The web is being read by machines now. aitools.hjlabs.in gives you three free, privacy-first tools to control and measure that: generate an llms.txt to guide AI crawlers, build an AI-focused robots.txt to allow or block GPTBot, ClaudeBot, CCBot, Google-Extended & PerplexityBot, and count GPT/Claude tokens in any text. No signup, no upload.
Each tool is a single-purpose, no-nonsense utility. Pick what you need — they all run entirely in your browser.
Create the new llms.txt standard file that tells ChatGPT, Claude, Perplexity and other AI models exactly which pages on your site matter. Enter your site name, description and key pages — get a clean, spec-compliant markdown file to drop at your domain root.
Decide which AI crawlers may train on or fetch your content. Toggle GPTBot, ClaudeBot, CCBot, Google-Extended, PerplexityBot, Bytespider and more, then copy a ready-to-publish robots.txt. The fastest way to block AI crawlers — or selectively allow them.
Paste any prompt or document and instantly see an estimated GPT and Claude token count, plus characters, words and an approximate API cost. Perfect for staying under context limits and budgeting prompts. Runs locally — your text never leaves the page.
Open Token Counter →For two decades, the contract between websites and machines was simple: a robots.txt file and an XML sitemap told search-engine crawlers what to index, and that was that. Large language models broke that contract. ChatGPT, Claude, Gemini, Perplexity and a growing fleet of AI agents now read the open web to train models, to answer questions in real time, and to cite sources inside generated answers. Your site is no longer just being ranked — it is being read, summarized, and quoted by machines. That shift created an entirely new category of decisions for every site owner, and almost no tooling exists to make those decisions easy. That is the gap aitools.hjlabs.in fills.
The llms.txt proposal is a fast-emerging standard — a single markdown file at the root of your domain that gives language models a curated map of your most important content. Instead of an LLM guessing which of your 4,000 pages are canonical, you hand it a short, structured list: here is what this site is, here are the key docs, here is the pricing page, here is the API reference. Sites that publish a good llms.txt are easier for AI to understand and more likely to be cited accurately. Our llms.txt generator turns a few inputs into a spec-compliant file in seconds — we author these for our own properties, so we productized it.
Not every site wants its content used to train the next foundation model — and publishers in particular are drawing hard lines. The mechanism is still robots.txt, but the user-agents are new: GPTBot (OpenAI training), ClaudeBot (Anthropic), CCBot (Common Crawl, which feeds many models), Google-Extended (Gemini training), PerplexityBot, Bytespider (TikTok/ByteDance) and others. Getting the exact agent strings and directives right is fiddly, and a single typo silently fails. Our AI robots.txt generator gives you a clean toggle per crawler and emits a correct file — whether you want to block AI crawlers entirely or allow some and deny others.
Everything you send to or receive from an LLM is billed and bounded in tokens, not words. Underestimate and your prompt gets truncated; overestimate and you waste money or hit context-window errors. Our token counter gives an instant client-side estimate of GPT and Claude tokens for any text, alongside characters, words, and an approximate API cost — so you can budget a prompt or check that a document fits a model's context window before you ever call the API.
Every tool here runs 100% in your browser. Your site structure, your robots rules, your prompts and documents are processed locally with JavaScript and never transmitted to any server. There is no account, no tracking of your inputs, and no upload. This is the same privacy-first, edge-deployed approach behind our sister tools at fmt.hjlabs.in and pixel.hjlabs.in.
hjLabs.in operates a family of web properties, and we maintain llms.txt and AI-crawler policies across all of them. These tools are the internal utilities we built for ourselves, polished and made free for everyone. The AI-era web is still being defined — getting these three files right today is one of the cheapest, highest-leverage things a site owner can do for generative engine optimization (GEO) and AI visibility.