Skip to content
Free tool · No signup

Free Robots.txt Generator

Control who can crawl your site — including GPTBot, ClaudeBot, PerplexityBot, Google-Extended, and 20+ other AI and search crawlers. Built for AEO, not just classic SEO.

Start with a preset

Optional — points crawlers at your full URL list. Add even when you also publish llms.txt.

Default for unlisted bots

Paths under User-agent: *. Common: /admin/, /api/, /wp-admin/, /cart/.

AI search bots

These crawlers power AI answers and citations at query time. Letting them in is how you get cited.

AI training bots

Crawl your content to train the next foundation model. Blocking them doesn't usually hurt citations — most AI assistants use a separate index-time bot.

Search engines

Classic crawlers behind Google, Bing, etc. Block these only if you really mean it.

robots.txt84 chars · 5 lines
# robots.txt — generated by fixaeo.com/robots-txt-generator

User-agent: *
Allow: /

Where to upload

Save as robots.txt and host it at the root of your site (e.g. https://yoursite.com/robots.txt). Most static hosts and CMSes accept a plain text file in the public/static folder.

Why this one's different

Most robots.txt generators stop at Googlebot. AI crawlers multiplied 5× since 2023 — your file needs to keep up.

24 AI + search crawlers covered

GPTBot, ClaudeBot, PerplexityBot, Google-Extended, Bytespider, CCBot, Applebot-Extended, Meta-ExternalAgent, MistralAI-User, and all the major search engines. Each tagged by maker and purpose so you know what you're blocking.

Smart presets for AEO

One click for 'AEO-friendly' (allow AI search, block AI training), 'Block all AI', or 'Allow everything'. Tweak from there instead of starting from scratch.

Minimal, correct output

Only emits overrides that differ from your default. The result is a short, readable file you can paste straight to /robots.txt — no boilerplate noise.

The two kinds of AI crawler

Knowing the difference is the difference between vanishing from AI answers and just opting out of model training.

Index-time

AI search bots

Fetch your page when an AI assistant needs to answer a user's question. Blocking these removes you from citations entirely. Examples: OAI-SearchBot, Claude-SearchBot, PerplexityBot, ChatGPT-User, DuckAssistBot.

Recommendation: allow unless you really don't want to show up in AI answers.

Training-time

AI training bots

Crawl your content to feed the next foundation model. Blocking these opts you out of training datasets but doesn't affect today's citations. Examples: GPTBot, Google-Extended, Bytespider, CCBot, anthropic-ai, Applebot-Extended.

Recommendation: your call — common AEO stance is block these and keep index-time bots allowed.

Frequently asked questions

AI bots, training vs indexing, and how robots.txt fits into AEO strategy.

What is a robots.txt file?
robots.txt is a plain-text file at the root of your website (yoursite.com/robots.txt) that tells web crawlers which URLs they may or may not fetch. Crawlers check it before requesting any other page. It's an honor-system protocol — well-behaved bots respect it; malicious scrapers ignore it.
Why does robots.txt matter for AEO?
AI assistants like ChatGPT, Claude, and Perplexity send their own crawlers (GPTBot, ClaudeBot, PerplexityBot, etc.) to fetch and index pages. If you block them in robots.txt, your content won't appear in their answers. If you allow them, your site becomes eligible to be cited. Most sites should explicitly allow AI search bots and consider their stance on training-only crawlers separately.
What's the difference between AI search bots and AI training bots?
AI search bots (OAI-SearchBot, Claude-SearchBot, PerplexityBot, ChatGPT-User) fetch pages at query time to power citations and answers — blocking them removes you from those AI assistants entirely. AI training bots (GPTBot, Google-Extended, Bytespider, CCBot, anthropic-ai) crawl to build the next foundation model — blocking them opts you out of training without affecting today's citations. Many SEOs allow the indexers and block the trainers.
Will blocking GPTBot stop ChatGPT from mentioning my site?
Mostly. GPTBot is OpenAI's training crawler — blocking it stops new training data ingestion. But ChatGPT also uses OAI-SearchBot and ChatGPT-User for live answers; if you only block GPTBot, your site can still be cited in real-time browsing. To remove yourself entirely, block all three.
What about Google-Extended?
Google-Extended is a *separate* user-agent that controls whether Google can use your content to train Gemini / Bard / Vertex AI. It does NOT affect Googlebot or your classic Google Search rankings. You can block Google-Extended without losing organic traffic — a common AEO-friendly setup.
Should I block Common Crawl (CCBot)?
Common Crawl is a non-profit that publishes an open web archive used by many LLMs (including most early GPT models). Blocking CCBot opts you out of that dataset — but doesn't necessarily protect you from individual labs that crawl directly. Useful as a signal of intent more than a hard guarantee.
Where do I put the robots.txt file?
At the root of your domain, served as text/plain. For static sites (Next.js, Astro, Hugo), drop it into the public/ or static/ folder. For WordPress, upload via SFTP to the webroot. After deploying, verify with `curl -I https://yoursite.com/robots.txt` — you should see HTTP 200.
Will robots.txt protect private content?
No. robots.txt is advisory — it tells well-behaved crawlers what to skip, but anyone (and any bot that ignores robots.txt) can still fetch the URL directly. For real privacy, use authentication, IP allowlists, or noindex headers. robots.txt is for telling Google/AI bots which public pages to ignore, not for hiding secrets.

Want to see how AI engines find you today?

Run a free AEO audit. We check your robots.txt, llms.txt, schema, citation strength, and how your brand currently shows up across ChatGPT, Claude, Gemini, Perplexity, and Grok.

Run a free AEO audit