Paste your robots.txt and we’ll tell you exactly which AI crawlers you’re allowing — and which you’re accidentally blocking from the answers your customers see in ChatGPT, Perplexity, Claude, and Google.
Same AI bot taxonomy used in the BeCited paid audit. Runs entirely in your browser.
Heads-up: Browsers block direct cross-origin requests, so this uses a public CORS proxy (corsproxy.io). It works for most sites but some hosts (Cloudflare, anti-bot WAFs) refuse proxy traffic. If fetch fails, copy your robots.txt into the paste tab — that always works.
This is the most common GEO mistake. Training bots (GPTBot, ClaudeBot, Google-Extended) only feed future model training — blocking them is a defensible privacy choice. Retrieval bots (OAI-SearchBot, ChatGPT-User, PerplexityBot, Claude-SearchBot) are how live AI search engines fetch your pages to answer real user questions right now. Blocking retrieval bots while allowing training bots gives away your content for training but makes you invisible in the answers your customers actually see. Per BuzzStream’s 2024 analysis, 71% of sites do exactly this.
Used to train future AI models. Blocking is a defensible privacy choice.
Used by live AI search to fetch pages and cite your business in answers. Block these and you’re invisible.
Disallow: / rule, falls under a wildcard User-agent: * with Disallow: /, or has any path-level disallow. Bots with no matching rule default to allowed.
Robots.txt is the cheapest fix in AI search. The hard part is whether ChatGPT actually cites you when a buyer searches. The BeCited audit runs 25–50 buying-intent prompts across 4 engines and scores 13 more site-readiness signals (schema, llms.txt, freshness, E-E-A-T, entity readiness, and more).
Search engines used to be a single bot (Googlebot). AI search splits the work across two bot families — and most site owners block the wrong ones without realizing it.
Crawl your site to feed model training data. Examples: GPTBot (OpenAI), Google-Extended (Google AI), ClaudeBot (Anthropic), CCBot (Common Crawl), Bytespider (TikTok), anthropic-ai. You can block these without affecting whether you appear in AI answers today.
Fetch your pages in real time when users ask AI search engines questions. Examples: OAI-SearchBot and ChatGPT-User (ChatGPT search), PerplexityBot (Perplexity), Claude-SearchBot (Claude web search). Block these and you’re absent from the answers your customers are reading right now.