Heads-up: Browsers block direct cross-origin requests, so this uses a public CORS proxy (corsproxy.io). It works for most sites but some hosts (Cloudflare, anti-bot WAFs) refuse proxy traffic. If fetch fails, copy your robots.txt into the paste tab — that always works.

PASS All AI bots allowed

71% of sites make this mistake

You’re blocking AI search but allowing AI training

This is the most common GEO mistake. Training bots (GPTBot, ClaudeBot, Google-Extended) only feed future model training — blocking them is a defensible privacy choice. Retrieval bots (OAI-SearchBot, ChatGPT-User, PerplexityBot, Claude-SearchBot) are how live AI search engines fetch your pages to answer real user questions right now. Blocking retrieval bots while allowing training bots gives away your content for training but makes you invisible in the answers your customers actually see. Per BuzzStream’s 2024 analysis, 71% of sites do exactly this.

Training bots

Used to train future AI models. Blocking is a defensible privacy choice.

Retrieval bots

Used by live AI search to fetch pages and cite your business in answers. Block these and you’re invisible.

How we read your robots.txt: we parse User-agent groups, then for each AI bot check if it has its own Disallow: / rule, falls under a wildcard User-agent: * with Disallow: /, or has any path-level disallow. Bots with no matching rule default to allowed.
This is 1 of 14 checks

Want the full audit?

Robots.txt is the cheapest fix in AI search. The hard part is whether ChatGPT actually cites you when a buyer searches. The BeCited audit runs 25–50 buying-intent prompts across 4 engines and scores 13 more site-readiness signals (schema, llms.txt, freshness, E-E-A-T, entity readiness, and more).

25–50 buying-intent prompts 4 AI engines 14 site-readiness checks Action plan included
See the BeCited Audit →
Why this matters

Blocking AI bots is the fastest way to disappear from AI search

Search engines used to be a single bot (Googlebot). AI search splits the work across two bot families — and most site owners block the wrong ones without realizing it.

Training bots

Crawl your site to feed model training data. Examples: GPTBot (OpenAI), Google-Extended (Google AI), ClaudeBot (Anthropic), CCBot (Common Crawl), Bytespider (TikTok), anthropic-ai. You can block these without affecting whether you appear in AI answers today.

Retrieval bots

Fetch your pages in real time when users ask AI search engines questions. Examples: OAI-SearchBot and ChatGPT-User (ChatGPT search), PerplexityBot (Perplexity), Claude-SearchBot (Claude web search). Block these and you’re absent from the answers your customers are reading right now.