Skip to main content
This is an info Alert.
friendly4AI LogoMaking websites AI-friendly - Your website optimization platform for AI systemsfriendly4AI
  • Home
  • TOP friendly4AI
  • Demo
  • GEO Scanner
      • AI-Readiness Score
      • AI Visibility Score
  • Company
      • About us
      • Contact us
  • Pricing
  • Blog
  • FAQs
Sign in

friendly4AI LogoMaking websites AI-friendly - Your website optimization platform for AI systemsfriendly4AI

The starting point for making your website AI-friendly. friendly4AI helps you optimize your website for AI systems and improve visibility.

ai@friendly4.ai

Products
GEO ScannerAI VisibilityPricing
friendly4AI
About usFor developersContact usFAQs
Legal
Terms and ConditionsPrivacy PolicyAI usage policy
friendly4AI © 2026

Why Your Site Is Invisible to AI (And How to Fix It in 5 Minutes)

Marina, friendly4AI Team
Marina, friendly4AI Team19 May 2026
Last updated: 19 May 2026
  1. Home
  2. Blog
  3. Why Your Site Is Invisible to AI (And How to Fix It in 5 Minutes)

Your hosting provider may be blocking GPTBot, ClaudeBot, and PerplexityBot from your site by default. Here's how to check, and the one-line robots.txt fix.

You can have the clearest writing in your industry, perfect Schema.org markup, and a site that loads in under a second, and still get zero citations from ChatGPT, Claude, or Perplexity.

Why? Because before any of that matters, AI engines have to be allowed in. And right now, most sites are not letting them in. Most owners do not know.

The finding that triggered this

According to Otterly's AI Citations Report 2026, about 73% of sites have at least one AI crawler blocked at the robots.txt or CDN layer. Roughly three out of four sites we scan. We see the same shape in our own data: AI Bot Accessibility blocks keep turning up across every industry we look at, on sites whose owners never set the rule themselves.

Otterly tracks who cites you. friendly4AI tells you why the crawler is not reaching you in the first place — the layer underneath the citation. Both numbers matter; you cannot move the first without first fixing the second.

That number caught me off guard. For most of our scoring history we treated "is your robots.txt blocking AI?" as an edge case. It is not. It is the default failure mode. If you are an SEO manager explaining to a stakeholder why AI search matters this quarter, this is also the number you bring into the meeting — three out of four sites in your category are doing this wrong by default.

That is why Score v2.1 now ships a dedicated AI Bot Accessibility section in every report. It checks each major AI crawler against your robots.txt and tells you who can read your site, and who can't.

A site AI cannot read is a zero-citation site

This is the part worth saying plainly. There is no SEO trick, no schema upgrade, no content rewrite that fixes an AI crawler block. If GPTBot is disallowed, OpenAI's training corpus does not see your page. If PerplexityBot is disallowed, Perplexity has nothing to cite. There is no second chance further down the funnel.

It is the single highest-leverage technical fix for AI Visibility, and it costs almost nothing. A one-line change in a file most people already have.

The managed-host trap

Here is where it gets uncomfortable. The block is usually not something you did.

Throughout 2025 and into 2026, several large managed hosting platforms (WP Engine, Squarespace, Wix, and others) added default robots.txt rules that disallow GPTBot, ClaudeBot, and similar AI training agents. The intent was reasonable: protect customers from uncompensated scraping. The effect, especially for owners who never opened their robots.txt, is that they are silently absent from AI answers.

I see this pattern in our inbox almost every week. A small business owner re-scans, sees a Critical verdict, and writes in asking what they did wrong. Answer: nothing. Their host did. They are now in the position of having to override a default they did not know existed, on a platform they pay for to avoid exactly that kind of decision.

When our scanner detects you are on one of these platforms, the report surfaces a "Why is this happening?" callout naming your host and linking to the platform's override instructions. The block is fixable from your side. Your host's default does not have the final word on your robots.txt.

How to check in 30 seconds

Open this URL in your browser:

https://yourdomain.com/robots.txt

Scan it for any of these user-agent strings followed by Disallow: /. If you see a match, that crawler cannot read your site, and you lose what is in the right-most column.

User-agent tokenCompanyTypeWhat you lose if blocked
GPTBotOpenAITrainingYour pages are excluded from future GPT training data; long-term ChatGPT recall about your brand erodes.
OAI-SearchBotOpenAIReal-time searchChatGPT cannot cite your page when it searches the web mid-answer.
ClaudeBot / anthropic-aiAnthropicTrainingYour content is excluded from Claude's training corpus.
PerplexityBotPerplexityIndex / searchPerplexity has nothing to cite when answering questions in your topic area.
Google-ExtendedGoogleAI training opt-outYour content is excluded from Gemini training and grounding. (Separate from Googlebot; does not affect Google Search ranking.)
CCBotCommon CrawlTraining feedMany AI training pipelines start from Common Crawl. A block here propagates to multiple downstream models.
BytespiderByteDanceTrainingYou are absent from ByteDance AI products including TikTok's Doubao.

Table: Major AI crawlers, what they do, and what a block costs you (2026).

You want both training and real-time crawlers allowed. The first feeds the model itself; the second fetches you when someone asks a question right now.

Easier path: run a free scan at friendly4.ai. The AI Bot Accessibility section gives you each crawler's verdict in one view, plus the score impact and the copy-paste fix.

The 5-minute fix

If you are on Squarespace, Wix, or WP Engine, skip the snippet below. Those hosts ship a managed robots.txt you cannot replace by upload. A direct file edit will not stick. Instead, open your friendly4AI report, find the "Why is this happening?" callout for your host, and follow the platform-specific override instructions. Then come back here for the verification step.

For self-hosted sites — your own WordPress, Next.js, static site, custom stack — add this block to your robots.txt and replace any existing AI-crawler rules:

User-agent: GPTBot
Allow: /

User-agent: OAI-SearchBot
Allow: /

User-agent: ChatGPT-User
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: anthropic-ai
Allow: /

User-agent: PerplexityBot
Allow: /

User-agent: Google-Extended
Allow: /

User-agent: CCBot
Allow: /

That covers training crawlers and real-time fetchers in one pass. For a deeper breakdown of which bot does what, see Understanding AI Crawlers.

Re-scan to verify the fix

Once your robots.txt is updated, re-scan your site. In your friendly4AI report, the AI Bot Accessibility section should update from "Critical" or "Mixed" to "All allowed."

If your previous report showed a capped score (two numbers, for example "AI-Readiness: 55 / Potential after fix: 79"), the cap lifts on the next scan once the block is resolved. The "Potential after fix" number becomes your headline score.

We argued about the cap value internally for a week. Sixty was the number that felt honest — clearly below the "good" threshold, not so low it obscures the rest of the work. When AI cannot reach your site, a score of 80 is misleading, no matter how good the rest of your setup looks. The dual-number view shows you both where you are today and where you will be once the crawler block is gone.

One quick note on llms.txt

If you have been tracking GEO news, you may have heard the same week that llms.txt does not move the needle. Google publicly confirmed in 2025 that llms.txt is not used in Search or AI Overviews ranking. Independent studies of 94,000+ URLs show no measurable citation effect. We have re-weighted it accordingly in Score v2.1.

We still detect llms.txt on your site and surface it in an Experimental signals section for completeness. Its weight in AI-Readiness score: 0 points. We did not delete the check; we just stopped recommending it as a priority action.

The reason these two changes ship together is simple. We removed a low-impact recommendation and added a high-impact one, in the same release, so the "Critical" findings band in your report becomes more honest about which actions actually move AI Visibility.

What to do next

Scan your site at friendly4.ai if you have not in the last week, and look at the AI Bot Accessibility section. If you see any Critical or Mixed verdict, copy the robots.txt snippet from your report (it lists only the agents you currently block) and apply it. Re-scan to confirm the fix landed, then move on.

Stuck. If our scanner says you are blocked and you cannot apply the fix yourself, reply to your report email or write ai@friendly4.ai. We are tagging these so we know which hosts are causing the most pain.

The free scan tells you who can read your site. Tracking who actually cites you across ChatGPT, Claude, Gemini, Grok, and Perplexity is on Starter.

Once AI can reach your site, the rest of your AI-readiness work (structured data, semantic HTML, internal linking) starts compounding. Before that, the score is mostly diagnostic.

If you are in the 27% whose AI Bot Accessibility is already clean, you are ahead of most of the web. The next fixes are in How to Improve Your AI-Readiness Score.

Keep reading

  • Understanding AI Crawlers — training bots, search bots, user fetchers
  • How LLMs Choose Which Websites to Recommend — what gets cited and why
  • What Is AI Visibility? — the outcome the AI Bot Accessibility fix protects
  • How to Improve Your AI-Readiness Score — the full prioritized checklist

Sources

  • Otterly, AI Citations Report 2026 (May 2026) — 73% of sites have at least one AI crawler blocked at the robots.txt or CDN layer
  • Google / Gary Illyes on llms.txt (July 2025) — public statement that llms.txt is not used in Search or AI Overviews ranking
  • ALM Corp robots.txt strategy analysis (2025) — independent analysis of 94,000+ URLs finds no measurable citation lift from llms.txt
  • OpenAI GPTBot documentation — official user-agent and IP-range reference
  • Anthropic crawler documentation — official ClaudeBot reference
AI Bot Accessibility
robots.txt
GPTBot
ClaudeBot
AI Visibility
Score v2.1
GEO

Recent Posts

Marina, friendly4AI Team
How to Check if ChatGPT Recommends Your Website
20 Mar 2026
How to Check if ChatGPT Recommends Your Website
Alex, friendly4AI Team
How LLMs Choose Which Websites to Recommend
19 Feb 2026
How LLMs Choose Which Websites to Recommend
Marina, friendly4AI Team
What Is AI Visibility and Why It Matters
14 Feb 2026
What Is AI Visibility and Why It Matters
Marina, friendly4AI Team
The Evolution of Search: From SEO to GEO
27 Jan 2026
The Evolution of Search: From SEO to GEO