Search-Bot Network Reachability

Q: What does this parameter measure?

An Allow rule in robots.txt means nothing if the bot's HTTP request never lands on your origin. This parameter looks one layer down, at the WAF and CDN, and asks a simpler question: can the declared AI search-bot user-agents actually get through at the network level? friendly4AI sends a probe from each of the three search-bot identities and sorts the response into one of three buckets: - REACHABLE — HTTP 200/30x with normal content. - BLOCKED — HTTP 403 or a WAF/CDN challenge page. - INCONCLUSIVE — timeout or ambiguous response. Each bot's IP ranges are checked against the JSON its engine publishes: OpenAI (openai.com/searchbot.json), Anthropic (claude.com/crawling/bots.json), and Perplexity (perplexity.ai/perplexitybot.json). If every probe comes back inconclusive or times out, the result drops to an advisory UNKNOWN, which is excluded from the score denominator — network ambiguity won't cost you points. But a confirmed block on any search-bot tier can cap your composite score, exactly the way a robots.txt Disallow would for tier-A bots.

Q: How do I fix a blocked search bot?

- Open your WAF or CDN bot-management console (Cloudflare Bot Management, AWS WAF Managed Rules, Akamai Bot Manager, or equivalent) and add explicit allow-list rules for the OAI-SearchBot, Claude-SearchBot, and PerplexityBot user-agent strings. - Put the official IP ranges from each engine's published JSON on your allowlist, so IP-level blocking doesn't trip up requests that already carry the correct UA. - Using Cloudflare's "Block bots" or "Managed Challenge" rules? Make sure AI search crawlers sit outside the managed-challenge scope. A challenge that returns a JavaScript interstitial reads as BLOCKED to the scanner. - Test each bot UA by hand: curl -A "OAI-SearchBot/1.0" https://yourdomain.com should return a 200 with your normal page content, not a Cloudflare or WAF error page. - Re-scan once your WAF rules are updated to confirm all three bots now come back REACHABLE.

Score Bands

Verdict	Condition
Pass	all declared search-bot user-agents (OAI-SearchBot, Claude-SearchBot, PerplexityBot) are reachable — no 403 or network block detected at the WAF/CDN layer
Partial	at least one search bot is confirmed network-blocked while at least one is reachable, or reachability is mixed or uncertain for one or more bots
Fail	all probed search-bot user-agents are confirmed network-blocked — every engine is invisible at the network layer

Verdict

Condition

Pass

all declared search-bot user-agents (OAI-SearchBot, Claude-SearchBot, PerplexityBot) are reachable — no 403 or network block detected at the WAF/CDN layer

Partial

at least one search bot is confirmed network-blocked while at least one is reachable, or reachability is mixed or uncertain for one or more bots

Fail

all probed search-bot user-agents are confirmed network-blocked — every engine is invisible at the network layer

Description

Search-bot network reachability checks whether AI search crawlers can actually reach your origin at the WAF and CDN layer, below robots.txt. friendly4AI probes three search-bot user-agents — OAI-SearchBot (OpenAI), Claude-SearchBot (Anthropic), and PerplexityBot (Perplexity) — and scores pass (100) when all three are reachable, partial (50) when at least one is blocked while another is reachable, and fail (0) when every bot is network-blocked.

What does this parameter measure?

An Allow rule in robots.txt means nothing if the bot's HTTP request never lands on your origin. This parameter looks one layer down, at the WAF and CDN, and asks a simpler question: can the declared AI search-bot user-agents actually get through at the network level?

friendly4AI sends a probe from each of the three search-bot identities and sorts the response into one of three buckets:

REACHABLE — HTTP 200/30x with normal content.
BLOCKED — HTTP 403 or a WAF/CDN challenge page.
INCONCLUSIVE — timeout or ambiguous response.

Each bot's IP ranges are checked against the JSON its engine publishes: OpenAI (openai.com/searchbot.json), Anthropic (claude.com/crawling/bots.json), and Perplexity (perplexity.ai/perplexitybot.json). If every probe comes back inconclusive or times out, the result drops to an advisory UNKNOWN, which is excluded from the score denominator — network ambiguity won't cost you points. But a confirmed block on any search-bot tier can cap your composite score, exactly the way a robots.txt Disallow would for tier-A bots.

Why does network reachability matter for AI-readiness?

WAF and CDN bot-management tools tend to block generic bot user-agents and unfamiliar IP ranges out of the box. So if OAI-SearchBot, Claude-SearchBot, or PerplexityBot get stopped at the network layer, those engines never fetch your pages — and it doesn't matter how permissive your robots.txt is.

The failure is silent. Your robots.txt says "allowed," the crawler hits a 403, and it gives up. That content never reaches the engine's index, so it never shows up in AI-generated answers on ChatGPT, Claude.ai, or Perplexity. This parameter surfaces that gap and puts a score on it. It pairs with robots-txt-accessibility, which checks the robots.txt declaration only — not whether the request actually arrives.

How is the score calculated?

Under the v4.5 methodology, this Crawlability parameter uses a three-tier gradient driven by confirmed probe outcomes:

Pass (100) — all probed search-bot UAs are REACHABLE; no 403 or cloaking signals detected at the WAF/CDN layer.
Partial (50) — at least one bot is confirmed BLOCKED but at least one is REACHABLE, or reachability is mixed or uncertain across bots.
Fail (0) — every probed bot UA is confirmed BLOCKED; all AI search engines are network-invisible.
UNKNOWN (advisory) — all probes are INCONCLUSIVE or time out; excluded from the score denominator, not treated as a scored 0.

A confirmed block also trips a score cap. For the composite cap calculation, a network-blocked search bot counts the same as a Disallow: / in a tier-A robots.txt.

How do I fix a blocked search bot?

Open your WAF or CDN bot-management console (Cloudflare Bot Management, AWS WAF Managed Rules, Akamai Bot Manager, or equivalent) and add explicit allow-list rules for the OAI-SearchBot, Claude-SearchBot, and PerplexityBot user-agent strings.
Put the official IP ranges from each engine's published JSON on your allowlist, so IP-level blocking doesn't trip up requests that already carry the correct UA.
Using Cloudflare's "Block bots" or "Managed Challenge" rules? Make sure AI search crawlers sit outside the managed-challenge scope. A challenge that returns a JavaScript interstitial reads as BLOCKED to the scanner.
Test each bot UA by hand: curl -A "OAI-SearchBot/1.0" https://yourdomain.com should return a 200 with your normal page content, not a Cloudflare or WAF error page.
Re-scan once your WAF rules are updated to confirm all three bots now come back REACHABLE.

Score Bands

Verdict	Condition
Pass	all declared search-bot user-agents (OAI-SearchBot, Claude-SearchBot, PerplexityBot) are reachable — no 403 or network block detected at the WAF/CDN layer
Partial	at least one search bot is confirmed network-blocked while at least one is reachable, or reachability is mixed or uncertain for one or more bots
Fail	all probed search-bot user-agents are confirmed network-blocked — every engine is invisible at the network layer

Verdict

Condition

Pass

all declared search-bot user-agents (OAI-SearchBot, Claude-SearchBot, PerplexityBot) are reachable — no 403 or network block detected at the WAF/CDN layer

Partial

at least one search bot is confirmed network-blocked while at least one is reachable, or reachability is mixed or uncertain for one or more bots

Fail

all probed search-bot user-agents are confirmed network-blocked — every engine is invisible at the network layer

Description

What does this parameter measure?

friendly4AI sends a probe from each of the three search-bot identities and sorts the response into one of three buckets:

REACHABLE — HTTP 200/30x with normal content.
BLOCKED — HTTP 403 or a WAF/CDN challenge page.
INCONCLUSIVE — timeout or ambiguous response.

Why does network reachability matter for AI-readiness?

How is the score calculated?

Under the v4.5 methodology, this Crawlability parameter uses a three-tier gradient driven by confirmed probe outcomes:

Pass (100) — all probed search-bot UAs are REACHABLE; no 403 or cloaking signals detected at the WAF/CDN layer.
Partial (50) — at least one bot is confirmed BLOCKED but at least one is REACHABLE, or reachability is mixed or uncertain across bots.
Fail (0) — every probed bot UA is confirmed BLOCKED; all AI search engines are network-invisible.
UNKNOWN (advisory) — all probes are INCONCLUSIVE or time out; excluded from the score denominator, not treated as a scored 0.

A confirmed block also trips a score cap. For the composite cap calculation, a network-blocked search bot counts the same as a Disallow: / in a tier-A robots.txt.

How do I fix a blocked search bot?

Open your WAF or CDN bot-management console (Cloudflare Bot Management, AWS WAF Managed Rules, Akamai Bot Manager, or equivalent) and add explicit allow-list rules for the OAI-SearchBot, Claude-SearchBot, and PerplexityBot user-agent strings.
Put the official IP ranges from each engine's published JSON on your allowlist, so IP-level blocking doesn't trip up requests that already carry the correct UA.
Using Cloudflare's "Block bots" or "Managed Challenge" rules? Make sure AI search crawlers sit outside the managed-challenge scope. A challenge that returns a JavaScript interstitial reads as BLOCKED to the scanner.
Test each bot UA by hand: curl -A "OAI-SearchBot/1.0" https://yourdomain.com should return a 200 with your normal page content, not a Cloudflare or WAF error page.
Re-scan once your WAF rules are updated to confirm all three bots now come back REACHABLE.

Signal Source

Score Bands

Description

What does this parameter measure?

Why does network reachability matter for AI-readiness?

How is the score calculated?

How do I fix a blocked search bot?

Version History

Key takeaways

Search-Bot Network Reachability

Signal Source

Score Bands

Description

What does this parameter measure?

Why does network reachability matter for AI-readiness?

How is the score calculated?

How do I fix a blocked search bot?

Version History

Key takeaways

Signal Source

Score Bands

Description

What does this parameter measure?

Why does network reachability matter for AI-readiness?

How is the score calculated?

How do I fix a blocked search bot?

Version History

Key takeaways

Related Parameters

Signal Source

Score Bands

Description

What does this parameter measure?

Why does network reachability matter for AI-readiness?

How is the score calculated?

How do I fix a blocked search bot?

Version History

Key takeaways

Related Parameters