iscrawlable runs a public, unauthenticated crawler-readiness check across robots.txt, HTTP responses, indexability headers, sitemap, llms.txt, and WAF/CDN signals. Some answers a public scan can give with high confidence. Others require manual review or a future connected scan with read-only access to your CDN. This page lays out which is which.
A public scan probes your site the way an external crawler would. It cannot see configuration that lives behind your CDN dashboard — for example, the Block AI Bots toggle in Cloudflare's AI Crawl Control panel, or custom WAF rules that match on attributes a public probe cannot reproduce.
A connected scan would ask for a read-only API token from your CDN provider so those settings can be read directly. That workflow is not available in the self-serve product yet; current results are based on public checks and support-assisted review.
Our public scan sends requests with the published user agent strings of major AI crawlers. We do not originate from the IP ranges those crawlers actually use, and we do not impersonate verified-bot identities. Sites that gate access by source IP or by verified-bot signature may treat our probe differently from a real crawler. That is a known limit of any public crawler-readiness check.
OpenAI, Anthropic, Perplexity, and Google publish IP ranges for some of their crawlers. We compare the public response a site returns to our probe against the documented behavior of those crawlers, but we cannot fully replicate IP-based allow-lists from outside the perimeter. If your access policy depends on IP attestation, public scan results should be treated as advisory.
A future connected Cloudflare scan or support-assisted review should inspect:
Public scans cannot read these settings today. They also cannot see rules at the origin server level (nginx / Apache / application code), or policies at any other layer in front of Cloudflare.
We only check declared Perplexity user agents and public access signals. We cannot verify stealth, third-party, or undeclared crawlers from a public scan. Perplexity's user-triggered agent (Perplexity-User) is shown as auxiliary context only and never counts against the main pass/fail.
Pass means crawlers appear allowed by public checks. It does not guarantee citation in ChatGPT, Claude, Perplexity, or Google AI results.