iscrawlable runs a public, unauthenticated crawler-readiness check across robots.txt, HTTP responses, indexability headers, sitemap, llms.txt, and WAF/CDN signals. Some answers a public scan can give with high confidence. Others need a connected scan with read-only access to your CDN. This page lays out which is which.
A public scan probes your site the way an external crawler would. It cannot see configuration that lives behind your CDN dashboard — for example, the Block AI Bots toggle in Cloudflare's AI Crawl Control panel, or custom WAF rules that match on attributes a public probe cannot reproduce.
A connected scan asks for a read-only API token from your CDN provider so we can read those settings directly. We use it only to read configuration, never to change it. Connected scan is a Pro feature.
Our public scan sends requests with the published user agent strings of major AI crawlers. We do not originate from the IP ranges those crawlers actually use, and we do not impersonate verified-bot identities. Sites that gate access by source IP or by verified-bot signature may treat our probe differently from a real crawler. That is a known limit of any public crawler-readiness check.
OpenAI, Anthropic, Perplexity, and Google publish IP ranges for some of their crawlers. We compare the public response a site returns to our probe against the documented behavior of those crawlers, but we cannot fully replicate IP-based allow-lists from outside the perimeter. If your access policy depends on IP attestation, a connected scan is the only way to verify the rule end-to-end.
If you connect Cloudflare with a read-only API token, we can additionally inspect:
What we still cannot see, even with a Cloudflare token: rules at the origin server level (nginx / Apache / application code), and policies at any other layer in front of Cloudflare. We also do not modify any settings — this scan is read-only by contract.
We only check declared Perplexity user agents and public access signals. We cannot verify stealth, third-party, or undeclared crawlers from a public scan. Perplexity's user-triggered agent (Perplexity-User) is shown as auxiliary context only and never counts against the main pass/fail.
Pass means crawlers appear allowed by public checks. It does not guarantee citation in ChatGPT, Claude, Perplexity, or Google AI results.