Built a CLI that tells me if GPTBot/ClaudeBot/Perplexity can actually reach my site (and where the block is)
I kept getting "your AI visibility is low" reports from various tools that wouldn't tell me *why* . Was the block in robots.txt? At the CDN? At origin? Different fixes, different teams. I guess this sits somewhere in the "generative engine optimization" bucket, but I wanted the tool to stay very concrete: can these crawlers reach the site, and if not, where are they being blocked? So I wrote a small Node CLI that just answers that question deterministically: ``` npx u/geosuite/ai-crawler-bots robots https://my-site.com ``` What it actually does: - Parses robots.txt with line-level provenance — when a bot is Disallow'd it tells me *which line in which group* . - For each tracked bot (24 right now: GPTBot, ChatGPT-User, OAI-SearchBot, ClaudeBot, PerplexityBot, Perplexity-User, Bytespider, etc.), reports the verdict. - Detects Cloudflare's "Managed Content" markers (`# BEGIN Cloudflare Managed content` … `# END`) and tells me whether my own rules would've allowed the bot. - Also has a `check <url>` mode that does an actual HTTP probe with each bot's UA, and distinguishes edge blocks (CDN fingerprints) from origin blocks. Different remediation. Zero runtime dependencies, MIT, Node 20+. Source: github.com/TryGeoSuite/ai-crawler-bots There are three companion tools in the same scope: - `@geosuite/schema-templates` — 23 schema.org JSON-LD templates + offline validator. - `@geosuite/llms-txt-generator` — sitemap.xml → llms.txt. - `@geosuite/sitemap-builder` — crawl + valid sitemap.xml for custom sites without one. Honest disclaimer: I also build a hosted SaaS (trygeosuite.it) on top of similar logic, but the four CLIs are MIT and stand alone. I open-sourced them because I find it dishonest to sell a black box that does things any dev can verify. Curious what other people are using to debug AI bot reachability — especially anyone running through Cloudflare, Akamai, or Vercel. The "managed content" injection broke my mental model the first time I hit it. [link] [comments]
from Search Engine Optimization: The Latest SEO News https://ift.tt/6EfYB4J
Comments
Post a Comment