Robots Disallow Checker

Chrome Extension

Audit how Google indexes URLs blocked by your robots.txt

Disallow rules don't always keep URLs out of the index. This extension turns every disallow pattern into a Google query and tells you what Google still has.

Add to Chrome View on GitHub

What it does

Pages you disallow in robots.txt can still appear in Google's index, often discovered through external links. Robots Disallow Checker reads your robots.txt, converts every rule into a targeted site: query, and reports how many URLs Google actually has for each pattern. You get a precise picture of the gap between intent and reality.

Features

11 pattern types

Plain path prefixes, filetype suffixes (*.pdf), mid-path wildcards, query strings, URL-encoded UTF-8 paths, and Allow exceptions (up to 10 per query).

Google reference parser

Pattern matching follows the same rules as Google's open-source robotstxt parser, so what the extension flags is what Googlebot would actually match.

Audit controls

Stop and resume scans at any point, rescan robots.txt on demand, and inspect individual queries without losing progress.

Exports

One-click export to TSV (for spreadsheets), Markdown (for reports), or debug JSON (for raw data).

Local-only storage

All data stays in your browser. 7-day cache TTL so repeat audits stay fast without sending anything to a server.

No build, no tracking

Vanilla JavaScript ES modules. No analytics, no telemetry, no external dependencies. Full test suite in the repo.

Screenshots

Honest about limits. Counts marked ~ are approximate, because Google's search operators don't perfectly mirror robots.txt semantics, so the extension biases toward false positives over false negatives. Large audits (50+ rules) run serially with 10-30 second delays between requests; very large batches may trigger a CAPTCHA.

Ready to audit?

Add to Chrome View on GitHub