geo_auditor — manual — read-only
███▄ ▄███▓ ▄▄▄ ███▄ █ █ ██ ▄▄▄ ██▓
▓██▒▀█▀ ██▒▒████▄ ██ ▀█ █ ██ ▓██▒▒████▄ ▓██▒
▓██ ▓██░▒██ ▀█▄ ▓██ ▀█ ██▒▓██ ▒██░▒██ ▀█▄ ▒██░
▒██ ▒██ ░██▄▄▄▄██ ▓██▒ ▐▌██▒▓▓█ ░██░░██▄▄▄▄██ ▒██░
▒██▒ ░██▒ ▓█ ▓██▒▒██░ ▓██░▒▒█████▓ ▓█ ▓██▒░██████▒
░ ▒░ ░ ░ ▒▒ ▓▒█░░ ▒░ ▒ ▒ ░▒▓▒ ▒ ▒ ▒▒ ▓▒█░░ ▒░▓ ░
░ ░ ░ ▒ ▒▒ ░░ ░░ ░ ▒░░░▒░ ░ ░ ▒ ▒▒ ░░ ░ ▒ ░
░ ░ ░ ▒ ░ ░ ░ ░░░ ░ ░ ░ ▒ ░ ░
░ ░ ░ ░ ░ ░ ░ ░ ░
OVERVIEW
The GEO Auditor measures whether a website is easy for modern AI systems to access, interpret, and cite. It is built for teams who need a practical view of AI readiness, not just classic search visibility.
In the web product, audits run in the background, stream progress live to the browser, and finish with stakeholder-friendly outputs such as an Excel workbook, optional PDF, topic map, heatmap, schema recommendations, and optional llms.txt generation.
WHAT THE SYSTEM CHECKS
The audit combines site-wide, page-level, and crawl-level signals:
- Robots and AI Access: Checks crawler rules in robots.txt and evaluates whether an existing llms.txt is missing, thin, weak, or useful.
- Schema, Metadata, and HTML Structure: Measures how clearly the page explains what it is, what it covers, and how the content is organized.
- Content Quality: Looks at depth, formatting, and semantic coherence so the page is not just indexable, but also quotable.
- Entities: Checks whether brands, people, places, and products can be extracted cleanly and mapped to canonical references such as Wikidata.
- Chunk Citability: Scores whether individual passages are self-contained and useful enough to be lifted into an AI answer.
- Authority: Combines domain age, entity schema strength, external brand signals, grounding-page signals, and optional backlink proxy data.
- Agentic Readiness: Looks for practical signals such as WebMCP annotations and a UCP discovery file at /.well-known/ucp.
- Harmonic Centrality: In crawl mode, the system checks which pages are central or peripheral in the discovered internal link graph.
- Rendering Consistency: When both raw and JS-rendered HTML are collected, the audit shows whether crawlers and users are effectively seeing the same content.
HOW A WEB AUDIT RUNS
- 1. URL Resolution: The system resolves the audit set from sitemap discovery, scoped crawl mode, exact uploaded URLs, or uploaded seed URLs.
- 2. Site-wide Checks: Domain-level signals such as robots and authority run once and are reused across the pages in the same job.
- 3. Background Page Audits: Each page is processed by the worker queue while the browser receives live progress updates.
- 4. Finalization: The system aggregates scores, prepares factor summaries, and writes the final report artifacts.
INPUT MODES
- Direct URL: Start from a single site or page in the main input field.
- Exact URLs: Upload a file and audit only the listed URLs.
- Seed Crawl: Upload one or more starting URLs and let the crawler discover additional pages from there.
- Scope Controls: Use max URLs, JS sample size, optional path scope, and optional locale filters to shape the run.
OUTPUTS
- Excel Workbook: The main technical export for the job.
- PDF Report: Optional stakeholder-facing summary.
- schemas.json: Recommended schema templates based on the audited domain.
- Topic Map: An HTML graph showing page and entity relationships.
- Chunk Heatmap: A visual pass/fail style view of citability by page section.
- llms.txt: Optionally generated only when requested and only when the audited site does not already expose one.
HOW TO READ RESULTS
- Weighted factors move the headline score.
- Supporting factors add context but do not change the headline score.
- Diagnostic factors explain technical risk.
- Not measured does not automatically mean failure. It can also mean the signal was intentionally unavailable for that run.
● End of file. Click the AUDIT TERMINAL link in the sidebar to return.