GEO Auditor - Manual

geo_auditor — manual — read-only

 ███▄ ▄███▓ ▄▄▄       ███▄    █  █    ██  ▄▄▄       ██▓    
▓██▒▀█▀ ██▒▒████▄     ██ ▀█   █  ██  ▓██▒▒████▄    ▓██▒    
▓██    ▓██░▒██  ▀█▄  ▓██  ▀█ ██▒▓██  ▒██░▒██  ▀█▄  ▒██░    
▒██    ▒██ ░██▄▄▄▄██ ▓██▒  ▐▌██▒▓▓█  ░██░░██▄▄▄▄██ ▒██░    
▒██▒   ░██▒ ▓█   ▓██▒▒██░   ▓██░▒▒█████▓  ▓█   ▓██▒░██████▒
░ ▒░   ░  ░ ▒▒   ▓▒█░░ ▒░   ▒ ▒ ░▒▓▒ ▒ ▒  ▒▒   ▓▒█░░ ▒░▓  ░
░  ░      ░  ▒   ▒▒ ░░ ░░   ░ ▒░░░▒░ ░ ░   ▒   ▒▒ ░░ ░ ▒  ░
░      ░     ░   ▒      ░   ░ ░  ░░░ ░ ░   ░   ▒     ░ ░   
       ░         ░  ░         ░    ░           ░  ░    ░  ░

OVERVIEW

The GEO Auditor measures whether a website is easy for modern AI systems to access, interpret, and cite. It is built for teams who need a practical view of AI readiness, not just classic search visibility.

In the web product, audits run in the background, stream progress live to the browser, and finish with stakeholder-friendly outputs such as an Excel workbook, optional PDF, topic map, heatmap, schema recommendations, and optional llms.txt generation.

WHAT THE SYSTEM CHECKS

The audit combines site-wide, page-level, and crawl-level signals:

Robots and AI Access: Checks crawler rules in robots.txt and evaluates whether an existing llms.txt is missing, thin, weak, or useful.
Schema, Metadata, and HTML Structure: Measures how clearly the page explains what it is, what it covers, and how the content is organized.
Content Quality: Looks at depth, formatting, and semantic coherence so the page is not just indexable, but also quotable.
Entities: Checks whether brands, people, places, and products can be extracted cleanly and mapped to canonical references such as Wikidata.
Chunk Citability: Scores whether individual passages are self-contained and useful enough to be lifted into an AI answer.
Authority: Combines domain age, entity schema strength, external brand signals, grounding-page signals, and optional backlink proxy data.
Agentic Readiness: Looks for practical signals such as WebMCP annotations and a UCP discovery file at /.well-known/ucp.
Harmonic Centrality: In crawl mode, the system checks which pages are central or peripheral in the discovered internal link graph.
Rendering Consistency: When both raw and JS-rendered HTML are collected, the audit shows whether crawlers and users are effectively seeing the same content.

HOW A WEB AUDIT RUNS

1. URL Resolution: The system resolves the audit set from sitemap discovery, scoped crawl mode, exact uploaded URLs, or uploaded seed URLs.
2. Site-wide Checks: Domain-level signals such as robots and authority run once and are reused across the pages in the same job.
3. Background Page Audits: Each page is processed by the worker queue while the browser receives live progress updates.
4. Finalization: The system aggregates scores, prepares factor summaries, and writes the final report artifacts.

INPUT MODES

Direct URL: Start from a single site or page in the main input field.
Exact URLs: Upload a file and audit only the listed URLs.
Seed Crawl: Upload one or more starting URLs and let the crawler discover additional pages from there.
Scope Controls: Use max URLs, JS sample size, optional path scope, and optional locale filters to shape the run.

OUTPUTS

Excel Workbook: The main technical export for the job.
PDF Report: Optional stakeholder-facing summary.
schemas.json: Recommended schema templates based on the audited domain.
Topic Map: An HTML graph showing page and entity relationships.
Chunk Heatmap: A visual pass/fail style view of citability by page section.
llms.txt: Optionally generated only when requested and only when the audited site does not already expose one.

HOW TO READ RESULTS

Weighted factors move the headline score.
Supporting factors add context but do not change the headline score.
Diagnostic factors explain technical risk.
Not measured does not automatically mean failure. It can also mean the signal was intentionally unavailable for that run.

● End of file. Click the AUDIT TERMINAL link in the sidebar to return.