Methodology · v0.1 (preview)

How we measure AI answer-engine visibility.

The full methodology page is in draft. Today's free check runs four static scanners — the AI-crawler robots.txt audit (14 user-agents, explicit/fall-through distinction), the llms.txt validator (with honest framing about who actually respects it), the schema.org JSON-LD extractor (with FAQPage flagged as the highest-leverage gap per Relixir 2025), and the HTTP-header inspector. Live engine probes across ChatGPT, Claude, Gemini, and Perplexity are out of scope for this free utility.

When the Audit tier ships, this page expands to cover the full measurement design: stratified prompt sets across four intent clusters, N=10 sampling per cell pooled to intent-cluster × model × region, bootstrap confidence intervals with 2,000 resamples, model snapshotting via nightly control prompts plus vendor-changelog scraping, and the URL-hallucination exclusion policy. Every academic reference will live here too:

Sielinski, R. (2026). Quantifying Uncertainty in AI Visibility: A Statistical Framework for Generative Search Measurement. arxiv:2603.08924
Schulte, J., Bleeker, M., Kaufmann, P. (2026). Don't Measure Once: Measuring Visibility in AI Search (GEO). arxiv:2604.07585
Sharma, A. P. (2026). The Discovery Gap: How Product Hunt Startups Vanish in LLM Organic Discovery Queries. arxiv:2601.00912
Aggarwal, P. et al. (2024). GEO: Generative Engine Optimization. arxiv:2311.09735

Where this differs from Google's official guidance.

In Google's May-2026 AI-optimization guide, the headline thesis is: “Optimizing for generative AI search is optimizing for the search experience, and thus still SEO.” Google explicitly tells site owners not to ship llms.txt, not to create AI-specific markdown files, not to chunk content for AI, and not to overfocus on structured data. They say the same standard SEO best practices that produce traditional rankings produce AI-Overview / AI-Mode visibility.

For Google's own AI features, this is accurate. Gemini, AI Overviews, and AI Mode are rooted in Google's Search ranking and quality systems — they run retrieval-augmented generation against the same index. Optimize for Google Search and you optimize for Google's AI surface.

For the engines outside Google's family, it's not accurate. Three empirical anchors from the 2025–26 measurement literature establish that AI-citation graphs diverge sharply from Google's index:

Only 11% of domains are cited by BOTH ChatGPT and Perplexity across the same query set. (Profound platform, 680M citations analyzed.) If two of the largest non-Google engines share only 11% of their citation graphs, neither can be optimized for via the third graph (Google).
80% of URLs ChatGPT cites do NOT appear in Google's top 100 for the same query. (averi.ai B2B citation benchmarks.) ChatGPT is not retrieving from Google's index.
Domain Authority correlates only r=0.18 with Google AI Overview selection. (wellows, 15,847 results.) Even within Google's AI surface, the classic SEO signal is necessary but far from sufficient.

So we take Google's position as accurate for Google and partial for everyone else. Our scan still reports on every signal Google ranks (their guidance is the floor, not the ceiling), but we also report on signals their guidance dismisses — llms.txt, Content Signals, FAQPage schema, the 14-bot crawler taxonomy including OAI-SearchBot / Claude-SearchBot / PerplexityBot — because for the engines outside Google, those signals demonstrably affect citation share. The FAQPage finding, for instance, is anchored to a 2.7× lift measured across multiple AI engines (Relixir 2025), not against Google specifically.

The static scan reports honestly on ecosystem-wide signals rather than collapsing everything into Google's preferred frame. Live per-engine measurement (same prompt set across ChatGPT, Claude, Gemini, Perplexity with bootstrap CIs) is out of scope for this free utility — the methodology section below describes what such a study would look like, for anyone wanting to replicate or extend it.

Contact & corrections.

Questions, corrections, or critiques: methodology@canaifind.com. We log them publicly with disposition.

Back to free check