AEO measurement · Vol. 1 · 2026
Research library · v0.1 · last updated 2026-05-21

Every source we cite — with summaries.

Most AEO tools cite nothing. Most "AI SEO" advice is recycled from classic SEO blog posts with the bots renamed. We work from peer-reviewed papers, large-N industry studies, and official vendor docs — and we tell you when those sources disagree, because they often do.

This page is the living index. Each entry has an anchor link (click the source name to copy the URL) so you can cite a specific finding without sending someone to the whole library. Missing something we should add? methodology@canaifind.com.

Peer-reviewed academic work

Where the formal measurement framework comes from. Sielinski and Schulte underpin our confidence-interval reporting; Aggarwal coined the field; the citation-absorption and hallucination papers are the most recent additions to our reading list.

  1. Quantifying Uncertainty in AI Visibility: A Statistical Framework for Generative Search Measurement

    Sielinski, R. · 2026 · arxiv:2603.08924

    The load-bearing paper for our methodology. Argues that single-run visibility metrics give a misleadingly precise picture of domain performance in generative search and proposes a sampling + bootstrap-CI framework. Quote: "Single-run visibility metrics provide a misleadingly precise picture of domain performance in generative search." Every dashboard that reports an AI-visibility score without a confidence interval is, per this paper, statistically indistinguishable from noise for most lift claims. We pin every report to a methodology version because of this argument.

    arxiv.org/abs/2603.08924
  2. Don't Measure Once: Measuring Visibility in AI Search (GEO)

    Schulte, J., Bleeker, M., Kaufmann, P. · 2026 · arxiv:2604.07585

    Companion to Sielinski. Empirical demonstration that running the same prompt set on the same day across the same engine produces materially different citation graphs run-to-run — i.e. the engines have non-deterministic retrieval at the per-query level. Establishes that N≥10 samples per cell is the empirical floor for stable per-engine citation share, with pooling by intent cluster × model × region above that. Underpins our Stage-2 live-engine probe design.

    arxiv.org/abs/2604.07585
  3. From Citation Selection to Citation Absorption: A Measurement Framework for Generative Engine Optimization Across AI Search Platforms

    Various · 2026 · arxiv:2604.25707

    Distinguishes two distinct AI-engine behaviors: (1) citation selection — which URLs the model attaches as references; (2) citation absorption — which URLs the model paraphrases into its answer body without crediting. Selection is what most dashboards measure; absorption is what actually drives brand-mention share in the answer text. We expect to add an absorption-vs-selection metric in the Audit tier when live probes ship.

    arxiv.org/abs/2604.25707
  4. The Discovery Gap: How Product Hunt Startups Vanish in LLM Organic Discovery Queries

    Sharma, A. P. · 2026 · arxiv:2601.00912

    Tracks 1,000+ Product Hunt launches over six months and measures their citation rate in organic LLM discovery queries (e.g. "best X tool for Y"). Finds a dramatic visibility gap between launch-day mention and post-launch organic discoverability — most products that get launch traction never become citation-eligible at all. Strong empirical support for our position that early-stage brands need AEO measurement, not just SEO.

    arxiv.org/abs/2601.00912
  5. GEO: Generative Engine Optimization

    Aggarwal, P. et al. · 2024 · arxiv:2311.09735 · KDD 2024

    The original paper that coined "Generative Engine Optimization" as a distinct discipline. Presented at KDD 2024 (Knowledge Discovery and Data Mining). Tests nine optimization strategies on a benchmark of 10K queries across multiple LLM-based search engines and finds that techniques that move traditional search rankings frequently do NOT move AI citation rates — and vice versa. The empirical foundation for treating AEO as a distinct field.

    arxiv.org/abs/2311.09735
  6. Aligning Large Language Model Behavior with Human Citation Preferences

    Various · 2026 · arxiv:2602.05205

    Examines whether LLM citation choices reflect what human evaluators consider authoritative or trustworthy citations. Establishes that off-the-shelf models cite Wikipedia and high-DA news sites at rates well above human-judged appropriateness, while undercitating primary sources, vendor docs, and expert-authored content. Relevant to entity-anchor advice: high-DA mention helps citation rate but not citation quality.

    arxiv.org/abs/2602.05205
  7. Do Deployment Constraints Make LLMs Hallucinate Citations? An Empirical Study Across Four Models and Five Prompting Regimes

    Various · 2026 · arxiv:2603.07287

    Quantifies how often production-deployed LLMs fabricate citation URLs in answer responses. Across four models and five prompting regimes, hallucinated-URL rates range 8–34% depending on configuration. Methodological foundation for our hallucinated-URL exclusion policy — we HEAD-verify every URL a model cites before counting it toward citation share.

    arxiv.org/abs/2603.07287

Industry studies & benchmarks

Empirical work from commercial AEO platforms (Profound, Brandlight, averi) plus auditing firms (wellows, Relixir, Seer). Less peer-reviewed but larger N. We cite findings, not vendor claims about their own products.

  1. Profound Citation-Graph Analysis (680M Citations)

    Profound · 2025 · 680M citations · 11% overlap

    Profound, the largest commercial AEO monitoring platform, published an analysis of 680M citations across ChatGPT, Perplexity, Claude, and Gemini. Headline finding: only 11% of domains are cited by BOTH ChatGPT AND Perplexity for the same query set. The two largest non-Google engines diverge sharply in what they cite, which means single-engine optimization is structurally incomplete.

    www.tryprofound.com
  2. Google ↔ AI Citation Overlap Decline

    Brandlight · 2026 · overlap 70% → <20%

    GEO firm Brandlight reports that the overlap between top Google links and AI-cited sources has dropped from ~70% (2024) to under 20% (2026). Direct evidence that AI engines are not retrieving from Google's ranked index, which is the empirical basis for our position that Google's "AEO=SEO" framing does not generalize beyond Google's own AI features.

    www.brandlight.ai
  3. Averi B2B Citation Benchmarks

    averi.ai · 2026 · 80% non-Google · 8.4× top vs bottom

    Two headline findings: (1) 80% of URLs ChatGPT cites for B2B buyer-research queries do NOT appear in Google's top 100 for the same query — ChatGPT's retrieval graph is functionally disjoint from Google's ranked index; (2) the top quartile of SaaS brands earns 8.4× more AI citations than the bottom quartile, and 89% of B2B buyers now use AI for vendor research. The case for B2B-specific AEO measurement.

    averi.ai
  4. wellows AI Overviews Analysis (15,847 Results)

    wellows · 2025 · r=0.18 DA correlation · n=15,847

    Empirical analysis of 15,847 SERP results to identify which signals predict Google AI Overview inclusion. Headline finding: classic Domain Authority correlates only r=0.18 with AI Overview selection. Even within Google's own AI surface — which Google says is "rooted in core Search ranking" — the strongest classic SEO signal explains only a small fraction of selection variance. Mid- and long-tail authoritative pages outperform homepage-of-major-brand on AI Overview citation.

    wellows.com
  5. AI Overview Citation → Click Lift

    Seer Interactive · 2025 · +120% organic · +41% paid

    Measured organic click-through and paid click-through impact for pages cited in Google AI Overviews. Found +120% organic clicks per impression and +41% paid clicks per impression for cited pages. The lift is real, measurable, and counters the "zero-click" narrative — AI Overview citations are a discovery surface that drives downstream traffic, not just a terminus.

    www.seerinteractive.com
  6. FAQPage Schema → Citation Rate Lift

    Relixir · 2025 · 2.7× citation · 41% vs 15%

    Compared citation rates for pages WITH FAQPage JSON-LD vs. pages WITHOUT, controlling for content quality and domain authority. Headline finding: 2.7× citation rate for FAQPage pages — 41% vs. 15% baseline — measured across ChatGPT, Claude, and Perplexity. The single highest-leverage structural fix our scanner surfaces. (Caveat: measured for non-Google engines; Google's May-2026 docs say structured data is "not required" for their AI features.)

    www.relixir.com
  7. HubSpot State of Marketing 2026

    HubSpot · 2026 · 50% AI search adoption

    Annual survey of marketing leaders + consumer behavior. Headline AI finding: 50% of consumers now use AI-powered search as their primary discovery surface for at least one product/service category. Establishes the addressable user base for AEO measurement — half the buying population is now in an AI-mediated discovery flow that's invisible to traditional analytics.

    www.hubspot.com/state-of-marketing
  8. AI Search Platform Scale Indicators

    Multiple vendor disclosures · 2026 · ChatGPT 883M MAU · AI Overviews ~55% of SERPs

    ChatGPT reached 883M monthly active users (OpenAI public disclosure). Google AI Overviews appear in approximately 55% of all Google searches (Google quarterly product update + third-party measurement). Two scale anchors for sizing the AEO opportunity: even small percentages of these volumes are material citation surface.

    openai.com

Official vendor positions

What the engines themselves say about their bots, their AI optimization advice, and what they do or do not support. Includes Google's May-2026 anti-llms.txt position, the OpenAI / Anthropic / Perplexity bot taxonomies, and the web.dev agent-execution UX guide.

  1. Google: AI Optimization Guide

    Google (Search Central) · 2026 · Official Google docs · 2026

    Google's official position on optimizing for AI Overviews and AI Mode. Headline thesis: "Optimizing for generative AI search is optimizing for the search experience, and thus still SEO." Explicitly tells site owners NOT to create llms.txt, AI-specific markdown files, or chunked content; says structured data is "not required" for AI features. Accurate for Gemini and AI Overviews (Google's own AI surfaces run RAG over the Search index); incomplete for the non-Google ecosystem.

    developers.google.com/search/docs/fundamentals/ai-optimization-guide
  2. web.dev: AI Agent-Friendly Website UX

    Google / web.dev · 2026 · agent-execution UX

    Companion to Google's AI optimization guide focused on agent-execution rather than answer-engine citation. Recommends semantic HTML over div/span, ARIA roles + tabindex for non-semantic elements, cursor:pointer as actionability signal, label-for-input association, ≥8 square-pixel interactive elements, no ghost/transparent overlays. Mentions WebMCP as an emerging standard. Most checks overlap with general web accessibility tooling.

    web.dev/articles/ai-agent-site-ux
  3. OpenAI: Bot User-Agent Documentation

    OpenAI · 2025 · GPTBot / OAI-SearchBot / ChatGPT-User

    OpenAI's three-bot taxonomy: GPTBot (training crawler, respects robots.txt), OAI-SearchBot (search-index crawler used by ChatGPT at inference, respects robots.txt), and ChatGPT-User (per-user retrieval, ignores robots.txt by design because it's acting on behalf of a user). The GPTBot-vs-OAI-SearchBot distinction is the load-bearing detail behind our foot-gun finding: blocking GPTBot expecting to opt out of training also blocks OAI-SearchBot via the User-agent: * fall-through.

    platform.openai.com/docs/bots
  4. Anthropic: Crawler Documentation (4-Bot Update Feb 2026)

    Anthropic · 2026 · ClaudeBot / Claude-User / Claude-SearchBot / claude-code

    Anthropic's updated four-bot taxonomy: ClaudeBot (training), Claude-User (per-user retrieval — distinct from OpenAI's ChatGPT-User in that it DOES respect robots.txt), Claude-SearchBot (search index), and claude-code (Claude Code CLI / IDE retrieval, documentation-targeted). The Claude-User behavior is a deliberately stricter user-initiated crawler than OpenAI's equivalent.

    docs.anthropic.com
  5. Perplexity: Crawler Documentation

    Perplexity · 2025 · PerplexityBot / Perplexity-User

    Perplexity's two-bot taxonomy: PerplexityBot (indexing crawler, respects robots.txt) and Perplexity-User (per-user retrieval, ignores robots.txt). Smaller surface than OpenAI/Anthropic but the same user-initiated-bot-ignores-robots-txt pattern as ChatGPT-User.

    docs.perplexity.ai
  6. Google: llms.txt Position (Search Central Live)

    Gary Illyes / Google · 2025 · Google: NO · Anthropic: YES · OpenAI: unconfirmed

    In a July 2025 Search Central Live appearance, Google's Gary Illyes explicitly confirmed Google does NOT support or read llms.txt. The position was escalated to a formal published anti-recommendation in Google's May-2026 AI optimization guide. Anthropic still respects llms.txt for Claude Desktop and Claude.ai; OpenAI is unconfirmed. The ecosystem split is real and worth reporting honestly.

    developers.google.com/search/docs/fundamentals/ai-optimization-guide

Standards & specifications

The IETF RFCs and community specs our scanner implements directly: robots.txt, Web Linking, API Catalog, OAuth Protected Resource, llms.txt, Content Signals, schema.org. Read these to dispute our scoring.

  1. RFC 9309: Robots Exclusion Protocol

    IETF · 2022 · IETF · Robots Exclusion

    The standardized robots.txt specification. Defines User-agent groups, Allow/Disallow rules, and the precedence of explicit-User-agent matches over fall-through User-agent: * blocks. Our parser implements this RFC with practical relaxations (BOM tolerance, inline comments, multiple User-agent lines per group). The "explicit match beats wildcard" rule is what enables our foot-gun finding.

    datatracker.ietf.org/doc/html/rfc9309
  2. RFC 8288: Web Linking

    IETF · 2017 · IETF · HTTP Link header

    Specifies the HTTP Link header and the registry of standard link relations (canonical, sitemap, describedby, etc.). Our HTTP-headers scanner uses this RFC to detect agent-discovery rels: api-catalog, service-desc, describedby, agent-card. The HTTP-header version of canonical is processed earlier in the retrieval pipeline than HTML <link rel="canonical">, which is why we surface it separately.

    datatracker.ietf.org/doc/html/rfc8288
  3. RFC 9727: API Catalog (/.well-known/api-catalog)

    IETF · 2024 · IETF · API Catalog

    Standardizes /.well-known/api-catalog as the discoverable location for an API catalog document (linkset+json), allowing agents to find an organization's machine-readable APIs without scraping HTML. Often paired with service-desc (link to OpenAPI spec) and describedby (link to JSON-LD/RDF description) rels. Cloudflare's IsItAgentReady checks for this; we surface it via the agent-discovery Link rel detection.

    datatracker.ietf.org/doc/html/rfc9727
  4. RFC 9728: OAuth 2.0 Protected Resource Metadata

    IETF · 2024 · IETF · OAuth Protected Resource

    Companion to OAuth/OIDC discovery: standardizes /.well-known/oauth-protected-resource as the location for metadata describing how agents obtain access tokens for protected APIs. Out of scope for our citation-focused scan, but relevant for the agent-execution side of AI readiness (which is what Cloudflare's IsItAgentReady measures).

    datatracker.ietf.org/doc/html/rfc9728
  5. llms.txt Specification

    llmstxt.org · 2024 · community spec · split adoption

    Community specification for a markdown index of a site's most important pages, served at /llms.txt. Canonical structure: H1 title, blockquote summary, H2 section headers each containing markdown link lists. Adoption split as of 2026: Anthropic respects it (Claude Desktop, Claude.ai); IDE tooling fetches it (Cursor, Claude Code, GitHub Copilot, Cline, Aider); Google has formally declined to support it; OpenAI is unconfirmed.

    llmstxt.org
  6. Endpoint Context Protocol (ECP)

    endpointcontextprotocol.io · 2026 · community spec · agent content-negotiation

    Community-driven spec for agent-vs-browser content negotiation via HTTP. Two signals: (1) the homepage response should include `Vary: Accept, Sec-Fetch-Dest, User-Agent` to indicate the response is negotiated on those headers; (2) a /.well-known/ecp.json manifest at the site root lists the available representations (HTML, markdown, JSON). MIT-licensed, pre-standards but with growing adoption. Functionally a superset of the Markdown-negotiation pattern (which only uses Accept). Validator at ecptest.com. We added a detection probe for both signals (Vary on the homepage HEAD + GET on /.well-known/ecp.json) after a Reddit commenter (u/ExistentialConcierge) flagged it during the launch thread — credit where due.

    endpointcontextprotocol.io
  7. Content Signals (IETF Draft)

    contentsignals.org / IETF · 2026 · IETF draft · declarative preferences

    IETF draft for a declarative AI-usage preference directive in robots.txt: Content-Signal: search=yes, ai-input=yes, ai-train=no. Distinct from User-agent rules in that it expresses publisher intent per role (traditional search vs. live AI answer vs. training corpus) even when crawlers are allowed access. Voluntary compliance today, but Cloudflare's IsItAgentReady has begun checking for it and adoption is rising in 2026.

    contentsignals.org
  8. schema.org Vocabulary

    schema.org · ongoing · structured data vocabulary

    The structured-data vocabulary for the web. Our scanner checks for Organization (entity anchor + sameAs graph), FAQPage (the highest-leverage finding for non-Google engines per Relixir 2025), Article (editorial pages), HowTo (procedural queries), SoftwareApplication (B2B SaaS vendor evaluation), and Person (E-E-A-T author entities). Google's May-2026 docs say structured data is "not required" for their AI features, but the empirical multi-engine lift stands.

    schema.org

See a study we're missing? A claim above you can disprove? A vendor doc that's changed? Send corrections to methodology@canaifind.com. We log them publicly with disposition.

Back to free check Methodology