Reportlululemon.com·checked 2026-05-21 20:32 UTC·methodology v0.1 (preview)·canaifind.com/r/QVjecwCT
PartialSome fundamentals in place; high-leverage gaps identified.

This is a static-scan check (robots.txt + llms.txt + schema.org + headers). Live engine probes across ChatGPT, Claude, Gemini, and Perplexity arrive in a future build — currently in queue. Real visibility lives in category and comparison queries, which we measure with a 100-prompt stratified set on the Audit tier.

╴ Check your own domain

Same scan, free, no signup. Results in ~5 seconds at your own permanent canaifind.com/r/{slug} URL.

AI crawler robots.txt audit

§1 of 4
OpenAI
GPTBotTraining crawler for future OpenAI models.? Unknown (fetch blocked)
OAI-SearchBotChatGPT Search index. Disallowing makes you invisible to ChatGPT Search.? Unknown (fetch blocked)
ChatGPT-UserUser-initiated retrieval. Ignores robots.txt by design.— Ignores robots.txt
Anthropic
ClaudeBotTraining crawler for Anthropic models.? Unknown (fetch blocked)
Claude-UserRetrieves pages when a Claude user asks about them. Respects robots.txt (unlike OpenAI's ChatGPT-User).? Unknown (fetch blocked)
Claude-SearchBotSearch index for Claude. Disallowing reduces Claude search quality.? Unknown (fetch blocked)
claude-codeClaude Code CLI / IDE retrieval. Documentation-targeted.? Unknown (fetch blocked)
Perplexity
PerplexityBotPerplexity indexing. Disallowing removes you from Perplexity retrieval.? Unknown (fetch blocked)
Perplexity-UserUser-initiated retrieval. Ignores robots.txt by design.— Ignores robots.txt
Google
Google-ExtendedTraining opt-out for Gemini / Bard. Disallowing opts you out of Google AI training.? Unknown (fetch blocked)
GoogleOtherCatch-all for non-Search Google crawlers.? Unknown (fetch blocked)
Meta
Meta-ExternalAgentMeta AI crawler. Disallowing opts you out of Meta AI training/retrieval.? Unknown (fetch blocked)
Apple
Applebot-ExtendedApple Intelligence training opt-out (separate from Applebot Search).? Unknown (fetch blocked)
ByteDance
BytespiderByteDance / TikTok AI crawler.? Unknown (fetch blocked)
Common Crawl
CCBotCommon Crawl. Heavily used as a training-corpus source by every major model.? Unknown (fetch blocked)

Structured data & discovery files

§2 of 4
ArtifactStatusNote
llms.txtA markdown index of the site's most important pages, served at /llms.txt. Anthropic Claude Desktop and Claude.ai fetch this. IDE tooling (Cursor, Claude Code, GitHub Copilot, Cline, Aider) routinely retrieves it. Google has explicitly confirmed it does NOT support it (Gary Illyes, July 2025). OpenAI is unconfirmed.✗ MissingAnthropic Claude respects this; Google has confirmed it does not; OpenAI is unconfirmed.
llms-full.txtOptional full-content companion to llms.txt. Useful for agents with large context windows that prefer a single fetch over crawling. Doesn't replace llms.txt — both can coexist.✗ MissingOptional full-content companion file.
ArtifactStatusNote
schema.org OrganizationThe brand-identity anchor LLMs use to disambiguate the site. Without it, profile links on LinkedIn, Wikidata, Crunchbase, GitHub etc. aren't bound to the homepage's entity in the AI's knowledge graph. The sameAs array is the load-bearing field.✗ MissingEntity anchor for the sameAs graph.
schema.org FAQPagePages with FAQPage JSON-LD show 2.7× citation rate vs without — 41% vs 15% in the Relixir 2025 study. The JSON-LD must mirror visible Q&A content on the page; Google penalises mismatch. Single highest-leverage fix in the audit.✗ Missing2.7× citation rate vs without (Relixir 2025) — highest-leverage single fix.
schema.org ArticleFor journalistic/editorial pages. Declares author, datePublished, dateModified, and section to AI engines. They preferentially cite recent, dated, authored content in answer-engine results.✗ MissingFor editorial pages.
schema.org HowToFor step-by-step procedural content. AI engines preferentially cite HowTo markup when answering procedural queries ("how do I X"). Maps directly to retrieval intent.✗ MissingFor tutorials.
schema.org SoftwareApplicationFor product/app pages. Maps to vendor-evaluation queries ("best X for Y"). Effectively required for B2B SaaS visibility in AI citations — 89% of B2B buyers now use AI for vendor research (Averi 2026).✗ MissingFor product pages.
Person (author entity)Author entity on bylines, linked to the Article entity. E-E-A-T (Experience, Expertise, Authoritativeness, Trustworthiness) signal — AI engines weight content authored by named, credentialed people higher than anonymous content.✗ MissingE-E-A-T signal on bylines.

HTTP headers

§3 of 4

Could not fetch the homepage (HTTP 0). Skipping HTTP header checks.

Top findings

§4 of 4
  1. 1

    Could not fetch robots.txt.

    The request for lululemon.com/robots.txt failed: the origin did not respond within 5s. We cannot make claims about per-crawler access until we can read the file. AI retrieval crawlers running from datacenter IPs may face the same outcome.

    Med
  2. 2

    Could not fetch the homepage.

    The request for https://lululemon.com/ failed: the origin did not respond within 5s. AI retrieval crawlers may face the same outcome from datacenter IPs — we can't audit schema.org markup until the page is reachable.

    Med
╴ Share this report

This report has a permanent URL: canaifind.com/r/QVjecwCT. Screenshot, drop in Slack, quote-tweet, or send to whoever's going to ask. That's how this tool finds the next person who needs it.