Open ChatGPT. Type "what's the best non-custodial wallet for Solana." Read the answer. No ads, no ten blue links. One synthesized response with three or four citations. The projects named will get the wallets connected. The projects not named are invisible. This is the playbook for getting cited. How AI engines decide. The Web3 Trust Hub you need to be on. The LLM Sitemap methodology we developed. From my experience, LLM-referred traffic converts ~4.4× higher than organic — the user arrives with the AI's pre-conferred trust.
TL;DR
- AI Overviews show on 25-50% of Google searches. Every educational crypto query we tested triggered one (Ahrefs, April 2026).
- Citation = Authority × Content × Technical. Any factor at zero kills the output. Most teams optimize content with zero authority and wonder why nothing happens.
- Map your Trust Hub first. For Web3 it's CoinDesk, The Block, Decrypt, Wikipedia, Reddit, plus 4-6 category-specific platforms. Skew presence work 70/30 third-party to self-published.
- Bullets under 15 words extract verbatim. Over 15, your specific claim becomes generic filler.
- LLM Sitemaps + canonical definition. Identical positioning across every page and external profile, structured as a clustered HTML hierarchy AI can parse.
- Pick topics where AI can't compete. Niche specificity wins. Generic keywords are now where AI generates a better answer than your page.
- Measure citation share, not rankings. Run your top 20 queries through ChatGPT, Perplexity, AI Overviews monthly.
One mental model. Citation isn't additive. It's multiplicative.
Why GEO Matters More for Web3 Than Other Industries#
- Adopters. Same audience that left banks for DeFi, Reddit for Farcaster, Excel for Dune. AI search adoption runs months ahead of mainstream.
- Question shape. "How does X work," "is Y safe," "best Z for use case." Exactly what AI handles best.
- Trust threshold. Users research before depositing into smart contracts. AI now mediates that research. Not cited = a YouTube video defines your reputation.
How AI Engines Decide What to Cite#
Each engine pulls from a different mix of sources.
The four retrieval mechanics behind every AI answer
Four ways your project gets into an AI answer. Optimization opportunities sit between them.
| Mechanism | How it works | How to influence it | Time to land |
|---|---|---|---|
| Training data recall | LLM "remembers" you from training corpus | Mentions across high-authority pages over time | 6-18 months |
| Live web retrieval (RAG) | LLM searches the web during the query | Rank well in SEO; Bing index matters | Days to weeks |
| Knowledge graph entries | Wikipedia, Wikidata, Crunchbase referenced directly | Earn entries; maintain accurate public records | Months |
| Domain memorization | For named brands, the LLM has the brand "memorized" | Consistent mentions over long periods, consistent framing | 12+ months |
Optimize for live retrieval (days) and training-data recall (12+ months) in parallel.
The Five Channels That Feed AI Citations#
Every niche has a Trust Hub — the cluster of domains LLMs reference for that niche. For B2B SaaS it's G2, PeerSpot, Wikipedia. For Web3 it's a different set. Map yours before spending on anything.
Skew presence work 70/30: roughly 70% of brand mentions from independent third parties, 30% self-published. The other way reads as manufactured.
1. Wikipedia
Highest-leverage AI citation move. Top source in ChatGPT, major source in every other engine, feeds Google's Knowledge Graph. One legitimate page is worth ten tier-1 PR placements. Strict notability rules — build legitimate independent sources first, then have an experienced editor draft a neutral, fully-cited page.
2. Reddit
~21% of Google AI Overview citations include Reddit. Perplexity surfaces it heavily. ChatGPT's web tool pulls it constantly. What works:
- 90+ days of genuine subreddit presence before promotion. Smaller category subs (r/defi, r/ethereum, r/Solana) over r/CryptoCurrency.
- Answer specific questions usefully. Complete reply, mention project once in context.
- Original research posts. On-chain analysis with TL;DR gets archived and AI-cited.
3. Tier-1 crypto media
Domain authority is the single strongest predictor of AI citation (SHAP 0.63, SE Ranking 2025). One CoinDesk, The Block, or Decrypt editorial feeds training data and live retrieval simultaneously, often for years. Earn it through expert outreach. Wire-service press releases produce zero lift.
4. YouTube
~18-19% of Google AI Overview citations include YouTube. Build 3-5 mid-tier creator relationships and your own founder-on-camera channel. Production value optional, accurate captions mandatory — AI extracts from transcripts.
5. Your own domain
Cited mostly for branded queries unless authority is serious. The work that earns non-branded citation:
- Original research and data. Public Dune dashboards, on-chain analyses with specific numbers.
- Comparison pages. "X vs Y" with structured tables.
- Long-form pillar pages. 3,000+ words on topics where you have genuine expertise.
- FAQ blocks with schema markup.
- An LLM Sitemap. Detailed below.
Two silent blockers most teams miss
OAI-SearchBot. Separate crawler from GPTBot — handles ChatGPT's live browsing. Older robots.txt templates miss it. Allow GPTBot, OAI-SearchBot, ClaudeBot, and PerplexityBot explicitly.
Cloudflare Bot Fight Mode. Blocks AI crawlers at the network layer regardless of robots.txt. We see this in roughly one in three audits.
The GEO Content Stack#
Each layer enables the next. Most projects skip the foundation, start at the top, and nothing compounds.
Pick Topics Where AI Can't Compete#
Most teams pick topics by search volume. That's how you write "what is crypto" content where ChatGPT generates a better answer than your page. Write where AI can't follow.
LLM Sitemaps: The Methodology We Developed#
An XML sitemap tells Googlebot what URLs exist. An LLM Sitemap tells AI crawlers what your project does, how content is organized, what concepts relate, and when to recommend you. A clustered HTML page (not XML) with pillar-cluster hierarchy, embedded FAQs, comparison tables, and explicit semantic relationships. Developed across 50+ B2B SaaS and crypto client engagements.
The most important element: a canonical definition repeated identically across every page and external profile. The formula:
The canonical definition formula
- [Project] is a [specific category] for [target user] that [primary function]. Unlike [well-known alternative], it [differentiator with measurable claim].
"Specific category" means one thing. Not "comprehensive," not "all-in-one." Identical phrasing across homepage, About, LinkedIn, Crunchbase, PR boilerplate. When the model sees three different descriptions, it hedges and recommends the competitor it sees described consistently.
What goes inside an LLM Sitemap
- Pillar pages, 3-5. One canonical 3,000+ word page per top-level topic.
- Cluster pages, 3-8 per pillar. Sub-topics linking back to pillar and laterally.
- FAQ blocks, 6-10 per cluster. First-person Q&A with FAQ schema.
- Comparison tables, 1-2 per cluster. Structured tables with named columns.
- Explicit semantic cross-links. Anchor text that names the relationship between concepts.
- An HTML index page. The actual "LLM Sitemap." Submit in robots.txt, link from footer.
Projects shipping a real LLM Sitemap see citation share grow 2-3× faster. Structure is the optimization.
Submit your LLM Sitemap explicitly
Add to robots.txt: Sitemap: https://yoursite.com/llm-sitemap. Link from your footer. AI crawlers are starting to look for these signals.
The all-in-one trap
Products positioned as "all-in-one" or "the operating system for X" give the model no clear answer to "when should I recommend this?" Own one specific category first. Vague positioning = invisible citations.
On-Page Optimization for AI Extraction#
Six rules. Same project, same fact — completely different outcomes inside an AI answer. The pattern isn't subtle.


