AEO Is Solving the Wrong 8%: How to Invest in the Layer That Actually Determines AI Citation

There is a number buried in a University of Toronto study from September 2025 that should have ended the AEO conversation as it currently exists. It did not, because almost nobody stopped to do the maths on what it actually means.

The number is 92.1%.

That is the proportion of citations made by AI systems in the consumer electronics category that came from third-party authoritative sources — independent coverage, journalism, earned mentions — rather than from the brand’s own website. In automotive the figure was 81.9%. Across industries, the pattern holds: the overwhelming majority of what AI cites is not your content. It is what other people say about you.

Now look at what the AEO industry is selling.

Every guide published in the last two years — and there are hundreds of them — tells you to structure your content with direct answers, use clear headings, add schema markup, write concise paragraphs, front-load your conclusions. This is presented as the path to AI visibility. Some of it is good advice. None of it addresses the 92%.

If AI cites third-party sources more than nine times out of ten, then the entire discipline of on-page AEO optimisation is competing for the remaining fraction. The industry has built a sophisticated methodology for a minority problem and is selling it as the complete solution.

Two problems that look like one

Retrieval is the question of whether AI can extract a useful fragment from your page once it has decided to look at you. Selection is the question of whether AI considers you at all. This is where the 92% lives. Selection comes before retrieval. You cannot be extracted from a page that was never consulted.

The data is now unambiguous

In December 2025, Muck Rack’s Generative Pulse published an analysis of more than one million links from AI responses across Gemini, Perplexity, Claude, and ChatGPT. Their finding: 82% of all citations come from earned media. Non-paid sources account for 94% overall. Two independent datasets, different methodologies, same direction: AI citation is primarily an off-page, earned, trust-infrastructure problem, not an on-page content structure problem.

When Muck Rack mapped citation coverage by outlet, 50% of a brand’s AI citation coverage came from approximately 20 outlets. And when they compared the journalists PR professionals pitch against the journalists whose work AI actually cites, the overlap was 2%.

TurboQuant makes this more urgent

In March 2026, Google published a blog post announcing TurboQuant — algorithms that reduce vector database indexing time to virtually zero. Analyst Marie Haynes has identified it as a likely factor behind the March 2026 core update. The practical consequence: semantic authority and entity recognition matter more. TurboQuant does not change what wins the selection layer. It makes the selection layer operate faster and with less tolerance for entities whose authority is ambiguous.

What on-page work actually does

Structured content, schema, and clear paragraph architecture are the retrieval layer, and the retrieval layer matters. Research from Princeton, Georgia Tech and IIT Delhi found structured content interventions improve AI citation rates by 30-40%. But a business that optimises its on-page content without building the off-page trust infrastructure will become highly retrievable from pages that are rarely consulted.

The building your business needs to occupy

AI visibility is a four-floor building, and the floors must be built in order. On-page optimisation is the second floor. The 92% lives on the third.

The Four-Floor Model — AI Recommendation Stack
AI does not rank businesses. It selects them.
Click a floor or drag the lift to explore each level.
Where most businesses are: The majority of businesses that come through an AI visibility audit are failing at Floor 2 or Floor 3 — not because they lack content, but because their content is not structured for selection and their trust signals are not independently corroborated. Floor 1 failures are common and invisible. Very few businesses are genuinely ready for Floor 4.
Floor 4 — Agentic Execution · 2026–2027
MCP · WebMCP · Callable Tools · Governance Layer
Future layer — click to explore
Floor 3 — Trust & Selection
AI Recommendation Eligibility · CITATE · Entity Corroboration · Citation
AI systems have enough trust to name and recommend you — not just retrieve you
Floor 2 — Content Extractability
Structured Data · Schema Markup · Machine-Readable Answers · AI-Citable Format
AI retrieval systems can parse, extract, and quote your content accurately
Floor 1 — Entity Foundation & Discovery · Start here
NAP Consistency · Bing Indexability · Wikidata · llms.txt · Technical SEO
AI systems can find and correctly identify your entity before any recommendation is possible. Nothing above works without this.
Lift shaft
Floor 1 fail
You are invisible to AI systems
Floor 2 fail
You are retrieved but not cited
Floor 3 fail
You are cited but not recommended
Floor 4 — future
You cannot be actioned

Each floor depends on the one below it. The full model is at MCP vs WebMCP: And Why Neither Matters If Your Building Has No Floors.

What to do about the 92%: five steps

Step one: Audit your current AI citation position. Run your business name and your three strongest competitors through ChatGPT, Perplexity, Gemini, and Claude. Note every source cited, every competitor that appears where you do not, and every factual error. This is the only honest baseline.

Step two: Map your query type. For branded queries, owned content dominates at 80-90%. For specialist queries, 50-70%. For general and comparison queries, earned media determines the shortlist and the 92% applies most directly. Your investment ratio follows from your answer.

Step three: Identify your 20 outlets. Map which publications currently cite you in AI responses, which cite your competitors, and which outlets AI models consistently use for your sector. These approximately 20 outlets drive 50% of your potential AI citation coverage.

Step four: Make your content extractable (Floor 2). Before investing in earned media, ensure your highest-value pages pass the CITATE extractability threshold. Floor 3 investment without Floor 2 in place produces inconsistent citation even when AI does decide to consult you. See the CITATE framework.

Step five: Build your earned media infrastructure as a sustained programme. Target your 20 outlets with consistent activity — press releases with high statistical density, regular publishable content, and publishing cadence that keeps you within the first-seven-day citation window. The 2% journalist overlap between PR investment and AI citation is solved by measurement, not by pitching more.

The commercial consequence

According to Seer Interactive’s analysis of twelve million visits in 2025, traffic arriving via AI citation converts at 14.2% compared to 2.8% for standard organic search. The distinction between these two outcomes is not execution quality. It is understanding what problem you are actually solving.

For the full implementation guide — query-type matrix, business-type segmentation, and the complete priority framework — see the AI Citation Gap guide. For your current four-floor position, start with the AI Visibility Audit.

Related topics:

aeo ai-discovery-stack ai-seo ai-visibility Entity Seo future-of-seo geo llm-optimisation search-trends
Sean Mullins

Founder of SEO Strategy Ltd with 20+ years in SEO, web development and digital marketing. Specialising in healthcare IT, legal services and SaaS — from technical audits to AI-assisted development.