Someone found you. Not through a cold search. Through research, referrals, reputation. They knew your name, your specialism, your track record. They had already decided. They were ready to make contact.
In 2025, that journey increasingly ends with an AI assistant. “Find me the best way to reach [firm name] and speak to someone who handles [specific matter].” A reasonable request. A natural one.
And if your AI visibility infrastructure is not built correctly, one of three things happens. The AI hedges — it knows you exist but not confidently enough to commit. It conflates you with a similarly named competitor in a different city. Or it routes the enquiry through a generic contact form when the answer should have been a direct number and a named specialist.
The prospective client did the work. They chose you. The instruction was lost at the handover. You never knew the enquiry existed.
This is not a future problem. It is happening now, at scale, invisibly — and the businesses that understand the infrastructure behind it are building a compounding advantage over those that do not.
What this guide covers and who it is for
The AI visibility conversation has been dominated by on-page content advice. Structure your answers better. Add FAQ schema. Write answer-first paragraphs. That guidance is not wrong. It is incomplete — and applied without the context this guide provides, it produces the wrong strategy for most businesses.
| Your situation | Jump to |
|---|---|
| Enterprise or large brand with existing PR infrastructure | The Enterprise Problem |
| B2B specialist — law firm, SaaS, consultancy, professional services | The Specialist Problem |
| SME competing primarily on discovery — B2C or generalist B2B | The Discovery Problem |
| Starting from scratch or auditing your current position | Where to Start |
Eight things this guide will tell you
- Muck Rack’s analysis of over a million AI responses found that 82% of AI citations come from earned media. That statistic is accurate. It is also an average — and applying it without understanding query-type variation produces the wrong strategy for a significant proportion of businesses.
- Our testing across a specialist legal client’s query set found owned content citation rates of 50–70% in core specialist queries. The instruction to “shift everything to earned media” would, applied to that business, mean defunding the assets currently driving its AI citation.
- Earned media determines whether AI selects you. Owned content determines what AI extracts once it has. Both matter. The ratio depends on where your business operates in the query landscape.
- There are four floors to your AI visibility. Most businesses have built one and a half. The majority of the citation opportunity — and all of the agentic opportunity — sits above where most businesses have reached.
- Muck Rack found that most of a brand’s AI citations come from approximately 20 outlets, and that the journalists PR teams pitch have only a 2% overlap with the journalists AI models actually cite.
- Publishing cadence is a citation strategy. Muck Rack found that half of all AI citations are for content published in the last 11 months, with the highest citation density in the first seven days after publication.
- Generic thought leadership is being deprioritised. Muck Rack found management consulting citations fell 35% between July and December 2025. Specific, independently-validated, recent information is being rewarded.
- Floor 4 is the future layer. If you are investing in MCP before fixing Floors 1–3, you are wasting money.
The number nobody did the maths on
In September 2025, researchers at the University of Toronto examined how AI systems cite sources across 13 industries. They found that when answering queries in consumer electronics, AI cited independent third-party sources 92.1% of the time. In automotive: 81.9%. The pattern held across sectors.
Separately, Muck Rack’s Generative Pulse team analysed over a million links from AI responses generated by ChatGPT, Claude, Gemini and Perplexity between July and December 2025. Their finding: 82% of all links cited by AI come from earned media. Non-paid sources account for 94%.
Two independent datasets, different methodologies, same direction.
The AEO industry — the people writing the guides, restructuring your content, and optimising your schema — has spent three years focused on the other 6–18%.
That is the headline. Here is the problem with stopping there. Those figures are real. They are also averages — and averages conceal the most important variable in AI visibility strategy: what kind of query is being asked. The University of Toronto study measured general consumer queries. Muck Rack measured across a broad prompt set spanning many industries and query types. Neither was designed to answer how citation behaviour changes when a business has deep, comprehensive specialist content in a niche domain — which is exactly where the aggregate obscures the most.
Applying the aggregate to your specific situation without that context is how businesses end up executing the right strategy for the wrong problem.
The query-type matrix: where the real insight lives
Not all queries are created equal. The same AI system reaches for different sources depending on what kind of query is being asked. Understanding where your business sits in this matrix is the most important analytical step before any investment decision.
| Query Type | Example | Owned Content % | What AI is doing |
|---|---|---|---|
| Branded | “[Firm name] Manchester” | 80–90% | Treating you as the authoritative source. Your site, GBP, and direct content dominate. |
| Specialist / Niche | “pre-charge representation solicitor UK” | 50–70% | Drawing on your expert content as primary source, supplemented by official guidance. |
| General / Comparison | “best criminal defence solicitor London” | 20–40% | Prioritising aggregators, rankings, sector press, and independent coverage. Earned media determines the shortlist. |
| Educational / Process | “what happens at a voluntary police interview” | 40–60% | Blending accessible expert explanation with official procedural sources. |
A note on these ranges: they are directional, based on AI citation testing across a specialist legal client’s query set in criminal defence, cross-referenced with Muck Rack Generative Pulse data. They are intended to show how citation behaviour varies by query type — not to claim universal benchmark precision across all sectors. The pattern of variation holds; the exact percentages will differ by industry and content depth.
The “82% earned media” figure describes citation behaviour predominantly in the general and comparison query category. For a specialist firm with genuinely comprehensive expert content, the ratio in its core query set looks very different.
The honest answer: earned media determines whether AI selects you. Owned content determines what AI extracts once it has. Both matter. The ratio depends entirely on where your business operates in the query landscape.
What most businesses will get wrong from this data
The 82% statistic is now widely cited. Here is how each business type will misread it.
Enterprises will use it to justify prestige media coverage as AI strategy. They will pitch the same high-circulation outlets they always have, measure column inches, and assume AI citation follows media authority. It does not — at least not in the way they expect. Muck Rack found that the journalists AI actually cites for a given brand have only a 2% overlap with the journalists that brand’s PR team pitches. Prestige and AI citation influence are weakly correlated. The right outlets for AI visibility are often specialist, niche, and below the radar of a traditional media strategy.
Specialists will read the 82% figure and start defunding their owned content. The reasoning goes: if earned media is where AI citation lives, our resources should shift there. But that reasoning does not account for query type. A specialist firm with content ranking first with 44% click-through rate is being cited heavily by AI from that owned content — for the specialist and educational queries that drive most of its traffic. Shifting budget to earned media on the basis of an aggregate statistic would be strategically backwards.
SMEs will treat schema markup and technical fixes as trust-building tools. They will complete Floors One and Two, see their content become more extractable, and assume they have addressed the AI visibility problem. They have not. Schema tells AI what your content says. It does not tell AI whether to trust you. That happens at Floor Three, through independent sources AI already respects — and no amount of JSON-LD changes that.
The common error across all three: treating a selection problem as an extraction problem.
The four floors of AI visibility
Most businesses think about AI visibility the way they thought about SEO in 2010: as a content problem. Write better content. Structure it correctly. Add the right metadata. Wait. That mental model addresses one floor of a four-floor building — and it is not the top floor.
| Floor | What it means | If you are failing here |
|---|---|---|
| Floor 1 — Foundation | AI systems can find and correctly identify your entity | You are invisible to AI systems |
| Floor 2 — Extractability | AI can read, parse, and use your content | You are retrieved but not cited |
| Floor 3 — Trust & Selection | Independent sources corroborate you; AI selects you with confidence | You are cited but not recommended |
| Floor 4 — Agentic Execution | AI agents can act with your business, not just recommend it | You cannot be actioned |
Each floor is a dependency for the one above it. You cannot be recommended if you are not trusted. You cannot be trusted if you cannot be extracted. You cannot be extracted if you cannot be found. The full four-floor model is explained here →
Floor One: Foundation
AI systems can find and correctly identify your entity before any recommendation is possible. Nothing above works without this.
Entity recognition. Structured data. Schema markup. NAP consistency across every surface AI can read. Wikidata presence. Bing indexability. Your business understood as a coherent, unambiguous entity — not just a website with words on it. Without this, AI will hedge, conflate, or route around you. Most businesses have done partial work here. Few have done it comprehensively.
Floor Two: Extractability
Can AI read and use your content?
Content structure. Fragment clarity. Answer-first organisation. The conditions that allow an AI to lift a useful, accurate, attributable passage from your page and include it in a response. This is where most AEO guidance lives — and it is legitimate, necessary work. The problem is that most AEO guides treat this floor as the destination. It covers branded and specialist queries. It does not cover the majority of discovery and comparison queries — where citation decisions are made before AI ever reaches your content.
Floor Three: Trust and Selection
Selection is a consensus mechanism — not a ranking system. AI systems weigh corroboration, consistency, and agreement across independent sources to determine which entity is safest to recommend.
AI systems do not select businesses based on what those businesses say about themselves. They select based on what independent sources say: editorial coverage you did not write, review platforms with genuine volume and recency, structured entity databases, named frameworks and statistics that others cite.
The ladder problem. Most businesses have a temporary ladder to Floor Three — some press releases, a few directory listings, occasional organic media coverage. A ladder gets you there occasionally. It does not give you consistent, reliable presence. And Muck Rack found something important: the journalists PR teams pitch have only a 2% overlap with the journalists AI models actually cite. The ladder is real. It leads to the wrong floor.
Floor Four: Agentic Execution
MCP and WebMCP are the infrastructure that lets AI agents act with your business — not just recommend it. Without trust established at Floor 3, those permissions are never granted.
Floor 4 is the future layer. If you are investing in MCP before fixing Floors 1–3, you are wasting money. An AI agent that has reached this floor has already worked through Floors 1–3 — found your entity, extracted your content, evaluated your trust signals, selected you. Now it needs to act: complete a contact, raise a request, book a consultation — without the user leaving the AI interface. Without trust established at Floor 3, those permissions are never granted.
What a million AI responses actually showed
The following data comes from Muck Rack’s Generative Pulse report (December 2025), which analysed over a million links from real AI responses across ChatGPT, Claude, Gemini and Perplexity between July and December 2025. These are observed findings from a large dataset, not projections.
| Finding | Figure | What it means |
|---|---|---|
| Non-paid media as % of AI citations | 94% | Paid placements are almost entirely absent from AI citation |
| Earned media as % of AI citations | 82% | Independent, third-party sources dominate — in aggregate |
| Journalistic sources as % of all citations | ~25% | Consistent across all four models tested |
| Citations for content published in last 11 months | 50%+ | AI favours recency structurally, not incidentally |
| Highest citation density window | First 7 days after publication | Publishing cadence is a citation strategy |
| Management consulting citations, Jul vs Dec | −35% | Generic thought leadership is being deprioritised |
| Press release citations, Jul vs Dec | +5× (0.2% → 1%) | Structured, stat-dense releases are gaining ground |
| Outlets driving 50% of a brand’s citations | ~20 | Concentration beats breadth |
| Overlap: PR-pitched journalists vs AI-cited journalists | 2% | Most PR activity is pointed at the wrong audience |
Three patterns in this data warrant attention.
Recency is structural, not incidental. Half of all AI citations are for content published in the last 11 months, with the highest density in the first seven days. Consistent publishing outperforms occasional landmark releases.
Generic authority is losing ground. Management consulting thought leadership fell 35% in six months across all models tested. The implication: AI rewards specific, recent, independently-validated information from sources it has already learned to trust — not brand authority signals.
Concentration beats breadth. Most of a brand’s AI citations come from approximately 20 outlets. The implication: depth of relationship with the right outlets matters more than broad distribution. The open question for most businesses is whether they know which 20 those are.
What cited press releases look like
Muck Rack found that press release citations rose 5× between July and December 2025. They also found that the structure of cited press releases differs measurably from those that go uncited:
| Structural characteristic | Cited vs. non-cited |
|---|---|
| Statistics included | 2× as many in cited releases |
| Action verbs | 30% more in cited releases |
| Bullet points | 2.5× as many in cited releases |
| Unique companies or products mentioned | Significantly higher in cited releases |
| Objective sentences | 30% higher rate in cited releases |
The implication: a press release written for a journalist to read is a different document from a press release structured for AI extractability. Statistical density, scannable structure, and factual specificity are conditions of citation — not stylistic choices.
The enterprise problem
You almost certainly have PR infrastructure. The question is whether it is pointed at the right problem.
The 2% journalist overlap finding is the most commercially significant number in this report for an enterprise communications team. Your team is investing — with considerable time and budget — in a journalist list that has approximately 2% overlap with the writers whose work actually influences your AI citation profile. This is not a failure of execution. It is a failure of measurement. Traditional PR metrics — coverage volume, outlet tier, estimated reach — were never designed to track AI citation influence, and they do not.
The fix is a mapping exercise, not a bigger PR budget. Run the queries that matter across ChatGPT, Perplexity, Gemini, and Claude. Read every citation. Build the actual list of outlets whose work is appearing. Those are your 20. Realign pitching around them. Separately: the 35% decline in management consulting citations is a direct warning for any enterprise running a thought leadership programme through owned channels. Third-party corroboration of your claims creates AI citation value. First-party publication of those same claims does not.
The AI Visibility Audit maps your current citation profile against competitors, identifies your actual 20 outlets, and produces a prioritised remediation plan. Run an AI Visibility Audit →
The specialist problem
This is where query-type analysis matters most — because the aggregate statistics will actively mislead you if applied without segmentation.
For your branded queries, owned content citation rates run at 80–90%. This is not where your problem is. For your specialist and niche queries, owned content is still significant at 50–70%. If you have genuinely comprehensive, expert-level content in your practice areas, AI is citing it. Do not redirect that investment to earned media on the basis of a statistic that describes a different query type. For your general and comparison queries — the discovery layer where prospects who do not know your name are building a shortlist — earned media dominates. This is where your Floor Three deficit costs you real business.
Here is the thing most specialist firms do not see: they think their visibility problem is about content quality, because their content is genuinely good. The actual problem is that they are never in the shortlist for comparison queries in the first place — because AI’s selection mechanism operates before it ever reaches their content. Floor Two excellence does not help you if Floor Three trust has not put you in the consideration set.
The handover that never happens. Your prospective client has done their homework. They know your name, your specialism, your track record. They ask an AI assistant to help them make direct contact. If your entity foundation is incomplete, AI hedges, conflates you with a similarly named firm, or routes them through a generic contact page. The prospect chose you. The connection failed silently. No record that the enquiry existed.
The AI Visibility Audit diagnoses exactly which floor is failing and what to build next. Run an AI Visibility Audit →
The discovery problem
The aggregate statistics apply to you most directly. Your prospective customers do not know your name — they are asking AI for a recommendation, and AI is building the shortlist from which they will choose. In that query type, owned content accounts for only 20–40% of citations. Earned media is doing the work of getting you into the conversation at all.
The mistake most SMEs make is completing Floors One and Two — and expecting Floor Three outcomes. Schema tells AI what your content says. It does not tell AI whether to trust you. Trust happens off your site, through the independent sources AI already respects. No amount of structured data changes that.
The AI Visibility Audit tells you which floor you are on and what to build next. Run an AI Visibility Audit →
Where to start: the priority framework
Sequence matters. Floor Three work without Floor One in place is partially wasted.
This month — understand before you act
| Action | What it tells you |
|---|---|
| Run your own AI audit across four models | Ask ChatGPT, Perplexity, Gemini and Claude about your business, category, and competitors. Read every citation. Note every error, gap, and competitor that appears where you do not. |
| Map your query landscape | Identify which query types (branded / specialist / comparison / educational) drive the majority of your business. This determines your strategy. |
| Identify your actual 20 outlets | Which publications currently cite you in AI responses? Which cite your competitors? Which do AI models use for your sector? |
This quarter — build the foundation
| Action | What it builds |
|---|---|
| Complete Floor One | Schema markup, structured data, entity disambiguation, NAP consistency, Wikidata presence. Non-negotiable prerequisite. |
| Complete Floor Two | Content structure audit, answer-first reorganisation, extractability review of your highest-value specialist content. |
| Restructure press releases | Double the statistical density. Increase bullet structure. Raise the objective sentence rate. Structural characteristics of cited releases, not stylistic preferences. |
| Establish publishing cadence | Regular, consistent output. The first-7-day citation window rewards frequency alongside quality. |
This year — build the infrastructure that compounds
| Action | What it builds |
|---|---|
| Build relationships with your 20 | Targeted, sustained relationships with the outlets AI actually cites in your sector — not the most prestigious, but the most citation-influential for your query type. |
| Realign PR journalist targeting | Map current journalist relationships against AI citation patterns. The 2% overlap problem is solved by measurement, not by pitching more. |
| Corroborate your specialist claims | Get your key claims and statistics cited by independent sources — academic, journalistic, institutional. Third-party validation creates AI citation value. First-party publication of the same claim does not. |
What is shifting — and why the principle matters more than the source
Citation behaviour is not static. Muck Rack tracked substantial changes across all models between July and December 2025: Wikipedia reliance dropped sharply; management consulting citations fell 35%; press release citations rose 5×; YouTube spiked in Gemini responses then partially reverted; Reuters grew consistently across models.
The specific sources AI trusts will continue to change. Models are updated, retrained, and tuned continuously. The citation mix can shift quickly.
What does not change: AI consistently reaches for sources that are specific, recent, independently validated, and from outlets it has already learned to trust for the query context. Build for the principle, not for today’s source list.
The honest summary
The 82% earned media figure is accurate. It is also incomplete without the query-type context that shows where owned content still dominates — and applying it without that context produces the wrong strategy for a significant proportion of businesses.
The AEO industry’s focus on on-page structure and content extractability is not wrong. It addresses one floor of a four-floor building and presents it as the answer.
The correct question is not “owned or earned?” The correct question is: which query types drive my business, what does citation behaviour look like within those specific query types, and am I building all four floors in the right order?
AI visibility is not a content tactic. It is an evidence-and-selection problem. Businesses still treating it as a formatting exercise are optimising the wrong layer.
The businesses that get this right will be cited because AI has learned — from multiple independent sources — that they are worth citing. The businesses that do not will have increasingly well-structured content that AI retrieves from less and less.
The enquiry you never knew you lost will keep not arriving.
For the argument that underpins this guide in a single shareable piece, see AEO Is Solving the Wrong 8%. For the page-level extractability standard that Floor 2 requires, see the CITATE framework. For the formal AI visibility audit that maps where your business sits across all four floors, see the AI Visibility Audit.
Sources: Muck Rack Generative Pulse Report, December 2025 (1M+ AI response links, longitudinal analysis July–December 2025, models: ChatGPT, Claude, Gemini, Perplexity). University of Toronto AI citation study, September 2025 (13 industries). Directional analysis of AI citation behaviour across a specialist legal client’s query set in criminal defence, cross-referenced against live AI system responses, 2025 — intended to illustrate query-type variation, not establish universal benchmarks.