ChatGPT Search: The Largest AI Audience for Your Brand
ChatGPT has over 300 million weekly active users. More than any other AI platform. When OpenAI launched ChatGPT Search in late 2024 — integrating real-time web retrieval directly into ChatGPT’s interface, replacing the previous Bing-powered browsing with its own search infrastructure — it created the largest AI discovery surface on the internet. Getting your brand cited in ChatGPT Search is not an emerging opportunity. It is a present commercial reality, and for most B2B businesses, the gap between the opportunity and their preparation for it is significant.
This guide is specifically about ChatGPT Search: the web retrieval mode that activates when ChatGPT needs current or specific information beyond its training knowledge. This is distinct from ChatGPT in conversational mode (drawing on training data only) and from the original Bing-browsing feature it replaced. Understanding the distinction matters because the optimisation approaches differ — and most guidance currently available conflates these modes in ways that produce confused strategy.
ChatGPT SEO sits within the broader LLM Optimisation framework alongside Perplexity SEO, Google AI Overviews Optimisation, and AI Agent Optimisation. Each platform has its own retrieval mechanics and citation patterns. This guide covers what is specific to ChatGPT — the source authority model, the training-versus-retrieval decision, the citation format, and the content characteristics that consistently produce citations in ChatGPT Search responses.
How ChatGPT Search Works
ChatGPT’s search capability uses OAI-SearchBot — OpenAI’s web crawler — to retrieve current content from the web. Unlike a pure language model that draws exclusively from training knowledge, ChatGPT with search enabled retrieves real-time web content for queries where current information, specific facts, or recency matters. The retrieved content is combined with the model’s training knowledge to generate a response, with citations attached to claims drawn from web sources.
When ChatGPT Triggers Web Search
ChatGPT does not search the web for every query. According to OpenAI data reported in 2025, web search is triggered for approximately 31% of ChatGPT prompts — primarily those involving current events, specific factual lookups, product and vendor research, pricing queries, and any topic where the model detects that its training knowledge may be outdated or insufficient. For queries about well-established concepts, definitions, or stable information, ChatGPT draws from training data without retrieval.
This training-versus-retrieval decision has direct implications for GEO strategy. Content that appears in ChatGPT responses without search being triggered is there because of training data inclusion — the model learned about your brand during its training process, before its knowledge cutoff. Content that appears when search is triggered is there because of real-time retrieval quality — your site was found, evaluated and cited in the search session. These are two distinct mechanisms requiring different strategies: training data presence (brand authority, widespread mentions, entity recognition across the web) and retrieval quality (the same content optimisation factors that drive Perplexity and other AI search citations).
OAI-SearchBot and Indexing
OAI-SearchBot is OpenAI’s dedicated search crawler, distinct from GPTBot (which crawls for model training). OAI-SearchBot crawls for real-time retrieval purposes — the content it accesses is used in ChatGPT Search responses, not stored for training. This is an important distinction: blocking GPTBot prevents your content from being used in training data but does not prevent ChatGPT Search from citing you. Blocking OAI-SearchBot prevents real-time citation but does not affect training data. Most businesses should allow both.
Verify your robots.txt allows OAI-SearchBot. Check your server logs for OAI-SearchBot crawl activity. If you see no OAI-SearchBot activity, your citation gap is at the access layer — the simplest and most fixable problem in ChatGPT SEO. Ensure your priority pages serve complete HTML under two seconds, with all structured data present server-side before JavaScript execution. OAI-SearchBot, like other AI crawlers, has limited tolerance for slow-loading or JavaScript-dependent content.
The Citation Format: Inline Links vs Numbered Citations
ChatGPT Search uses inline citation links rather than Perplexity’s numbered source panel. This makes ChatGPT citations less visually prominent — users see an inline footnote marker rather than a numbered list of sources at the side. But inline citations are still clearly visible and clickable, and the brands cited within ChatGPT’s answers still receive the credibility signal of being selected by the AI as a reliable source. The traffic impact from ChatGPT citations can be substantial: with 300 million+ weekly active users, even a modest citation rate across relevant queries translates to meaningful referral traffic volume.
The Three Routes Into a ChatGPT Answer
Understanding how content reaches a ChatGPT answer matters more than understanding which on-page factors correlate with citation, because the route determines the optimisation work. There are three distinct routes — and businesses that conflate them tend to misdirect their effort.
Route 1: Training Data Presence
The first route is the one most businesses never consciously work on: appearing in ChatGPT’s training data. When ChatGPT responds without triggering a web search, every reference it makes — whether or not it cites a source link — is drawn from what the model learned during training. Brands and entities that appeared prominently and accurately in the web content that fed model training are recognised by ChatGPT before any search happens. The model does not pull this information in real time; it has already internalised it.
You cannot directly influence what is in any past training cycle — the knowledge cutoff is fixed. But you can build the brand signals that ensure the next training cycle contains accurate, positive representations of your entity. Wikidata entries (see our Wikidata for SEO guide), Wikipedia presence where eligible, consistent press mentions in named industry publications, LinkedIn company page completeness, named expert quotes in third-party articles, conference proceedings and podcast appearances all contribute to training data presence. The work compounds slowly and asymmetrically — you cannot verify uptake until the next training cycle, but by then you cannot retroactively add yourself either.
Route 2: Real-Time Web Retrieval via OAI-SearchBot
The second route is what most GEO discourse refers to when it says “ChatGPT SEO”: real-time web retrieval. When the model determines that current or specific information is needed — recency queries, pricing, vendor research, comparisons, specific factual lookups — it calls its internal web tool to retrieve fresh content. OAI-SearchBot is the dedicated crawler that fetches this content. The retrieval pool is then evaluated and the model decides which sources to cite in the synthesised answer.
This is the route most directly responsive to on-page optimisation. The retrieval quality factors that drive citation here are the same ones that drive Perplexity citation: clean server-rendered HTML, fast load times, node architecture, content density in the first 30% of the page (per Kevin Indig’s 1.2 million citation analysis published in Growth Memo February 2026, 44.2% of citations come from the opening third of cited pages), structured data, factual specificity, named entities, dated content. If OAI-SearchBot cannot crawl you, you are absent from Route 2. If it can crawl you but your content is not extractable, you are crawled but uncited.
Route 3: The Bing Index Dependency
The third route is less discussed but operationally important: for a significant proportion of queries, ChatGPT Search’s retrieval pool draws from the Bing index rather than crawling the web independently. The same Microsoft infrastructure that powers Copilot retrieval also informs ChatGPT’s retrieval candidate set for many query types. This means Bing indexing health — sitemap submission to Bing Webmaster Tools, IndexNow configuration, Bing-specific accessibility — matters for ChatGPT citation in ways that are easy to overlook because the connection is not visible in OAI-SearchBot logs alone.
Most businesses optimise heavily for Google indexing and treat Bing as an afterthought. For ChatGPT citation, this is a structural blind spot. See our Bing AI visibility guide for the indexing work that supports both Copilot and ChatGPT retrieval routes.
Why the Three-Routes Distinction Matters
A page that performs perfectly on retrieval quality (Route 2) but whose business has minimal training data presence (Route 1) will be cited when search is triggered — but absent from the majority of ChatGPT responses where search is not triggered. A business with strong entity authority (Route 1) but technical accessibility problems (Route 2 and Route 3) will be discussed without citation. The strategic question is which route is currently your weakest, because that is the route that determines your ceiling.
How ChatGPT Evaluates Sources: Key Differences from Perplexity
ChatGPT Search and Perplexity share the same fundamental content quality evaluation criteria, but they weight signals differently in practice. Understanding these differences determines how you prioritise your optimisation efforts across both platforms.
Authority Weighting Is Higher
ChatGPT Search applies stronger weighting to overall domain authority than Perplexity does. Well-known brands, high-authority domains and recognised industry publications are preferentially cited over equally well-written content from lower-authority domains. This means ChatGPT can be harder to break into for newer or smaller businesses, but it also means that authority-building work — acquiring authoritative backlinks, building entity recognition across the web, establishing consistent brand signals — produces measurable citation improvement on ChatGPT in ways that less authority-sensitive platforms do not reward as directly.
For specialist businesses, the strategy is topical authority rather than overall domain authority. ChatGPT’s authority evaluation is topic-specific: a domain that is not particularly well-known overall but that has demonstrable, concentrated expertise on a specific topic will be cited for that topic over a higher-authority generalist domain with thinner coverage. This is the same mechanism that drives Google’s expertise signals — and it is why building a deep content ecosystem around your core specialisms, rather than covering broad categories shallowly, is the most effective long-term strategy for ChatGPT citation.
Fewer Citations, Higher Quality Threshold
ChatGPT Search typically cites two to four sources per answer, compared to Perplexity’s four to six. This higher selectivity means the quality threshold for citation is higher — but it also means that getting into ChatGPT’s citation set for a query represents a stronger competitive advantage. When ChatGPT cites you among two or three sources, the implicit recommendation is more concentrated than appearing as source six of eight in a Perplexity response.
The practical implication: for ChatGPT optimisation, focus depth over breadth. Two or three outstanding, highly authoritative pages on your core topics will outperform ten adequately optimised pages. ChatGPT’s source selection is not tolerant of mediocrity — the bar for inclusion is higher, but the reward for clearing it is proportionally greater.
Training Data Presence as a Foundation
ChatGPT’s unique characteristic among AI search platforms is the combination of training knowledge and real-time retrieval. Brands and entities that appear prominently in ChatGPT’s training data are treated as recognised entities by the model — which influences how confidently it cites them in search responses. A brand that ChatGPT “knows” from training is cited with higher confidence than an equally authoritative brand that it is encountering primarily through real-time retrieval.
You cannot directly influence what is in ChatGPT’s training data, which has a cutoff date. But you can build the brand signals that make your entity recognisable when the next training cycle occurs, and you can ensure that your entity’s digital footprint is consistent and authoritative enough that ChatGPT’s training data contains positive, high-quality representations of your brand. Wikidata entries, consistent press mentions, published case studies referenced by others, industry awards, and LinkedIn company page completeness all contribute to training data presence in ways that make ChatGPT’s search-mode citations more confident. See our Entity Authority Checklist for the full framework.
Content Comprehensiveness vs Atomic Sections
Perplexity retrieves at the chunk level — individual sections and paragraphs. ChatGPT’s retrieval is more holistic: it evaluates page comprehensiveness as well as specific claim extractability. A page that comprehensively covers a topic from multiple angles — definition, comparison, how-to, use cases, limitations, evidence — scores higher in ChatGPT’s source evaluation than a page that covers one angle very specifically. This does not contradict the node architecture principle — it extends it. Each section should be independently citable (for Perplexity and other platforms), and the page as a whole should be comprehensively authoritative (for ChatGPT’s holistic evaluation).
ChatGPT’s Query Decomposition and Fan-Out
Like other AI search platforms, ChatGPT decomposes complex queries into multiple sub-queries before retrieving sources. When search is triggered, the model generates a set of search terms that collectively address the underlying information need, retrieves results for each, and synthesises a response from the combined retrieval set.
ChatGPT’s fan-out pattern differs from Perplexity’s iterative Pro Search model. ChatGPT typically generates its sub-queries in a single decomposition step rather than iteratively refining them across multiple rounds. This means the initial query decomposition is broader — covering more angles simultaneously — but the total retrieval depth per sub-query is shallower than Perplexity Pro Search’s iterative passes. Content that is highly relevant to a broad range of sub-query phrasings performs consistently in ChatGPT retrieval; content optimised for a narrow keyword cluster may only appear when ChatGPT’s decomposition happens to generate that specific phrasing.
The implication is consistent with the broader GEO principle: semantic coverage of a topic’s full conceptual territory is more stable than narrow keyword targeting. Write for the range of ways a question could be phrased, not for one specific phrasing. The Content SEO node architecture approach — comprehensive coverage structured in independently extractable sections — is the content strategy most durable across ChatGPT’s variable sub-query generation.
Training Data vs Real-Time Retrieval: A Practical Strategy
The dual-channel nature of ChatGPT’s knowledge — training data plus real-time retrieval — creates a two-stage strategy for ChatGPT SEO that no other AI platform requires to the same degree.
Stage 1: Entity recognition in training data. Build the brand signals that ensure ChatGPT’s training data contains accurate, positive representations of your entity. This means: consistent brand mentions across authoritative web sources, a complete and accurate Wikidata entry (see our Wikidata for SEO guide), active Wikipedia presence if eligible, comprehensive LinkedIn company and personal profiles, press mentions in industry publications, and third-party content that references your brand with accurate descriptions of your expertise. None of this is directly controllable, but all of it is achievable through consistent brand-building activity. The brand that ChatGPT “knows” confidently from training will be cited more confidently in search responses.
Stage 2: Retrieval quality for current content. For real-time retrieval, the same content optimisation principles that drive Perplexity citations apply: node architecture, factual specificity, structured data, entity consistency, freshness. Ensure OAI-SearchBot access. Maintain page speed. Keep priority content substantively updated. Implement schema markup server-side. These retrieval quality signals are what get you cited when ChatGPT searches the web for current information.
The businesses that perform best in ChatGPT citations are those that invest in both stages simultaneously — not just content optimisation for retrieval, but the broader brand authority building that makes ChatGPT’s model recognise and trust your entity before it even performs a web search.
Content Strategy for ChatGPT Citations
Given ChatGPT’s higher authority weighting and holistic page evaluation, the content strategy for ChatGPT SEO emphasises depth and credibility over volume and breadth.
Demonstrate Genuine Expertise, Not Category Coverage
ChatGPT’s source selection is particularly sensitive to the difference between genuine expert content and surface-level category coverage. Pages that reflect hands-on experience — specific client results, tested methodologies, named examples, practitioner-level detail — are evaluated as more credible than pages that rehash publicly available information in new words. When we write about GEO strategy on this site, it reflects two years of systematic testing across client engagements, not a summary of other practitioners’ published work. That experiential grounding is visible in the content — specific data points from real projects, honest acknowledgements of limitations, methodology explanations that go beyond the generic — and ChatGPT’s source evaluation can distinguish it from content that does not have the same basis.
Authoritative Source Attribution
ChatGPT’s citation model responds strongly to claims that are backed by named, authoritative sources. Internal data points (your own client results, your own research) have some value, but they carry more weight when combined with references to external authoritative sources — peer-reviewed research, industry surveys from recognised organisations, official regulatory guidance, named expert perspectives. Content that synthesises multiple authoritative sources and attributes each claim correctly provides ChatGPT with the source chain it needs to cite you confidently. GEO-Bench data from Princeton, Georgia Tech and IIT Delhi; Ahrefs divergence research; Similarweb query fan-out findings — these are the kinds of named, attributable sources that elevate a page from opinion to evidence-based analysis in ChatGPT’s evaluation.
E-E-A-T Signals at Author and Domain Level
ChatGPT’s training data contains extensive information about Google’s E-E-A-T framework, and its source evaluation reflects these criteria — not because it follows Google’s guidelines, but because the same quality signals that define E-E-A-T are genuinely associated with more credible, more citable content. Experience (demonstrated first-hand engagement with the topic), Expertise (topical depth and consistency), Authoritativeness (third-party recognition and reference), and Trustworthiness (consistent entity signals, verifiable claims, transparent attribution) all function as positive signals in ChatGPT’s source evaluation.
Implement author attribution on all content: named author, author bio linking to credentials page, author schema with expertise areas and sameAs links. Implement Organisation schema with comprehensive knowsAbout properties. Publish and maintain an About page that clearly establishes the entity’s expertise basis. These E-E-A-T signals are particularly important for ChatGPT because its authority evaluation leans heavily on entity recognition — being clearly identified as an expert entity, not just an anonymous domain with good content.
Case Studies with Specific, Verifiable Metrics
Case studies with quantified results are among the highest-value content types for ChatGPT citation in commercial and professional services categories. When a user asks ChatGPT “which SEO agencies have delivered measurable results for SaaS companies in the UK?”, ChatGPT searches for content that demonstrates specific, verifiable results — not general descriptions of services. Named clients, specific metrics, and outcomes that could in principle be verified are the evidence standard ChatGPT’s citation model applies. Our work with Azure Outdoor Living reaching seven-figure turnover through SEO, Motoring Defence Solicitors achieving seven position-one rankings, and Coviant Software generating 200+ enterprise leads — these are the specific, attributed proof points that ChatGPT cites when answering research-intent queries about SEO results.
The Concentration Problem: ChatGPT Citation Is a Limited-Seat Game
The most important strategic property of ChatGPT citation is one that practitioners often discover only after months of optimisation work fails to produce visible results: citation distribution is highly concentrated. Kevin Indig’s March 2026 analysis of approximately 98,000 ChatGPT citation rows from 1.2 million responses (data from Gauge, published in Growth Memo) found that the top 10 domains capture 46% of all citations within a topic, and the top 30 capture 67%. Indig’s description is direct: there are effectively around 30 seats at the citation table for any given topic, and everything else is nearly invisible.
This concentration is slightly less extreme than traditional organic search, but it is still extreme. It is also self-reinforcing. The citation distribution feeds back into training data on the next model cycle (the cited brands become more recognised entities), which compounds Route 1 advantage for those already in the top 30, which makes their retrieval candidacy stronger on Route 2 and Route 3 retrieval. The seats accrue to those who already hold seats.
The Bigfoot Effect: When Citation Concentrated Further
The concentration is also moving in the wrong direction for newer entrants. On 4 March 2026, ChatGPT switched its default model from GPT-4o/5.2 to GPT-5.3 Instant. Analysis by Resoneo, the French SEO consultancy, using data from the Meteoria monitoring platform (400 daily prompts tracked across 14 weeks, 27,000 comparable responses) measured the impact: the average number of unique domains cited per response dropped from 19 before the transition to 15 after — a reduction of roughly 20%. The citation surface in each response had not shrunk; the number of websites sharing that surface had. Same pie, fewer slices.
Resoneo named the phenomenon the Bigfoot Effect after Dr Pete Meyers’ 2012 Moz observation that Google would sometimes let a single domain occupy the entire first page of organic results, leaving a massive footprint. Independent log analysis from Jérôme Salomon at Oncrawl confirmed the pattern through ChatGPT-User bot crawl volume settling at a lower level over the same period. Some pages were no longer being crawled at all. The root cause is structural: more than 90% of ChatGPT’s weekly users are on the free tier, and the default tier triggers fewer web searches, uses fewer queries, and produces fewer citations than the paid model behaviour the early studies were partly trained on. OpenAI subsequently replaced GPT-5.3 Instant with GPT-5.5 Instant in May 2026; the broader concentration trend has not reversed.
What the Concentration Data Implies for Strategy
The strategic implication is uncomfortable but operationally clear. ChatGPT optimisation cannot be approached as a generic SEO discipline applied to a new platform. It is a competitive contest for a limited number of citation seats per topic, where the seats are concentrated, self-reinforcing, and structurally narrowing as default models reduce retrieval volume. Three implications follow.
First, broad coverage strategies do not work. The teams winning citations are not covering hundreds of keywords thinly; they are building deep, comprehensive coverage of a small number of topics where they can credibly compete for a seat. Indig’s data shows that pages above 20,000 characters average 10.18 citations each compared to 2.39 for pages under 500 characters — but the relationship is non-linear and topic-specific. Length without authority is not the lever.
Second, citation reach (the number of distinct prompts a domain is cited across) is a more useful metric than raw citation count. A domain cited many times for one query is less valuable than a domain cited fewer times across many distinct queries. The strategic move is to architect content around query clusters rather than individual keywords — owning the conceptual territory of a topic rather than ranking for one phrasing.
Third, in high-concentration verticals (Indig identifies Education and Crypto as examples), the realistic path is becoming the definitive resource on a specific sub-topic rather than competing for top-30 status across an entire vertical. The 30-seat table is per topic; carving out a defensible sub-topic where you are unambiguously among the top three or four credible sources is more achievable than fighting for a generalist seat. This is the same logic that drives our specialist client work: Motoring Defence Solicitors dominates motoring law in the UK rather than fighting for general legal SEO citation; Azure Outdoor Living dominates premium outdoor living rather than competing across all home improvement.
ChatGPT Search and Structured Data
Structured data plays a similar role in ChatGPT Search as in other AI platforms — it makes your content machine-readable, reduces ambiguity in entity identification, and provides clean, extractable content that the model can cite with higher confidence. The specific schema types most relevant to ChatGPT citation are consistent with the broader GEO schema stack.
Organisation schema with knowsAbout establishes your entity’s expertise associations at a machine-readable level. ChatGPT’s authority evaluation is partly based on whether it can confidently identify your entity and its expertise domain — Organisation schema makes this identification explicit rather than inferred from prose. Include sameAs links to every verified profile: LinkedIn company page, Wikidata entry, industry directories, Companies House, Google Business Profile. Each sameAs link is a cross-reference that strengthens ChatGPT’s confidence in your entity.
Person schema for key individuals matters significantly for ChatGPT, because the model’s authority evaluation incorporates individual expertise signals alongside domain signals. A consultancy whose principal is a named, described, credentialled individual with a schema-backed profile is evaluated as more authoritative than an equivalent consultancy whose people are anonymous. Implement Person schema with jobTitle, alumniOf, knowsAbout, and sameAs links to the individual’s LinkedIn, Wikidata entry (if present), and other verified profiles.
FAQPage and HowTo schema function as they do across other AI platforms — providing structured, extractable content units that ChatGPT can cite with machine-readable precision. Ensure all schema is server-rendered, valid, and present on every priority page. See our JSON-LD implementation guide for the implementation approach we use across client sites.
Measuring ChatGPT Citation Performance
ChatGPT citation measurement is harder than Perplexity measurement because ChatGPT’s citations are less visually prominent and its responses are more variable between sessions. The same query submitted twice may produce different source citations as ChatGPT’s retrieval generates different sub-queries. This variability is not a measurement failure — it reflects how probabilistic AI retrieval works. The goal is to increase citation probability across your target queries, not to lock in a deterministic citation.
Manual Citation Testing
Monthly manual testing remains the primary measurement approach. Using ChatGPT with search enabled (not the default conversational mode), run your 20 to 30 priority queries and record citation presence. For each query, run it two to three times across different sessions to account for variability. Document the trend over time — are you being cited in more sessions, for more queries, in higher-priority positions in the answer? The trend matters more than any individual session result.
Referral Traffic from ChatGPT
ChatGPT Search drives measurable referral traffic. In your analytics, look for sessions from chatgpt.com — these are users who clicked through on a citation link from a ChatGPT response. The volume will depend heavily on how frequently you are cited, how prominent the citation is, and how click-worthy the cited content is. ChatGPT-referred sessions tend to be high-quality: these users have been pre-qualified by the AI’s recommendation and are typically in an active research or evaluation phase. Track conversion rates from ChatGPT referrals specifically — they often outperform other referral sources on a per-session basis.
Training Data Presence Testing
Separately from search-mode citation, test how ChatGPT describes your brand in conversational mode (search not triggered). Ask “what do you know about [your brand]?” or “who are [your brand] and what do they specialise in?” The accuracy and completeness of this response reflects your training data presence. If ChatGPT doesn’t know your brand, describes it inaccurately, or confuses it with another entity, you have an entity authority gap that affects both training data presence and search-mode citation confidence. Our Wikidata for SEO guide and the broader AI Visibility Pyramid cover the entity foundation work that addresses this gap.
For the complete AI citation measurement framework across all platforms, see our guide to getting cited by AI, which includes the five-step self-audit workflow that identifies whether your citation gaps are at retrieval eligibility, source selection, or answer inclusion level.
Practical Checklist: Five Things to Audit on Your Highest-Value Pages
Before you invest in broader brand authority work, audit your highest-value commercial pages against these five concrete factors. Each one is observable, fixable, and supported by the published research on ChatGPT citation patterns. Run the audit against your top 10 to 20 commercial pages first — the ones that need to earn citation for your most strategic queries.
1. Opening Density: Is the Answer in the First 30%?
Per Kevin Indig’s analysis of 1.2 million ChatGPT responses, 44.2% of citations come from the first 30% of cited pages. The implication is direct: if your page opens with marketing prose, scene-setting, or an extended introduction before the actual answer arrives, you are losing citation probability on the most extractable section of your content. Open with the answer. Use definitive language (“X is defined as…”, “X refers to…”) — Indig’s data shows pages with definitive opening language are nearly twice as likely to be cited (36.2% vs 20.2%). The supporting context, nuance, and elaboration belong below the fold, not above it.
2. OAI-SearchBot Access: Are You Crawlable in Reading Mode?
Verify robots.txt allows OAI-SearchBot. Check server logs for OAI-SearchBot activity in the last 30 days. According to Search Engine Land’s October 2025 reporting, 46% of ChatGPT bot visits begin in reading mode — a plain HTML version of the page with no JavaScript, images, or schema executed. If your page is JavaScript-rendered, schema-only client-side, or hidden behind paywalls or cookie walls, you are crawled but unparseable. After landing on a page, 63% of ChatGPT agents leave immediately if the content is not extractable. Server-render the answer. Place schema in the HTML, not injected by JavaScript after page load.
3. Title-to-Query Alignment
AirOps and Growth Memo’s analysis of 16,851 ChatGPT queries against 353,799 retrieved pages found that pages with a heading match (cosine similarity 0.90 or higher to the query) achieve a 41% citation rate, compared to 30% for pages below 0.50. Title-to-query alignment was the strongest single content signal in the dataset — stronger than word count, heading count, fan-out coverage, or domain authority. The implication is concrete: tighten your title tags to match the phrasing AI models actually use in their sub-queries, not just the phrasing humans type into Google. For your top commercial pages, manually prompt ChatGPT with your target query and note the exact wording the model uses in its sub-queries. Match that phrasing in your H1 and primary H2.
4. Citable Entities: Named Sources, Named People, Specific Numbers
Indig’s entity analysis showed that DATE and NUMBER entity types are universal positive citation predictors. Specific statistics with named sources, specific dates on content, named individuals with credentials, and named third-party sources all elevate citation probability. The reverse is also true: pages relying on Knowledge Graph-verified mega-entities (major brands, famous institutions) tend toward generic coverage, which ChatGPT does not preferentially cite. Niche-specific entities and precise numbers outperform brand-name density. Add publish dates, last-updated dates, and at least one specific number with a named source on every commercial page. Replace abstract claims with quantified ones. The data is unambiguous.
5. Fan-Out Coverage Without Sub-Topic Sprawl
ChatGPT decomposes complex queries into multiple sub-queries before retrieving. AirOps research published in March 2026 found that 95% of fan-out sub-queries have zero monthly search volume by traditional metrics — meaning conventional keyword research misses approximately one-third of citation opportunities. But Growth Memo’s April 2026 follow-up found that pages covering 26 to 50% of ChatGPT’s fan-out sub-queries get cited more than pages covering 100%. The balance: address the fan-out, but do not chase universality. Identify the specific sub-queries ChatGPT generates for your target queries (run the queries manually, document the follow-up searches), then ensure your page covers the commercially relevant ones substantively rather than skimming all of them.
How to Run This Audit
Take your top 10 commercial pages. For each one, run the target query in ChatGPT with web search enabled, document the citations and fan-out sub-queries, then check the five factors above. Most pages will fail on two or three of the five. The fixes are within engineering reach for any team that controls its CMS — this is not a structural rewrite, it is a series of specific edits to existing high-value pages. The audit work itself takes a focused day. The implementation work that follows depends on how much your pages already do right, but the typical pattern is two to three weeks for a small portfolio, with measurable citation movement visible within 30 to 60 days of the implementation completing — on the assumption that your Route 1 entity authority is already adequate to support the Route 2 retrieval quality work.
ChatGPT vs Perplexity vs Google AI Overviews: Where to Focus First
Businesses starting their AI citation strategy face a practical prioritisation question: which platform do you invest in first? The honest answer depends on your business and audience, but the following framework applies to most B2B and professional services businesses.
Start with Perplexity for measurement and learning. Its transparency (Steps tab, numbered citations, attributable referral traffic) makes it the best platform for understanding what works and what doesn’t. Build your measurement discipline and iterate on content quality using Perplexity as your feedback mechanism.
Invest in ChatGPT for scale. With 300 million+ weekly active users, ChatGPT is the highest-volume AI discovery surface. The authority signals and content quality that drive Perplexity citations carry over to ChatGPT — but ChatGPT also rewards the entity authority building (Wikidata, cross-platform brand consistency, third-party mentions) that Perplexity is less sensitive to. Add this brand-level authority work as your second strategic layer.
Layer in Google AI Overviews through traditional SEO. AI Overviews source primarily from pages that already rank organically on Google, making strong organic rankings a prerequisite. If your traditional SEO foundations are solid, AI Overviews performance follows. If not, fix your organic rankings first — see our Technical SEO and Content SEO service pages for the foundation work, and our Gemini SEO guide for the platform-specific signals that determine citation versus anonymous content extraction.
The good news is that the shared foundations — content depth, entity authority, structured data, freshness — serve all three platforms simultaneously. You are not running three separate strategies; you are building one authoritative content and entity presence that platforms evaluate differently. The LLM Optimisation framework we use with clients integrates these platform-specific considerations into a single, coherent strategy rather than treating each platform as a separate workstream.