Complete Guide

CITATE: The Framework for AI-Citable Content

CITATE defines the threshold at which content becomes extractable, evidenced, and attributable enough for AI systems to cite. Six criteria across three paired layers — Structure (C1–C2), Evidence (C3–C4), Identity (C5–C6) — enforced in production across 30+ pages since March 2026. This is the canonical definition.

17 min read 3,302 words Updated Apr 2026

CITATE is a content citation framework that defines the threshold at which web pages become extractable, evidenced, and attributable enough for AI systems to cite with confidence. Developed by Sean Mullins, SEO Strategy Ltd, March 2026. Six criteria across three paired layers: Structure (C1–C2), Evidence (C3–C4), Identity (C5–C6). In production across 30+ pages since first deployment.

+14% aggregate citation lift from a declarative opening sentence — the only writing signal with universal positive effect across all seven verticals. DATE and NUMBER are the only entity types with universal positive citation lift in the same dataset. Independently corroborates CITATE C1 (declarative form), C3 (numbered statistic), and C4 (named source). Kevin Indig, Growth Memo, March 2026, 98,000 ChatGPT citation rows, 7 verticals

92.1% of AI citations in consumer electronics came from third-party authoritative sources — independent coverage, journalism, earned mentions — not from the brand's own content. The pattern held across 13 industries, with automotive at 81.9%. CITATE addresses the retrieval layer; this data describes the selection layer that precedes it. University of Toronto AI citation study, September 2025, 13 industries

82% of all AI citations come from earned media, with non-paid sources accounting for 94% — confirmed across over one million AI response links from ChatGPT, Claude, Gemini and Perplexity between July and December 2025. Corroborates the University of Toronto finding from a separate methodology and dataset. Muck Rack Generative Pulse, December 2025, 1M+ links

30–40% improvement in AI citation rates from structured content interventions consistent with CITATE criteria — the largest effect size measured in the GEO-Bench evaluation of generative engine optimisation techniques Princeton, GaTech, IIT Delhi GEO-Bench, 2024

14.2% conversion rate for traffic arriving via AI citation, compared to 2.8% for standard organic search — the commercial consequence of reaching the CITATE threshold and being named as a recommended source Seer Interactive, 12M visits, 2025

CITATE defines the threshold at which content becomes citable. It addresses one of five points where AI visibility breaks — content extractability — and it is the only point that can be fixed entirely through how a page is written. The framework maps to three paired layers — Structure, Evidence, Identity — and six specific criteria, C1 through C6.

CITATE was developed by Sean Mullins, Founder of SEO Strategy Ltd, in March 2026. It is filed as a UK trademark (Application UK00004359244, Classes 35 and 41, filed 22 March 2026). The associated domain, citate.uk, is registered and redirects to this page. CITATE is in production across 30+ pages on seostrategy.co.uk and referenced as the Layer 3 threshold in the AI Discovery Stack.

The commercial consequence of failing CITATE is direct: content that does not pass the threshold is retrieved by AI systems but not cited. It contributes to AI answers without the business being named. Understanding which criterion is failing — and it is usually C3, C4, or C5 — tells you exactly what to rewrite. Passing CITATE is necessary but not sufficient for named recommendation; it is the prerequisite for the corroboration and trust work that follows at Layers 4 and 5. For the implementation audit that applies these criteria page by page, see the AI Citation Checklist. For the consultancy that designs and builds CITATE-compliant content architecture, see LLM Optimisation services.

The framework is not theoretical. It has been in production since March 2026 across more than thirty pages on seostrategy.co.uk, enforced through a custom WordPress template that scores each page 1–6 in the admin bar and blocks publication of meta fields that would fail the criteria. Every page on this site that uses the page-ai-citable.php template is a live implementation of CITATE. The reference implementation is the GEO agency page, which moved from 4/6 to 6/6 when the framework was applied and has remained there.

This page is the canonical definition. It explains what CITATE means, what each criterion requires, where the framework applies, and where it does not.

In February 2023, Google’s Search Quality team published a definitive statement on AI-generated content: “Our focus on the quality of content, rather than how content is produced, is a useful guide that has helped us deliver reliable, high quality results to users for years.” The question that guidance leaves open is: what does quality look like when a machine evaluates it rather than a human? CITATE is the answer to that question. The six criteria are not new editorial standards invented for AI search — they operationalise what quality has always meant for retrieval systems. Can the content be extracted as a self-contained fragment (C1)? Does it define its terms so they can be understood without surrounding context (C2)? Does it include a specific, verifiable data point (C3) attributed to a named source (C4)? Does it declare the entity making the claim (C5) and commit to a position that can be attributed (C6)? Google has always rewarded content that meets these conditions. CITATE makes the threshold explicit, measurable, and enforceable at the infrastructure level.

Who CITATE is for

CITATE assumes the technical and authority foundations are already in place. It is the right framework for organisations that are already performing in traditional search — clean technical infrastructure, established domain authority, content that ranks and covers the topic — where the remaining gap is AI citation and named recommendation, not discovery or authority building.

For organisations still resolving crawl errors, building topical depth, or establishing link equity, the structural work must come first. AI optimisation applied to a broken foundation amplifies nothing. The independent SEJ/DAC framework for prioritising SEO vs AI search reaches the same conclusion: AI citation optimisation is the right lever for approximately 20% of brands — those already winning in traditional search. CITATE is the implementation standard for that 20%.

If you are not yet in that position, the correct sequence is: resolve the technical foundation → build topical authority → cross the AI Visibility Ceiling → then apply CITATE at the page level to ensure the content that already ranks also becomes extractable and attributable. The AI Visibility Action Plan diagnoses which stage applies to your site.

The threshold, not the checklist

CITATE is a threshold concept, not a checklist concept. The distinction matters.

A checklist is a list of things you should do. CITATE describes a state a page either reaches or does not. Below the threshold, AI systems may retrieve your content but will not cite it confidently. Above it, they extract from it, attribute it, and reference it by name. The difference is not marginal. Seer Interactive analysed twelve million visits and found AI-cited traffic converting at 14.2% versus 2.8% for standard organic — that gap exists because a visitor who arrives via AI citation has already received a recommendation, not just a result.

The distinction is clearest when contrasted with volume-driven content pipelines. An automated SEO pipeline can drive 59,000 daily impressions at a 2.2% click-through rate, sitting at average position 8–10. That is Stage 1 optimisation: retrieval. It reaches the pool. What it does not produce is named AI citation on commercial-intent queries — because retrieval and citation are different outcomes, determined by different signals. A page generating 59,000 impressions at 2.2% CTR from mid-page position is not the same outcome as a page cited in an AI Overview, converting at 14.2%. CITATE is built for the second outcome. Volume without citation is awareness without revenue.

Content that does not reach CITATE may be retrieved by AI systems, but it will not be cited with attribution. This is the line most businesses never cross — not because their content is poor, but because it was written for humans reading linearly, not for systems extracting fragments out of sequence.

The word CITATE was chosen deliberately. It is the Latin root of “to cite” — the act of formally referencing a source with attribution. A page that reaches CITATE is citable in the precise sense: an AI system can extract from it, attribute the extraction to a named source, and reuse it in a generated response without risk of misrepresentation.

The three layers

The six criteria map to three paired layers. Each layer addresses a different question AI systems ask when evaluating content for citation.

Layer	Criteria	The question it answers
Structure	C1 + C2	Can I extract and understand this?
Evidence	C3 + C4	Can I trust this?
Identity	C5 + C6	Can I attribute and recommend this?

A page that passes only one or two layers will not reach CITATE. The layers are not independent contributions — they are sequential dependencies. Structure without Evidence produces an extractable but untrustworthy answer. Evidence without Identity produces a trustworthy but unattributable claim. All three must be present.

Layer 1: Structure (C1–C2)

C1 — Standalone opening answer

The first 40–60 words of every section intended for AI extraction must be readable as a complete, standalone answer to the question that section addresses. Not an introduction. Not a signpost. An answer.

AI systems extract from the beginning of text blocks, not from the middle of them. A page that opens with “In this article, we will explore…” has failed C1 regardless of how good the content that follows is. The opening is what gets pulled. If it requires context from elsewhere on the page to make sense, it will not be extracted.

Failing: “AI visibility is a complex topic that many businesses are beginning to take seriously as search evolves.”

Passing: “AI visibility is the extent to which an AI system can find, understand, and confidently recommend a business. It is determined by five factors: entity recognition, content retrieval, content extractability, third-party trust, and recommendation eligibility.”

C1 sub-criterion — Declarative opening: The opening sentence must follow the form [X] is [Y] or [X] does [Z]. Remove all qualifiers, questions, and preamble from the opening sentence. If the opening contains “may,” “might,” “could,” “this guide,” “in this article,” or begins with “When” or “If,” it fails C1. Kevin Indig’s March 2026 analysis of 98,000 ChatGPT citation rows across seven verticals identifies the declarative opening as the only writing signal with universal positive citation lift — a +14% aggregate effect across all verticals tested. This is the highest-value single edit available at the sentence level, now confirmed independently by two large-scale datasets: Princeton/GaTech/IIT Delhi GEO-Bench (structured content interventions, +30–40% citation lift) and Indig’s 1.2M response dataset.

C2 — Explicit definition

Named concepts, frameworks, and technical terms must be explicitly defined on the page, not assumed. The definition must be a complete sentence of the form “X is Y” — not a description, not a parenthetical, not a link to another page.

AI systems build understanding from definitions. A page that uses a term without defining it forces the AI system to import a definition from elsewhere, which may not match your usage. When your definition is on the page, the AI system uses your version — and attributes it to you.

Failing: “Entity corroboration, which is central to AI recommendation eligibility, requires consistent third-party signals.”

Passing: “Entity corroboration is the accumulation of consistent, independent, third-party evidence about a business entity that increases AI systems’ confidence in naming it as a recommended provider.”

Layer 2: Evidence (C3–C4)

C3 — Statistic with context

At least one specific, quantified claim must appear on the page with enough surrounding context that the number is meaningful in isolation. The stat must be extractable without requiring the reader to know what came before it.

Numbers without context are not evidence. “Conversion rates improve significantly with AI citation” is weaker than “Seer Interactive found AI-cited traffic converting at 14.2% versus 2.8% for standard organic search, across twelve million visits analysed in 2025.” The second version can be extracted and reused. The first cannot be attributed to anything specific.

Failing: “Studies show significant improvements in conversion rates when businesses appear in AI-generated answers.”

Passing: “AI-cited traffic converts at 14.2% compared to 2.8% for standard organic search — a finding from Seer Interactive’s analysis of twelve million visits in 2025. The difference reflects the intent of a visitor who arrives having already received a recommendation.”

C4 — Named source

Every statistic must be attributed to a named, verifiable source. The source name must appear on the same page as the statistic, within the same paragraph or immediately adjacent to it. A footnote or a link to a sources page does not satisfy C4 — the source must be inline.

C3 and C4 are a pair. A stat without a source fails C4. A source without a stat fails C3. Both must be present together. AI systems use source attribution to calibrate their confidence in a claim — an unsourced number is treated as opinion, not evidence. A sourced number is treated as a fact that can be cited with attribution.

Failing: “According to recent research, 48% of Google searches now trigger an AI Overview.”

Passing: “48% of monitored Google searches now trigger an AI Overview, according to Ahrefs tracking data published in 2026.”

External validation — C3 and C4: Kevin Indig’s March 2026 analysis of 98,000 ChatGPT citation rows across seven verticals identifies DATE and NUMBER as the only entity types with universal positive citation lift — independent of vertical-specific factors. Pages that include a specific date and at least one number in the opening 200 words show consistent citation improvement across all seven verticals tested. This independently corroborates C3 and C4: named statistics with sources in the opening content are not a style choice, they are the quantifiable citation mechanism. The Princeton GEO-Bench finding (+30–40% from structured content interventions) and the Indig dataset are now two independent large-scale studies confirming the same mechanism from different methodologies.

C3 and recency — statistics decay. AI systems retrieve at query time, not crawl time. A statistic published in 2023 without a refresh carries lower citation confidence than an equivalent statistic from 2025 or 2026. This is not a soft preference — recency is one of the DATE entity signals Indig’s dataset shows having universal positive citation lift. The practical consequence: a page that passed C3 on publication will drift below threshold as its statistics age. Any C3 statistic more than 18 months old should be verified as still current or replaced with a more recent equivalent. The year of the source is part of the context requirement — not optional metadata. “Conversion rates improve significantly” fails C3. “AI-cited traffic converts at 14.2% versus 2.8% for standard organic, Seer Interactive, 2025” passes. The same claim dated 2022 would pass C3 structurally but would carry reduced citation weight due to age. On fast-moving topics — AI search behaviour, platform adoption rates, market share data — 12 months is the practical freshness window.

Layer 3: Identity (C5–C6)

C5 — Named entity

The page must declare, clearly and explicitly, who produced the content. This is not a byline requirement — it is a citation target requirement. AI systems need a named entity to cite. A page written in the first person plural without a named author or organisation cannot be cited by name. It can only be referenced as an unnamed source.

For individual practitioners and consultancies, this means naming the person. For organisations, it means naming the organisation. Ideally both. The name must appear in the content itself, not only in a footer or meta field.

Named entities are the prerequisite for recommendation, not just citation. Without a named entity, an AI system can extract content — but it cannot confidently recommend a provider. C5 is what separates being cited anonymously from being named specifically.

Failing: “Our team has been working in this space since 2005 and we’ve seen these patterns consistently.”

Passing: “Sean Mullins, founder of SEO Strategy Ltd, has been observing this pattern across B2B clients since 2020. The pattern is consistent: businesses that fail to appear in AI recommendations have typically never built the trust infrastructure, not the content.”

C6 — Attributable claim

The page must contain at least one specific, quotable statement — a claim that is precise enough to stand alone as an attributed quote in a third-party article or an AI-generated response. A claim is not an observation. It is a specific, defensible position that can be attributed to a named source.

C6 is the criterion most pages fail while thinking they have passed. “AI is changing search” is not an attributable claim — it is a platitude. “Most businesses optimising for AI citation are failing at Layer 4 — the trust and identity layer — not at the content layer” is an attributable claim. It is specific, it can be right or wrong, and it can be quoted.

Failing: “It’s important for businesses to think carefully about how AI systems understand and represent them.”

Passing: “Most businesses never cross the AI Visibility Ceiling — the threshold between being topically visible and being named as a recommended provider — because they optimise Stages 1 through 3 of the AI Provider Selection Pipeline while leaving Stages 4 and 5 entirely unaddressed.”

Where CITATE applies and where it does not

CITATE was built for retrieval-grounded AI citation — the mechanism by which AI systems like Google AI Overviews, Perplexity, and Microsoft Copilot select and extract content from indexed web pages in real time. For these systems, CITATE criteria are directly applicable and the 30–40% citation improvement figure from the Princeton/GaTech/IIT Delhi GEO-Bench research maps to structured content interventions consistent with the CITATE framework.

CITATE is less directly applicable to training data citation — the way AI systems incorporate knowledge from their training corpora. Training data selection operates on different signals over different timescales, and the criteria for influencing it are less well-understood and less controllable. CITATE will improve your content’s quality in ways that are likely to correlate with training data inclusion, but this is an indirect and unverifiable effect.

CITATE is also necessary but not sufficient for named AI recommendation. A page that reaches 6/6 CITATE will be extractable and attributable. Whether the business behind it gets named as a recommended provider depends on entity corroboration — the trust infrastructure built outside the page itself. Content reaching CITATE is a prerequisite for recommendation. It is not a guarantee of it. CITATE measures citation readiness, not behavioural validation. Whether users actively search for, click on, and engage with a business across surfaces is a separate layer that operates outside the framework’s scope. The full model for recommendation eligibility is at How AI Systems Decide Which Companies to Recommend.

CITATE and the selection layer

Understanding where CITATE sits in the full dependency model sharpens both its value and its limits. Two independent datasets now quantify what was previously an inferential argument.

In September 2025, researchers at the University of Toronto examined AI citation behaviour across 13 industries. In consumer electronics, AI cited third-party authoritative sources — independent coverage, journalism, earned mentions — 92.1% of the time. In automotive: 81.9%. The pattern held across sectors. Muck Rack’s Generative Pulse team, analysing over one million links from AI responses generated by ChatGPT, Claude, Gemini and Perplexity between July and December 2025, confirmed the direction: 82% of all AI citations come from earned media. Non-paid sources account for 94%.

The implication for CITATE is precise. CITATE addresses the retrieval layer — whether AI can extract a clean, attributable fragment from your page once it has decided to consult you. The data above describes the selection layer — whether AI decides to consult you at all. Selection is determined by what independent sources say about your entity, not by how your page is structured. These are categorically different problems operating at different floors of the same dependency model.

CITATE solves Floor 2. Floor 3 is where the 92% lives. You need both, in that order. A business that reaches 6/6 CITATE on every commercial page and has no Floor 3 trust infrastructure will produce well-structured content that AI retrieves from pages it rarely consults. Clean fragments, extracted infrequently. CITATE is the prerequisite for the corroboration work — not a substitute for it. The full floor model is documented at MCP vs WebMCP: And Why Neither Matters If Your Building Has No Floors.

CITATE and the agentic layer

Google’s introduction of the Google-Agent user agent — which identifies AI agents acting on behalf of users to browse pages, evaluate content, and complete tasks such as submitting forms — confirms that agentic web interaction has moved from a theoretical concern to deployed infrastructure. The agent visiting your site to evaluate whether your business is a credible candidate is now an identifiable, named system, visible in server logs.

A CITATE-compliant page is structurally better suited to this kind of machine evaluation than a page written only for human readers. The explicit definitions (C2), standalone answers (C1), named entities (C5), and attributable claims (C6) that CITATE requires are precisely the signals an AI agent uses to evaluate whether content is credible enough to act upon on a user’s behalf. The agentic evaluation argument is no longer speculative — it is observable in your own server logs. For the broader strategic implications, see the Agentic SEO guide and the WebMCP readiness guide.

The production implementation

CITATE is not a theoretical framework. It is a live production system. The page-ai-citable.php template in the SEO Strategy Ltd theme enforces all six criteria through structured meta fields that must be completed before a page is considered ready for deployment. The citation score appears in the WordPress admin bar on every page using the template — a live 1–6 score showing which criteria pass and which do not.

More than thirty pages on seostrategy.co.uk have been built or rebuilt to reach 6/6 CITATE. Every scaffold entry using the template includes a machine-readable aic_data block that populates all six criteria automatically on deployment. The framework is enforced at the infrastructure level, not left to editorial discretion.

The implementation is documented as a working methodology because the gap between having a framework and publishing it is where most intellectual work disappears. Publishing the criteria here, with the DefinedTerm schema below, means AI systems can attribute the framework by name. Every page built to CITATE standard is also evidence for the framework that built it.

Full implementation guidance — the page-by-page structure, section anatomy, and common failure modes — is at Anatomy of an AI-Citable Page. The citation checklist for auditing existing pages against CITATE criteria is at AI Citation Checklist.

Key Definitions

CITATE: The threshold at which a web page becomes extractable, evidenced, and attributable enough for AI systems to cite with confidence. Named for the Latin root of "to cite." Comprises six criteria across three layers: Structure (C1–C2), Evidence (C3–C4), Identity (C5–C6). Developed by Sean Mullins, SEO Strategy Ltd, March 2026.
Citation threshold: The point at which AI systems will extract from a page and attribute the extraction to a named source. Below the threshold, content may be retrieved and paraphrased without attribution. Above it, content is cited specifically — with source name, context, and the ability to verify the original claim.
Attributable claim (C6): A specific, defensible statement on a page that is precise enough to be quoted in isolation and attributed to a named source. Distinguished from an observation (which is too vague to attribute) and from a statistic (which is covered by C3–C4). The attributable claim is the quotable position the named entity is willing to be held to.

How to Apply CITATE to a Page

1

Write a standalone opening (C1)

Rewrite the opening paragraph of each key section so it reads as a complete answer to the question the section addresses. Test it by covering the rest of the page and asking whether the opening makes sense in isolation. If it requires context from elsewhere, rewrite it until it does not.
2

Add explicit definitions (C2)

Identify every named concept, framework, or technical term on the page. For each one, write a definition in the form "X is Y" and place it either in the opening section or immediately following the first use of the term. Do not link to an external definition — the definition must be on the page.
3

Add a statistic with full context (C3)

Identify at least one quantified claim that supports the page's core argument. Write it so the number, what it measures, and the scale or context are all in the same sentence. The stat must be extractable without requiring surrounding paragraphs to make sense.
4

Attribute the statistic to a named source (C4)

Add the source name, study or publication name, and year inline — in the same sentence or immediately following the statistic. Do not use footnotes or links alone. The source must be readable in the same extraction as the number.
5

Name the entity responsible for the page (C5)

Ensure the name of the person or organisation responsible for the content appears explicitly in the body text, not only in a footer or byline. For individual practitioners, use full name and role. For organisations, use the organisation name. Both is better.
6

Write one attributable claim (C6)

Identify the most specific, defensible position the page takes and write it as a single sentence that could be quoted in isolation. Test it by asking: could this sentence appear in a third-party article as a quote attributed to the named entity on this page? If not, make it more specific until it can.

Frequently Asked Questions

What does CITATE stand for?

CITATE does not stand for an acronym — it is the Latin root of "to cite," chosen because it describes precisely what the framework enables. A page that reaches CITATE is citable: an AI system can extract from it, attribute the extraction to a named source, and reuse it in a generated response. The six criteria (C1–C6) are the implementation of the framework, not the expansion of the letters.

How is CITATE different from E-E-A-T?

E-E-A-T (Experience, Expertise, Authoritativeness, Trustworthiness) is Google's framework for evaluating the credibility of content and its authors, primarily for ranking purposes. CITATE is specifically designed for AI citation eligibility — the conditions under which AI systems extract and reuse content in generated responses. The two overlap (both value named authorship and evidenced claims) but CITATE is operationally specific: it defines exactly what must be present in the content itself, enforced through structured meta fields, not evaluated subjectively by a human reviewer.

Does a 6/6 CITATE score guarantee AI citation?

No. CITATE is necessary but not sufficient for named AI recommendation. A page that reaches 6/6 will be extractable and attributable — the content-layer conditions are satisfied. Whether the business gets named as a recommended provider depends on entity corroboration: the trust infrastructure built outside the page itself, across review platforms, structured databases, editorial coverage, and Wikidata. CITATE addresses Stage 3 of the AI Provider Selection Pipeline (content extraction). Named recommendation is determined at Stages 4 and 5 (trust and recommendation layers).

Does CITATE apply to training data citation as well as retrieval-grounded citation?

CITATE was designed for retrieval-grounded AI citation — the mechanism used by Google AI Overviews, Perplexity, Microsoft Copilot, and similar systems that retrieve from an index in real time. For training data citation (how AI systems incorporate content from their training corpora), the signals are different and less directly controllable. CITATE will likely correlate with training data inclusion quality, but this is an indirect and unverifiable effect. The framework makes no specific claims about training data.

What is the reference implementation of CITATE?

The reference implementation is the GEO agency page at /llm-optimisation/geo-agency/ — the first page rebuilt specifically to reach 6/6 under the CITATE criteria. It moved from 4/6 to 6/6 when the framework was applied and has maintained that score. The criteria were subsequently deployed across 30+ pages using the page-ai-citable.php WordPress template, which enforces all six criteria through structured meta fields and displays a live citation score in the admin bar.

Can CITATE be applied without the SEO Strategy Ltd theme?

Yes. The CITATE criteria describe content requirements, not technical implementations. Any page — on any platform — can be audited against the six criteria manually. The theme implementation automates enforcement and scoring, but the framework itself is platform-agnostic. The AI citation checklist at /llm-optimisation/ai-citation-checklist/ provides a manual audit tool that applies the criteria without requiring any specific technology.

Founder of SEO Strategy Ltd with 20+ years in SEO, web development and digital marketing. Specialising in healthcare IT, legal services and SaaS — from technical audits to AI-assisted development.

Ready to improve your search visibility?

Book a free 30-minute consultation and let's discuss your SEO strategy.

Get in Touch