Voice search is declining. AI voice delivery is rising. Here’s what that means for structured data, content architecture, and the businesses that want to be cited when AI speaks.
Google built Speakable schema for a world where voice assistants read web content aloud. That world is arriving — just not in the way anyone expected.
Voice search queries are collapsing. In the UK, “voice search SEO” dropped 81% year-on-year. “Voice search optimisation” fell by the same margin. “Voice search schema” hit -100%. The command-based “Hey Google, find me a plumber” era peaked and is now in structural decline.
But AI voice delivery is surging. ChatGPT’s voice mode has tens of millions of active users. Google AI Overviews — up 179% year-on-year in the UK, 234% in the US — increasingly power spoken summaries. Perplexity reads synthesised answers aloud. Every major AI platform is moving toward delivering content audibly to users who never typed a search query.
This is not the same thing as voice search. And that distinction changes everything about how you should think about structured data, content architecture, and what it means to be “optimised” for the AI delivery layer.
The Two Waves of Voice: Command vs. Conversation
Voice search — the first wave — was transactional and command-based. “What’s the weather?” “Call Mum.” “Navigate to Tesco.” Google built Speakable schema for this wave: a structured data type that lets publishers mark specific content sections as suitable for text-to-speech playback.
Speakable was a sensible idea for its time. If a voice assistant needed to read something from your website, you could tell it which bits to read. The cssSelector property points at specific HTML elements — your page title, your opening paragraph, your key headings — and says “these are the parts that work as spoken content.”
The problem: almost nobody implemented it. It remains in beta, limited to news content in English. Google’s own documentation still lists it as an experimental feature. In the UK, “speakable schema” gets 10 searches per month. In the US, 50 — though that 50 represents +150% year-on-year growth, which tells you something about early adopters paying attention.
The second wave is fundamentally different. AI voice delivery doesn’t read your content aloud — it synthesises new content from your content, then speaks that synthesis. ChatGPT doesn’t visit your page and read paragraph three into a microphone. It ingests your content, extracts the relevant information, generates a new response, and delivers that response conversationally — sometimes with voice, sometimes with text, always as a reformulated summary rather than a direct reading.
This means the entire paradigm has shifted. The question is no longer “which parts of my page sound good when read aloud?” The question is “which parts of my page will an AI system reliably extract, understand, and cite when generating its own answer?”
What Google Was Trying to Solve — and Why It Still Matters
Speakable schema was Google’s first attempt at solving a real problem: how do publishers maintain some influence over how their content is delivered when the delivery mechanism isn’t a blue link on a results page?
That problem hasn’t gone away. It’s become vastly more important. AI Overviews, ChatGPT, Perplexity, and every emerging AI search platform all face the same challenge: they need to select, extract, and reformat content from publishers. The publishers who make that extraction reliable, unambiguous, and high-quality get cited. The ones who don’t get ignored or, worse, misrepresented.
Speakable was a narrow, mechanical solution — point at CSS selectors, mark them as speakable. The broader solution, the one that actually works for the AI summary layer, is structural. It’s about how your content is architected, not which HTML classes you tag.
But here’s the thing: the mechanical solution and the structural solution aren’t in conflict. They’re layers. And the businesses that implement both are the ones building compounding advantage.
The AI Summary Layer: What Actually Controls How AI Delivers Your Content
If you want AI systems to accurately represent your brand, cite your expertise, and recommend your services when they generate spoken or written summaries, you need to control four things.
1. The Opening Declaration
The first 120–150 words of any page are disproportionately influential in AI extraction. This is where large language models form their initial understanding of what the page is about, who wrote it, and what authority it carries.
Most websites waste this space. They open with narrative scene-setting, vague value propositions, or marketing language that sounds good to humans but tells an AI system nothing useful. “In today’s rapidly evolving digital landscape, businesses are increasingly looking for ways to stand out” — that’s 16 words of zero-information filler that an AI will skip entirely.
Compare that with a declarative opening: “Entity SEO is the practice of building machine-readable identity for your brand across search engines, knowledge graphs, and AI systems. It works by establishing clear, unambiguous connections between your business, your expertise, and the topics you’re authoritative for.”
The second version gives an AI system exactly what it needs: a definition, a mechanism, and a scope. It can extract that confidently, cite it accurately, and use it as the foundation for a synthesised answer.
This isn’t about keyword stuffing or SEO tricks. It’s about information density in the position where AI systems look first.
2. Entity Clarity
AI systems don’t rank pages. They evaluate entities — people, organisations, concepts, products — and assess whether those entities are authoritative for a given query.
Your structured data is the machine-readable API for your entity identity. Organisation schema tells AI systems what you are, where you operate, what you’re known for, and how to verify your identity across platforms (via sameAs links to LinkedIn, Wikidata, industry directories). Person schema does the same for individual experts — their credentials, their expertise areas, their published work.
The combination of Organisation schema with knowsAbout properties, Person schema with credentials and sameAs links, and consistent entity references across your content creates what we call entity clarity. An AI system encountering your content can immediately verify: “This is [Organisation], founded by [Person], authoritative for [Topics], verified across [Platforms].”
Without entity clarity, your content is just another anonymous web page. With it, you’re a known, verified source that AI systems can cite with confidence.
3. Content Architecture for Synthesis
AI systems don’t extract content the way search engines index it. Search engines care about keywords, links, and page authority. AI systems care about whether they can decompose your content into discrete, reliable facts that can be reassembled into a new response.
This is what query fan-out looks like in practice. When a user asks an AI system a complex question, the system breaks it into sub-questions, retrieves relevant content for each sub-question, extracts specific claims from that content, evaluates the reliability of those claims, and synthesises a response. Your content needs to survive every step of that process.
That means clear heading hierarchies that map to specific questions. Standalone summary paragraphs that can be extracted without losing meaning. Structured definitions that an AI system can quote or paraphrase with confidence. And — critically — no ambiguity about what claims you’re making and what evidence supports them.
4. Structured Data as the Clarity Layer
This is where speakable fits into the bigger picture — not as a voice search optimisation tactic, but as part of the structured data stack that gives AI systems explicit signals about your content.
FAQPage schema provides ready-made question-answer pairs that AI systems can extract with high confidence. HowTo schema breaks procedural content into discrete steps. Article schema with proper author attribution provides provenance — who wrote this, when, and what authority do they carry? Organisation schema provides entity context.
And Speakable schema, despite its limited current deployment, does something none of the others do: it explicitly tells AI systems which parts of your content are designed to be delivered as spoken output. As AI voice delivery expands — and every trend line says it will — that signal becomes increasingly valuable.
The implementation cost is minimal. A few CSS selectors pointing at your headings and opening paragraphs. The potential upside, as voice delivery scales, is that you’ve already told AI systems exactly which parts of your content work as spoken answers.
The Decision Framework: Where Speakable Fits in Your Priority Stack
Not every business needs to implement speakable schema today. Here’s how to think about it.
News publishers should implement it now. Google’s speakable support is currently limited to news content in English, which means news publishers are the only ones who can see direct results today. The summary paragraph control that speakable provides is directly relevant to how Google News and AI Overviews select content for spoken delivery.
SaaS companies and consultancies should treat it as experimental but low-cost. The structured data itself takes minutes to implement. The real work — ensuring your content has clear, extractable summary paragraphs and entity-grounded authority — benefits your AI visibility regardless of whether speakable specifically drives results.
E-commerce businesses have lower priority for speakable but should focus heavily on product schema clarity, review aggregation, and the entity foundation that helps AI shopping assistants understand and recommend their products.
For everyone: the content architecture work that makes speakable effective — declarative openings, entity clarity, structured definitions, clear heading hierarchies — is the same work that makes your content more visible across AI Overviews, ChatGPT, Perplexity, and every other AI delivery platform. You’re not implementing speakable schema in isolation. You’re building the content infrastructure that the entire AI summary layer depends on.
What Actually Matters in 2026
Let’s be direct about what speakable does and doesn’t do today.
Speakable does not influence AI Overviews. Google does not use speakable selectors when deciding what to include in its AI-generated summaries. AI Overviews use their own content extraction and synthesis pipeline, which is based on content quality, entity authority, and structural clarity — not on which CSS classes you’ve marked as speakable.
Google does not rely on speakable for summarisation. The AI rewriting pipeline generates novel text based on source content. It doesn’t read your speakable-marked paragraphs aloud; it synthesises new responses informed by your content.
AI systems rewrite your content regardless. No amount of structured data will make ChatGPT quote you verbatim. The goal isn’t to control the exact words an AI speaks — it’s to ensure your content is selected as a source, accurately understood, and properly attributed.
Control comes from structure, not tags. The businesses that dominate AI visibility aren’t the ones with the most schema markup. They’re the ones whose content is architecturally designed for extraction: clear entities, declarative content, structured definitions, and unambiguous authority signals.
Speakable is one signal in that architecture. A small one today. Potentially a significant one tomorrow. But the architecture itself — that’s what you should be building now.
The Market Signal
The keyword data tells a clear story. “AI overviews” is at 8,100 monthly searches in the UK (+179% YoY) and 49,500 in the US (+234% YoY). “LLM optimisation” grew 600% year-on-year. “Answer engine optimisation” is up 125%. “Entity SEO” is up 29%.
Meanwhile, “voice search SEO” is down 81%. “Voice search optimisation” is down 81%. “Schema for voice search” shows zero sustained interest.
The market is not asking how to optimise for voice search. The market is asking how to optimise for AI delivery — AI Overviews, LLM responses, answer engines, entity-grounded citation systems.
Speakable sits at the intersection. It’s a structured data type designed for spoken delivery, in a market where spoken delivery is shifting from command-based voice search to AI-synthesised voice output. Its current implementation is limited. Its directional signal is strong.
The businesses that are building entity authority, structuring content for AI extraction, implementing comprehensive schema, and — yes — adding speakable selectors to their most important content sections are the ones positioning for where AI delivery is heading. Not where voice search has been.
Implementation: The Practical Steps
If you’ve read this far and want to act on it, here’s the priority order.
First, fix your content architecture. Audit your top 20 pages. Does each one open with a declarative summary in the first 120 words? Can you extract a clear, standalone definition or answer from each major section? Do your headings map to specific questions that an AI system might decompose from a complex query?
Second, build your entity foundation. Ensure your Organisation schema includes knowsAbout, areaServed, founder, and sameAs links to every verifiable profile (LinkedIn, Wikidata, industry directories). Add Person schema for key team members with credentials, expertise areas, and cross-platform verification.
Third, implement your answer schema stack. FAQPage for pages with question-answer content. HowTo for step-by-step processes. Article with proper author attribution on all content pages. VideoObject for embedded video with transcript where possible.
Fourth, add speakable selectors. On each page, identify the CSS selectors that point to your most important content: the page title, the opening summary paragraph, and the primary heading structure. Add SpeakableSpecification to your WebPage schema with those selectors. Validate through the Schema.org validator to ensure every selector actually matches an element on the page — referencing classes that don’t exist in your HTML will generate validation errors.
The first three steps drive measurable AI visibility results today. The fourth costs almost nothing and positions you for the expansion of AI voice delivery that every trend line is pointing toward.
This article is part of our structured data and schema markup series. For a comprehensive guide to entity authority and AI visibility, see Entity SEO: The Complete Guide. To assess your current AI visibility, book a free consultation.