Complete Guide

Observed Outcomes Register: Documented AI Retrieval, Citation and Feature-Extraction Behaviour

The public log of documented AI retrieval, citation, and feature-extraction behaviour observed on SEO Strategy Ltd client work. Each entry records a dated observation: query, AI surface, retrieval pattern, feature extracted, constraint handling, entity preservation, attribution persistence, confounders, interpretation confidence, and the hypothesis generated. Launched with two entries (May 2026); expanded to six in June 2026 across SEO, legal and B2B SaaS, including a self-entity magnet, a citation-volatility observation substantiated by independent drift research (SISTRIX, Profound), an intent-conditional retrieval split, and a cited-but-not-recommended case. The register’s value compounds with entry count and time span, not with any individual observation.

16 min read 3,246 words Updated Jun 2026

The observed-outcomes register is the public log of documented AI retrieval, citation, and feature-extraction behaviour observed on SEO Strategy Ltd client work. Each entry records a dated observation of an AI system’s actual output for a specific query — the surface, the retrieval pattern, the entity treatment, the operational differentiator extracted, the confounders, and the interpretation confidence. The register exists because the strongest evidence for any framework is documented behaviour observed in the wild, recorded the same way every time, accumulating over months.

6 Documented entries in the register as of June 2026. Launched with two in May 2026 (Olliers Operation Soteria, Daves Taxis Cruise Tour Planner); expanded with four on 24 June 2026 across SEO, legal and B2B SaaS (self-entity magnet, citation volatility, intent-conditional retrieval, cited-but-not-recommended). The register is append-only and updates as observations occur. SEO Strategy Ltd direct observation, March to June 2026

11 Structured fields recorded per entry — date observed, query, AI surface, retrieval pattern, feature extracted, constraint handling observed, entity preservation, attribution persistence, confounders, interpretation confidence, hypothesis generated — ensuring observations are comparable over time and across sectors. SEO Strategy Ltd register methodology, May 2026

8-12 Entry count threshold at which observations begin carrying evidential weight rather than reading as isolated anecdotes. Six entries is the start of a pattern; the register’s value compounds with entry count and time span, not with any individual observation. SEO Strategy Ltd register design principle, May 2026

56% Weekly source-set churn in Google AI Mode measured by SISTRIX (82,619 prompts, 17 weeks, six countries, May 2026): a small stable core of cited sources persists while a much larger carousel rotates. Profound, analysing 240 million ChatGPT citations, separately found 40 to 60% of cited domains change month to month for identical queries. This independent research substantiates the register’s Entry 4 volatility observation and underpins its single-snapshot honesty discipline. SISTRIX (Johannes Beus), AI Citation Drift, May 2026; Profound citation tracking, 2026

The observed-outcomes register is the public log of documented AI retrieval, citation, and feature-extraction behaviour observed on SEO Strategy Ltd client work. Each entry records a dated observation of an AI system’s actual output for a specific query — the surface, the retrieval pattern, the entity treatment, the operational differentiator extracted, the confounders, and the interpretation confidence. The register exists because the strongest evidence for any framework is not the framework’s internal logic. It is documented behaviour observed in the wild, recorded the same way every time, accumulating over months. The register launched with two entries in May 2026 and was expanded to six in June 2026. It will continue to compound.

Why this register exists

The AI visibility category is flooded with frameworks, opinions, and predictive claims. Almost no one is publishing structured, timestamped, verifiable observations of actual AI system behaviour. That is the gap this register fills.

Each entry is an observation, not a proof. An observation is a single documented data point from a specific AI surface at a specific time, with the confounders named. A proof would require controlled replication across queries, surfaces, and time windows that no one in this discipline currently has. The honest position is that we have observations consistent with hypotheses about how AI retrieval and recommendation systems behave — and that as observations accumulate, those hypotheses either earn evidential weight or get refuted. Either outcome is useful.

The register is read alongside CITATE (the page-level standard the observed pages were built to) and the AI Discovery Stack (the system model the observations sit inside). The register does not prove either framework. It documents specific outputs in specific conditions and lets readers draw their own conclusions.

How entries are structured

Every entry uses the same fields so that observations are comparable over time:

Date observed — when the output was captured
Query — the exact prompt used
AI surface — which platform and which feature (e.g. Google AI Overview, Google AI Mode, ChatGPT, Perplexity)
Retrieval pattern — how the source surfaced (cited link, named card, embedded summary, branded panel)
Feature extracted — the specific operational differentiator the system surfaced, if any, in whose exact wording
Constraint handling observed — whether the system reasoned about user constraints (timing, location, eligibility) using the source’s structure
Entity preservation — whether the source was named, linked, logo-displayed, or all three
Attribution persistence — whether the attribution carried through multiple turns or follow-up queries
Confounders — what could plausibly explain the result other than the framework hypothesis
Interpretation confidence — low, medium, or high, with the reasoning named
Hypothesis generated — what this observation, if it repeats, would suggest

Entry 1: Olliers Solicitors — Operation Soteria

Date observed: March 2026. Client: Olliers Solicitors. Sector: Criminal defence law.

Query: Operation Soteria-related criminal defence query (specific phrasing captured in screenshot evidence held with the client).

AI surface: Google AI Overview.

Retrieval pattern: Citation with the Olliers Solicitors logo displayed alongside the AI Overview answer. The source was named in the answer and visually anchored by the brand mark in the citation panel.

Feature extracted: The page’s structured explanation of the Operation Soteria framework was the substrate for the AI Overview’s answer. The page had been built to CITATE 6/6 with a standalone opening, explicit definitions of the operational terms, attributed claims, and named entity references throughout.

Constraint handling observed: The AI Overview reasoned about the legal context (which police forces, which charge types, which procedural stage) using the structure on the page rather than generic legal definitions. The page’s sectioning by procedural stage was preserved in the AI’s reasoning sequence.

Entity preservation: Named source, linked source, logo displayed in the citation panel. Strong on all three dimensions.

Attribution persistence: Persisted across follow-up queries within the same session. The system continued to surface Olliers Solicitors when adjacent procedural queries were submitted.

Confounders: Olliers is a recognised criminal defence firm with strong off-page corroboration (Chambers, Legal 500, law society listings). The AI Overview citation may reflect that corroboration rather than purely the CITATE-structured page. Disentangling on-page structure from off-page trust is not possible from a single observation.

Interpretation confidence: Medium-high. The combination of named citation, logo display, and structural preservation of the page’s sectioning is consistent with the hypothesis that pages built to extractable structure with strong entity corroboration become reusable substrates for AI Overview answers. It is not conclusive evidence that CITATE structure alone produced the result.

Hypothesis generated: Pages built to a measurement standard (CITATE 6/6) on entities with strong off-page corroboration appear more likely to be selected as the named substrate for AI Overview answers in regulated sectors where citation accuracy carries reputational stakes for the AI system.

Entry 2: Daves Taxis — Cruise Tour Planner

Date observed: May 2026. Client: Daves Taxis (David Plomer, sole trader, West End Southampton). Sector: Local transport / cruise tour services.

Query: Planning-stage cruise tour query for Southampton arrivals with specific docking and sailing time constraints.

AI surface: Google AI Mode.

Retrieval pattern: Branded card display of Daves Taxis with the business named and visually presented as a distinct entity, not merely listed in a results set.

Feature extracted: The AI Mode answer included, in near-verbatim form, the page’s description of the Cruise Tour Planner: “Features a dedicated ‘Cruise Tour Planner’ that filters destinations based on your exact docking and sailing hours.” The operationally distinct feature — the planner that handled the constraint-handling specific to cruise arrivals — was preserved as a named differentiator in the AI’s response. The system did not paraphrase the feature into generic language; it preserved the named tool and its constraint logic.

Constraint handling observed: The AI Mode response sequenced the planning logic (arrival time, docking duration, sailing time, destination selection) using the page’s structured planner as the reasoning scaffold rather than generating generic tour-planning prose. This is the most operationally interesting aspect of the observation: the AI did not just cite the page; it reused the page’s constraint-handling structure inside its own answer.

Entity preservation: Named source, branded card, business presented as a distinct callable entity. Strong on all three.

Attribution persistence: Daves Taxis persisted across multiple planning-context follow-up queries within the same session, including variations in arrival time and tour duration. The system retained the entity and re-applied it to adjacent planning contexts rather than re-retrieving alternatives each turn.

Confounders: Daves Taxis already ranked third organically for “taxi cruise tours Southampton” before the observation, so organic ranking strength is a plausible contributor. The local sector also has fewer competing entities than the criminal defence sector, lowering the bar for entity preservation. Disentangling AI Mode’s feature-extraction behaviour from underlying organic visibility is not possible from a single observation.

Interpretation confidence: High on the feature-extraction observation specifically. The verbatim preservation of “Cruise Tour Planner” as a named tool and the use of its constraint-handling structure in the AI’s reasoning are documented in the captured response. Medium on the broader implication. The hypothesis it generates needs cross-vertical replication before it carries serious evidential weight.

Hypothesis generated: AI retrieval systems may preferentially preserve operationally distinct, named features (planners, calculators, structured decision tools) when those features handle a constraint the user’s query implies. The feature is not just extractable content; it is reusable operational structure the AI can scaffold its own answer around. Generic prose appears more likely to be paraphrased away; named operational tools appear more likely to be preserved verbatim. If this hypothesis holds across further observations, the implication is that high-value pages should increasingly contain named, constraint-aware operational structures (planners, decision flows, calculators) rather than only descriptive prose.

June 2026 batch: a measurement-discipline note

The four entries below were captured on 24 June 2026 in a single working session. They introduce a discipline the first two entries did not need: multi-run observation. Where Entry 4 records six runs of the same query within forty minutes, the variation between those runs is itself the finding. This matters because independent research now quantifies how unstable AI citation sets are. SISTRIX (Johannes Beus, May 2026), tracking 82,619 prompts over 17 weeks across six countries, found Google AI Mode replaces roughly 56% of its cited sources every week, with a small stable core and a much larger rotating carousel. Profound, analysing 240 million ChatGPT citations, found 40 to 60% of cited domains change month to month for identical queries. A single observation is therefore a sample of a volatile distribution, not a fixed position. These entries name run counts where available and explicitly decline to quote a drift percentage from a handful of runs.

Entry 3: SEO Strategy Ltd, self-entity magnet

Date observed: 24 June 2026. Client: SEO Strategy Ltd (self-observation). Sector: SEO and AI visibility consultancy.

Query: “best seo agency southampton”.

AI surface: Google AI Mode (36-site panel), corroborated by the standard organic result for the same query.

Retrieval pattern: Named card in the AI Mode source panel with the /seo-agency-southampton/ page cited, plus a full profile block in the answer body. SEO Strategy was named in the opening summary alongside other local agencies.

Feature extracted: The answer reproduced the firm’s own positioning near-verbatim: “senior consultant with over 20 years”, “no junior handoffs”, “deep, multi-year local client relationships”, and the AI-platform specialism (“AI-driven platforms like ChatGPT and Google AI Overviews”). The same feature-preservation pattern seen in the Daves Taxis entry, here on the consultancy’s own entity.

Constraint handling observed: Location-biased to Southampton; the answer reasoned about local market plus AI-visibility specialism using the page’s positioning rather than generic agency boilerplate.

Entity preservation: Named source, linked source, card in the panel. Strong on all three.

Attribution persistence: Not tested across follow-up turns in this session.

Confounders: The Google Business Profile for this entity is self-managed, which can influence the card. Location bias to Southampton is active. SEO Strategy appeared fourth in the opening summary, behind two other named agencies, so this is a magnet, not the top pick. Single session, so volatility is not characterised here.

Interpretation confidence: Medium. The near-verbatim preservation of the firm’s own positioning is clear in the captured response, but the self-managed profile and location bias mean this is not clean evidence that page structure alone produced the placement.

Hypothesis generated: A consultancy’s own consistent positioning language is preserved near-verbatim into AI Mode answers where the entity is well corroborated and the query intent matches its profile. Classification: Citation Magnet (retrieved and selected), with the caveat that magnet status is not the same as being the single recommended pick.

Entry 4: Olliers Solicitors, citation volatility on an informational query

Date observed: 24 June 2026. Client: Olliers Solicitors. Sector: Criminal defence law. Note: This entry concerns a different observation from Entry 1 (Operation Soteria, a clean win). Both are kept. Per this register’s method, contradictory or differing observations on the same entity are recorded side by side rather than reconciled.

Query: “what happens if charged with possession of indecent images”.

AI surface: Google AI Mode (multiple runs) and standard Google search.

Retrieval pattern: Variable across six runs in the same forty-minute window. On standard Google search, the Olliers page held the featured snippet, the answer box above the Sentencing Council and the Crown Prosecution Service. In Google AI Mode, the Olliers page appeared as a cited source in four of six runs. In the two runs where it did not appear, the AI answer had drawn from a noticeably wider source pool (around 51 sites, against roughly 19 to 29 in the runs where it did appear).

Feature extracted: On standard search, the page answered the informational query directly in the snippet (the offence, the sentencing range). On AI Mode runs where present, the page was cited as a supporting source rather than having a named feature reused.

Constraint handling observed: Not the axis of this observation. The finding is run-to-run variance in whether the source is retrieved at all.

Entity preservation: Strong where present (named card, linked) and snippet ownership on standard search; absent entirely in the two wider-pool AI Mode runs.

Attribution persistence: Not measured as session persistence here; the relevant axis is presence versus absence across independent runs.

Confounders: The six runs differed in session state (some logged in, some incognito) as well as in time, so pure temporal drift cannot be isolated from session-state and pool-size effects. Six runs establish direction, not a reliable percentage. The page is a CITATE-built, on-topic specialist page, so this is not a content-gap absence.

Interpretation confidence: Medium-high on the qualitative finding (strong and stable on standard search, present in most AI Mode runs, occasional drop-out in wider-pool runs). Low on any precise drift rate, which six runs cannot support.

Hypothesis generated: A strong, snippet-holding, CITATE-built page can still drop out of individual AI Mode runs. This is carousel behaviour rather than failure to anchor in the stable core, and it is consistent with the independent drift research cited above. A secondary, weaker observation from this set: the two absences coincided with the widest source pools, suggesting that broad exploratory retrievals may reach for generic authorities while tighter retrievals favour the directly relevant specialist page. Six runs make this a hypothesis, not a rule. The practical implication is that AI Mode placement should be tracked as persistence across repeated same-condition runs, never inferred from a single check.

Entry 5: Coviant Software (Diplomat MFT), intent-conditional retrieval

Date observed: 24 June 2026. Client: Coviant Software (Diplomat MFT). Sector: B2B SaaS, managed file transfer.

Query: A set of related queries. Profile-matching queries: “automatic pgp encryption software for hipaa compliance” and “affordable MFT for HIPAA compliance”. Generic enterprise queries: “secure mft for healthcare compliance” and “top compliance platforms”.

AI surface: Google AI Mode and ChatGPT.

Retrieval pattern: Split by query intent. On the profile-matching queries, Diplomat MFT was returned unprompted at or near the top, and the Coviant page was the lead citation card on the affordable-MFT query. On the generic enterprise queries, Diplomat was absent and the enterprise incumbents (GoAnywhere, MOVEit, Kiteworks, Progress, OPSWAT) owned the answer.

Feature extracted: On the magnet queries, the answer preserved the product’s differentiators near-verbatim: no-code visual automation, flat-rate licensing that avoids per-user price increases, integrated audit trails, the Basic Edition for smaller teams, and the 20 years and zero breaches record carried through the Capterra listing.

Constraint handling observed: The systems matched query intent to the entity profile the models hold (affordable, simple, no-code, SMB to mid-market) and returned the entity where intent matched, omitting it where intent implied large-enterprise scale.

Entity preservation: Named, carded, own page cited on the profile queries. Absent on the enterprise comparison queries.

Attribution persistence: On a direct comparison prompt, ChatGPT advised against a simple better-than claim and offered a defensible positioning line of its own (core capabilities most organisations need, lower complexity, typically lower cost than enterprise-focused alternatives), consistent with the entity it already holds.

Confounders: Single session, so volatility is not characterised. Generic comparison queries naturally favour incumbents with deeper third-party corroboration, which is a plausible cause of the absences independent of any Diplomat weakness.

Interpretation confidence: Medium-high on the intent-split pattern, which held consistently across several distinct queries on two surfaces. Lower on the durability of the pattern over time, which needs repeated sampling.

Hypothesis generated: AI retrieval is intent-conditional. The same entity is a Citation Magnet where the query intent matches its corroborated profile and Citation Invisible where the intent implies a profile it does not hold. This is the same shape observed for Olliers across informational versus recommendation queries, now in a third sector (B2B SaaS), which raises its weight as a cross-vertical pattern rather than a local-search artefact.

Entry 6: Daves Taxis, cited but not recommended

Date observed: 24 June 2026. Client: Daves Taxis (David Plomer, Southampton). Sector: Local transport, cruise tour services. Note: A second observation on the same entity as Entry 2, on a different gate.

Query: “cruise taxi tours southampton”.

AI surface: Google AI Mode.

Retrieval pattern: Daves Taxis was the lead citation card in the source panel and was cited inline for the opening definition and the structured drive-times list. However, in the “Top Rated Providers” recommendation block within the same answer, the profiled and recommended businesses were Southampton White Taxi Excursions and Air2Port, not Daves. Two AI Mode renders of the same query also produced materially different framings (a drive-times answer leading Daves, and a top-rated-providers answer leading the competitors).

Feature extracted: The page’s structured drive-times content was preserved (New Forest roughly 35 minutes, Stonehenge roughly 1 hour 15, Bath roughly 1 hour 30), reused as the scaffold for the answer’s itinerary section.

Constraint handling observed: The answer used Daves’s structured drive-time content as the reasoning scaffold for the planning section, the same feature-extraction behaviour recorded in Entry 2.

Entity preservation: Cited and named as the source authority, but not selected as the recommendation. The two gates diverged within a single answer.

Attribution persistence: Render variance was the notable feature: the same query produced different leading entities across two runs.

Confounders: The recommendation split tracks a visible review-count gap (Daves against Southampton White Taxi Excursions at 4.7 from 25 reviews and Air2Port at 5.0 from 151 reviews), which plausibly drives the selection independent of citation. Single session.

Interpretation confidence: High on the cited-but-not-recommended observation specifically, which is visible in a single captured answer. Medium on the broader implication, which needs replication.

Hypothesis generated: Citation (being retrieved and used as a source) and recommendation (being selected as the suggested choice) are separate gates. An entity can be the cited authority for a topic and still not be the recommended pick. This is the Citation Repellent-adjacent state from the Citation Magnet Classification: retrieved and cited, but not selected, a Selection-layer (Floor 3) signal rather than a retrieval problem. The likely lever here is review volume and trust signals, not page structure, which the page already supplies.

What this register is not

This is not a results portfolio. It is not a case study collection. It is not a marketing surface. The register exists to document observed AI behaviour with enough structural consistency that the observations can be compared, the hypotheses can be tested, and conclusions can be drawn or refuted as the entry count grows.

Six entries is the start of a pattern, not yet proof. The 24 June 2026 batch added a magnet (SEO Strategy), a volatility observation (Olliers), an intent-split (Coviant) and a cited-but-not-recommended case (Daves), across SEO, legal and B2B SaaS. Eight to twelve entries across multiple sectors, observed repeatedly rather than once, would be enough to begin saying which observations recur, which were idiosyncratic, and which hypotheses are earning evidential weight. The register’s value compounds with entry count and time span, not with any individual observation.

Methodology and update cadence

Each entry is captured at the time of observation: the query, the screenshot, the surface, the response. Entries are added in the order observed. Existing entries are not edited retroactively. If a later observation contradicts an earlier one, both are kept and the contradiction is named in a new entry (the two Olliers entries are an example: a clean win and a later volatility observation, held side by side). From the June 2026 batch onward, observations of unstable queries record run counts and decline to quote a drift percentage from a small number of runs, because the platforms are too variable for a single check to be reliable.

The register is updated as observations occur, not on a fixed cadence. Quality of observation beats quantity of entries.

For the framework standards these observations are tested against, see the CITATE framework, the AI Discovery Stack, and the full SEO Strategy Frameworks register. For the commercial engagement that produces this work, see the AI Visibility Audit.

Key Definitions

Observed-outcomes register: A structured, append-only public log of documented AI retrieval, citation, and feature-extraction behaviour observed on client work. Each entry uses the same fields so observations are comparable over time and across sectors. Entries are observations, not proofs — documented data points whose evidential weight comes from accumulation, not from any individual record.
Feature extraction: The behaviour observed when an AI system preserves a named, operationally distinct feature from a source page verbatim in its own response — particularly when that feature handles a constraint the user’s query implies (timing, eligibility, location, sequence). Distinct from paraphrastic summarisation, where the system rewrites the source’s information in generic language. The Daves Taxis Cruise Tour Planner observation is the canonical example.
Attribution persistence: Whether an AI system continues to surface, name, and credit the same source across follow-up queries in the same session, rather than re-retrieving alternatives each turn. High attribution persistence indicates the system has formed a session-level entity preference for the source.

Frequently Asked Questions

What is the observed-outcomes register?

A public log of documented AI retrieval, citation, and feature-extraction behaviour observed on SEO Strategy Ltd client work. Each entry uses the same structured fields so observations are comparable over time. The register exists to generate evidence for or against hypotheses about how AI retrieval and recommendation systems behave — not to prove a framework correct.

Is this a case study collection?

No. Case studies present client results from a marketing angle. The observed-outcomes register documents AI behaviour with structural consistency for cross-comparison, names the confounders, and rates the interpretation confidence. The two formats serve different purposes; this register is for evidence, not promotion.

Why does the register say a source can appear in one AI answer and not the next?

Because AI citation sets are unstable by design. Independent research (SISTRIX, May 2026, 82,619 prompts over 17 weeks) found Google AI Mode replaces roughly 56% of its cited sources every week, with a small stable core and a large rotating carousel; Profound found 40 to 60% of cited domains change month to month for identical queries. Entry 4 of this register records that behaviour live on a client page across six runs in forty minutes. A single check is therefore a sample of a volatile distribution, which is why the register tracks presence over repeated runs (persistence) rather than trusting any one observation.

What does an entry need to qualify for inclusion?

A specific AI surface, a specific query, a documented response with screenshot evidence, and enough information to fill all eleven structured fields including confounders and interpretation confidence. Observations that cannot be honestly rated for confidence or that have unnamed confounders do not get added.

How many entries are in the register?

Six as of June 2026. It launched with two structured entries in May 2026 and was expanded to six on 24 June 2026 across SEO, legal and B2B SaaS. Quality of observation beats quantity: six well-structured entries with named confounders are more useful than dozens of loosely-described ones. The register is append-only and compounds over months; eight to twelve entries observed repeatedly across multiple sectors would be enough to begin saying which observations recur and which hypotheses are earning evidential weight.

What is feature extraction and why does it matter?

Feature extraction is the behaviour observed when an AI system preserves a named, operationally distinct feature from a source page verbatim — particularly when the feature handles a constraint the user’s query implies. Distinct from paraphrastic summarisation. If this pattern holds across observations, the implication is that high-value pages should increasingly contain named, constraint-aware operational structures (planners, calculators, decision tools) rather than only descriptive prose.

Founder of SEO Strategy Ltd with 20+ years in SEO, web development and digital marketing. Specialising in healthcare IT, legal services and SaaS — from technical audits to AI-assisted development.

Ready to improve your search visibility?

Book a free 30-minute consultation and let's discuss your SEO strategy.

Get in Touch