Last updated: March 2026
You’re not an SEO. You’ve been told the site needs work. Maybe traffic has plateaued. Maybe a competitor appeared from nowhere and is outranking you despite having half your content. Maybe you got a 400-line spreadsheet from an agency and have no idea whether to fix item 12 or item 387 first. This page exists for you.
The checklist below is prioritised by what actually moves rankings and AI visibility — not by what crawl tools flag loudest. It separates the ranking blockers from the hygiene tasks, includes the AI citation readiness checks that no standard audit covers, and ends with a straight answer on what a proper audit costs, how long it takes, and what you should get from it. If you’d rather have SEO Strategy Ltd run it for you: book a free 30-minute consultation.
Why Most SEO Audit Checklists Send You in the Wrong Direction
Every SEO tool produces a list of issues in under ten minutes. Crawl errors, missing meta descriptions, images without alt text, pages without H1 tags — hundreds of items, colour-coded by severity, ready to export into a PDF. The problem is that the list treats every issue as roughly equal. A missing meta description on a low-traffic archive page gets the same flag as a crawl budget problem that’s preventing your 50 best pages from being indexed. You spend two weeks writing meta descriptions. Nothing moves.
Here is where most sites’ performance problems actually live — and it is not in the items crawl tools shout loudest about.
Internal linking architecture accounts for more unexplained ranking plateaus than any other single factor. PageRank is flowing from your homepage into blog posts that link to nothing, while your commercial pages sit at crawl depth six with almost no internal authority pointing at them. This is not a tool output — it is a structural problem you have to map manually. It is also the fix that most consistently produces visible ranking movement fastest.
Index quality matters more than index size. A site with 12,000 indexed pages and 300 that get meaningful traffic has an index quality problem. Google uses site-wide quality signals, which means 11,700 thin, low-value pages are actively dragging down the authority of the 300 that deserve to rank. Consolidating, redirecting, and noindexing the dead weight can produce faster ranking improvements than months of new content.
Content retrieval structure is the check almost no audit runs — and in 2026 it matters more than most technical factors. Google AI Overviews, Perplexity, and ChatGPT retrieve content at paragraph level, not page level. A page can be technically perfect and still be completely invisible to AI systems because its sections don’t open with standalone answers, its statistics lack source attribution, and its entity references are all pronouns. This is covered in the AI Citation Readiness tier below.
The Five Questions That Reveal Most Ranking Problems
Before running a single tool, answer these five questions. In most audits Sean Mullins runs, the answer to one of them is the primary reason the site is underperforming — and fixing it produces more movement than months of new content or link building.
1. Is internal link equity reaching your commercial pages? Open your five most important pages — the ones you need to rank and convert. Count how many internal links point to each of them from pages outside your navigation. If the answer is two or three, that is the problem. Blog posts linking to nothing, homepage links pointing to archives, commercial pages orphaned at crawl depth five — this pattern is extraordinarily common and highly fixable.
2. What percentage of your indexed pages deserve to rank? Run site:yourdomain.com in Google. Look at what comes up. Tag pages, author archives, filtered URLs, thin posts from 2016, pagination variants — all indexed, all diluting your site’s quality signals. The Index Quality Ratio (meaningful indexed pages ÷ total indexed pages) should be above 40%. Most sites are sitting at 5–10%.
3. Where is Google spending its crawl budget? On larger sites, server log analysis routinely shows 60–80% of Googlebot’s crawl activity going to parameter URLs, faceted navigation variants, and pagination — not the pages you want indexed. Fixing crawl waste is often the fastest route to getting fresh content discovered and ranked quickly.
4. Is your content structured for AI retrieval? Does each major section of your key pages open with a direct, standalone answer in the first sentence? Are statistics cited with number, population, action, timeframe, and source? Are entities named explicitly — “SEO Strategy Ltd’s llms.txt Generator WordPress plugin” rather than “our tool”? These structural characteristics determine AI citation, and they are entirely separate from traditional ranking factors.
5. Does Google know what your brand is an authority on? Entity clarity — consistent schema markup, a coherent knowledge graph presence, cross-platform entity signals — determines whether AI systems trust your content enough to cite it. A business with consistent structured data, a Wikidata record, and entity-anchored content is evaluated very differently from an identical business with none of those signals. This is the foundation everything else is built on. See the entity SEO guide for the full framework.
The SEO & AI Audit Checklist — Prioritised by Impact
Four tiers, ordered by what moves rankings and AI visibility. Work through them in this order. Do not start on Tier 4 while Tier 1 problems are unresolved.
Tier 1: Fix These First — Ranking Blockers
These issues, if present, are directly suppressing your rankings right now. Everything else is optimising around a structural problem until these are resolved.
Crawlability and indexation
- Run
site:yourdomain.com— are the results what you’d expect, or are tag pages, author archives, and parameter variants dominating? - Check robots.txt for unintentional blocks — live sites accidentally blocking their own blog directory are more common than you’d think. Read every line and test in Search Console’s robots.txt tester.
- Verify your XML sitemap exists, is referenced in robots.txt, and contains the pages you actually want indexed. Broken sitemaps appear on roughly one in four audited sites.
- Check for noindex tags on pages that should be indexed — CMS updates and staging-to-production migrations are the most common cause. Check both the HTML meta robots tag and X-Robots-Tag HTTP header.
- Review canonical tags — every indexable page should have a self-referencing canonical. Inconsistencies between canonicals, sitemaps, and internal links confuse Google’s consolidation logic.
- Test JavaScript rendering via Search Console’s URL Inspection tool — compare raw HTML against what Google renders. Content that’s only visible after JS executes may be invisible to Google.
- Check for soft 404s — pages returning 200 status but showing error content or empty results. Out-of-stock product pages are the most common offender.
- Identify orphan pages — if the only route to a page is the XML sitemap and nothing links to it internally, Google’s implicit signal is that you don’t think it matters.
Internal linking architecture
- Map click depth for priority pages — can they be reached within three clicks from the homepage? Pages buried deeper get crawled less frequently and tend to rank worse.
- Draw the internal link authority flow: homepage → ? → ? → commercial pages. In most sites, authority flows into blog posts and away from money pages. This map reveals the structural problem.
- Audit anchor text on internal links to commercial pages — “click here” and “find out more” waste a signal that should describe the target page’s topic.
- Find and fix all broken internal links — every 404 you’re linking to internally is a dead end for both users and Googlebot.
- Identify redirect chains in internal links — a link going through two or three redirects loses equity at each hop. Update to point directly at the final URL.
Tier 2: Fix Next — Structural Drag
These issues aren’t acute ranking blockers in most cases, but they represent compounding drag. A site running with these unresolved is working harder than it needs to for every ranking it achieves.
Index quality
- Calculate your Index Quality Ratio: meaningful indexed pages ÷ total indexed pages. Below 40% means index consolidation should be a priority before new content creation.
- Identify thin, low-value indexed pages — tag archives, author pages, parameter variants, expired content, paginated archives beyond page 2. These dilute site-wide quality signals.
- For thin pages that can’t be improved: consolidate, redirect, or noindex. Reducing index bloat often produces faster authority gains than publishing new content.
- Run a content decay analysis — pages that peaked 12–18 months ago and have declined since. Refreshing these is typically faster ROI than new pages, because the authority is already there.
Site speed and Core Web Vitals
- Run PageSpeed Insights on your five most important page templates — focus on the specific recommendations, not the headline score.
- Check Largest Contentful Paint (LCP): if it’s over 2.5 seconds, investigate hero image size, server response time, and render-blocking resources.
- Audit image optimisation across the site — missing WebP or AVIF format, images served at the wrong dimensions, missing lazy loading for below-fold images. Fixing images alone frequently delivers more speed improvement than everything else combined.
- Check for render-blocking scripts and stylesheets — if critical CSS isn’t inlined and scripts aren’t deferred, the browser waits before rendering anything.
On-page technical signals
- Audit title tags for uniqueness and clarity — pull a full list via Screaming Frog and sort by duplicates. Every page needs a unique title that describes exactly what it covers.
- Check heading hierarchy — every page should have one H1 that matches the primary topic. Multiple H1s per page, or logos wrapped in H1 tags on every page, are still common.
- Verify structured data via Google’s Rich Results Test — confirm schema types are error-free and that data matches what’s visible on the page.
- Check for duplicate content across URL variants: www vs non-www, HTTP vs HTTPS, trailing slash vs no trailing slash — all should resolve to a single canonical URL.
Tier 3: AI Citation Readiness — The Checks Most Audits Don’t Run
This is the section that distinguishes a 2026 audit from a 2020 one. The checks above determine whether your pages can be found. These checks determine whether your pages get cited by Google AI Overviews, Perplexity, ChatGPT, and Copilot once they’re found. The GEO-Bench study (Princeton, Georgia Tech, IIT Delhi, 2024) found that adding statistics with full context improved AI citation rates by 41%, and adding authoritative source citations improved citation rates by 28%. A technically perfect page can be completely uncitable if its content structure doesn’t meet these criteria.
Node architecture — can each section stand alone?
- Does each H2 section open with a direct, standalone answer in the first 30–60 words? A section that requires context from the surrounding sections to make sense is not independently retrievable.
- Are definitions explicit, not implied? “Query fan-out is the process by which AI search systems decompose a single user query into 6–20 parallel sub-queries” is citable. “Query fan-out, which you’ve probably heard about” is not.
- Do statistics carry full context? The citable format is: number + population + action + timeframe + source. “Studies show improvement” is not citable. “The GEO-Bench study (Princeton, Georgia Tech, IIT Delhi, 2024) found that statistics improved AI citation rates by 41%” is.
- Are headings written to answer questions? “Our Methodology” signals nothing. “How Sub-Query Coverage Mapping Works” signals exactly what the section answers.
- Does each section name at least one authoritative source? Source attribution is itself a citation-worthiness signal — content that grounds claims in evidence is evaluated as more trustworthy than content that asserts without it.
Entity anchoring — are entities named throughout?
- Are entities named explicitly throughout content, or replaced with pronouns? “SEO Strategy Ltd’s llms.txt Generator WordPress plugin” is entity-anchored. “Our plugin” is not. Every named entity is a potential knowledge graph connection.
- Is your brand name associated with specific expertise claims on key pages — not just in the navigation and footer? AI systems build entity-topic associations from content, not navigation structure.
- Are client names and case study entities named explicitly? “Azure Outdoor Living, a Norfolk-based outdoor structures company” gives AI systems a grounded entity reference. “One of our clients” gives them nothing.
Sub-query coverage — are you answering the full question cluster?
- For each core topic, map the sub-query types: definition, comparison, how-to, use case, objection, entity expansion, metric. AI platforms decompose queries into 6–20 sub-queries (Google named this “query fan-out” at I/O 2025). Missing two or three sub-query types means missing two or three citation slots.
- Query Perplexity, ChatGPT, and Google AI Mode for your core topics. Record who’s cited and why. If competitors appear and you don’t, that is a content brief, not a mystery.
- Check whether your content addresses the conversational, long-form versions of your target queries. iPullRank and Similarweb research (2026) shows AI search queries average 70–80 words, versus 3–4 words in traditional search.
Freshness signals
- Does the page have a visible last-updated date?
- Do statistics include their publication year?
- Are framework or tool names versioned where relevant? “As of March 2026” is a freshness signal. “Recently” is not.
Tier 4: Technical Hygiene — Do These Last
These are the items crawl tools flag loudly and agencies often work on first. They matter — but fix them after the structural and retrieval issues. Spending three weeks on Tier 4 while Tier 1 problems are unresolved is the most common way SEO budgets get wasted.
- Meta descriptions — write them for click-through, not keywords. Google rewrites them frequently anyway, but a well-written description influences CTR when your snippet appears. Do this last, not first.
- Image alt text — write descriptive alt text for every meaningful image, primarily for accessibility. Don’t let an agency spend 10 hours on this while your internal linking is broken.
- Minor Core Web Vitals refinements — once you’re past the LCP and INP thresholds, marginal score improvements produce diminishing ranking returns.
- SSL certificate validity — confirm it’s valid, covers all subdomains, and isn’t approaching expiry. SSL Labs checks this in under a minute.
- Mixed content — confirm HTTPS pages aren’t loading resources over HTTP. Straightforward to resolve once identified.
- HTML validation errors — fix structural errors; ignore cosmetic warnings. A perfectly valid HTML document with broken internal linking will not rank.
What a Proper Audit Costs, How Long It Takes, and What You Get
You want a direct answer to whether this is worth £2,000–£3,000 and a significant chunk of someone’s time. Here it is.
If you’re doing it yourself: running through this checklist properly — Google Search Console, Screaming Frog (free up to 500 URLs), PageSpeed Insights, server logs — takes 8–12 hours spread across a week. You’ll find real issues. Where DIY consistently falls short is prioritisation: understanding which of the 50 issues you’ve found is suppressing rankings most, and in what sequence to tackle them. That judgement comes from experience across dozens of sites.
If you’re spending £2,000–£3,000, here is what the output should include — and what to reject if it isn’t there:
- A prioritised list of the top 8–10 issues by impact, with “if we fix this, here’s what should happen” attached to each one. Not 400 issues — 10, ranked. If you receive a 400-line spreadsheet, you’ve received a tool export, not an audit.
- A competitor citation analysis: for your target queries, who is currently appearing in Google AI Overviews, Perplexity, and ChatGPT — and why? This turns the audit into a competitive intelligence document.
- A content gap analysis: which sub-query types are you missing, and which existing pages can be restructured to fill them without creating new content?
- A 90-day prioritised roadmap — week 1–2 for acute blockers, month 1 for structural work, months 2–3 for AI citation readiness. A sequenced plan, not a list of recommendations.
What moves, and when: Structural fixes — crawl waste, internal linking, index consolidation — typically show measurable ranking changes within 4–8 weeks. Content restructuring for AI citation readiness can show citation improvement within 2–6 weeks. Entity authority building is a 3–6 month compounding programme. The businesses that get the best ROI from a £2,000–£3,000 audit use it as the foundation for a 6–12 month programme, not a one-off exercise. An audit that sits in a folder produces zero ROI regardless of its quality.
Want SEO Strategy Ltd to run this for you and tell you exactly what matters? Book a free 30-minute consultation.
Proof: What Fixing the Right Things Actually Produces
Pro2col — cannibalisation audit: A full content audit identified 146 competing posts across the same topic clusters — multiple articles targeting the same intent, fragmenting authority without building any. Redirect mapping, consolidation, and rebuilding the strongest version of each cluster produced measurable ranking improvements within six weeks. The problem was not a lack of content. It was 146 pieces of content fighting each other.
Coviant Software — displacement pages from audit data: After identifying specific search patterns in GSC data collaboratively with the client, Sean Mullins built competitor displacement pages — “Serv-U vs Diplomat MFT” and similar comparison pages — targeting queries buyers use during active evaluation. These now rank and convert enterprise leads. The insight came from the audit, not from guessing.
Motoring Defence Solicitors — intent-matched content: The drink driving calculator at Motoring Defence Solicitors ranks for competitive terms and drives qualified leads from users seeking help at the exact moment of need. It ranks because it serves a specific, high-intent sub-query — a direct outcome of understanding the query fan-out around “drink driving” rather than targeting the head term alone.
Azure Outdoor Living — seven-figure organic growth: A sustained SEO programme for Azure Outdoor Living scaled the business to seven-figure national turnover. The foundation was an audit that identified where authority was leaking, where internal linking was weak, and where content architecture wasn’t matching buyer intent at different funnel stages.
Glossary
Index Quality Ratio — The proportion of a site’s indexed pages that have meaningful traffic or commercial value. Calculated as: valuable indexed pages ÷ total indexed pages. Below 40% indicates index bloat that is likely suppressing site-wide authority signals.
Crawl budget — The number of pages Googlebot crawls on a site in a given period. On larger sites, crawl budget is finite. Wasting it on parameter URLs and thin variants means priority pages get crawled less frequently and take longer to rank.
Node architecture — A content structuring principle in which every H2 section is independently retrievable by AI systems: opening with a standalone direct answer, containing explicit definitions, including statistics with full context, and naming entities explicitly throughout.
Query fan-out — The process by which AI search platforms decompose a single user query into 6–20 parallel sub-queries before retrieving any content. Google named this mechanism at I/O 2025. Only around 27% of sub-queries are consistent across repeated searches (Similarweb, March 2026), which makes optimising for specific sub-query phrases a losing strategy — comprehensive semantic coverage is the durable approach.
Entity anchoring — The practice of naming entities — brands, tools, people, frameworks, locations — explicitly and consistently throughout content, rather than using pronouns or generic references, to strengthen knowledge graph recognition and AI retrieval connections.
AI citation readiness — The structural characteristics that make content likely to be extracted and cited by AI retrieval systems: independent retrievability, explicit definitions, statistics with full context, entity anchoring, and named authoritative source attribution.
Content decay — The pattern by which a page’s organic traffic peaks and then declines as competitors produce fresher, more comprehensive content on the same topic. Decaying pages are often faster to restore through updating than to replace, because the authority and internal links are already in place.
Content cannibalisation — Multiple pages on the same site targeting the same or very similar search intent, splitting authority signals and preventing any single page from ranking as strongly as it could. Fixing cannibalisation — not creating more content — is often the fastest ranking lever available.