Skip to Content
Content ToolkitContent Audit Agent

Content Audit Agent

[To Build] · agent__content-auditor · Claude Sonnet 4.6

Audits published blog posts for decay signals: outdated facts, thin sections, keyword cannibalisation, missing internal links, and poor AI search structure. Produces a per-post health score with actionable recommendations and enables a one-click “Refresh this post” workflow.

Related: Blog Writer · Content Auditor Agent · RAG Integration · Content Optimizer · Performance Feedback Loop · Content Toolkit Overview


Overview

FunctionAudit published blog posts for content decay, gaps, and cannibalisation; generate a refresh brief
TypeWorker — Content Quality
StatusTo Build
PriorityP2 — Differentiating
Queueagent__content-auditor
Concurrency2
Timeout8 min
Est. cost / task~$0.60
Credits1 cr per audit
PlanPro+

Why This Is Needed

Published blog posts decay. Statistics become outdated, competitors publish fresher content, and internal link opportunities that did not exist at publish time accumulate as the site grows. Without audits, the published content library quietly loses value while consuming no credits to fix.

The content auditor runs on demand (or on a scheduled cycle) and surfaces exactly what needs to change and why — feeding directly into a pre-filled blog refresh brief so no work is wasted re-researching.


What the Audit Checks

1. Content Freshness

SignalHow detected
Date references older than 18 monthsRegex for year patterns (e.g. “In 2022”, “As of last year”)
Statistics without a recent sourceSentences containing numbers that lack an inline citation link
Product/pricing referencesKeywords from client’s own product descriptions cross-checked against current client_docs RAG content

2. Thin Sections

Sections (content between consecutive H2s) with fewer than 200 words are flagged. The agent also checks whether any H2 section is present in the brief outline but contains less than 50% of the expected word allocation.

3. Keyword Cannibalisation

Cross-references all published blog posts in the published_content RAG dataset to detect when two posts compete for the same primary keyword. The audit flags:

  • This post’s primary keyword appears as the primary or secondary keyword in another published post
  • Suggested resolution: consolidate, differentiate, or prune

Queries the published_content RAG dataset to find published posts that are topically related to the audited post but not linked. Flags specific internal link opportunities:

  • “Post X covers [related topic] — link from paragraph 3 of this post”
  • “Section Y of this post is referenced by Z other posts but does not link back”

5. AI Search Structure Gaps

Same signals as the Content Optimizer’s AI Search Visibility Score, but run in audit mode:

  • No direct answer in the intro
  • No definition block for the primary keyword
  • No FAQ section
  • No comparison table
  • Statistics without source citations

Input Contract

interface ContentAuditorInput { tenantId: string; blogPostId: string; auditScope: { freshness: boolean; thinSections: boolean; cannibalisation: boolean; internalLinks: boolean; aiSearchStructure: boolean; }; }

Output Contract

interface ContentAuditResult { tenantId: string; blogPostId: string; auditedAt: string; // ISO timestamp healthScore: number; // 0–100 composite findings: AuditFinding[]; // Pre-filled refresh brief context — fed into blog-writer when "Refresh" is triggered refreshContext: { summary: string; // 2-3 sentence summary of what needs to change priorityFindings: string[]; // Top 3 findings in plain language for the brief suggestedUpdates: string; // Markdown list of specific suggested updates }; } interface AuditFinding { category: 'freshness' | 'thin_section' | 'cannibalisation' | 'internal_links' | 'ai_search'; severity: 'critical' | 'warning' | 'suggestion'; title: string; detail: string; location?: string; // e.g. "Section: Benefits of X" or "Paragraph 4" action: string; // Specific recommended action }

Health Score Breakdown

CategoryWeight
Freshness30%
Thin sections20%
Cannibalisation20%
Internal links15%
AI search structure15%

Each category scores 0–100 based on the number and severity of findings within it. The composite health score is the weighted average.

Colour coding: red (0–49 = Needs Refresh), amber (50–74 = Review Recommended), green (75–100 = Healthy).


”Refresh This Post” Workflow

DM clicks "Refresh" on an audited blog post API: POST /tenant/v1/blog/:id/refresh Creates a new BlogActivity: - activityType: 'blog_post' - refreshSourceId: original BlogPost.id - contentBrief: { ...original brief + audit refreshContext injected } Activity appears in Activities list as "Blog Refresh: {original title}" blog-writer agent runs with the original brief + audit findings as context - Agent instructed to preserve and improve, not rewrite from scratch - Original post body passed as reference Normal review workflow: dm_review → client_review → published On publish: original BlogPost marked as superseded (status: 'superseded') New post takes the same slug (with optional redirect from old post URL)

Audit Dashboard

Dashboard — Content → Audit tab:

  • Table of all published blog posts with columns: Title · Published date · Last audit date · Health score · Findings count · Actions
  • “Run Audit” button per post (enqueues auditor job)
  • “Bulk Audit” button — enqueues audit jobs for all posts not audited in the last 30 days (limited to 10 at a time; credit check first)
  • Filters: All / Needs Refresh (red) / Review Recommended (amber) / Healthy (green)
  • Sort by health score ascending to surface the worst-performing posts first

Blog Post Detail — Audit Panel:

  • Last audit date + health score badge
  • Findings list grouped by category
  • “Refresh” button (visible when at least one critical or 3+ warning findings)

Scheduled Audits

Tenants can enable automatic monthly audits on all published posts via Settings → Content → Audit Schedule. When enabled:

  • On the 1st of each month, a cron job enqueues audit tasks for all published blog posts
  • Limited to 20 posts per run to cap credit consumption
  • Posts audited in the last 15 days are skipped
  • Total credit deducted at run start; if insufficient credits, run stops and DM is notified

Key Design Decisions

DecisionChoiceRationale
RAG for cannibalisation + link detectionQuery published_content datasetAll published posts are already ingested into RAG at publish time; no separate index needed
refreshContext in audit outputAgent produces a pre-filled brief context block alongside findingsEliminates friction between “audit found problems” and “brief is ready to fix them”
1 cr per auditSame cost as keyword researchPost-publication value; moderate inference usage with multiple RAG queries
Supersede on publishOriginal post marked superseded when refresh is publishedPreserves audit history and ensures the refresh does not overwrite the original until approved

Implementation Phases

Phase 1 — Agent + Basic Audit

  1. Create docs/agents/content-auditor.md (agent doc)
  2. Add content-auditor to AgentRole type union
  3. Create packages/agents/src/workers/content-auditor.worker.ts
  4. Seed system prompt in packages/db/src/seed.ts
  5. Implement freshness + thin section checks (heuristic, no RAG needed)
  6. Add auditResult JSON field + auditedAt to BlogPost model (migration)
  7. POST /tenant/v1/blog/:id/audit route
  8. Blog post detail: audit panel with findings list

Phase 2 — RAG-Powered Checks

  1. Extend worker to query published_content RAG dataset for cannibalisation + internal link checks
  2. Audit dashboard tab (list view with health scores, filters)
  3. “Bulk Audit” action with credit pre-check

Phase 3 — Refresh Workflow

  1. Add refreshSourceId field to Activity model (migration)
  2. POST /tenant/v1/blog/:id/refresh route — creates refresh activity with audit context injected
  3. Extend blog-writer agent to accept refreshContext in input (update system prompt accordingly)
  4. Mark original post as superseded on refresh post publish

Phase 4 — Scheduled Audits

  1. Tenant settings: audit schedule toggle
  2. Monthly cron job in API scheduler
  3. Credit guard: check balance before bulk audit run

© 2026 Leadmetrics — Internal use only