Client Context Pipeline

[Live] — Runs on tenant registration and on-demand revision.

The Client Context File is the single source of truth injected into every agent in the system. It is a structured Markdown document describing the tenant’s business, audience, brand voice, competitors, and content opportunities. Every blog post, social post, ad, and strategy document is generated with this file in context.

Pipeline Overview

Three agents run sequentially on tenant registration:


client-researcher → competitor-researcher → context-file-writer

Each agent outputs free-form Markdown that is passed as-is to the next step via the BullMQ job payload. The final output (from context-file-writer) is saved to the ClientContext DB record and surfaced for tenant admin review before any downstream agents run.

Agent Roles

1. Client Researcher (`agent__client-researcher`)

Crawls the tenant’s website and searches for external mentions to build a factual profile of the business.

Inputs (job payload):

Field	Source
`tenantName`	Tenant record
`emailDomain`	Tenant record
`country`	Tenant record
`plan`	Tenant subscription
`revisionNotes`	User-supplied (revision runs only)

What it does:

Calls search_knowledge.js "company overview products services" "website_content" first — if the tenant has already been crawled via the Website Channel, uses that indexed content as the primary source rather than re-crawling live.
Live-crawls the website: homepage, /about, /services or /products, /blog (index), /contact, /faq.
Web-searches for press mentions, social profiles, and review listings.
Synthesises everything into structured Markdown research notes covering: products/services, USPs, target audience, brand tone, geographic focus, content cadence, social presence, and technical indicators.

Output: Free-form Markdown research notes, passed to competitor-researcher as clientResearchOutput.

2. Competitor Researcher (`agent__competitor-researcher`)

Identifies and analyses the tenant’s competitive landscape based on the client research.

Inputs (job payload):

Field	Source
`tenantName`	Tenant record
`country`	Tenant record
`plan`	Tenant subscription
`clientResearchOutput`	Output from client-researcher
`knownCompetitors`	`Competitor` table (active rows for this tenant)
`revisionNotes`	User-supplied (revision runs only)

What it does:

Identifies 4–6 direct competitors (same product/service category, overlapping geography).
For each competitor: name, URL, services, target audience, marketing channels, content cadence, strengths/weaknesses, awards.
Identifies 2–3 indirect/content competitors.
Performs keyword gap analysis.
Summarises competitive intensity (low / medium / high).
Seeds from knownCompetitors — competitors already tracked in the DB are always included and expanded on.

Output: Free-form Markdown competitor analysis, passed to context-file-writer as competitorResearchOutput.

3. Context File Writer (`agent__context-file-writer`)

Synthesises both research outputs into the canonical Client Context File.

Inputs (job payload):

Field	Source
`tenantName`	Tenant record
`country`	Tenant record
`plan`	Tenant subscription
`clientResearchOutput`	Output from client-researcher
`competitorResearchOutput`	Output from competitor-researcher
`revisionNotes`	User-supplied (revision runs only)

What it does:

Calls search_knowledge.js against client_docs for brand guidelines, tone-of-voice docs, and uploaded brand assets. These override synthesised content where they conflict.
Synthesises both research inputs into the 11-section Client Context File (see Output Structure below).
Writes the output to context.md in its working directory.
The worker reads context.md after execution and appends:
- Brand Voice section — from BrandVoice DB record if status = "active"
- Key Pages section — published/approved blog posts and landing pages for internal linking

Output: Saved to ClientContext.content. Versioned in ClientContextVersion.

Output Structure

The context file is a Markdown document with exactly these sections (in order):


# [Company Name] — Context File
 
## Business Overview
Narrative description of the company — industry, founding story, size, model.
 
## Products & Services
| Product/Service | Description | Target User | Key Benefit |
 
## Target Audience
| Segment | Demographics | Pain Points | Goals |
 
## Brand Voice & Tone
Bullet list: tone adjectives, personality traits, vocabulary to use, vocabulary to avoid, writing style.
 
## Unique Selling Propositions
Bullet list of 3–6 distinct USPs.
 
## Current Marketing Presence
| Channel | Status | Audience Size / Metrics | Notes |
 
## Geographic Focus
Primary / Secondary / Not served regions.
 
## Competitive Landscape
| Competitor | Strengths | Weaknesses | Key Differentiator |
 
## Keyword Opportunities
| Keyword | Intent | Estimated Volume | Difficulty |
 
## Content Gaps & Opportunities
| Gap / Opportunity | Recommended Format | Priority |
 
## Technical Notes
Bullet list: CMS, e-commerce platform, booking system, tech indicators.
 
---
 
## Brand Voice
(Appended by worker from BrandVoice DB record if active)
 
## Key Pages (Internal Linking Reference)
(Appended by worker from published BlogPost + LandingPage records)

Tables are mandatory for: Products & Services, Target Audience, Current Marketing Presence, Competitive Landscape, Keyword Opportunities, Content Gaps & Opportunities. Bullet lists are used only for unstructured content.

Database Models

`ClientContext`

One row per tenant. The canonical context file.

Field	Type	Notes
`tenantId`	String	`@unique` — one context per tenant
`status`	String	`pending` → `generating` → `completed` → `approved`
`content`	Text	Full Markdown document
`generatedAt`	DateTime	Set on each generation
`version`	Int	Increments on every regeneration

`ClientContextVersion`

Snapshot per generation. Used for history browsing at /context/versions.

Field	Type	Notes
`contextId`	String	FK to ClientContext
`version`	Int	Version number at time of snapshot
`content`	Text	Full Markdown at this version
`createdBy`	String	`"AI Agent"` or user name

`ClientContextLog`

Audit trail for the context timeline UI.

Field	Type	Notes
`contextId`	String	FK to ClientContext
`action`	String	`generated`, `regenerated`, `approved`, `revision_requested`, `manual_edit`, `research_completed`
`detail`	Text	Human-readable description
`performedBy`	String	Agent name or user name

Status Flow


pending
  └─ generating  (set when client-researcher starts)
       └─ completed  (set when context-file-writer saves output)
            ├─ approved       (admin approves → triggers Strategy Writer)
            └─ generating     (admin requests revision → re-runs full chain)

Manual edits by the admin keep status = completed and increment version but do not re-trigger AI generation.

Triggers

Trigger	What runs	Job name
Tenant completes onboarding wizard	Full chain from client-researcher	`setup`
Admin hits “Retrigger” in manage portal	Full chain from client-researcher	`setup` with timestamp jobId suffix
Tenant admin requests revision with notes	Full chain from client-researcher	`revision` — notes passed through all 3 agents
Brand voice activated	context-file-writer only (skips research)	`revision` with `revisionNotes` describing the brand voice update

Revision runs use timestamp-suffixed jobIds (revision__${tenantId}__${role}__${Date.now()}) to avoid BullMQ dedup blocking re-enqueues.

Approval Workflow (HITL)

After context-file-writer completes, status = completed and the tenant admin is notified by email and in-app. They review the full Markdown document at /context.

Action	What happens
Approve	`status → approved`; Strategy Writer enqueued
Request Revision	`status → generating`; full chain re-runs with revision notes
Manual Edit	Content updated directly; `version` incremented; no AI re-run

No downstream agents (strategy, blog, social) run until the context file is approved.

Downstream Effects

On context-file-writer completion:

ai-visibility-seeder enqueued — extracts competitor data from the context file into the Competitor table
opportunity-matcher enqueued — matches tenant to backlink directory opportunities
Email notification sent to tenant admin + content_review preference recipients
In-app notification published

Credits

Context file generation consumes context_file credits, reserved at job start and consumed on successful save. Credits are released (not consumed) if the job fails.

Skills Injected

File	Agent	Purpose
`context_file_structure.md`	context-file-writer	Schema reference: exact section names, table column specs, and formatting rules the output must follow
`search_knowledge.js`	all 3 agents	Node script to query the tenant RAG knowledge base
`CLAUDE.md`	all 3 agents	Instructions for when and how to call `search_knowledge.js`

Per-Tenant Configuration

Each tenant can have its context generation inputs toggled on or off independently. This is stored in Tenant.contextConfig (a Json? field, schema: TenantContextConfig) and controlled from the Context Settings tab in the manage portal tenant detail page.

Input	Field	Default	When to disable
Live website crawl	`enableWebCrawl`	`true`	Pre-launch tenants, no public site, or JS-heavy sites that block crawlers
Competitor research	`enableCompetitorResearch`	`true`	Niche markets with no trackable competitors, or client request
Brand voice section	`enableBrandVoiceSection`	`true`	Tenant has not yet completed brand voice setup
Key pages section	`enableKeyPagesSection`	`true`	Early-stage clients with no published content

How it works in the pipeline:

contextConfig is loaded from the DB at job start (parallel with agentConfig), parsed via resolveContextConfig() which fills in true defaults for any missing keys.
enableWebCrawl = false → client-researcher prompt is modified to say “Do NOT crawl the website; use only the knowledge base.”
enableCompetitorResearch = false → competitor-researcher is skipped entirely; a placeholder string is passed as competitorResearchOutput directly to context-file-writer.
enableBrandVoiceSection = false → brand voice block is not appended after generation completes.
enableKeyPagesSection = false → key pages block is not appended after generation completes.

All defaults are true — a tenant with no contextConfig row behaves identically to a tenant with all flags explicitly set to true.

Known Issues & Improvements (April 2026)

The following bugs were identified and fixed in April 2026:

1. `revisionNotes` silently dropped

Problem: SetupJobData has a revisionNotes field populated when a user requests a revision, but none of the three agent prompts included it. The agents had no idea what the user wanted changed.

Fix: revisionNotes is now injected into all three prompts under a REVISION INSTRUCTIONS: block when present. All three agents receive the revision focus so research and synthesis can be directed accordingly.

2. `plan` not passed to client-researcher or competitor-researcher

Problem: plan was in SetupJobData and passed to context-file-writer, but not to the first two agents. Both agents were generating generic research with no awareness of the tenant’s plan tier.

Fix: plan is now included in all three agent prompts. This allows client-researcher to calibrate research depth (e.g. Agency-tier needs channel-by-channel breakdown; Free-tier needs basic profile) and competitor-researcher to prioritise accordingly.

3. Skill file sections didn’t match system prompt

Problem: The context_file_structure.md skill injected into context-file-writer defined a completely different section schema from the system prompt:

Skill file (wrong)	System prompt (correct)
Brand Identity	Brand Voice & Tone
Content & Marketing Context	Current Marketing Presence
Goals & KPIs	(not a required section)
Additional Notes	Technical Notes
(missing)	Unique Selling Propositions
(missing)	Geographic Focus
(missing)	Keyword Opportunities
(missing)	Content Gaps & Opportunities

The agent received contradictory instructions — skill said one schema, prompt said another.

Fix: context_file_structure.md updated to exactly match the 11 sections defined in the system prompt, with correct table column specs per section.

4. Website crawler data not used by client-researcher

Problem: The WebPage model stores crawled page content indexed into the website_content RAG dataset. But client-researcher was always re-crawling live, ignoring indexed data. This was slower, nondeterministic, and wasted tool calls.

Fix: The client-researcher prompt now instructs the agent to call search_knowledge.js against website_content before live crawling. If indexed content exists, it is used as the primary source; live crawling supplements gaps only.

5. Output not validated before saving

Problem: context-file-writer writes its output to context.md in its working directory. The worker read this file but had no check that it contained valid content (required section headings, minimum length). A malformed or stub output could be silently saved to the DB and credited.

Fix: The worker now validates that the output contains at least 3 ## headings and is at least 500 characters before saving. If the file is too short or missing headings, a warning is logged and surfaced in the HITL review UI.

Client Context Pipeline

Pipeline Overview

Agent Roles

1. Client Researcher (agent__client-researcher)

2. Competitor Researcher (agent__competitor-researcher)

3. Context File Writer (agent__context-file-writer)

Output Structure

Database Models

ClientContext

ClientContextVersion

ClientContextLog

Status Flow

Triggers

Approval Workflow (HITL)

Downstream Effects

Credits

Skills Injected

Per-Tenant Configuration

Known Issues & Improvements (April 2026)

1. revisionNotes silently dropped

2. plan not passed to client-researcher or competitor-researcher

3. Skill file sections didn’t match system prompt

4. Website crawler data not used by client-researcher

5. Output not validated before saving

1. Client Researcher (`agent__client-researcher`)

2. Competitor Researcher (`agent__competitor-researcher`)

3. Context File Writer (`agent__context-file-writer`)

`ClientContext`

`ClientContextVersion`

`ClientContextLog`

1. `revisionNotes` silently dropped

2. `plan` not passed to client-researcher or competitor-researcher