Client Context Pipeline
[Live] — Runs on tenant registration and on-demand revision.
The Client Context File is the single source of truth injected into every agent in the system. It is a structured Markdown document describing the tenant’s business, audience, brand voice, competitors, and content opportunities. Every blog post, social post, ad, and strategy document is generated with this file in context.
Pipeline Overview
Three agents run sequentially on tenant registration:
client-researcher → competitor-researcher → context-file-writerEach agent outputs free-form Markdown that is passed as-is to the next step via the BullMQ job payload. The final output (from context-file-writer) is saved to the ClientContext DB record and surfaced for tenant admin review before any downstream agents run.
Agent Roles
1. Client Researcher (agent__client-researcher)
Crawls the tenant’s website and searches for external mentions to build a factual profile of the business.
Inputs (job payload):
| Field | Source |
|---|---|
tenantName | Tenant record |
emailDomain | Tenant record |
country | Tenant record |
plan | Tenant subscription |
revisionNotes | User-supplied (revision runs only) |
What it does:
- Calls
search_knowledge.js "company overview products services" "website_content"first — if the tenant has already been crawled via the Website Channel, uses that indexed content as the primary source rather than re-crawling live. - Live-crawls the website: homepage, /about, /services or /products, /blog (index), /contact, /faq.
- Web-searches for press mentions, social profiles, and review listings.
- Synthesises everything into structured Markdown research notes covering: products/services, USPs, target audience, brand tone, geographic focus, content cadence, social presence, and technical indicators.
Output: Free-form Markdown research notes, passed to competitor-researcher as clientResearchOutput.
2. Competitor Researcher (agent__competitor-researcher)
Identifies and analyses the tenant’s competitive landscape based on the client research.
Inputs (job payload):
| Field | Source |
|---|---|
tenantName | Tenant record |
country | Tenant record |
plan | Tenant subscription |
clientResearchOutput | Output from client-researcher |
knownCompetitors | Competitor table (active rows for this tenant) |
revisionNotes | User-supplied (revision runs only) |
What it does:
- Identifies 4–6 direct competitors (same product/service category, overlapping geography).
- For each competitor: name, URL, services, target audience, marketing channels, content cadence, strengths/weaknesses, awards.
- Identifies 2–3 indirect/content competitors.
- Performs keyword gap analysis.
- Summarises competitive intensity (low / medium / high).
- Seeds from
knownCompetitors— competitors already tracked in the DB are always included and expanded on.
Output: Free-form Markdown competitor analysis, passed to context-file-writer as competitorResearchOutput.
3. Context File Writer (agent__context-file-writer)
Synthesises both research outputs into the canonical Client Context File.
Inputs (job payload):
| Field | Source |
|---|---|
tenantName | Tenant record |
country | Tenant record |
plan | Tenant subscription |
clientResearchOutput | Output from client-researcher |
competitorResearchOutput | Output from competitor-researcher |
revisionNotes | User-supplied (revision runs only) |
What it does:
- Calls
search_knowledge.jsagainstclient_docsfor brand guidelines, tone-of-voice docs, and uploaded brand assets. These override synthesised content where they conflict. - Synthesises both research inputs into the 11-section Client Context File (see Output Structure below).
- Writes the output to
context.mdin its working directory. - The worker reads
context.mdafter execution and appends:- Brand Voice section — from
BrandVoiceDB record ifstatus = "active" - Key Pages section — published/approved blog posts and landing pages for internal linking
- Brand Voice section — from
Output: Saved to ClientContext.content. Versioned in ClientContextVersion.
Output Structure
The context file is a Markdown document with exactly these sections (in order):
# [Company Name] — Context File
## Business Overview
Narrative description of the company — industry, founding story, size, model.
## Products & Services
| Product/Service | Description | Target User | Key Benefit |
## Target Audience
| Segment | Demographics | Pain Points | Goals |
## Brand Voice & Tone
Bullet list: tone adjectives, personality traits, vocabulary to use, vocabulary to avoid, writing style.
## Unique Selling Propositions
Bullet list of 3–6 distinct USPs.
## Current Marketing Presence
| Channel | Status | Audience Size / Metrics | Notes |
## Geographic Focus
Primary / Secondary / Not served regions.
## Competitive Landscape
| Competitor | Strengths | Weaknesses | Key Differentiator |
## Keyword Opportunities
| Keyword | Intent | Estimated Volume | Difficulty |
## Content Gaps & Opportunities
| Gap / Opportunity | Recommended Format | Priority |
## Technical Notes
Bullet list: CMS, e-commerce platform, booking system, tech indicators.
---
## Brand Voice
(Appended by worker from BrandVoice DB record if active)
## Key Pages (Internal Linking Reference)
(Appended by worker from published BlogPost + LandingPage records)Tables are mandatory for: Products & Services, Target Audience, Current Marketing Presence, Competitive Landscape, Keyword Opportunities, Content Gaps & Opportunities. Bullet lists are used only for unstructured content.
Database Models
ClientContext
One row per tenant. The canonical context file.
| Field | Type | Notes |
|---|---|---|
tenantId | String | @unique — one context per tenant |
status | String | pending → generating → completed → approved |
content | Text | Full Markdown document |
generatedAt | DateTime | Set on each generation |
version | Int | Increments on every regeneration |
ClientContextVersion
Snapshot per generation. Used for history browsing at /context/versions.
| Field | Type | Notes |
|---|---|---|
contextId | String | FK to ClientContext |
version | Int | Version number at time of snapshot |
content | Text | Full Markdown at this version |
createdBy | String | "AI Agent" or user name |
ClientContextLog
Audit trail for the context timeline UI.
| Field | Type | Notes |
|---|---|---|
contextId | String | FK to ClientContext |
action | String | generated, regenerated, approved, revision_requested, manual_edit, research_completed |
detail | Text | Human-readable description |
performedBy | String | Agent name or user name |
Status Flow
pending
└─ generating (set when client-researcher starts)
└─ completed (set when context-file-writer saves output)
├─ approved (admin approves → triggers Strategy Writer)
└─ generating (admin requests revision → re-runs full chain)Manual edits by the admin keep status = completed and increment version but do not re-trigger AI generation.
Triggers
| Trigger | What runs | Job name |
|---|---|---|
| Tenant completes onboarding wizard | Full chain from client-researcher | setup |
| Admin hits “Retrigger” in manage portal | Full chain from client-researcher | setup with timestamp jobId suffix |
| Tenant admin requests revision with notes | Full chain from client-researcher | revision — notes passed through all 3 agents |
| Brand voice activated | context-file-writer only (skips research) | revision with revisionNotes describing the brand voice update |
Revision runs use timestamp-suffixed jobIds (revision__${tenantId}__${role}__${Date.now()}) to avoid BullMQ dedup blocking re-enqueues.
Approval Workflow (HITL)
After context-file-writer completes, status = completed and the tenant admin is notified by email and in-app. They review the full Markdown document at /context.
| Action | What happens |
|---|---|
| Approve | status → approved; Strategy Writer enqueued |
| Request Revision | status → generating; full chain re-runs with revision notes |
| Manual Edit | Content updated directly; version incremented; no AI re-run |
No downstream agents (strategy, blog, social) run until the context file is approved.
Downstream Effects
On context-file-writer completion:
ai-visibility-seederenqueued — extracts competitor data from the context file into theCompetitortableopportunity-matcherenqueued — matches tenant to backlink directory opportunities- Email notification sent to tenant admin +
content_reviewpreference recipients - In-app notification published
Credits
Context file generation consumes context_file credits, reserved at job start and consumed on successful save. Credits are released (not consumed) if the job fails.
Skills Injected
| File | Agent | Purpose |
|---|---|---|
context_file_structure.md | context-file-writer | Schema reference: exact section names, table column specs, and formatting rules the output must follow |
search_knowledge.js | all 3 agents | Node script to query the tenant RAG knowledge base |
CLAUDE.md | all 3 agents | Instructions for when and how to call search_knowledge.js |
Per-Tenant Configuration
Each tenant can have its context generation inputs toggled on or off independently. This is stored in Tenant.contextConfig (a Json? field, schema: TenantContextConfig) and controlled from the Context Settings tab in the manage portal tenant detail page.
| Input | Field | Default | When to disable |
|---|---|---|---|
| Live website crawl | enableWebCrawl | true | Pre-launch tenants, no public site, or JS-heavy sites that block crawlers |
| Competitor research | enableCompetitorResearch | true | Niche markets with no trackable competitors, or client request |
| Brand voice section | enableBrandVoiceSection | true | Tenant has not yet completed brand voice setup |
| Key pages section | enableKeyPagesSection | true | Early-stage clients with no published content |
How it works in the pipeline:
contextConfigis loaded from the DB at job start (parallel withagentConfig), parsed viaresolveContextConfig()which fills intruedefaults for any missing keys.enableWebCrawl = false→ client-researcher prompt is modified to say “Do NOT crawl the website; use only the knowledge base.”enableCompetitorResearch = false→ competitor-researcher is skipped entirely; a placeholder string is passed ascompetitorResearchOutputdirectly to context-file-writer.enableBrandVoiceSection = false→ brand voice block is not appended after generation completes.enableKeyPagesSection = false→ key pages block is not appended after generation completes.
All defaults are true — a tenant with no contextConfig row behaves identically to a tenant with all flags explicitly set to true.
Known Issues & Improvements (April 2026)
The following bugs were identified and fixed in April 2026:
1. revisionNotes silently dropped
Problem: SetupJobData has a revisionNotes field populated when a user requests a revision, but none of the three agent prompts included it. The agents had no idea what the user wanted changed.
Fix: revisionNotes is now injected into all three prompts under a REVISION INSTRUCTIONS: block when present. All three agents receive the revision focus so research and synthesis can be directed accordingly.
2. plan not passed to client-researcher or competitor-researcher
Problem: plan was in SetupJobData and passed to context-file-writer, but not to the first two agents. Both agents were generating generic research with no awareness of the tenant’s plan tier.
Fix: plan is now included in all three agent prompts. This allows client-researcher to calibrate research depth (e.g. Agency-tier needs channel-by-channel breakdown; Free-tier needs basic profile) and competitor-researcher to prioritise accordingly.
3. Skill file sections didn’t match system prompt
Problem: The context_file_structure.md skill injected into context-file-writer defined a completely different section schema from the system prompt:
| Skill file (wrong) | System prompt (correct) |
|---|---|
| Brand Identity | Brand Voice & Tone |
| Content & Marketing Context | Current Marketing Presence |
| Goals & KPIs | (not a required section) |
| Additional Notes | Technical Notes |
| (missing) | Unique Selling Propositions |
| (missing) | Geographic Focus |
| (missing) | Keyword Opportunities |
| (missing) | Content Gaps & Opportunities |
The agent received contradictory instructions — skill said one schema, prompt said another.
Fix: context_file_structure.md updated to exactly match the 11 sections defined in the system prompt, with correct table column specs per section.
4. Website crawler data not used by client-researcher
Problem: The WebPage model stores crawled page content indexed into the website_content RAG dataset. But client-researcher was always re-crawling live, ignoring indexed data. This was slower, nondeterministic, and wasted tool calls.
Fix: The client-researcher prompt now instructs the agent to call search_knowledge.js against website_content before live crawling. If indexed content exists, it is used as the primary source; live crawling supplements gaps only.
5. Output not validated before saving
Problem: context-file-writer writes its output to context.md in its working directory. The worker read this file but had no check that it contained valid content (required section headings, minimum length). A malformed or stub output could be silently saved to the DB and credited.
Fix: The worker now validates that the output contains at least 3 ## headings and is at least 500 characters before saving. If the file is too short or missing headings, a warning is logged and surfaced in the HITL review UI.