Skip to Content
AgentsClient Researcher

Client Researcher

[Live] · agent__client-researcher · Claude Sonnet 4.6

Researches the client’s website and produces structured Markdown research notes that seed the Context File Writer and the entire downstream agent pipeline.


Overview

FunctionFetch up to 5 pages from the client’s website and write structured research notes
TypeSetup (part of the client context pipeline)
ModelClaude Sonnet 4.6
Queueagent__client-researcher
Concurrency3
Timeout15 min (timeoutSec: 900)
Max turns20 (hard cap to prevent runaway page fetching)
Est. cost / task~$0.12–0.15
PlanFree+ (all tenants)

Triggers

Trigger typeWhenWho initiates
Auto (setup chain)On tenant onboarding completion — first agent in the setup chainPlatform (triggered by completeOnboarding())
Human on-demand”Refresh Context” revision flow in Dashboard → Client ContextTenant admin

Input

interface SetupJobData { tenantId: string; tenantName: string; country: string; plan: string; revisionNotes?: string; // present on revision runs; focuses the research wakeReason: WakeReason; enableWebCrawl: boolean; ragContext?: string; // pre-fetched website_content RAG results (server-side pre-fetch) }

The worker does a server-side RAG pre-fetch against the website_content dataset before spawning the agent, and injects results into the prompt via ragContext. This means the agent gets any already-crawled website content without needing to call search_knowledge.js itself.


Output

The agent writes structured Markdown (not JSON). Output is passed as clientResearchOutput to the Competitor Researcher and eventually the Context File Writer.

Structure: clearly labelled sections covering products/services, target audience, brand voice, content presence, geographic focus, social profiles, and any technical notes.


How It Works

  1. Receive job — BullMQ dequeues the job; worker builds the prompt using buildClientResearcherPrompt() (exported from setup.worker.ts) with tenantName, country, plan, optional ragContext, and optional revisionNotes.

  2. Skills dir createdcreateSkillsDir() writes search_knowledge.js + CLAUDE.md + any DB-mapped skill files into a temp directory passed to the Claude Code CLI via --add-dir.

  3. Agentic loop — Claude Code CLI runs with --dangerously-skip-permissions and a cap of 20 turns. The agent:

    • Fetches at most 5 pages: homepage, services/products, about, contact, and optionally the blog index
    • Does not follow links to individual blog posts or case studies
    • Does not run web searches for news or press coverage — only uses what is visible on the fetched pages
    • Notes any awards, press mentions, or milestones visible in the fetched content
  4. Chain forward — On success, the worker loads any existing competitors from the Competitor table and enqueues a competitor-researcher job (or jumps to context-file-writer if enableCompetitorResearch = false).

  5. DB log — A ClientContextLog record is written with action: "research_completed".


System Prompt (live in AgentConfig table)

You are the Client Researcher agent for Leadmetrics, a digital marketing agency AI platform. Your task is to research a client's business and produce detailed research notes that will be used to generate their marketing context file. RESEARCH REQUIREMENTS: 1. Fetch at most 5 pages from the client's website: prioritise homepage, services/products, about, and contact. If a blog index is present, fetch it too — but do not follow individual blog post links. 2. Identify and document: - Core products or services offered - Unique selling propositions (USPs) and differentiators - Target audience signals (language used, testimonials, case studies) - Geographic focus (local, national, global) - Brand tone and voice - Current content types and publishing frequency - Any pricing tiers or packages visible - Social media presence (platforms linked from site) - Technical indicators (e-commerce, booking system, lead gen forms) - Any awards, press mentions, or notable milestones visible on the pages you fetch 3. Identify the primary industry vertical and sub-niche. 4. Note any obvious content gaps or marketing weaknesses observed on the site. DO NOT: - Fetch more than 5 pages total - Follow links to individual blog posts or case studies - Run web searches for news or press coverage — only use what is visible on the website OUTPUT FORMAT: Write structured Markdown with clearly labelled sections. Be factual — only include information you can verify from the website. Do not speculate or invent details. Output ONLY the research notes. No preamble, no explanation.

The system prompt is stored in AgentConfig.systemPrompt (role client-researcher) and can be updated via Manage → Agents without a redeploy. The seed in packages/db/src/seed.ts mirrors this value.


Skills Injected

Skill filePurpose
search_knowledge.jsQuery the tenant RAG knowledge base (all datasets)
CLAUDE.mdInstructions for when/how to call search_knowledge

Tools Used (actual)

ToolCalls per runPurpose
WebFetchmax 5Fetch client website pages
WebSearch0Not used — press/news searches removed April 2026
Bash (search_knowledge.js)0–1Optional RAG lookup if ragContext was insufficient

Total tool calls target: 5–7. Hard cap via --max-turns 20.

Web searches for press coverage were removed in April 2026. For SMB clients these searches returned no useful results and added 3 wasted tool calls per run.


Cost Profile

Typical tool calls5–7
Est. cost / task~$0.12–0.15
Est. duration~1 min

Before April 2026 tuning: “thoroughly… at minimum” language in the system prompt caused 10+ WebFetch calls (blog posts, sub-pages) + 3 WebSearch calls for press coverage, totalling ~15 tool calls, ~2 min, ~$0.29/run.


Adapter Config

Set in setup.worker.ts for the client-researcher role:

{ cwd, model, // from AgentConfig.model dangerouslySkipPermissions: true, timeoutSec: 900, maxTurnsPerRun: 20, // hard cap added April 2026 }

RAG Pre-fetch

The worker does a server-side RAG pre-fetch before spawning the agent (in setup.worker.ts, not in the agentic loop). It queries the website_content dataset and injects results into the prompt as ragContext. This means:

  • If the tenant’s website has already been crawled (via the Website Crawler), those results are used immediately
  • The agent gets structured website content without needing a live WebFetch for pages already indexed
  • Re-runs after a web crawl are faster and more consistent

Error Handling

ErrorResponse
WebFetch returns 404 or timeout on all pagesOutput is partial; Context File Writer still runs with whatever was found
WebFetch succeeds on homepage onlyAgent continues with partial data; notes gaps in output
Max turns reached (20)BullMQ job fails; setup chain retries per BullMQ config
RAG pre-fetch returns no resultsAgent proceeds without RAG context

Tenant Settings Used

SettingHow it’s used
tenantNameUsed in job display names
countryPassed in prompt to frame geographic context
planPassed in prompt
enableWebCrawlPassed in job data; controls whether website_content RAG pre-fetch is attempted

© 2026 Leadmetrics — Internal use only