Deliverable Planner: Slow Runtime (7-15 min) — Needs Optimization

Status: Open
Severity: Medium — job completes but takes 7-15 min; DNS failures can push it past the 900s timeout
File: packages/agents/src/workers/strategy.worker.ts
Observed: 2026-05-06 00:17 (completed in ~7.5 min on retry after DNS failure caused first timeout)

Symptom


ERROR [strategy-worker] Deliverable planner job failed
  message: "Process timed out after 900s"

On the DNS failure run: started 23:55, timed out 00:10 (900s). BullMQ stall-recovery restarted it. On the healthy retry: started 00:10:19, completed 00:17:50 (~7.5 min, no DNS issues).

Even without network problems, ~7.5 min is slow and leaves little margin before the 900s timeout.

Root Causes

1. Client context injected verbatim (30-50K chars)

clientContext.content is the full AI-generated context file. It contains structured analysis of the client’s business, competitors, brand voice, etc. — typically 30,000-50,000 characters. The planner doesn’t need all of it (brand voice, competitor details, key page analysis are irrelevant to deliverable planning), but the whole file is injected verbatim.

2. Strategy document injected verbatim (15-25K chars)

data.approvedStrategy is the full strategy document. Combined with the context file, the total prompt input can exceed 80,000 characters (~20K tokens) before any output tokens.

3. Large JSON output with uncapped action items

The prompt instructs Claude: “Do NOT artificially cap the list. Include every task that genuinely adds strategic value.” Claude generates 20-30 action items, making the output JSON very long and requiring 2-3 continuation turns.

4. claude-sonnet-4-6 is slow for structured JSON generation

Sonnet takes 3-5 min per turn on large prompts. 3 turns x 3-5 min = 9-15 min. Any transient network delay pushes it over the 900s timeout.

5. DNS failure compounded the issue (incident on 2026-05-05)

A temporary getaddrinfo ENOTFOUND api.anthropic.com caused the first attempt to timeout waiting for a connection that never came, wasting the full 900s budget before DNS recovered.

Root Cause Detail (from code investigation)

Architecture: Claude CLI subprocess, not a direct API call

The adapter spawns claude --print - --output-format stream-json as a child process (the Claude Code CLI in non-interactive mode). Each “turn” inside that session is a full round-trip API call to api.anthropic.com. With maxTurnsPerRun: 3, up to 3 sequential API calls can occur before the job returns.

At 3-5 min per Sonnet call on large prompts: 3 turns × 3-5 min = 9-15 min. The observed 7.5 min was a 2-turn run.

The prompt is genuinely massive

buildDeliverablePlannerPrompt() injects verbatim:

clientContext — full context file (30-50K chars). Planner only needs goals, channels, audience, budget. Brand voice, competitor analysis, and key page breakdowns are included but irrelevant to deliverable planning.
approvedStrategy — full strategy markdown (15-25K chars). This IS the primary input and cannot be cut.

Combined: ~45-75K chars = 11-19K input tokens before a single output token.

Output schema is large by design

The prompt says “Do NOT artificially cap the list” for action items, producing 20-30 items with 7 fields each, plus goals, deliverables, and a multi-section planningNotes markdown block. Output is ~5-8K tokens — large enough to require continuation turns.

Proposed Fixes

Option 1: Section extraction from context file (quick win, ~1-2 min saving)

Parse the context markdown by # headings at prompt-build time and include only planning-relevant sections:


function extractPlanningContext(contextContent: string): string {
  const KEEP = ['goals', 'objective', 'channel', 'audience', 'overview', 'summary', 'budget'];
  const SKIP = ['brand voice', 'competitor', 'key page', 'website', 'technical'];
  // parse by # headings, include KEEP sections, skip SKIP sections
  // target: ~4-6K chars
}

Pros: No quality loss, no extra LLM call, deterministic, < 50 lines of code
Cons: Requires consistent heading names — context-file-writer uses # headings so this should be reliable, but needs verification

Option 2: Pre-distilled planner summary stored on ClientContext (~500-1000 words)

Generate a short planning summary when the context file is approved. Store it in a plannerSummary column on ClientContext. The deliverable planner uses that field instead of content.

Pros: Zero extraction logic at runtime, reusable across planner re-runs, guaranteed relevance
Cons: Requires DB migration + one extra LLM call at context-approval time (Haiku is sufficient). Summary goes stale if context is regenerated without re-running the summary.

Option 3: Two-phase output — goals+deliverables first, action items separately

The output has two logically distinct parts:

Phase 1 (fast, ~1-2 min): goals[] + deliverables[] — what the admin reviews
Phase 2 (async/deferred): actionItems[] + planningNotes — generated after plan is reviewed or in a background job

Admin sees a plan immediately instead of waiting 7-15 min. Action items load in the background.

Pros: Biggest UX win — eliminates the visible blocking wait entirely
Cons: More complex — needs a second queue or deferred job, UI needs a “loading action items” state

Option 4: Cap action items at 12 (two-line change, ~40% output reduction)

Change the prompt instruction from “Do NOT artificially cap the list” to:


`- Limit to a maximum of 12 action items — prioritise the highest-value foundational tasks.`

Pros: Reduces output tokens by ~40%, likely saves a full continuation turn
Cons: May miss some legitimate tasks for large or complex plans

Option 5: Switch model to claude-haiku-4-5 (~5x speedup)

Haiku generates structured JSON ~5x faster than Sonnet at lower cost.


model: "claude-haiku-4-5",
timeoutSec: 300,
lockDuration: 480_000,

Pros: ~5x speedup, significant cost reduction
Cons: Lower quality for complex planning nuance — worth A/B testing before shipping

Recommended Fix Order

Option 1 + Option 4 (immediate) — section extraction + cap at 12. No schema changes, < 50 lines, expected runtime ~3-4 min instead of 7-15 min.
Option 2 (medium term) — pre-distilled summary stored at context-approval time. Cleaner than runtime extraction once validated.
Option 3 (if UX priority) — two-phase output eliminates the visible wait entirely, highest-value improvement for the user experience.
Option 5 (after 1+4 validated) — Haiku if runtime is still too slow.

Current Config (unchanged)


model: "claude-sonnet-4-6",
allowedTools:   [],
timeoutSec:     900,
maxTurnsPerRun: 3,
lockDuration:   1_080_000,