Deliverable Planner: Slow Runtime (7-15 min) — Needs Optimization
Status: Open
Severity: Medium — job completes but takes 7-15 min; DNS failures can push it past the 900s timeout
File: packages/agents/src/workers/strategy.worker.ts
Observed: 2026-05-06 00:17 (completed in ~7.5 min on retry after DNS failure caused first timeout)
Symptom
ERROR [strategy-worker] Deliverable planner job failed
message: "Process timed out after 900s"On the DNS failure run: started 23:55, timed out 00:10 (900s). BullMQ stall-recovery restarted it. On the healthy retry: started 00:10:19, completed 00:17:50 (~7.5 min, no DNS issues).
Even without network problems, ~7.5 min is slow and leaves little margin before the 900s timeout.
Root Causes
1. Client context injected verbatim (30-50K chars)
clientContext.content is the full AI-generated context file. It contains structured analysis
of the client’s business, competitors, brand voice, etc. — typically 30,000-50,000 characters.
The planner doesn’t need all of it (brand voice, competitor details, key page analysis are
irrelevant to deliverable planning), but the whole file is injected verbatim.
2. Strategy document injected verbatim (15-25K chars)
data.approvedStrategy is the full strategy document. Combined with the context file, the
total prompt input can exceed 80,000 characters (~20K tokens) before any output tokens.
3. Large JSON output with uncapped action items
The prompt instructs Claude: “Do NOT artificially cap the list. Include every task that genuinely adds strategic value.” Claude generates 20-30 action items, making the output JSON very long and requiring 2-3 continuation turns.
4. claude-sonnet-4-6 is slow for structured JSON generation
Sonnet takes 3-5 min per turn on large prompts. 3 turns x 3-5 min = 9-15 min. Any transient network delay pushes it over the 900s timeout.
5. DNS failure compounded the issue (incident on 2026-05-05)
A temporary getaddrinfo ENOTFOUND api.anthropic.com caused the first attempt to timeout
waiting for a connection that never came, wasting the full 900s budget before DNS recovered.
Root Cause Detail (from code investigation)
Architecture: Claude CLI subprocess, not a direct API call
The adapter spawns claude --print - --output-format stream-json as a child process (the
Claude Code CLI in non-interactive mode). Each “turn” inside that session is a full
round-trip API call to api.anthropic.com. With maxTurnsPerRun: 3, up to 3 sequential
API calls can occur before the job returns.
At 3-5 min per Sonnet call on large prompts: 3 turns × 3-5 min = 9-15 min. The observed
7.5 min was a 2-turn run.
The prompt is genuinely massive
buildDeliverablePlannerPrompt() injects verbatim:
clientContext— full context file (30-50K chars). Planner only needs goals, channels, audience, budget. Brand voice, competitor analysis, and key page breakdowns are included but irrelevant to deliverable planning.approvedStrategy— full strategy markdown (15-25K chars). This IS the primary input and cannot be cut.
Combined: ~45-75K chars = 11-19K input tokens before a single output token.
Output schema is large by design
The prompt says “Do NOT artificially cap the list” for action items, producing 20-30 items
with 7 fields each, plus goals, deliverables, and a multi-section planningNotes markdown
block. Output is ~5-8K tokens — large enough to require continuation turns.
Proposed Fixes
Option 1: Section extraction from context file (quick win, ~1-2 min saving)
Parse the context markdown by # headings at prompt-build time and include only
planning-relevant sections:
function extractPlanningContext(contextContent: string): string {
const KEEP = ['goals', 'objective', 'channel', 'audience', 'overview', 'summary', 'budget'];
const SKIP = ['brand voice', 'competitor', 'key page', 'website', 'technical'];
// parse by # headings, include KEEP sections, skip SKIP sections
// target: ~4-6K chars
}Pros: No quality loss, no extra LLM call, deterministic, < 50 lines of code
Cons: Requires consistent heading names — context-file-writer uses # headings so this
should be reliable, but needs verification
Option 2: Pre-distilled planner summary stored on ClientContext (~500-1000 words)
Generate a short planning summary when the context file is approved. Store it in a
plannerSummary column on ClientContext. The deliverable planner uses that field instead
of content.
Pros: Zero extraction logic at runtime, reusable across planner re-runs, guaranteed
relevance
Cons: Requires DB migration + one extra LLM call at context-approval time (Haiku is
sufficient). Summary goes stale if context is regenerated without re-running the summary.
Option 3: Two-phase output — goals+deliverables first, action items separately
The output has two logically distinct parts:
- Phase 1 (fast, ~1-2 min):
goals[]+deliverables[]— what the admin reviews - Phase 2 (async/deferred):
actionItems[]+planningNotes— generated after plan is reviewed or in a background job
Admin sees a plan immediately instead of waiting 7-15 min. Action items load in the background.
Pros: Biggest UX win — eliminates the visible blocking wait entirely
Cons: More complex — needs a second queue or deferred job, UI needs a “loading action
items” state
Option 4: Cap action items at 12 (two-line change, ~40% output reduction)
Change the prompt instruction from “Do NOT artificially cap the list” to:
`- Limit to a maximum of 12 action items — prioritise the highest-value foundational tasks.`Pros: Reduces output tokens by ~40%, likely saves a full continuation turn
Cons: May miss some legitimate tasks for large or complex plans
Option 5: Switch model to claude-haiku-4-5 (~5x speedup)
Haiku generates structured JSON ~5x faster than Sonnet at lower cost.
model: "claude-haiku-4-5",
timeoutSec: 300,
lockDuration: 480_000,Pros: ~5x speedup, significant cost reduction
Cons: Lower quality for complex planning nuance — worth A/B testing before shipping
Recommended Fix Order
- Option 1 + Option 4 (immediate) — section extraction + cap at 12. No schema changes, < 50 lines, expected runtime ~3-4 min instead of 7-15 min.
- Option 2 (medium term) — pre-distilled summary stored at context-approval time. Cleaner than runtime extraction once validated.
- Option 3 (if UX priority) — two-phase output eliminates the visible wait entirely, highest-value improvement for the user experience.
- Option 5 (after 1+4 validated) — Haiku if runtime is still too slow.
Current Config (unchanged)
model: "claude-sonnet-4-6",
allowedTools: [],
timeoutSec: 900,
maxTurnsPerRun: 3,
lockDuration: 1_080_000,