Skip to Content
IssuesChannel Action Suggester: Claude Subprocess Exits with Code 1

Channel Action Suggester: Claude Subprocess Exits with Code 1

Status: ✅ Fixed (2026-05-06) — concurrency: 3 → 1 Severity: Medium — affected jobs fail immediately; BullMQ retries eventually succeed, so data is not lost but credits are wasted and latency increases File: packages/agents/src/workers/insights/channel-action-suggester.worker.ts

Symptom

After channel insights complete, the action suggester job starts and immediately fails (~400ms) with no meaningful error:

ERROR [channel-action-suggester] Claude adapter returned failure error: "Process exited with code 1" connectedChannelId: cmotrtt7n0009w13o1uunhhna ERROR [channel-action-suggester] Channel action suggester job failed jobId: channel-action-suggester__<channelId>__<ts> Error: Process exited with code 1 at Worker.processJob (channel-action-suggester.worker.ts:232)

The failure is always near-instant (~400ms–1.5s), far too fast for Claude to have processed anything. Retries succeed once no concurrent job is running.

Root Cause

The execute.ts adapter (packages/adapters/claude-local/src/server/execute.ts) uses a .agent.pid file in the worker’s cwd to detect and kill orphaned Claude subprocesses left over from a server restart:

// Kill any orphaned subprocess from a previous run (server restart scenario). const pidFile = path.join(config.cwd, ".agent.pid"); // reads old PID → taskkill /F /T /PID <old> → writes current PID

All channel-action-suggester jobs share the same cwd:

os.tmpdir()/leadmetrics-agents/insights/channel-action-suggester

With concurrency: 3, BullMQ runs multiple jobs simultaneously. When Job B starts while Job A is still running:

  1. Job B reads .agent.pid — finds Job A’s PID
  2. Job B issues taskkill /F /T /PID <A> — kills Job A’s Claude process
  3. Job A’s Claude process exits with code 1, empty stderr
  4. Job A’s execute.ts resolves { success: false, error: "Process exited with code 1" }
  5. Job A throws and BullMQ schedules a retry

This was confirmed in live logs (2026-05-06):

13:34:48 Job A starts (Claude running, ~1 min) 13:35:48 Job B starts ← concurrent; kills Job A via pidFile 13:35:48 Job A FAILS "Process exited with code 1" (416ms after B started) 13:38:28 Job D starts 13:39:07 Job E starts ← same pattern; kills Job D 13:39:08 Job D FAILS "Process exited with code 1"

Why insight workers don’t have this problem: each insight worker type has its own agentRole (e.g. "facebook-insights", "gsc-insights"), so each gets its own cwd and its own pid file. At any given moment typically only one job per channel type is running.

Previous Incorrect Fix (2026-05-05)

An earlier fix added maxStalledCount: 2, maxTurnsPerRun: 3, and lockDuration: 300_000. These helped with a separate stall-retry issue but did not address the concurrent-kill race condition. The "Process exited with code 1" error continued appearing in the 2026-05-06 E2E session.

Fix Applied (2026-05-06)

Set CONCURRENCY from 3 to 1:

// packages/agents/src/workers/insights/channel-action-suggester.worker.ts const CONCURRENCY = 1; // was 3

With concurrency: 1, BullMQ only runs one job at a time for this worker. The pid file mechanism was designed for this model — one Claude process per agent role at a time. Jobs queue up and execute serially, which is appropriate for a background suggestion generation task.

Why Not Use Per-Execution cwd?

An alternative of using path.join(workerCwd(AGENT_ROLE), runId) as the cwd would isolate concurrent jobs but would break orphan cleanup: after a server restart, the new job would use a new runId-derived directory and never find the orphaned Claude process from the previous run.

Notes

  • lockDuration: 300_000 (5 min) remains, giving each serial job enough lock time
  • maxStalledCount: 2 remains as a safety net for Windows DLL cold-start on first Claude spawn
  • If throughput becomes a bottleneck, the proper fix is to scope the pid file per connectedChannelId (one file per channel, not one per agent role), which would allow concurrent jobs across different channels without interfering

© 2026 Leadmetrics — Internal use only