Skip to Content
AgentsImprovementsGap 5: No Structured Output Contracts

Gap 5: No Structured Output Contracts

Problem

All agent outputs are free-text. Parsing happens ad hoc with regex and loose string checks. Lilian Weng’s 2023 survey explicitly names this as a known reliability risk: “much of the agent demo code focuses on parsing model output” — and it manifests as a real maintenance burden.

Current state in the codebase:

  • blog-writer extracts title, meta description, and slug from unstructured Markdown using regex
  • context-file-writer validates output with validateContextOutput() — checks heading count (≥3) and character length (≥500) but not structure
  • strategy-writer output is unvalidated Markdown stored directly in DB
  • AgentRun.output is a Text field with no schema at all
  • Parsing failures silently produce malformed content that reaches human reviewers

Concrete example

The blog-writer prompt says “Return the title on the first line prefixed with TITLE:”. Occasionally the model writes Title: (lowercase t) or **TITLE:** (markdown bold) or omits it entirely. The regex fails silently and blogPost.title is stored as null or as a fragment of the body text.

What to Build

1. Define Zod output schemas per agent role

// packages/agents/src/schemas/outputs.ts import { z } from "zod"; export const BlogWriterOutput = z.object({ title: z.string().min(10).max(120), metaDescription: z.string().min(50).max(160), slug: z.string().regex(/^[a-z0-9-]+$/), content: z.string().min(500), wordCount: z.number().int().positive(), headings: z.array(z.string()), internalLinks: z.array(z.string().url()).optional(), }); export const ContextFileOutput = z.object({ companyOverview: z.string().min(100), targetAudience: z.string().min(50), coreProducts: z.array(z.string()).min(1), competitors: z.array(z.string()), brandVoice: z.string().min(50), keyMessages: z.array(z.string()).min(3), }); export const StrategyOutput = z.object({ executiveSummary: z.string().min(100), goals: z.array(z.object({ goal: z.string(), metric: z.string(), timeline: z.string(), })).min(1), channels: z.array(z.string()).min(1), contentPillars: z.array(z.string()).min(3), keyInitiatives: z.array(z.object({ title: z.string(), description: z.string(), priority: z.enum(["high", "medium", "low"]), })).min(3), }); export type BlogWriterOutput = z.infer<typeof BlogWriterOutput>; export type ContextFileOutput = z.infer<typeof ContextFileOutput>; export type StrategyOutput = z.infer<typeof StrategyOutput>;

2. Use Claude’s tool_use mode for structured extraction

Instead of asking Claude to format output a certain way and parsing it, use tool_use to enforce structure at the model level:

// packages/agents/src/lib/structured-output.ts export async function extractStructuredOutput<T>( rawOutput: string, schema: z.ZodType<T>, agentRole: string ): Promise<{ success: true; data: T } | { success: false; error: string }> { // First try: parse if the model already returned JSON try { const parsed = JSON.parse(rawOutput); const result = schema.safeParse(parsed); if (result.success) return { success: true, data: result.data }; } catch {} // Second try: extract with a fast haiku call using tool_use const extractionResult = await claudeExtract(rawOutput, schema, agentRole); return extractionResult; }

The prompt to haiku for extraction:

Extract the structured fields from this agent output. Return them exactly matching the required schema. Do not invent content — only extract what is present. [RAW OUTPUT] {rawOutput}

3. Store parsed and raw output separately in AgentRun

model AgentRun { // existing output String? // raw LLM text output // new parsedOutput Json? // validated structured output outputValid Boolean @default(false) outputErrors String? // Zod validation error message if parsing failed }

4. Fail loudly on parse failure for critical agents

For context-file-writer and strategy-writer (high-stakes, human-reviewed), a parse failure should not silently produce bad output. Instead:

const parsed = await extractStructuredOutput(result.output, ContextFileOutput, "context-file-writer"); if (!parsed.success) { // Retry once with explicit format instructions injected const retryResult = await adapter.execute({ ...config, prompt: buildRetryPrompt(result.output, parsed.error) }); const retryParsed = await extractStructuredOutput(retryResult.output, ContextFileOutput, "context-file-writer"); if (!retryParsed.success) { // Store for human review with parse_failed flag await db.agentRun.update({ where: { id }, data: { outputValid: false, outputErrors: retryParsed.error } }); throw new Error(`Output parsing failed after retry: ${retryParsed.error}`); } }

Files to Change

  • New file: packages/agents/src/schemas/outputs.ts — all output schemas
  • New file: packages/agents/src/lib/structured-output.ts — extraction utility
  • packages/db/prisma/schema.prisma — add parsedOutput, outputValid, outputErrors to AgentRun
  • packages/agents/src/workers/blog-writer.worker.ts — replace regex parsing
  • packages/agents/src/workers/setup.worker.ts — replace validateContextOutput() with schema validation
  • packages/agents/src/workers/strategy.worker.ts — add schema validation
  • Gap 6: Critic agent quality gate (critic operates on parsed structured output, not raw text)
  • Gap 3: Hallucination detection (structured output failures are a hallucination signal)

© 2026 Leadmetrics — Internal use only