Skip to Content
AgentsImprovementsGap 9: No Episodic Memory Across Related Tasks

Gap 9: No Episodic Memory Across Related Tasks

Problem

Each agent run reads the same static ClientContext file. There is no accumulated knowledge of what has happened across runs for the same tenant. The blog-writer that just produced 10 posts for a tenant doesn’t know:

  • Which topics have already been covered (risk of duplicate content)
  • What writing style got approved vs. rejected
  • What content lengths the client tends to approve
  • Which keywords have already been used heavily

This is the core insight from the Generative Agents paper (Park et al., 2023): agents with a memory stream that accumulates experience and synthesises it into higher-level reflections perform significantly better than agents that start from scratch each time.

Concrete example

The blog-writer for Tenant A generates “10 Benefits of Social Media Marketing” in March. In April, without episodic memory, it generates “Why Social Media Marketing Matters for Your Business” — a near-duplicate. Both posts exist in the CMS, both go through DM review, and the DM catches the duplication manually.

What to Build

1. TenantAgentMemory model

A structured key-value store per (tenantId, agentRole) for accumulated learnings:

model TenantAgentMemory { id String @id @default(cuid()) tenantId String agentRole String key String // e.g. "covered_topics", "preferred_tone", "avg_approved_length" value Json // type depends on key confidence Float @default(1.0) // degrades over time if contradicted updatedAt DateTime @updatedAt createdAt DateTime @default(now()) tenant Tenant @relation(fields: [tenantId], references: [id]) @@unique([tenantId, agentRole, key]) @@map("tenant_agent_memory") }

Standard memory keys per agent role:

Agent roleMemory keys
blog-writercovered_topics, preferred_length_words, approved_structures, rejected_patterns
social-post-writerused_hashtags, preferred_cta_style, avg_approved_length
strategy-writerprevious_goals, channel_history, approved_pillars
context-file-writerrevision_count, last_approved_sections

2. Memory extraction after each approved run

When content is approved (status changes to client_approved or active), run a memory extraction job:

// packages/agents/src/lib/memory-extractor.ts export async function extractAndStoreMemory( tenantId: string, agentRole: string, approvedOutput: string, existingMemories: TenantAgentMemory[] ): Promise<void> { const extractionPrompt = buildExtractionPrompt(agentRole, approvedOutput, existingMemories); // Fast haiku call const extracted = await claudeHaiku(extractionPrompt); const updates = parseMemoryUpdates(extracted); for (const [key, value] of Object.entries(updates)) { await db.tenantAgentMemory.upsert({ where: { tenantId_agentRole_key: { tenantId, agentRole, key } }, update: { value, updatedAt: new Date() }, create: { tenantId, agentRole, key, value }, }); } }

Extraction prompt example for blog-writer:

This blog post was just approved by the client. Extract memory updates in JSON format: { "covered_topics": ["append new topic title here"], "preferred_length_words": <word count>, "approved_structures": ["append observed heading structure"] } APPROVED BLOG POST: {approvedOutput} EXISTING MEMORY (do not duplicate): {existingMemories}

3. Memory injection before generation

Before running the agent, load relevant memories and inject them into the prompt:

// In blog-writer.worker.ts, before building the main prompt const memories = await db.tenantAgentMemory.findMany({ where: { tenantId, agentRole: "blog-writer" }, }); const memorySection = buildMemorySection(memories); // Produces something like: // ## ACCUMULATED KNOWLEDGE FOR THIS CLIENT // - Topics already covered: [list of 10 titles] // - Preferred article length: ~1,400 words // - Approved content structures: [intro → 3 H2 sections → CTA] // - Patterns to avoid: listicles without examples, generic intros

4. Reflection synthesis (periodic, not per-run)

Inspired directly by the Generative Agents reflection mechanism: periodically synthesise raw memories into higher-level insights. Run this as a scheduled job (e.g., after every 5 approved posts):

Given these memory entries for this tenant's blog content: {memories} What are the 3 most important patterns we should always apply for this tenant? What are the 3 most common mistakes to avoid? Return as JSON: { alwaysDo: string[], neverDo: string[] }

Store the synthesis result as a special synthesis key in TenantAgentMemory. Inject it at the top of future prompts as the highest-priority context.

5. Memory confidence decay

If a rejected run contradicts an existing memory, reduce its confidence score. If confidence drops below 0.3, soft-delete the memory:

// When a run is rejected if (wakeReason === "rejection" && rejectionFeedback) { const contradictedMemories = await findContradictedMemories(tenantId, agentRole, rejectionFeedback); for (const memory of contradictedMemories) { await db.tenantAgentMemory.update({ where: { id: memory.id }, data: { confidence: memory.confidence * 0.6 }, }); } }

Files to Change

  • packages/db/prisma/schema.prisma — add TenantAgentMemory model
  • New file: packages/agents/src/lib/memory-extractor.ts
  • New file: packages/agents/src/lib/memory-injector.ts
  • packages/agents/src/workers/blog-writer.worker.ts — inject memory before generation; extract after approval
  • packages/agents/src/workers/social-post-writer.worker.ts — same pattern
  • packages/api/src/routers/tenant/ — webhook/handler for client_approved status change to trigger extraction
  • New scheduled job in apps/servers/scheduler — periodic reflection synthesis
  • Gap 1: Learning from feedback history (episodic memory is the structured version of episode retrieval)
  • Gap 2: RAG recency + importance scoring (memory supplements RAG for run-specific knowledge)

© 2026 Leadmetrics — Internal use only