Gap 9: No Episodic Memory Across Related Tasks
Problem
Each agent run reads the same static ClientContext file. There is no accumulated knowledge of what has happened across runs for the same tenant. The blog-writer that just produced 10 posts for a tenant doesn’t know:
- Which topics have already been covered (risk of duplicate content)
- What writing style got approved vs. rejected
- What content lengths the client tends to approve
- Which keywords have already been used heavily
This is the core insight from the Generative Agents paper (Park et al., 2023): agents with a memory stream that accumulates experience and synthesises it into higher-level reflections perform significantly better than agents that start from scratch each time.
Concrete example
The blog-writer for Tenant A generates “10 Benefits of Social Media Marketing” in March. In April, without episodic memory, it generates “Why Social Media Marketing Matters for Your Business” — a near-duplicate. Both posts exist in the CMS, both go through DM review, and the DM catches the duplication manually.
What to Build
1. TenantAgentMemory model
A structured key-value store per (tenantId, agentRole) for accumulated learnings:
model TenantAgentMemory {
id String @id @default(cuid())
tenantId String
agentRole String
key String // e.g. "covered_topics", "preferred_tone", "avg_approved_length"
value Json // type depends on key
confidence Float @default(1.0) // degrades over time if contradicted
updatedAt DateTime @updatedAt
createdAt DateTime @default(now())
tenant Tenant @relation(fields: [tenantId], references: [id])
@@unique([tenantId, agentRole, key])
@@map("tenant_agent_memory")
}Standard memory keys per agent role:
| Agent role | Memory keys |
|---|---|
| blog-writer | covered_topics, preferred_length_words, approved_structures, rejected_patterns |
| social-post-writer | used_hashtags, preferred_cta_style, avg_approved_length |
| strategy-writer | previous_goals, channel_history, approved_pillars |
| context-file-writer | revision_count, last_approved_sections |
2. Memory extraction after each approved run
When content is approved (status changes to client_approved or active), run a memory extraction job:
// packages/agents/src/lib/memory-extractor.ts
export async function extractAndStoreMemory(
tenantId: string,
agentRole: string,
approvedOutput: string,
existingMemories: TenantAgentMemory[]
): Promise<void> {
const extractionPrompt = buildExtractionPrompt(agentRole, approvedOutput, existingMemories);
// Fast haiku call
const extracted = await claudeHaiku(extractionPrompt);
const updates = parseMemoryUpdates(extracted);
for (const [key, value] of Object.entries(updates)) {
await db.tenantAgentMemory.upsert({
where: { tenantId_agentRole_key: { tenantId, agentRole, key } },
update: { value, updatedAt: new Date() },
create: { tenantId, agentRole, key, value },
});
}
}Extraction prompt example for blog-writer:
This blog post was just approved by the client. Extract memory updates in JSON format:
{
"covered_topics": ["append new topic title here"],
"preferred_length_words": <word count>,
"approved_structures": ["append observed heading structure"]
}
APPROVED BLOG POST:
{approvedOutput}
EXISTING MEMORY (do not duplicate):
{existingMemories}3. Memory injection before generation
Before running the agent, load relevant memories and inject them into the prompt:
// In blog-writer.worker.ts, before building the main prompt
const memories = await db.tenantAgentMemory.findMany({
where: { tenantId, agentRole: "blog-writer" },
});
const memorySection = buildMemorySection(memories);
// Produces something like:
// ## ACCUMULATED KNOWLEDGE FOR THIS CLIENT
// - Topics already covered: [list of 10 titles]
// - Preferred article length: ~1,400 words
// - Approved content structures: [intro → 3 H2 sections → CTA]
// - Patterns to avoid: listicles without examples, generic intros4. Reflection synthesis (periodic, not per-run)
Inspired directly by the Generative Agents reflection mechanism: periodically synthesise raw memories into higher-level insights. Run this as a scheduled job (e.g., after every 5 approved posts):
Given these memory entries for this tenant's blog content:
{memories}
What are the 3 most important patterns we should always apply for this tenant?
What are the 3 most common mistakes to avoid?
Return as JSON: { alwaysDo: string[], neverDo: string[] }Store the synthesis result as a special synthesis key in TenantAgentMemory. Inject it at the top of future prompts as the highest-priority context.
5. Memory confidence decay
If a rejected run contradicts an existing memory, reduce its confidence score. If confidence drops below 0.3, soft-delete the memory:
// When a run is rejected
if (wakeReason === "rejection" && rejectionFeedback) {
const contradictedMemories = await findContradictedMemories(tenantId, agentRole, rejectionFeedback);
for (const memory of contradictedMemories) {
await db.tenantAgentMemory.update({
where: { id: memory.id },
data: { confidence: memory.confidence * 0.6 },
});
}
}Files to Change
packages/db/prisma/schema.prisma— addTenantAgentMemorymodel- New file:
packages/agents/src/lib/memory-extractor.ts - New file:
packages/agents/src/lib/memory-injector.ts packages/agents/src/workers/blog-writer.worker.ts— inject memory before generation; extract after approvalpackages/agents/src/workers/social-post-writer.worker.ts— same patternpackages/api/src/routers/tenant/— webhook/handler forclient_approvedstatus change to trigger extraction- New scheduled job in
apps/servers/scheduler— periodic reflection synthesis
Related
- Gap 1: Learning from feedback history (episodic memory is the structured version of episode retrieval)
- Gap 2: RAG recency + importance scoring (memory supplements RAG for run-specific knowledge)