Skip to Content
AgentsImprovementsAgent Architecture Improvements

Agent Architecture Improvements

Gaps identified by mapping the Leadmetrics agent architecture against the patterns in LLM Powered Autonomous Agents  (Lilian Weng, 2023).

Each document describes the problem in terms of the actual codebase, then a concrete fix with file paths.


Index

#DocumentAreaPriority
01Learning from Feedback HistoryMemory / LearningP2
02RAG Recency + Importance ScoringRetrievalP3
03Hallucination DetectionReliabilityP3
04Bridge BullMQ ↔ LangGraphArchitectureP2
05Structured Output ContractsReliabilityP1
06Critic Agent Quality GateQualityP1
07Priority Queue DifferentiationPerformanceP2
08Context Window ManagementReliabilityP1
09Episodic Memory Per TenantMemoryP2
10Dynamic Model RoutingCost / QualityP3
11Multi-Reviewer ConsensusQualityP4
12Tool Usage AnalyticsObservabilityP3
13Cost Circuit BreakerSafetyP3

Priority grouping

P1 — Defensive (implement first, no architecture changes required)

These prevent bad output and silent failures in the existing pipeline.

  • 05 Structured output contracts — replace regex parsing with Zod schemas + Claude tool_use extraction
  • 06 Critic agent quality gate — blocking haiku pass before content reaches DM review
  • 08 Context window management — token budget system, prompt builder, truncation strategy

P2 — Compound improvement (meaningful quality gains, moderate effort)

  • 07 Priority queue differentiation — BullMQ priority field, rejection re-runs get CRITICAL priority
  • 01 Learning from feedback history — episode retrieval layer, TenantAgentMemory model
  • 09 Episodic memory per tenant — accumulate approved-run learnings, inject into future runs
  • 04 Bridge BullMQ ↔ LangGraph — triggerAgentJob tool for executor agent (unblocks phase 3)

P3 — Optimisation (improve over time with data)

  • 02 RAG recency + importance scoring — weighted retrieval formula
  • 03 Hallucination detection — transcript analyser, repeated-tool-call detection
  • 10 Dynamic model routing — task complexity classifier, haiku for simple tasks
  • 12 Tool usage analytics — AgentToolCall model, citation detection, skill effectiveness dashboard
  • 13 Cost circuit breaker — per-run cost cap, daily platform limit, anomaly alerting

P4 — High-stakes only (significant effort, narrow application)

  • 11 Multi-reviewer consensus — devil’s advocate critic for strategy and context documents only

Cross-cutting dependencies

05 (structured output) ──→ 06 (critic gate) ──→ 11 (multi-reviewer) 07 (priority queues) ──→ 04 (BullMQ↔LangGraph bridge) ──→ 13 (circuit breaker) 08 (context budget) ──→ 13 (circuit breaker) ──→ 02 (RAG scoring) 01 (feedback history) ──→ 09 (episodic memory) 12 (tool analytics) ──→ 03 (hallucination detection) ──→ 08 (context budget — prune unused skills)

© 2026 Leadmetrics — Internal use only