Gap 10: No Dynamic Model Routing
Problem
Adapter and model selection is static per agent role. AgentConfig.adapter and AgentConfig.model are set in the database and never change based on the actual task being executed.
This means:
- A blog-writer configured for
claude-sonnet-4-6uses the same model for a 3,000-word technical SEO deep-dive as for a 150-word social caption - A fast, cheap task waits in the same queue and uses the same compute as a slow, complex one
- There is no cost-quality tradeoff optimisation — every task pays for the same model regardless of need
The MRKL architecture (Karpas et al., 2022) describes the LLM as a router that dispatches to the most appropriate expert module per sub-task — neural or symbolic, cheap or powerful. The same principle applies to model selection.
What to Build
1. Task complexity classifier
Before executing a generation, classify the task complexity with a simple heuristic (no LLM call needed):
// packages/agents/src/lib/model-router.ts
export type TaskComplexity = "trivial" | "simple" | "standard" | "complex" | "expert";
export function classifyTaskComplexity(params: {
agentRole: string;
outputLengthTarget?: number; // estimated word count
hasRevisionHistory: boolean; // rejection re-run = harder
contextRichness: number; // 0–1, how much context is available
requiresReasoning: boolean; // strategy/analysis vs. creative/execution
}): TaskComplexity {
const { agentRole, outputLengthTarget, hasRevisionHistory, requiresReasoning } = params;
if (agentRole === "social-post-writer" && !hasRevisionHistory) return "simple";
if (agentRole === "email-writer" && !hasRevisionHistory) return "simple";
if (agentRole === "strategy-writer") return "expert";
if (agentRole === "context-file-writer") return "complex";
if (requiresReasoning && (outputLengthTarget ?? 0) > 1500) return "complex";
if ((outputLengthTarget ?? 0) > 2500) return "standard";
return "standard";
}2. Model selection table
export const MODEL_SELECTION: Record<TaskComplexity, { adapter: string; model: string; maxTokens: number }> = {
trivial: { adapter: "claude_local", model: "claude-haiku-4-5-20251001", maxTokens: 1_024 },
simple: { adapter: "claude_local", model: "claude-haiku-4-5-20251001", maxTokens: 2_048 },
standard: { adapter: "claude_local", model: "claude-sonnet-4-6", maxTokens: 8_192 },
complex: { adapter: "claude_local", model: "claude-sonnet-4-6", maxTokens: 16_384 },
expert: { adapter: "claude_local", model: "claude-opus-4-7", maxTokens: 32_768 },
};The AgentConfig.model in DB becomes a floor — the minimum model allowed for that role. Dynamic routing can only upgrade, never downgrade below the configured floor.
3. Cost-aware routing with budget check
Before routing to an expensive model, check whether the tenant’s credit balance can afford the estimated cost:
export async function selectModel(
agentRole: string,
complexity: TaskComplexity,
tenantId: string,
agentConfig: AgentConfig
): Promise<{ adapter: string; model: string; maxTokens: number }> {
const dynamic = MODEL_SELECTION[complexity];
// Never downgrade below configured floor
const selectedModel = isMoreCapable(dynamic.model, agentConfig.model)
? dynamic
: { adapter: agentConfig.adapter, model: agentConfig.model, maxTokens: dynamic.maxTokens };
// Estimate cost and check credits
const estimatedCost = estimateCost(selectedModel.model, selectedModel.maxTokens);
const hasCredits = await checkCreditBalance(tenantId, estimatedCost);
if (!hasCredits && selectedModel.model === "claude-opus-4-7") {
// Downgrade to sonnet if out of credits for opus
return MODEL_SELECTION["complex"];
}
return selectedModel;
}4. Log model selection decision in AgentRun
model AgentRun {
// existing
adapter String?
model String?
// new
taskComplexity String? // trivial | simple | standard | complex | expert
modelRoutedTo String? // actual model used (may differ from agentConfig.model)
modelRouteReason String? // "complexity:complex", "revision_history", "credit_limit"
}5. Model routing analytics in manage portal
Add a tab to /agents?tab=analytics showing:
- Model distribution per agent role (how often does each complexity tier get selected?)
- Cost savings from haiku vs. sonnet vs. opus routing
- Correlation between model selection and quality scores
This gives visibility into whether the routing is working — if trivial tasks are routing to opus, the heuristics need tuning.
Files to Change
- New file:
packages/agents/src/lib/model-router.ts packages/agents/src/workers/blog-writer.worker.ts— callselectModel()before adapter.execute()packages/agents/src/workers/social-post-writer.worker.ts— same (big win: haiku for simple captions)packages/agents/src/workers/setup.worker.ts— samepackages/db/prisma/schema.prisma— add routing fields toAgentRunapps/dashboard/src/app/(dashboard)/...manage agent analytics — model breakdown chart
Related
- Gap 8: Context window management (smaller models have smaller context windows; budget must account for this)
- Gap 13: Cost circuit breaker (model routing is the first line of cost control; circuit breaker is the safety net)