Skip to Content
FeaturesCost & Usage Tracking

Cost & Usage Tracking


[Live — April 2026] Credits System

Leadmetrics uses a credit-based model for agent content generation. Credits are the subscription currency; LLM USD costs are tracked separately for internal visibility.

Package — @leadmetrics/billing

packages/billing/src/ index.ts # public exports rates.ts # CREDIT_RATES constant balance.ts # getOrCreateCreditBalance, reserveCredits, consumeCredits, releaseCredits, getCreditCost

Prisma Models (packages/db/prisma/schema.prisma)

ModelPurpose
CreditBalanceOne row per tenant — available, reserved, consumed, totalAllocated
CreditLedgerImmutable audit trail — type ∈ allocated/consumed/reserved/released/adjusted/topped_up
CreditTopupOrderRazorpay top-up order records

Credit Rates

Credit typeRateAgent / worker
context_file2setup.worker (context-file-writer step only)
strategy5strategy-writer.worker
deliverable_planner1strategy.worker
activity_planner2activity.worker
blog_post2blog-writer.worker
social_post1social-post-writer.worker
landing_page3landing-page-writer.worker
gbp_post1gbp-post-writer.worker
email_newsletter2email-writer.worker
social_calendar1social-calendar-planner.worker
google_ads_rsa2google-ads-writer.worker
meta_ads2meta-ads-writer.worker
report2report-writer.worker + custom-report-writer.worker
research_notes1research-note-writer.worker
topic_research1topic-researcher.worker
ads_analysis1ads-analyst.worker
anomaly_detection1anomaly-detector.worker
review_response1review-response-writer.worker
site_audit2site-auditor.worker
keyword_cluster1keyword-researcher.worker
content_brief1content-brief-writer.worker
backlink_research1backlink-researcher.worker
backlink_outreach1 per emailbacklink-outreach-writer.worker
channel_insight1all 8 channel insight workers (via insight-worker-base)
ai_narrative1brand-narrative-analyst.worker
ai_visibility1 per runai-visibility-monitor.worker

Not charged: rag-ingestion (indexing utility), website-crawler (no LLM), social-post-designer (image generation), blog-faq-writer (secondary enrichment of an already-charged blog post), ai-visibility-seeder (setup utility).

Worker Patterns

Pattern A — createContentWorker (14 workers): Pass creditType in options; billing is handled centrally in content.worker.ts:

createContentWorker("email-writer", { creditType: "email_newsletter" })

Pattern B — Standalone workers: Reserve before execute, release on all failure paths, consume after DB save. All consumeCredits/releaseCredits calls are non-fatal (.catch(log.warn)).

Insight workers: Billing wired once in packages/agents/src/workers/insights/insight-worker-base.ts — all 8 channel insight workers inherit it automatically.

API

  • GET /admin/v1/billing/tenant-usage?tenantId= — requireSuperAdmin; returns credits + recent ledger + LLM 30-day aggregates
  • POST /admin/v1/billing/credits/adjust — manual credit adjustment

UI Screens

Dashboard /usage — Credits & Usage: balance progress bar (violet/amber/red thresholds), Razorpay top-up, recent ledger.

Dashboard /costs — LLM Costs: 30-day totals, daily bar chart, by-agent-role table, by-model table. Data source: AgentRun table (costUsd, inputTokens, outputTokens).

Dashboard sidebar — “Billing” NavGroup: Credits & Usage (Coins icon) + LLM Costs (DollarSign icon), between Audit Logs and Settings.

Manage portal /tenants/[id] → Credits & LLM tab — per-tenant credit balance, adjust form (amount + note), LLM spend aggregates, recent ledger.


Purpose

Monitor and control LLM spending across all agents, activities, and tenants. Every dollar spent on inference must be attributable, auditable, and controllable. Costs are surfaced at every level of the system so over-runs are caught early and billed back to the right tenant.

Related: Databasellm_calls, billing_events, agent_monthly_spend tables | Multi-Tenancy — per-tenant cost caps | Agent Hierarchy — cost caps per agent role in agent_configs


Budget Cap Hierarchy

Four levels of budget enforcement apply in order from most granular to most broad. An agent task is blocked if any level’s cap is exceeded.

┌──────────────────────────────────────────────────────────┐ │ Tenant monthly cap │ ← Platform-wide ceiling per client │ └── Agent monthly cap (per role) │ ← Monthly ceiling per agent type │ └── Campaign budget cap │ ← Per-campaign spending limit │ └── Per-task cost cap │ ← Single execution ceiling └──────────────────────────────────────────────────────────┘

What Gets Tracked

EventWhat’s captured
Every LLM callmodel, input tokens, output tokens, cost USD, duration, prompt hash, task run
Every tool calltool name, method, duration, success/failure
Agent monthly spendcumulative cost per agent role per tenant per calendar month
Tenant monthly spendcumulative cost across all agents per tenant per calendar month
Budget cap breacheslevel (task / campaign / agent / tenant), threshold hit, action taken
Session totalscumulative tokens across all calls in a session

Cost Calculation

Cost is calculated immediately after each LLM call completes, before the result is returned to the worker.

const MODEL_PRICING: Record<string, { inputPer1M: number; outputPer1M: number }> = { 'claude-sonnet-4-6': { inputPer1M: 3.00, outputPer1M: 15.00 }, 'claude-opus-4-6': { inputPer1M: 15.00, outputPer1M: 75.00 }, 'claude-haiku-4-5-20251001': { inputPer1M: 0.25, outputPer1M: 1.25 }, 'gpt-4o': { inputPer1M: 2.50, outputPer1M: 10.00 }, 'gpt-4o-mini': { inputPer1M: 0.15, outputPer1M: 0.60 }, 'gemma3:4b': { inputPer1M: 0, outputPer1M: 0 }, // local 'llama3.2': { inputPer1M: 0, outputPer1M: 0 }, // local 'mistral': { inputPer1M: 0, outputPer1M: 0 }, // local }; function calculateCost(model: string, inputTokens: number, outputTokens: number): number { const p = MODEL_PRICING[model] ?? { inputPer1M: 0, outputPer1M: 0 }; return (inputTokens / 1_000_000) * p.inputPer1M + (outputTokens / 1_000_000) * p.outputPer1M; }

Pricing is maintained in packages/agent-engine/src/pricing.ts. When a provider changes pricing, update MODEL_PRICING — no DB migration needed.


LLM Call Logging

Every call is written to llm_calls immediately after the stream closes:

await db.insert(llmCalls).values({ id: crypto.randomUUID(), taskRunId: taskRun.id, tenantId: tenant.id, agentRole: agentConfig.role, sessionId: session?.id ?? null, model, promptHash: sha256(fullPrompt), // no raw prompt stored — privacy responseHash: sha256(responseText), inputTokens, outputTokens, costUsd: calculateCost(model, inputTokens, outputTokens), durationMs, createdAt: new Date(), });

After logging, three counters are incremented atomically in a single transaction:

await db.transaction(async (tx) => { // 1. Campaign running total await tx.update(campaigns) .set({ totalCostUsd: sql`total_cost_usd + ${costUsd}` }) .where(eq(campaigns.id, campaignId)); // 2. Agent monthly spend (upsert — row created on first spend of the month) await tx.insert(agentMonthlySpend) .values({ tenantId, agentRole, month: startOfMonth(), totalCostUsd: costUsd }) .onConflictDoUpdate({ target: [agentMonthlySpend.tenantId, agentMonthlySpend.agentRole, agentMonthlySpend.month], set: { totalCostUsd: sql`total_cost_usd + ${costUsd}` }, }); // 3. Tenant monthly spend await tx.update(tenants) .set({ currentMonthSpendUsd: sql`current_month_spend_usd + ${costUsd}` }) .where(eq(tenants.id, tenantId)); });

Budget Cap Enforcement

Level 1 — Per-task cap

Each agent config has an optional maxCostUsdPerTask. The executor tracks running cost mid-stream and aborts if the cap is hit:

let runningCost = 0; for await (const event of executor.execute(request)) { if (event.type === 'text_delta') { runningCost = estimateCostFromStreamedTokens(model, estimatedInputTokens, streamedOutputTokens); if (agentConfig.limits.maxCostUsdPerTask && runningCost > agentConfig.limits.maxCostUsdPerTask) { executor.abort(); throw new CostCapExceededError(`Task cost cap of $${agentConfig.limits.maxCostUsdPerTask} exceeded`); } } yield event; }

Level 2 — Per-campaign cap

Before each LLM call, the worker checks the campaign’s remaining budget:

async function checkCampaignBudget(campaignId: string, estimatedCallCost: number): Promise<void> { const campaign = await db.query.campaigns.findFirst({ where: eq(campaigns.id, campaignId) }); if (!campaign.budgetCapUsd) return; // no cap set if (campaign.totalCostUsd + estimatedCallCost > campaign.budgetCapUsd) { await db.update(campaigns) .set({ status: 'paused', pauseReason: 'budget_exceeded' }) .where(eq(campaigns.id, campaignId)); await drainCampaignQueueJobs(campaignId); await notifyBudgetBreached({ level: 'campaign', campaign }); throw new BudgetCapExceededError(`Campaign budget of $${campaign.budgetCapUsd} would be exceeded`); } }

When paused by budget, all pending tasks for that campaign are drained from the queue. The campaign stays paused until a human increases the budget cap or the month resets (see Month Auto-Reset).

Level 3 — Per-agent monthly cap (NEW)

Each agent config has an optional monthlyBudgetCapUsd. This is a rolling calendar-month ceiling — it limits how much any single agent role can spend across all tasks for a tenant within the month.

Before dispatching a task to an agent, the worker checks the current month’s accumulated spend for that agent role:

async function checkAgentMonthlyBudget( tenantId: string, agentRole: AgentRole, estimatedCallCost: number, ): Promise<'ok' | 'warning' | 'exceeded'> { const config = await db.query.agentConfigs.findFirst({ where: and(eq(agentConfigs.tenantId, tenantId), eq(agentConfigs.role, agentRole)), }); if (!config.monthlyBudgetCapUsd) return 'ok'; const spend = await db.query.agentMonthlySpend.findFirst({ where: and( eq(agentMonthlySpend.tenantId, tenantId), eq(agentMonthlySpend.agentRole, agentRole), eq(agentMonthlySpend.month, startOfMonth()), ), }); const currentSpend = spend?.totalCostUsd ?? 0; const usagePct = currentSpend / config.monthlyBudgetCapUsd; if (currentSpend + estimatedCallCost > config.monthlyBudgetCapUsd) { await pauseAgent(tenantId, agentRole, 'monthly_budget_exceeded'); await notifyBudgetBreached({ level: 'agent', agentRole, tenantId }); throw new BudgetCapExceededError(`Agent ${agentRole} monthly budget of $${config.monthlyBudgetCapUsd} exceeded`); } if (usagePct >= 0.8) return 'warning'; // caller injects prompt warning (see below) return 'ok'; }

When an agent is paused by monthly budget, only that agent role is paused for that tenant. Other agents continue working. The agent resumes automatically on the 1st of the next month.

Level 4 — Tenant monthly cap (NEW)

The tenants.monthlySpendCapUsd field defines the total monthly LLM spend ceiling for the tenant across all agents. Checked before every task dispatch:

async function checkTenantMonthlyBudget(tenantId: string, estimatedCallCost: number): Promise<void> { const tenant = await db.query.tenants.findFirst({ where: eq(tenants.id, tenantId) }); if (!tenant.monthlySpendCapUsd) return; const usagePct = tenant.currentMonthSpendUsd / tenant.monthlySpendCapUsd; if (tenant.currentMonthSpendUsd + estimatedCallCost > tenant.monthlySpendCapUsd) { // Pause ALL agent queues for this tenant await pauseAllAgentsForTenant(tenantId, 'tenant_monthly_budget_exceeded'); await notifyBudgetBreached({ level: 'tenant', tenantId }); throw new BudgetCapExceededError(`Tenant monthly budget of $${tenant.monthlySpendCapUsd} exceeded`); } if (usagePct >= 0.8) { // One-time alert — deduplicated per day await sendBudgetWarning({ level: 'tenant', tenantId, usagePct }); } }

When the tenant cap is hit, all agent queues for that tenant are drained. No new tasks are dispatched until the cap is raised (by the tenant admin or super admin) or the month resets.


Agent Warning at 80% (NEW)

When an agent’s monthly spend reaches 80% of its cap, the next task it receives includes a budget warning injected into its system prompt. This makes the agent itself aware of the constraint — it self-limits by being more concise, skipping optional research steps, and prioritising the core deliverable.

The warning is injected in the adapter layer, before the LLM call:

async function buildSystemPromptWithBudgetWarning( baseSystemPrompt: string, tenantId: string, agentRole: AgentRole, ): Promise<string> { const budgetStatus = await checkAgentMonthlyBudget(tenantId, agentRole, 0); if (budgetStatus !== 'warning') return baseSystemPrompt; const spend = await getAgentMonthlySpend(tenantId, agentRole); const config = await getAgentConfig(tenantId, agentRole); const pct = Math.round((spend / config.monthlyBudgetCapUsd) * 100); const warning = ` ⚠️ BUDGET NOTICE: You have used ${pct}% of your monthly token budget for this client. You are approaching your limit. For this task: - Be concise in your responses - Skip optional research or elaboration steps - Focus only on the core deliverable - Avoid making unnecessary tool calls `; return warning + '\n\n' + baseSystemPrompt; }

This warning is prepended to the system prompt for all three adapters (Claude, OpenAI, Ollama). It is not stored in the task run’s prompt — it is injected ephemerally at dispatch time.

The warning fires once per task when the agent is in the 80–99% range. At 100%, the agent is hard-stopped (no task is dispatched at all).


Budget Alert System

Alerts fire at two thresholds. The 80% alert notifies humans; the 100% alert also blocks execution.

ThresholdAction — Campaign capAction — Agent monthly capAction — Tenant monthly cap
80%Slack alert to DM PortalSlack alert + warning injected into agent’s next promptSlack alert to tenant admin
100%Campaign paused, queue drainedAgent paused for remainder of monthAll agents paused, queue drained
// Budget alert check — BullMQ repeatable job, runs every 15 minutes async function checkBudgetAlerts(): Promise<void> { // Campaign-level const campaignsWithCaps = await db.query.campaigns.findMany({ where: and(isNotNull(campaigns.budgetCapUsd), eq(campaigns.status, 'active')), }); for (const campaign of campaignsWithCaps) { const usagePct = campaign.totalCostUsd / campaign.budgetCapUsd; if (usagePct >= 0.8 && usagePct < 1.0) { await sendAlertOnce(`campaign:${campaign.id}:80pct`, { text: `⚠️ Campaign "${campaign.name}" is at ${Math.round(usagePct * 100)}% of its budget ($${campaign.totalCostUsd.toFixed(2)} / $${campaign.budgetCapUsd})`, }); } } // Tenant-level const tenantsWithCaps = await db.query.tenants.findMany({ where: isNotNull(tenants.monthlySpendCapUsd), }); for (const tenant of tenantsWithCaps) { const usagePct = tenant.currentMonthSpendUsd / tenant.monthlySpendCapUsd; if (usagePct >= 0.8 && usagePct < 1.0) { await sendAlertOnce(`tenant:${tenant.id}:80pct:${startOfMonth()}`, { text: `⚠️ Tenant "${tenant.name}" is at ${Math.round(usagePct * 100)}% of their monthly LLM budget`, recipient: tenant.adminEmail, }); } } }

sendAlertOnce deduplicates using a Redis key with a 24-hour TTL so the same alert does not fire on every check cycle.


Month Auto-Reset (NEW)

On the 1st of every calendar month at 00:00 UTC, a BullMQ cron job resets all monthly spend counters and resumes agents/campaigns that were paused solely due to budget exhaustion.

// Registered as a repeatable job at startup await budgetResetQueue.add('monthly-budget-reset', {}, { repeat: { pattern: '0 0 1 * *' }, // 1st of every month at midnight UTC }); // Processor async function monthlyBudgetReset(): Promise<void> { await db.transaction(async (tx) => { // 1. Reset all tenant monthly spend counters await tx.update(tenants) .set({ currentMonthSpendUsd: 0 }); // 2. Reset agent monthly spend table (keep rows for history, set to 0 for current month) // New month's rows will be created on first spend — nothing to reset // 3. Resume campaigns paused by budget await tx.update(campaigns) .set({ status: 'active', pauseReason: null }) .where(eq(campaigns.pauseReason, 'budget_exceeded')); // 4. Resume agents paused by monthly budget await tx.update(agentConfigs) .set({ status: 'active', pauseReason: null }) .where(eq(agentConfigs.pauseReason, 'monthly_budget_exceeded')); }); // Re-enqueue any campaigns that were waiting for budget reset await requeuePausedCampaignTasks(); await logger.info('Monthly budget reset complete'); }

Important: Only campaigns/agents paused with pauseReason = 'budget_exceeded' or 'monthly_budget_exceeded' are resumed. Campaigns paused for other reasons (e.g. 'manual_pause', 'approval_pending') are not touched.


New Database Table: agent_monthly_spend

Added to providers/provider-db/src/schema/billing.ts:

export const agentMonthlySpend = pgTable('agent_monthly_spend', { id: uuid('id').primaryKey().defaultRandom(), tenantId: uuid('tenant_id').notNull().references(() => tenants.id), agentRole: text('agent_role').notNull(), month: date('month').notNull(), // always 1st of month: '2026-04-01' totalCostUsd: numeric('total_cost_usd', { precision: 10, scale: 6 }).notNull().default('0'), updatedOn: timestamp('updated_on').notNull().defaultNow(), }, (t) => ({ uniqueIdx: uniqueIndex('agent_monthly_spend_unique').on(t.tenantId, t.agentRole, t.month), tenantIdx: index('agent_monthly_spend_tenant_idx').on(t.tenantId), }));

This table is append-friendly — one row per (tenant, agent role, month). Historical months are never updated, giving a full spend history per agent over time.


New Field on agent_configs: monthlyBudgetCapUsd

// Added to the agent_configs table monthlyBudgetCapUsd: numeric('monthly_budget_cap_usd', { precision: 10, scale: 2 }), // null = no monthly cap for this agent

Configurable per agent role per tenant in the Dashboard → Team → Agent configuration modal, and globally in the Manage App → Agent Configs (M4).


Enforcement Order at Task Dispatch

Every time a task is about to be dispatched to an agent, all four budget checks run in order:

1. checkTenantMonthlyBudget(tenantId, estimatedCost) → Throws if tenant monthly cap exceeded → Warning if ≥80% (Slack alert, deduplicated) 2. checkAgentMonthlyBudget(tenantId, agentRole, estimatedCost) → Throws if agent monthly cap exceeded (agent paused for month) → Returns 'warning' if ≥80% 3. checkCampaignBudget(campaignId, estimatedCost) → Throws if campaign cap exceeded (campaign paused) 4. Build system prompt → If step 2 returned 'warning': prepend budget notice to system prompt → Agent receives warning; self-limits accordingly 5. Dispatch task → execute → Per-task cap enforced mid-stream (abort if exceeded)

Aggregation

By campaign

SELECT c.id, c.name, cl.name AS client_name, COALESCE(SUM(l.cost_usd), 0) AS total_cost_usd, SUM(l.input_tokens) AS total_input_tokens, SUM(l.output_tokens) AS total_output_tokens, COUNT(l.id) AS total_calls FROM campaigns c LEFT JOIN clients cl ON cl.id = c.client_id LEFT JOIN tasks t ON t.campaign_id = c.id LEFT JOIN task_runs tr ON tr.task_id = t.id LEFT JOIN llm_calls l ON l.task_run_id = tr.id GROUP BY c.id, cl.name;

By agent role (monthly)

SELECT ams.agent_role, ams.month, ams.total_cost_usd, ac.monthly_budget_cap_usd, ROUND((ams.total_cost_usd / NULLIF(ac.monthly_budget_cap_usd, 0)) * 100, 1) AS pct_used FROM agent_monthly_spend ams JOIN agent_configs ac ON ac.tenant_id = ams.tenant_id AND ac.role = ams.agent_role WHERE ams.tenant_id = :tenantId ORDER BY ams.month DESC, ams.total_cost_usd DESC;

By tenant (platform-wide, for Manage App)

SELECT t.id, t.name, t.plan, t.current_month_spend_usd, t.monthly_spend_cap_usd, ROUND((t.current_month_spend_usd / NULLIF(t.monthly_spend_cap_usd, 0)) * 100, 1) AS pct_used FROM tenants t WHERE t.deleted_on IS NULL ORDER BY t.current_month_spend_usd DESC;

Audit Log

The llm_calls table provides a complete audit trail:

WhatHow
Who spent whattask_run_id → task → campaign → client join chain
Which agent roleagent_role column (denormalised on llm_calls for fast queries)
Whencreated_at on every row
Which modelmodel column
What was askedprompt_hash (SHA-256 — no raw content stored)
What was producedresponse_hash
How much it costcost_usd

Dashboard Metrics

The Cost Dashboard screen (/costs) surfaces:

MetricSource
Total spend this monthtenants.currentMonthSpendUsd
Spend vs. monthly capcurrentMonthSpendUsd / monthlySpendCapUsd — progress bar
Spend by agent (donut chart)agent_monthly_spend for current month
Agent budget utilisation tablePer agent: spend / cap / % / status (active / paused / warning)
Spend by client (bar chart)llm_calls GROUP BY client
Daily spend trend (line chart)llm_calls GROUP BY DATE_TRUNC('day')
Model split (stacked bar)llm_calls GROUP BY model
Budget alert tableCampaigns and agents at ≥80% cap
LLM call log (paginated table)Full llm_calls with filters: agent, campaign, model, date range

Ollama Cost Handling

Ollama calls have cost_usd = 0.00 by design. Token counts are still tracked (input_tokens, output_tokens) for:

  • Understanding relative workload of local vs. cloud calls
  • Future cost modelling if switching a task to a paid model
  • Context window management (prevent Ollama context overflow)
  • agent_monthly_spend.totalCostUsd correctly stays at $0 for Ollama-only agents — they are never blocked by budget caps

Package Location

packages/agent-engine/ ├── src/ │ ├── pricing.ts # MODEL_PRICING, calculateCost() │ ├── budget.ts # checkTenantMonthlyBudget(), checkAgentMonthlyBudget(), │ │ # checkCampaignBudget(), pauseAgent(), resumeAgent() │ ├── budget-warning.ts # buildSystemPromptWithBudgetWarning() — 80% prompt injection │ └── budget-reset.ts # monthlyBudgetReset() — BullMQ cron processor apps/api/ └── src/ └── routes/ └── costs.ts # API endpoints for Cost Dashboard queries

© 2026 Leadmetrics — Internal use only