Cost & Usage Tracking
[Live — April 2026] Credits System
Leadmetrics uses a credit-based model for agent content generation. Credits are the subscription currency; LLM USD costs are tracked separately for internal visibility.
Package — @leadmetrics/billing
packages/billing/src/
index.ts # public exports
rates.ts # CREDIT_RATES constant
balance.ts # getOrCreateCreditBalance, reserveCredits, consumeCredits, releaseCredits, getCreditCostPrisma Models (packages/db/prisma/schema.prisma)
| Model | Purpose |
|---|---|
CreditBalance | One row per tenant — available, reserved, consumed, totalAllocated |
CreditLedger | Immutable audit trail — type ∈ allocated/consumed/reserved/released/adjusted/topped_up |
CreditTopupOrder | Razorpay top-up order records |
Credit Rates
| Credit type | Rate | Agent / worker |
|---|---|---|
context_file | 2 | setup.worker (context-file-writer step only) |
strategy | 5 | strategy-writer.worker |
deliverable_planner | 1 | strategy.worker |
activity_planner | 2 | activity.worker |
blog_post | 2 | blog-writer.worker |
social_post | 1 | social-post-writer.worker |
landing_page | 3 | landing-page-writer.worker |
gbp_post | 1 | gbp-post-writer.worker |
email_newsletter | 2 | email-writer.worker |
social_calendar | 1 | social-calendar-planner.worker |
google_ads_rsa | 2 | google-ads-writer.worker |
meta_ads | 2 | meta-ads-writer.worker |
report | 2 | report-writer.worker + custom-report-writer.worker |
research_notes | 1 | research-note-writer.worker |
topic_research | 1 | topic-researcher.worker |
ads_analysis | 1 | ads-analyst.worker |
anomaly_detection | 1 | anomaly-detector.worker |
review_response | 1 | review-response-writer.worker |
site_audit | 2 | site-auditor.worker |
keyword_cluster | 1 | keyword-researcher.worker |
content_brief | 1 | content-brief-writer.worker |
backlink_research | 1 | backlink-researcher.worker |
backlink_outreach | 1 per email | backlink-outreach-writer.worker |
channel_insight | 1 | all 8 channel insight workers (via insight-worker-base) |
ai_narrative | 1 | brand-narrative-analyst.worker |
ai_visibility | 1 per run | ai-visibility-monitor.worker |
Not charged: rag-ingestion (indexing utility), website-crawler (no LLM), social-post-designer (image generation), blog-faq-writer (secondary enrichment of an already-charged blog post), ai-visibility-seeder (setup utility).
Worker Patterns
Pattern A — createContentWorker (14 workers): Pass creditType in options; billing is handled centrally in content.worker.ts:
createContentWorker("email-writer", { creditType: "email_newsletter" })Pattern B — Standalone workers: Reserve before execute, release on all failure paths, consume after DB save. All consumeCredits/releaseCredits calls are non-fatal (.catch(log.warn)).
Insight workers: Billing wired once in packages/agents/src/workers/insights/insight-worker-base.ts — all 8 channel insight workers inherit it automatically.
API
GET /admin/v1/billing/tenant-usage?tenantId=— requireSuperAdmin; returns credits + recent ledger + LLM 30-day aggregatesPOST /admin/v1/billing/credits/adjust— manual credit adjustment
UI Screens
Dashboard /usage — Credits & Usage: balance progress bar (violet/amber/red thresholds), Razorpay top-up, recent ledger.
Dashboard /costs — LLM Costs: 30-day totals, daily bar chart, by-agent-role table, by-model table. Data source: AgentRun table (costUsd, inputTokens, outputTokens).
Dashboard sidebar — “Billing” NavGroup: Credits & Usage (Coins icon) + LLM Costs (DollarSign icon), between Audit Logs and Settings.
Manage portal /tenants/[id] → Credits & LLM tab — per-tenant credit balance, adjust form (amount + note), LLM spend aggregates, recent ledger.
Purpose
Monitor and control LLM spending across all agents, activities, and tenants. Every dollar spent on inference must be attributable, auditable, and controllable. Costs are surfaced at every level of the system so over-runs are caught early and billed back to the right tenant.
Related: Database —
llm_calls,billing_events,agent_monthly_spendtables | Multi-Tenancy — per-tenant cost caps | Agent Hierarchy — cost caps per agent role inagent_configs
Budget Cap Hierarchy
Four levels of budget enforcement apply in order from most granular to most broad. An agent task is blocked if any level’s cap is exceeded.
┌──────────────────────────────────────────────────────────┐
│ Tenant monthly cap │ ← Platform-wide ceiling per client
│ └── Agent monthly cap (per role) │ ← Monthly ceiling per agent type
│ └── Campaign budget cap │ ← Per-campaign spending limit
│ └── Per-task cost cap │ ← Single execution ceiling
└──────────────────────────────────────────────────────────┘What Gets Tracked
| Event | What’s captured |
|---|---|
| Every LLM call | model, input tokens, output tokens, cost USD, duration, prompt hash, task run |
| Every tool call | tool name, method, duration, success/failure |
| Agent monthly spend | cumulative cost per agent role per tenant per calendar month |
| Tenant monthly spend | cumulative cost across all agents per tenant per calendar month |
| Budget cap breaches | level (task / campaign / agent / tenant), threshold hit, action taken |
| Session totals | cumulative tokens across all calls in a session |
Cost Calculation
Cost is calculated immediately after each LLM call completes, before the result is returned to the worker.
const MODEL_PRICING: Record<string, { inputPer1M: number; outputPer1M: number }> = {
'claude-sonnet-4-6': { inputPer1M: 3.00, outputPer1M: 15.00 },
'claude-opus-4-6': { inputPer1M: 15.00, outputPer1M: 75.00 },
'claude-haiku-4-5-20251001': { inputPer1M: 0.25, outputPer1M: 1.25 },
'gpt-4o': { inputPer1M: 2.50, outputPer1M: 10.00 },
'gpt-4o-mini': { inputPer1M: 0.15, outputPer1M: 0.60 },
'gemma3:4b': { inputPer1M: 0, outputPer1M: 0 }, // local
'llama3.2': { inputPer1M: 0, outputPer1M: 0 }, // local
'mistral': { inputPer1M: 0, outputPer1M: 0 }, // local
};
function calculateCost(model: string, inputTokens: number, outputTokens: number): number {
const p = MODEL_PRICING[model] ?? { inputPer1M: 0, outputPer1M: 0 };
return (inputTokens / 1_000_000) * p.inputPer1M
+ (outputTokens / 1_000_000) * p.outputPer1M;
}Pricing is maintained in packages/agent-engine/src/pricing.ts. When a provider changes pricing, update MODEL_PRICING — no DB migration needed.
LLM Call Logging
Every call is written to llm_calls immediately after the stream closes:
await db.insert(llmCalls).values({
id: crypto.randomUUID(),
taskRunId: taskRun.id,
tenantId: tenant.id,
agentRole: agentConfig.role,
sessionId: session?.id ?? null,
model,
promptHash: sha256(fullPrompt), // no raw prompt stored — privacy
responseHash: sha256(responseText),
inputTokens,
outputTokens,
costUsd: calculateCost(model, inputTokens, outputTokens),
durationMs,
createdAt: new Date(),
});After logging, three counters are incremented atomically in a single transaction:
await db.transaction(async (tx) => {
// 1. Campaign running total
await tx.update(campaigns)
.set({ totalCostUsd: sql`total_cost_usd + ${costUsd}` })
.where(eq(campaigns.id, campaignId));
// 2. Agent monthly spend (upsert — row created on first spend of the month)
await tx.insert(agentMonthlySpend)
.values({ tenantId, agentRole, month: startOfMonth(), totalCostUsd: costUsd })
.onConflictDoUpdate({
target: [agentMonthlySpend.tenantId, agentMonthlySpend.agentRole, agentMonthlySpend.month],
set: { totalCostUsd: sql`total_cost_usd + ${costUsd}` },
});
// 3. Tenant monthly spend
await tx.update(tenants)
.set({ currentMonthSpendUsd: sql`current_month_spend_usd + ${costUsd}` })
.where(eq(tenants.id, tenantId));
});Budget Cap Enforcement
Level 1 — Per-task cap
Each agent config has an optional maxCostUsdPerTask. The executor tracks running cost mid-stream and aborts if the cap is hit:
let runningCost = 0;
for await (const event of executor.execute(request)) {
if (event.type === 'text_delta') {
runningCost = estimateCostFromStreamedTokens(model, estimatedInputTokens, streamedOutputTokens);
if (agentConfig.limits.maxCostUsdPerTask && runningCost > agentConfig.limits.maxCostUsdPerTask) {
executor.abort();
throw new CostCapExceededError(`Task cost cap of $${agentConfig.limits.maxCostUsdPerTask} exceeded`);
}
}
yield event;
}Level 2 — Per-campaign cap
Before each LLM call, the worker checks the campaign’s remaining budget:
async function checkCampaignBudget(campaignId: string, estimatedCallCost: number): Promise<void> {
const campaign = await db.query.campaigns.findFirst({ where: eq(campaigns.id, campaignId) });
if (!campaign.budgetCapUsd) return; // no cap set
if (campaign.totalCostUsd + estimatedCallCost > campaign.budgetCapUsd) {
await db.update(campaigns)
.set({ status: 'paused', pauseReason: 'budget_exceeded' })
.where(eq(campaigns.id, campaignId));
await drainCampaignQueueJobs(campaignId);
await notifyBudgetBreached({ level: 'campaign', campaign });
throw new BudgetCapExceededError(`Campaign budget of $${campaign.budgetCapUsd} would be exceeded`);
}
}When paused by budget, all pending tasks for that campaign are drained from the queue. The campaign stays paused until a human increases the budget cap or the month resets (see Month Auto-Reset).
Level 3 — Per-agent monthly cap (NEW)
Each agent config has an optional monthlyBudgetCapUsd. This is a rolling calendar-month ceiling — it limits how much any single agent role can spend across all tasks for a tenant within the month.
Before dispatching a task to an agent, the worker checks the current month’s accumulated spend for that agent role:
async function checkAgentMonthlyBudget(
tenantId: string,
agentRole: AgentRole,
estimatedCallCost: number,
): Promise<'ok' | 'warning' | 'exceeded'> {
const config = await db.query.agentConfigs.findFirst({
where: and(eq(agentConfigs.tenantId, tenantId), eq(agentConfigs.role, agentRole)),
});
if (!config.monthlyBudgetCapUsd) return 'ok';
const spend = await db.query.agentMonthlySpend.findFirst({
where: and(
eq(agentMonthlySpend.tenantId, tenantId),
eq(agentMonthlySpend.agentRole, agentRole),
eq(agentMonthlySpend.month, startOfMonth()),
),
});
const currentSpend = spend?.totalCostUsd ?? 0;
const usagePct = currentSpend / config.monthlyBudgetCapUsd;
if (currentSpend + estimatedCallCost > config.monthlyBudgetCapUsd) {
await pauseAgent(tenantId, agentRole, 'monthly_budget_exceeded');
await notifyBudgetBreached({ level: 'agent', agentRole, tenantId });
throw new BudgetCapExceededError(`Agent ${agentRole} monthly budget of $${config.monthlyBudgetCapUsd} exceeded`);
}
if (usagePct >= 0.8) return 'warning'; // caller injects prompt warning (see below)
return 'ok';
}When an agent is paused by monthly budget, only that agent role is paused for that tenant. Other agents continue working. The agent resumes automatically on the 1st of the next month.
Level 4 — Tenant monthly cap (NEW)
The tenants.monthlySpendCapUsd field defines the total monthly LLM spend ceiling for the tenant across all agents. Checked before every task dispatch:
async function checkTenantMonthlyBudget(tenantId: string, estimatedCallCost: number): Promise<void> {
const tenant = await db.query.tenants.findFirst({ where: eq(tenants.id, tenantId) });
if (!tenant.monthlySpendCapUsd) return;
const usagePct = tenant.currentMonthSpendUsd / tenant.monthlySpendCapUsd;
if (tenant.currentMonthSpendUsd + estimatedCallCost > tenant.monthlySpendCapUsd) {
// Pause ALL agent queues for this tenant
await pauseAllAgentsForTenant(tenantId, 'tenant_monthly_budget_exceeded');
await notifyBudgetBreached({ level: 'tenant', tenantId });
throw new BudgetCapExceededError(`Tenant monthly budget of $${tenant.monthlySpendCapUsd} exceeded`);
}
if (usagePct >= 0.8) {
// One-time alert — deduplicated per day
await sendBudgetWarning({ level: 'tenant', tenantId, usagePct });
}
}When the tenant cap is hit, all agent queues for that tenant are drained. No new tasks are dispatched until the cap is raised (by the tenant admin or super admin) or the month resets.
Agent Warning at 80% (NEW)
When an agent’s monthly spend reaches 80% of its cap, the next task it receives includes a budget warning injected into its system prompt. This makes the agent itself aware of the constraint — it self-limits by being more concise, skipping optional research steps, and prioritising the core deliverable.
The warning is injected in the adapter layer, before the LLM call:
async function buildSystemPromptWithBudgetWarning(
baseSystemPrompt: string,
tenantId: string,
agentRole: AgentRole,
): Promise<string> {
const budgetStatus = await checkAgentMonthlyBudget(tenantId, agentRole, 0);
if (budgetStatus !== 'warning') return baseSystemPrompt;
const spend = await getAgentMonthlySpend(tenantId, agentRole);
const config = await getAgentConfig(tenantId, agentRole);
const pct = Math.round((spend / config.monthlyBudgetCapUsd) * 100);
const warning = `
⚠️ BUDGET NOTICE: You have used ${pct}% of your monthly token budget for this client.
You are approaching your limit. For this task:
- Be concise in your responses
- Skip optional research or elaboration steps
- Focus only on the core deliverable
- Avoid making unnecessary tool calls
`;
return warning + '\n\n' + baseSystemPrompt;
}This warning is prepended to the system prompt for all three adapters (Claude, OpenAI, Ollama). It is not stored in the task run’s prompt — it is injected ephemerally at dispatch time.
The warning fires once per task when the agent is in the 80–99% range. At 100%, the agent is hard-stopped (no task is dispatched at all).
Budget Alert System
Alerts fire at two thresholds. The 80% alert notifies humans; the 100% alert also blocks execution.
| Threshold | Action — Campaign cap | Action — Agent monthly cap | Action — Tenant monthly cap |
|---|---|---|---|
| 80% | Slack alert to DM Portal | Slack alert + warning injected into agent’s next prompt | Slack alert to tenant admin |
| 100% | Campaign paused, queue drained | Agent paused for remainder of month | All agents paused, queue drained |
// Budget alert check — BullMQ repeatable job, runs every 15 minutes
async function checkBudgetAlerts(): Promise<void> {
// Campaign-level
const campaignsWithCaps = await db.query.campaigns.findMany({
where: and(isNotNull(campaigns.budgetCapUsd), eq(campaigns.status, 'active')),
});
for (const campaign of campaignsWithCaps) {
const usagePct = campaign.totalCostUsd / campaign.budgetCapUsd;
if (usagePct >= 0.8 && usagePct < 1.0) {
await sendAlertOnce(`campaign:${campaign.id}:80pct`, {
text: `⚠️ Campaign "${campaign.name}" is at ${Math.round(usagePct * 100)}% of its budget ($${campaign.totalCostUsd.toFixed(2)} / $${campaign.budgetCapUsd})`,
});
}
}
// Tenant-level
const tenantsWithCaps = await db.query.tenants.findMany({
where: isNotNull(tenants.monthlySpendCapUsd),
});
for (const tenant of tenantsWithCaps) {
const usagePct = tenant.currentMonthSpendUsd / tenant.monthlySpendCapUsd;
if (usagePct >= 0.8 && usagePct < 1.0) {
await sendAlertOnce(`tenant:${tenant.id}:80pct:${startOfMonth()}`, {
text: `⚠️ Tenant "${tenant.name}" is at ${Math.round(usagePct * 100)}% of their monthly LLM budget`,
recipient: tenant.adminEmail,
});
}
}
}sendAlertOnce deduplicates using a Redis key with a 24-hour TTL so the same alert does not fire on every check cycle.
Month Auto-Reset (NEW)
On the 1st of every calendar month at 00:00 UTC, a BullMQ cron job resets all monthly spend counters and resumes agents/campaigns that were paused solely due to budget exhaustion.
// Registered as a repeatable job at startup
await budgetResetQueue.add('monthly-budget-reset', {}, {
repeat: { pattern: '0 0 1 * *' }, // 1st of every month at midnight UTC
});
// Processor
async function monthlyBudgetReset(): Promise<void> {
await db.transaction(async (tx) => {
// 1. Reset all tenant monthly spend counters
await tx.update(tenants)
.set({ currentMonthSpendUsd: 0 });
// 2. Reset agent monthly spend table (keep rows for history, set to 0 for current month)
// New month's rows will be created on first spend — nothing to reset
// 3. Resume campaigns paused by budget
await tx.update(campaigns)
.set({ status: 'active', pauseReason: null })
.where(eq(campaigns.pauseReason, 'budget_exceeded'));
// 4. Resume agents paused by monthly budget
await tx.update(agentConfigs)
.set({ status: 'active', pauseReason: null })
.where(eq(agentConfigs.pauseReason, 'monthly_budget_exceeded'));
});
// Re-enqueue any campaigns that were waiting for budget reset
await requeuePausedCampaignTasks();
await logger.info('Monthly budget reset complete');
}Important: Only campaigns/agents paused with pauseReason = 'budget_exceeded' or 'monthly_budget_exceeded' are resumed. Campaigns paused for other reasons (e.g. 'manual_pause', 'approval_pending') are not touched.
New Database Table: agent_monthly_spend
Added to providers/provider-db/src/schema/billing.ts:
export const agentMonthlySpend = pgTable('agent_monthly_spend', {
id: uuid('id').primaryKey().defaultRandom(),
tenantId: uuid('tenant_id').notNull().references(() => tenants.id),
agentRole: text('agent_role').notNull(),
month: date('month').notNull(), // always 1st of month: '2026-04-01'
totalCostUsd: numeric('total_cost_usd', { precision: 10, scale: 6 }).notNull().default('0'),
updatedOn: timestamp('updated_on').notNull().defaultNow(),
}, (t) => ({
uniqueIdx: uniqueIndex('agent_monthly_spend_unique').on(t.tenantId, t.agentRole, t.month),
tenantIdx: index('agent_monthly_spend_tenant_idx').on(t.tenantId),
}));This table is append-friendly — one row per (tenant, agent role, month). Historical months are never updated, giving a full spend history per agent over time.
New Field on agent_configs: monthlyBudgetCapUsd
// Added to the agent_configs table
monthlyBudgetCapUsd: numeric('monthly_budget_cap_usd', { precision: 10, scale: 2 }),
// null = no monthly cap for this agentConfigurable per agent role per tenant in the Dashboard → Team → Agent configuration modal, and globally in the Manage App → Agent Configs (M4).
Enforcement Order at Task Dispatch
Every time a task is about to be dispatched to an agent, all four budget checks run in order:
1. checkTenantMonthlyBudget(tenantId, estimatedCost)
→ Throws if tenant monthly cap exceeded
→ Warning if ≥80% (Slack alert, deduplicated)
2. checkAgentMonthlyBudget(tenantId, agentRole, estimatedCost)
→ Throws if agent monthly cap exceeded (agent paused for month)
→ Returns 'warning' if ≥80%
3. checkCampaignBudget(campaignId, estimatedCost)
→ Throws if campaign cap exceeded (campaign paused)
4. Build system prompt
→ If step 2 returned 'warning': prepend budget notice to system prompt
→ Agent receives warning; self-limits accordingly
5. Dispatch task → execute
→ Per-task cap enforced mid-stream (abort if exceeded)Aggregation
By campaign
SELECT
c.id, c.name, cl.name AS client_name,
COALESCE(SUM(l.cost_usd), 0) AS total_cost_usd,
SUM(l.input_tokens) AS total_input_tokens,
SUM(l.output_tokens) AS total_output_tokens,
COUNT(l.id) AS total_calls
FROM campaigns c
LEFT JOIN clients cl ON cl.id = c.client_id
LEFT JOIN tasks t ON t.campaign_id = c.id
LEFT JOIN task_runs tr ON tr.task_id = t.id
LEFT JOIN llm_calls l ON l.task_run_id = tr.id
GROUP BY c.id, cl.name;By agent role (monthly)
SELECT
ams.agent_role,
ams.month,
ams.total_cost_usd,
ac.monthly_budget_cap_usd,
ROUND((ams.total_cost_usd / NULLIF(ac.monthly_budget_cap_usd, 0)) * 100, 1) AS pct_used
FROM agent_monthly_spend ams
JOIN agent_configs ac ON ac.tenant_id = ams.tenant_id AND ac.role = ams.agent_role
WHERE ams.tenant_id = :tenantId
ORDER BY ams.month DESC, ams.total_cost_usd DESC;By tenant (platform-wide, for Manage App)
SELECT
t.id, t.name, t.plan,
t.current_month_spend_usd,
t.monthly_spend_cap_usd,
ROUND((t.current_month_spend_usd / NULLIF(t.monthly_spend_cap_usd, 0)) * 100, 1) AS pct_used
FROM tenants t
WHERE t.deleted_on IS NULL
ORDER BY t.current_month_spend_usd DESC;Audit Log
The llm_calls table provides a complete audit trail:
| What | How |
|---|---|
| Who spent what | task_run_id → task → campaign → client join chain |
| Which agent role | agent_role column (denormalised on llm_calls for fast queries) |
| When | created_at on every row |
| Which model | model column |
| What was asked | prompt_hash (SHA-256 — no raw content stored) |
| What was produced | response_hash |
| How much it cost | cost_usd |
Dashboard Metrics
The Cost Dashboard screen (/costs) surfaces:
| Metric | Source |
|---|---|
| Total spend this month | tenants.currentMonthSpendUsd |
| Spend vs. monthly cap | currentMonthSpendUsd / monthlySpendCapUsd — progress bar |
| Spend by agent (donut chart) | agent_monthly_spend for current month |
| Agent budget utilisation table | Per agent: spend / cap / % / status (active / paused / warning) |
| Spend by client (bar chart) | llm_calls GROUP BY client |
| Daily spend trend (line chart) | llm_calls GROUP BY DATE_TRUNC('day') |
| Model split (stacked bar) | llm_calls GROUP BY model |
| Budget alert table | Campaigns and agents at ≥80% cap |
| LLM call log (paginated table) | Full llm_calls with filters: agent, campaign, model, date range |
Ollama Cost Handling
Ollama calls have cost_usd = 0.00 by design. Token counts are still tracked (input_tokens, output_tokens) for:
- Understanding relative workload of local vs. cloud calls
- Future cost modelling if switching a task to a paid model
- Context window management (prevent Ollama context overflow)
agent_monthly_spend.totalCostUsdcorrectly stays at $0 for Ollama-only agents — they are never blocked by budget caps
Package Location
packages/agent-engine/
├── src/
│ ├── pricing.ts # MODEL_PRICING, calculateCost()
│ ├── budget.ts # checkTenantMonthlyBudget(), checkAgentMonthlyBudget(),
│ │ # checkCampaignBudget(), pauseAgent(), resumeAgent()
│ ├── budget-warning.ts # buildSystemPromptWithBudgetWarning() — 80% prompt injection
│ └── budget-reset.ts # monthlyBudgetReset() — BullMQ cron processor
apps/api/
└── src/
└── routes/
└── costs.ts # API endpoints for Cost Dashboard queries