Cost & Usage Tracking

[Live — April 2026] Credits System

Leadmetrics uses a credit-based model for agent content generation. Credits are the subscription currency; LLM USD costs are tracked separately for internal visibility.

Package — `@leadmetrics/billing`


packages/billing/src/
  index.ts        # public exports
  rates.ts        # CREDIT_RATES constant
  balance.ts      # getOrCreateCreditBalance, reserveCredits, consumeCredits, releaseCredits, getCreditCost

Prisma Models (`packages/db/prisma/schema.prisma`)

Model	Purpose
`CreditBalance`	One row per tenant — `available`, `reserved`, `consumed`, `totalAllocated`
`CreditLedger`	Immutable audit trail — type ∈ `allocated`/`consumed`/`reserved`/`released`/`adjusted`/`topped_up`
`CreditTopupOrder`	Razorpay top-up order records

Credit Rates

Credit type	Rate	Agent / worker
`context_file`	2	setup.worker (context-file-writer step only)
`strategy`	5	strategy-writer.worker
`deliverable_planner`	1	strategy.worker
`activity_planner`	2	activity.worker
`blog_post`	2	blog-writer.worker
`social_post`	1	social-post-writer.worker
`landing_page`	3	landing-page-writer.worker
`gbp_post`	1	gbp-post-writer.worker
`email_newsletter`	2	email-writer.worker
`social_calendar`	1	social-calendar-planner.worker
`google_ads_rsa`	2	google-ads-writer.worker
`meta_ads`	2	meta-ads-writer.worker
`report`	2	report-writer.worker + custom-report-writer.worker
`research_notes`	1	research-note-writer.worker
`topic_research`	1	topic-researcher.worker
`ads_analysis`	1	ads-analyst.worker
`anomaly_detection`	1	anomaly-detector.worker
`review_response`	1	review-response-writer.worker
`site_audit`	2	site-auditor.worker
`keyword_cluster`	1	keyword-researcher.worker
`content_brief`	1	content-brief-writer.worker
`backlink_research`	1	backlink-researcher.worker
`backlink_outreach`	1 per email	backlink-outreach-writer.worker
`channel_insight`	1	all 8 channel insight workers (via insight-worker-base)
`ai_narrative`	1	brand-narrative-analyst.worker
`ai_visibility`	1 per run	ai-visibility-monitor.worker

Not charged: rag-ingestion (indexing utility), website-crawler (no LLM), social-post-designer (image generation), blog-faq-writer (secondary enrichment of an already-charged blog post), ai-visibility-seeder (setup utility).

Worker Patterns

Pattern A — createContentWorker (14 workers): Pass creditType in options; billing is handled centrally in content.worker.ts:


createContentWorker("email-writer", { creditType: "email_newsletter" })

Pattern B — Standalone workers: Reserve before execute, release on all failure paths, consume after DB save. All consumeCredits/releaseCredits calls are non-fatal (.catch(log.warn)).

Insight workers: Billing wired once in packages/agents/src/workers/insights/insight-worker-base.ts — all 8 channel insight workers inherit it automatically.

API

GET /admin/v1/billing/tenant-usage?tenantId= — requireSuperAdmin; returns credits + recent ledger + LLM 30-day aggregates
POST /admin/v1/billing/credits/adjust — manual credit adjustment

UI Screens

Dashboard /usage — Credits & Usage: balance progress bar (violet/amber/red thresholds), Razorpay top-up, recent ledger.

Dashboard /costs — LLM Costs: 30-day totals, daily bar chart, by-agent-role table, by-model table. Data source: AgentRun table (costUsd, inputTokens, outputTokens).

Dashboard sidebar — “Billing” NavGroup: Credits & Usage (Coins icon) + LLM Costs (DollarSign icon), between Audit Logs and Settings.

Manage portal /tenants/[id] → Credits & LLM tab — per-tenant credit balance, adjust form (amount + note), LLM spend aggregates, recent ledger.

Purpose

Monitor and control LLM spending across all agents, activities, and tenants. Every dollar spent on inference must be attributable, auditable, and controllable. Costs are surfaced at every level of the system so over-runs are caught early and billed back to the right tenant.

Related: Database — llm_calls, billing_events, agent_monthly_spend tables | Multi-Tenancy — per-tenant cost caps | Agent Hierarchy — cost caps per agent role in agent_configs

Budget Cap Hierarchy

Four levels of budget enforcement apply in order from most granular to most broad. An agent task is blocked if any level’s cap is exceeded.


┌──────────────────────────────────────────────────────────┐
│  Tenant monthly cap                                      │  ← Platform-wide ceiling per client
│    └── Agent monthly cap (per role)                      │  ← Monthly ceiling per agent type
│           └── Campaign budget cap                        │  ← Per-campaign spending limit
│                  └── Per-task cost cap                   │  ← Single execution ceiling
└──────────────────────────────────────────────────────────┘

What Gets Tracked

Event	What’s captured
Every LLM call	model, input tokens, output tokens, cost USD, duration, prompt hash, task run
Every tool call	tool name, method, duration, success/failure
Agent monthly spend	cumulative cost per agent role per tenant per calendar month
Tenant monthly spend	cumulative cost across all agents per tenant per calendar month
Budget cap breaches	level (task / campaign / agent / tenant), threshold hit, action taken
Session totals	cumulative tokens across all calls in a session

Cost Calculation

Cost is calculated immediately after each LLM call completes, before the result is returned to the worker.


const MODEL_PRICING: Record<string, { inputPer1M: number; outputPer1M: number }> = {
  'claude-sonnet-4-6':          { inputPer1M: 3.00,   outputPer1M: 15.00  },
  'claude-opus-4-6':            { inputPer1M: 15.00,  outputPer1M: 75.00  },
  'claude-haiku-4-5-20251001':  { inputPer1M: 0.25,   outputPer1M: 1.25   },
  'gpt-4o':                     { inputPer1M: 2.50,   outputPer1M: 10.00  },
  'gpt-4o-mini':                { inputPer1M: 0.15,   outputPer1M: 0.60   },
  'gemma3:4b':                  { inputPer1M: 0,       outputPer1M: 0      }, // local
  'llama3.2':                   { inputPer1M: 0,       outputPer1M: 0      }, // local
  'mistral':                    { inputPer1M: 0,       outputPer1M: 0      }, // local
};
 
function calculateCost(model: string, inputTokens: number, outputTokens: number): number {
  const p = MODEL_PRICING[model] ?? { inputPer1M: 0, outputPer1M: 0 };
  return (inputTokens / 1_000_000) * p.inputPer1M
       + (outputTokens / 1_000_000) * p.outputPer1M;
}

Pricing is maintained in packages/agent-engine/src/pricing.ts. When a provider changes pricing, update MODEL_PRICING — no DB migration needed.

LLM Call Logging

Every call is written to llm_calls immediately after the stream closes:


await db.insert(llmCalls).values({
  id:           crypto.randomUUID(),
  taskRunId:    taskRun.id,
  tenantId:     tenant.id,
  agentRole:    agentConfig.role,
  sessionId:    session?.id ?? null,
  model,
  promptHash:   sha256(fullPrompt),    // no raw prompt stored — privacy
  responseHash: sha256(responseText),
  inputTokens,
  outputTokens,
  costUsd:      calculateCost(model, inputTokens, outputTokens),
  durationMs,
  createdAt:    new Date(),
});

After logging, three counters are incremented atomically in a single transaction:


await db.transaction(async (tx) => {
  // 1. Campaign running total
  await tx.update(campaigns)
    .set({ totalCostUsd: sql`total_cost_usd + ${costUsd}` })
    .where(eq(campaigns.id, campaignId));
 
  // 2. Agent monthly spend (upsert — row created on first spend of the month)
  await tx.insert(agentMonthlySpend)
    .values({ tenantId, agentRole, month: startOfMonth(), totalCostUsd: costUsd })
    .onConflictDoUpdate({
      target: [agentMonthlySpend.tenantId, agentMonthlySpend.agentRole, agentMonthlySpend.month],
      set: { totalCostUsd: sql`total_cost_usd + ${costUsd}` },
    });
 
  // 3. Tenant monthly spend
  await tx.update(tenants)
    .set({ currentMonthSpendUsd: sql`current_month_spend_usd + ${costUsd}` })
    .where(eq(tenants.id, tenantId));
});

Budget Cap Enforcement

Level 1 — Per-task cap

Each agent config has an optional maxCostUsdPerTask. The executor tracks running cost mid-stream and aborts if the cap is hit:


let runningCost = 0;
 
for await (const event of executor.execute(request)) {
  if (event.type === 'text_delta') {
    runningCost = estimateCostFromStreamedTokens(model, estimatedInputTokens, streamedOutputTokens);
 
    if (agentConfig.limits.maxCostUsdPerTask && runningCost > agentConfig.limits.maxCostUsdPerTask) {
      executor.abort();
      throw new CostCapExceededError(`Task cost cap of $${agentConfig.limits.maxCostUsdPerTask} exceeded`);
    }
  }
  yield event;
}

Level 2 — Per-campaign cap

Before each LLM call, the worker checks the campaign’s remaining budget:


async function checkCampaignBudget(campaignId: string, estimatedCallCost: number): Promise<void> {
  const campaign = await db.query.campaigns.findFirst({ where: eq(campaigns.id, campaignId) });
 
  if (!campaign.budgetCapUsd) return;  // no cap set
 
  if (campaign.totalCostUsd + estimatedCallCost > campaign.budgetCapUsd) {
    await db.update(campaigns)
      .set({ status: 'paused', pauseReason: 'budget_exceeded' })
      .where(eq(campaigns.id, campaignId));
    await drainCampaignQueueJobs(campaignId);
    await notifyBudgetBreached({ level: 'campaign', campaign });
    throw new BudgetCapExceededError(`Campaign budget of $${campaign.budgetCapUsd} would be exceeded`);
  }
}

When paused by budget, all pending tasks for that campaign are drained from the queue. The campaign stays paused until a human increases the budget cap or the month resets (see Month Auto-Reset).

Level 3 — Per-agent monthly cap (NEW)

Each agent config has an optional monthlyBudgetCapUsd. This is a rolling calendar-month ceiling — it limits how much any single agent role can spend across all tasks for a tenant within the month.

Before dispatching a task to an agent, the worker checks the current month’s accumulated spend for that agent role:


async function checkAgentMonthlyBudget(
  tenantId: string,
  agentRole: AgentRole,
  estimatedCallCost: number,
): Promise<'ok' | 'warning' | 'exceeded'> {
  const config = await db.query.agentConfigs.findFirst({
    where: and(eq(agentConfigs.tenantId, tenantId), eq(agentConfigs.role, agentRole)),
  });
 
  if (!config.monthlyBudgetCapUsd) return 'ok';
 
  const spend = await db.query.agentMonthlySpend.findFirst({
    where: and(
      eq(agentMonthlySpend.tenantId, tenantId),
      eq(agentMonthlySpend.agentRole, agentRole),
      eq(agentMonthlySpend.month, startOfMonth()),
    ),
  });
 
  const currentSpend = spend?.totalCostUsd ?? 0;
  const usagePct = currentSpend / config.monthlyBudgetCapUsd;
 
  if (currentSpend + estimatedCallCost > config.monthlyBudgetCapUsd) {
    await pauseAgent(tenantId, agentRole, 'monthly_budget_exceeded');
    await notifyBudgetBreached({ level: 'agent', agentRole, tenantId });
    throw new BudgetCapExceededError(`Agent ${agentRole} monthly budget of $${config.monthlyBudgetCapUsd} exceeded`);
  }
 
  if (usagePct >= 0.8) return 'warning';  // caller injects prompt warning (see below)
  return 'ok';
}

When an agent is paused by monthly budget, only that agent role is paused for that tenant. Other agents continue working. The agent resumes automatically on the 1st of the next month.

Level 4 — Tenant monthly cap (NEW)

The tenants.monthlySpendCapUsd field defines the total monthly LLM spend ceiling for the tenant across all agents. Checked before every task dispatch:


async function checkTenantMonthlyBudget(tenantId: string, estimatedCallCost: number): Promise<void> {
  const tenant = await db.query.tenants.findFirst({ where: eq(tenants.id, tenantId) });
 
  if (!tenant.monthlySpendCapUsd) return;
 
  const usagePct = tenant.currentMonthSpendUsd / tenant.monthlySpendCapUsd;
 
  if (tenant.currentMonthSpendUsd + estimatedCallCost > tenant.monthlySpendCapUsd) {
    // Pause ALL agent queues for this tenant
    await pauseAllAgentsForTenant(tenantId, 'tenant_monthly_budget_exceeded');
    await notifyBudgetBreached({ level: 'tenant', tenantId });
    throw new BudgetCapExceededError(`Tenant monthly budget of $${tenant.monthlySpendCapUsd} exceeded`);
  }
 
  if (usagePct >= 0.8) {
    // One-time alert — deduplicated per day
    await sendBudgetWarning({ level: 'tenant', tenantId, usagePct });
  }
}

When the tenant cap is hit, all agent queues for that tenant are drained. No new tasks are dispatched until the cap is raised (by the tenant admin or super admin) or the month resets.

Agent Warning at 80% (NEW)

When an agent’s monthly spend reaches 80% of its cap, the next task it receives includes a budget warning injected into its system prompt. This makes the agent itself aware of the constraint — it self-limits by being more concise, skipping optional research steps, and prioritising the core deliverable.

The warning is injected in the adapter layer, before the LLM call:


async function buildSystemPromptWithBudgetWarning(
  baseSystemPrompt: string,
  tenantId: string,
  agentRole: AgentRole,
): Promise<string> {
  const budgetStatus = await checkAgentMonthlyBudget(tenantId, agentRole, 0);
 
  if (budgetStatus !== 'warning') return baseSystemPrompt;
 
  const spend = await getAgentMonthlySpend(tenantId, agentRole);
  const config = await getAgentConfig(tenantId, agentRole);
  const pct = Math.round((spend / config.monthlyBudgetCapUsd) * 100);
 
  const warning = `
⚠️ BUDGET NOTICE: You have used ${pct}% of your monthly token budget for this client.
You are approaching your limit. For this task:
- Be concise in your responses
- Skip optional research or elaboration steps
- Focus only on the core deliverable
- Avoid making unnecessary tool calls
`;
 
  return warning + '\n\n' + baseSystemPrompt;
}

This warning is prepended to the system prompt for all three adapters (Claude, OpenAI, Ollama). It is not stored in the task run’s prompt — it is injected ephemerally at dispatch time.

The warning fires once per task when the agent is in the 80–99% range. At 100%, the agent is hard-stopped (no task is dispatched at all).

Budget Alert System

Alerts fire at two thresholds. The 80% alert notifies humans; the 100% alert also blocks execution.

Threshold	Action — Campaign cap	Action — Agent monthly cap	Action — Tenant monthly cap
80%	Slack alert to DM Portal	Slack alert + warning injected into agent’s next prompt	Slack alert to tenant admin
100%	Campaign paused, queue drained	Agent paused for remainder of month	All agents paused, queue drained


// Budget alert check — BullMQ repeatable job, runs every 15 minutes
async function checkBudgetAlerts(): Promise<void> {
  // Campaign-level
  const campaignsWithCaps = await db.query.campaigns.findMany({
    where: and(isNotNull(campaigns.budgetCapUsd), eq(campaigns.status, 'active')),
  });
 
  for (const campaign of campaignsWithCaps) {
    const usagePct = campaign.totalCostUsd / campaign.budgetCapUsd;
    if (usagePct >= 0.8 && usagePct < 1.0) {
      await sendAlertOnce(`campaign:${campaign.id}:80pct`, {
        text: `⚠️ Campaign "${campaign.name}" is at ${Math.round(usagePct * 100)}% of its budget ($${campaign.totalCostUsd.toFixed(2)} / $${campaign.budgetCapUsd})`,
      });
    }
  }
 
  // Tenant-level
  const tenantsWithCaps = await db.query.tenants.findMany({
    where: isNotNull(tenants.monthlySpendCapUsd),
  });
 
  for (const tenant of tenantsWithCaps) {
    const usagePct = tenant.currentMonthSpendUsd / tenant.monthlySpendCapUsd;
    if (usagePct >= 0.8 && usagePct < 1.0) {
      await sendAlertOnce(`tenant:${tenant.id}:80pct:${startOfMonth()}`, {
        text: `⚠️ Tenant "${tenant.name}" is at ${Math.round(usagePct * 100)}% of their monthly LLM budget`,
        recipient: tenant.adminEmail,
      });
    }
  }
}

sendAlertOnce deduplicates using a Redis key with a 24-hour TTL so the same alert does not fire on every check cycle.

Month Auto-Reset (NEW)

On the 1st of every calendar month at 00:00 UTC, a BullMQ cron job resets all monthly spend counters and resumes agents/campaigns that were paused solely due to budget exhaustion.


// Registered as a repeatable job at startup
await budgetResetQueue.add('monthly-budget-reset', {}, {
  repeat: { pattern: '0 0 1 * *' },  // 1st of every month at midnight UTC
});
 
// Processor
async function monthlyBudgetReset(): Promise<void> {
  await db.transaction(async (tx) => {
    // 1. Reset all tenant monthly spend counters
    await tx.update(tenants)
      .set({ currentMonthSpendUsd: 0 });
 
    // 2. Reset agent monthly spend table (keep rows for history, set to 0 for current month)
    //    New month's rows will be created on first spend — nothing to reset
 
    // 3. Resume campaigns paused by budget
    await tx.update(campaigns)
      .set({ status: 'active', pauseReason: null })
      .where(eq(campaigns.pauseReason, 'budget_exceeded'));
 
    // 4. Resume agents paused by monthly budget
    await tx.update(agentConfigs)
      .set({ status: 'active', pauseReason: null })
      .where(eq(agentConfigs.pauseReason, 'monthly_budget_exceeded'));
  });
 
  // Re-enqueue any campaigns that were waiting for budget reset
  await requeuePausedCampaignTasks();
 
  await logger.info('Monthly budget reset complete');
}

Important: Only campaigns/agents paused with pauseReason = 'budget_exceeded' or 'monthly_budget_exceeded' are resumed. Campaigns paused for other reasons (e.g. 'manual_pause', 'approval_pending') are not touched.

New Database Table: `agent_monthly_spend`

Added to providers/provider-db/src/schema/billing.ts:


export const agentMonthlySpend = pgTable('agent_monthly_spend', {
  id:           uuid('id').primaryKey().defaultRandom(),
  tenantId:     uuid('tenant_id').notNull().references(() => tenants.id),
  agentRole:    text('agent_role').notNull(),
  month:        date('month').notNull(),          // always 1st of month: '2026-04-01'
  totalCostUsd: numeric('total_cost_usd', { precision: 10, scale: 6 }).notNull().default('0'),
  updatedOn:    timestamp('updated_on').notNull().defaultNow(),
}, (t) => ({
  uniqueIdx: uniqueIndex('agent_monthly_spend_unique').on(t.tenantId, t.agentRole, t.month),
  tenantIdx: index('agent_monthly_spend_tenant_idx').on(t.tenantId),
}));

This table is append-friendly — one row per (tenant, agent role, month). Historical months are never updated, giving a full spend history per agent over time.

New Field on `agent_configs`: `monthlyBudgetCapUsd`


// Added to the agent_configs table
monthlyBudgetCapUsd: numeric('monthly_budget_cap_usd', { precision: 10, scale: 2 }),
// null = no monthly cap for this agent

Configurable per agent role per tenant in the Dashboard → Team → Agent configuration modal, and globally in the Manage App → Agent Configs (M4).

Enforcement Order at Task Dispatch

Every time a task is about to be dispatched to an agent, all four budget checks run in order:


1. checkTenantMonthlyBudget(tenantId, estimatedCost)
   → Throws if tenant monthly cap exceeded
   → Warning if ≥80% (Slack alert, deduplicated)

2. checkAgentMonthlyBudget(tenantId, agentRole, estimatedCost)
   → Throws if agent monthly cap exceeded (agent paused for month)
   → Returns 'warning' if ≥80%

3. checkCampaignBudget(campaignId, estimatedCost)
   → Throws if campaign cap exceeded (campaign paused)

4. Build system prompt
   → If step 2 returned 'warning': prepend budget notice to system prompt
   → Agent receives warning; self-limits accordingly

5. Dispatch task → execute
   → Per-task cap enforced mid-stream (abort if exceeded)

Aggregation

By campaign


SELECT
  c.id, c.name, cl.name AS client_name,
  COALESCE(SUM(l.cost_usd), 0) AS total_cost_usd,
  SUM(l.input_tokens)           AS total_input_tokens,
  SUM(l.output_tokens)          AS total_output_tokens,
  COUNT(l.id)                   AS total_calls
FROM campaigns c
LEFT JOIN clients cl ON cl.id = c.client_id
LEFT JOIN tasks t    ON t.campaign_id = c.id
LEFT JOIN task_runs tr ON tr.task_id = t.id
LEFT JOIN llm_calls l  ON l.task_run_id = tr.id
GROUP BY c.id, cl.name;

By agent role (monthly)


SELECT
  ams.agent_role,
  ams.month,
  ams.total_cost_usd,
  ac.monthly_budget_cap_usd,
  ROUND((ams.total_cost_usd / NULLIF(ac.monthly_budget_cap_usd, 0)) * 100, 1) AS pct_used
FROM agent_monthly_spend ams
JOIN agent_configs ac ON ac.tenant_id = ams.tenant_id AND ac.role = ams.agent_role
WHERE ams.tenant_id = :tenantId
ORDER BY ams.month DESC, ams.total_cost_usd DESC;

By tenant (platform-wide, for Manage App)


SELECT
  t.id, t.name, t.plan,
  t.current_month_spend_usd,
  t.monthly_spend_cap_usd,
  ROUND((t.current_month_spend_usd / NULLIF(t.monthly_spend_cap_usd, 0)) * 100, 1) AS pct_used
FROM tenants t
WHERE t.deleted_on IS NULL
ORDER BY t.current_month_spend_usd DESC;

Audit Log

The llm_calls table provides a complete audit trail:

What	How
Who spent what	`task_run_id → task → campaign → client` join chain
Which agent role	`agent_role` column (denormalised on `llm_calls` for fast queries)
When	`created_at` on every row
Which model	`model` column
What was asked	`prompt_hash` (SHA-256 — no raw content stored)
What was produced	`response_hash`
How much it cost	`cost_usd`

Dashboard Metrics

The Cost Dashboard screen (/costs) surfaces:

Metric	Source
Total spend this month	`tenants.currentMonthSpendUsd`
Spend vs. monthly cap	`currentMonthSpendUsd / monthlySpendCapUsd` — progress bar
Spend by agent (donut chart)	`agent_monthly_spend` for current month
Agent budget utilisation table	Per agent: spend / cap / % / status (active / paused / warning)
Spend by client (bar chart)	`llm_calls GROUP BY client`
Daily spend trend (line chart)	`llm_calls GROUP BY DATE_TRUNC('day')`
Model split (stacked bar)	`llm_calls GROUP BY model`
Budget alert table	Campaigns and agents at ≥80% cap
LLM call log (paginated table)	Full `llm_calls` with filters: agent, campaign, model, date range

Ollama Cost Handling

Ollama calls have cost_usd = 0.00 by design. Token counts are still tracked (input_tokens, output_tokens) for:

Understanding relative workload of local vs. cloud calls
Future cost modelling if switching a task to a paid model
Context window management (prevent Ollama context overflow)
agent_monthly_spend.totalCostUsd correctly stays at $0 for Ollama-only agents — they are never blocked by budget caps

Package Location


packages/agent-engine/
├── src/
│   ├── pricing.ts         # MODEL_PRICING, calculateCost()
│   ├── budget.ts          # checkTenantMonthlyBudget(), checkAgentMonthlyBudget(),
│   │                      # checkCampaignBudget(), pauseAgent(), resumeAgent()
│   ├── budget-warning.ts  # buildSystemPromptWithBudgetWarning() — 80% prompt injection
│   └── budget-reset.ts    # monthlyBudgetReset() — BullMQ cron processor

apps/api/
└── src/
    └── routes/
        └── costs.ts       # API endpoints for Cost Dashboard queries