System Manual

The Credit Engine

How usage is metered: one normalized credit per 1,000 baseline tokens, model multipliers that track real cost, per-org and per-member budgets, hard stops, and plan-gated downshift.

The credit unit

Usage is metered in credits. Internally every run logs raw tokens — the ground truth for cost and margin — but customers see one normalized number. The conversion is defined once, in billing.ts, and read everywhere downstream:

typescript
// 1 credit = 1,000 tokens of the baseline (fast) model.
const TOKENS_PER_CREDIT = 1_000;

credits_charged = max(1, ceil( total_tokens / 1000 * model_multiplier ))

// Multipliers mirror real API price ratios vs the fast baseline:
fast    = 1×    // Gemini Flash, Claude Haiku
smart   = 12×   // Claude Sonnet, Gemini Pro
premium = 60×   // Claude Opus

Because the multiplier tracks the true price ratio, a credit maps to real cost no matter which model an agent uses. A premium run on the same token count costs 60× the credits of a fast run — which is exactly why model choice, escalation, and downshift matter to a budget. Every run is charged at least 1 credit.

Tokens

Raw usage per run

Ground truth logged for cost & margin

÷ 1,000

Normalize

1 baseline credit = 1,000 fast-model tokens

× model multiplier

fast

smart 12×

premium 60×

Credits

Metered vs. allowance

Counted against the monthly cap

Worked example

5,000 Sonnet (smart) tokens

= 5,000 ÷ 1,000 = 5

= 5 × 12 = 60 credits

Hard stop at the cap

When the org (or a member's budget) reaches the monthly allowance, the pre-run gate blocks the run and routes it to triage — never run on credit it doesn't have.

Figure: how raw tokens become metered credits — normalized per 1,000 tokens, scaled by the model multiplier, then counted against the monthly allowance with a hard stop at the cap.

Model multipliers & tier classification

The tier of a model is inferred from its id: anything with "opus" is premium; "sonnet" (or Gemini Pro) is smart; "haiku", "flash", or other Gemini models are fast. An unknown model is priced conservatively as smart so the meter never undercharges.

fast1× — Gemini Flash, Claude Haiku
smart12× — Claude Sonnet, Gemini Pro
premium60× — Claude Opus
unknown modelPriced as smart (12×) to avoid undercharging

Metering & enforcement

The credit meter wraps every run as part of the harness, in three moves:

  1. Pre-run gate

    Before an agent runs, the meter checks the org's remaining credits for the billing period and — on plans that support them — the member's personal budget. If either is exhausted it is a hard stop: the run is blocked and routed to triage with a clear reason, never run on credit it doesn't have.
  2. Model downshift

    The meter returns the plan, so the harness can downshift the agent's model to the best tier the plan allows — the agent still runs, on a cheaper brain.
  3. Post-run reconcile

    After a successful run, actual tokens are converted to credits, a row is appended to the per-org credit_ledger (with agent, model, and run id), and the period counter is atomically incremented. Metering is best-effort — a metering failure is logged but never breaks a completed run.
Note:Credits are org-scoped. Admins on org plans can set per-member budgets; when a member hits their budget the run is blocked with blockedBy: "member", distinct from the org running dry (blockedBy: "organization"), so the UI can tailor the message. Out-of-credit top-ups are sold at $25 per 1,000 credits.

Worked example: computing credits for a run

Example · A 9,200-token run on three different models

An agent run consumes 9,200 total tokens. The credit charge depends entirely on the model tier it ran on:

  • Fast (Haiku, 1×): ceil(9200 / 1000 × 1) = ceil(9.2) = 10 credits.
  • Smart (Sonnet, 12×): ceil(9.2 × 12) = ceil(110.4) = 111 credits.
  • Premium (Opus, 60×): ceil(9.2 × 60) = ceil(552) = 552 credits.

Same work, same tokens — 55× the credits between the cheapest and most expensive brain. If this agent were on a Pro plan (fast + smart only) but defaulted to Opus, the meter would downshift it to Sonnet and charge 111, not 552. On a Starter plan (500 included credits), even one Sonnet run of this size consumes a fifth of the monthly allowance — which is why Starter is fast-tier only.

The five plan tiers

Five tiers govern included credits, which model tiers an agent may reach, whether the harness may escalate to Opus, seat counts, per-member budgets, API access, and audit retention. Every plan can use every agent — tiers differ only by which brains agents may run on and how many credits you get.

Starter — $0/mo500 credits · fast only · no Opus escalation · 1 seat · no per-member budgets · no API · 7-day audit log
Pro — $25/mo3,000 credits · fast + smart · no Opus escalation · 1 seat · no per-member budgets · no API · 30-day audit log
Team — $99/mo12,000 credits · fast + smart · no Opus escalation · 3 seats · per-member budgets · API · 60-day audit log
Growth — $299/mo40,000 credits · fast + smart + premium · Opus escalation · 10 seats · per-member budgets · API · 90-day audit log
Enterprise — customCommitted credits · all tiers · Opus escalation · unlimited seats · per-member budgets · API · 365-day audit log
Tip:A restricted plan downshifts a premium agent to its best allowed model rather than blocking it — so a Pro customer still gets every agent in the fleet, just running on Sonnet instead of Opus.

◳ Screenshot

The usage meter in the dashboard: a ring showing credits used vs. included for the current billing period, a breakdown by department and model tier, per-member budget bars, and the credit ledger of recent runs (agent, model, tokens, credits).