System Manual

The Credit Engine

How usage is metered: one normalized credit per 1,000 baseline tokens, model multipliers that track real cost, per-org and per-member budgets, hard stops, and plan-gated downshift.

The credit unit

Usage is metered in credits. Internally every run logs raw tokens — the ground truth for cost and margin — but customers see one normalized number. The conversion is defined once, in billing.ts, and read everywhere downstream:

typescript

// 1 credit = 1,000 tokens of the baseline (fast) model.
const TOKENS_PER_CREDIT = 1_000;

credits_charged = max(1, ceil( total_tokens / 1000 * model_multiplier ))

// Multipliers mirror real API price ratios vs the fast baseline:
fast    = 1×    // Gemini Flash, Claude Haiku
smart   = 12×   // Claude Sonnet, Gemini Pro
premium = 60×   // Claude Opus

Because the multiplier tracks the true price ratio, a credit maps to real cost no matter which model an agent uses. A premium run on the same token count costs 60× the credits of a fast run — which is exactly why model choice, escalation, and downshift matter to a budget. Every run is charged at least 1 credit.

Tokens

Raw usage per run

Ground truth logged for cost & margin

÷ 1,000

Normalize

1 baseline credit = 1,000 fast-model tokens

× model multiplier

fast 1×

smart 12×

premium 60×

Credits

Metered vs. allowance

Counted against the monthly cap

Worked example

5,000 Sonnet (smart) tokens

= 5,000 ÷ 1,000 = 5

= 5 × 12 = 60 credits

Hard stop at the cap

When the org (or a member's budget) reaches the monthly allowance, the pre-run gate blocks the run and routes it to triage — never run on credit it doesn't have.

Figure: how raw tokens become metered credits — normalized per 1,000 tokens, scaled by the model multiplier, then counted against the monthly allowance with a hard stop at the cap.

Model multipliers & tier classification

The tier of a model is inferred from its id: anything with "opus" is premium; "sonnet" (or Gemini Pro) is smart; "haiku", "flash", or other Gemini models are fast. An unknown model is priced conservatively as smart so the meter never undercharges.

fast	1× — Gemini Flash, Claude Haiku
smart	12× — Claude Sonnet, Gemini Pro
premium	60× — Claude Opus
unknown model	Priced as smart (12×) to avoid undercharging

Metering & enforcement

The credit meter wraps every run as part of the harness, in three moves:

•
Pre-run gate
Before an agent runs, the meter checks the org's remaining credits for the billing period and — on plans that support them — the member's personal budget. If either is exhausted it is a hard stop: the run is blocked and routed to triage with a clear reason, never run on credit it doesn't have.
•
Model downshift
The meter returns the plan, so the harness can downshift the agent's model to the best tier the plan allows — the agent still runs, on a cheaper brain.
•
Post-run reconcile
After a successful run, actual tokens are converted to credits, a row is appended to the per-org credit_ledger (with agent, model, and run id), and the period counter is atomically incremented. Metering is best-effort — a metering failure is logged but never breaks a completed run.

Note:Credits are org-scoped. Admins on org plans can set per-member budgets; when a member hits their budget the run is blocked with blockedBy: "member", distinct from the org running dry (blockedBy: "organization"), so the UI can tailor the message. Out-of-credit top-ups are sold at $25 per 1,000 credits.

Worked example: computing credits for a run

Example · A 9,200-token run on three different models

An agent run consumes 9,200 total tokens. The credit charge depends entirely on the model tier it ran on:

Fast (Haiku, 1×): ceil(9200 / 1000 × 1) = ceil(9.2) = 10 credits.
Smart (Sonnet, 12×): ceil(9.2 × 12) = ceil(110.4) = 111 credits.
Premium (Opus, 60×): ceil(9.2 × 60) = ceil(552) = 552 credits.

Same work, same tokens — 55× the credits between the cheapest and most expensive brain. If this agent were on a Pro plan (fast + smart only) but defaulted to Opus, the meter would downshift it to Sonnet and charge 111, not 552. On a Starter plan (500 included credits), even one Sonnet run of this size consumes a fifth of the monthly allowance — which is why Starter is fast-tier only.

The five plan tiers

Five tiers govern included credits, which model tiers an agent may reach, whether the harness may escalate to Opus, seat counts, per-member budgets, API access, and audit retention. Every plan can use every agent — tiers differ only by which brains agents may run on and how many credits you get.

Starter — $0/mo	500 credits · fast only · no Opus escalation · 1 seat · no per-member budgets · no API · 7-day audit log
Pro — $25/mo	3,000 credits · fast + smart · no Opus escalation · 1 seat · no per-member budgets · no API · 30-day audit log
Team — $99/mo	12,000 credits · fast + smart · no Opus escalation · 3 seats · per-member budgets · no API · 60-day audit log
Growth — $299/mo	40,000 credits · fast + smart + premium · Opus escalation · 10 seats · per-member budgets · rate-limited API · 90-day audit log
Enterprise — custom	Committed credits · all tiers · Opus escalation · unlimited seats · per-member budgets · unlimited API · 365-day audit log

Tip:A restricted plan downshifts a premium agent to its best allowed model rather than blocking it — so a Pro customer still gets every agent in the fleet, just running on Sonnet instead of Opus.

◳ Screenshot

The usage meter in the dashboard: a ring showing credits used vs. included for the current billing period, a breakdown by department and model tier, per-member budget bars, and the credit ledger of recent runs (agent, model, tokens, credits).

← PreviousTrust & Governance Next →Integrations