Agent Cost Monitoring

We run ≈$8-9k/week of Claude Code list-price spend across slots a1-a10. Cost incidents — runaway patrol sessions, broken cache eviction, accidental Opus usage — used to surface manually weeks late by parsing JSONL transcripts. pnpm crux sys cost report is the on-demand terminal view that catches them in days, not weeks.

Quick reference

# Default: last 7 days, daily totals
pnpm crux sys cost report

# Rank slots by spend over last 14 days
pnpm crux sys cost report --by=session --since=14d

# 5-hour billing-block view (matches Claude's rate-limit windows)
pnpm crux sys cost report --by=blocks

# Per-model breakdown — useful for "is this Opus or Sonnet?"
pnpm crux sys cost report --by=session --breakdown

# Machine-readable for piping into another tool
pnpm crux sys cost report --by=session --json | jq '.sessions[0]'

Wrapper around ccusage, which parses the same JSONL transcripts (~/.claude/projects/*.jsonl) we used to grep manually. Cost numbers match list-price billing (Opus 4.7 = $5/$25 per Mtok input/output, Sonnet 4.6 = $3/$15, etc.).

What `--by` means

`--by`	Grouping	Useful for
`daily` (default)	Per-day totals	Spotting day-over-day spikes
`session`	Per project directory (= per slot for us)	"Which slot is burning the most?"
`weekly`	Per ISO week	Weekly trend
`monthly`	Per calendar month	Budget review
`blocks`	Per 5-hour billing window	Matches Claude's rate-limit semantics

Note: ccusage calls a slot a "session" because it groups by project directory; an actual Claude Code conversation has its own UUID JSONL file. For per-conversation breakdowns, raw ~/.claude/projects/<slot>/<uuid>.jsonl is the source — there's no first-class command for that yet.

Detecting an incident — playbook

When you suspect a cost spike (Discord ping, monthly bill review, gut feel):

pnpm crux sys cost report --by=session --since=7d to rank slots.
If one slot is way ahead, drill in: pnpm crux sys cost report --by=blocks --since=2d --breakdown.
Check the slot's tmux window or crux sys sessions list --slot=N to see what's running.
Common causes:
- Opus on a refactor that should be Sonnet — kill, swap model, restart.
- Re-read/re-summarize loop — kill, file a Linear ticket on the looping pattern.
- A patrol session with a stuck loop — see .claude/rules/agent-planning-discipline.md.

What's deliberately out of scope

Live dashboards / alerts — we run Grafana in the production k8s cluster already. If we ever want live cost dashboards, the right move is to wire Claude Code's built-in OTel exporter (CLAUDE_CODE_ENABLE_TELEMETRY=1) to the existing prod Prometheus, not to run a parallel docker stack on each workstation.
Per-PR / per-Linear-ticket cost attribution — needs a join with agent_sessions. Separate ticket if it becomes useful.
Cost-by-user attribution — single-user setup, not worth the labels.

Agent Cost Monitoring

Quick reference

What --by means

Detecting an incident — playbook

What's deliberately out of scope

References

What `--by` means