Appearance
Cost Optimization
Caliper has three enforcement layers. Only two of them use the Anthropic API.
| Layer | API usage? | Speed |
|---|---|---|
Convention enforcement (npx caliper check) | No — free | Sub-second |
Local AI review (npx caliper review) | Yes | 30s -- 3 min |
PR review (npx caliper <pr>) | Yes | 1 -- 10 min |
Convention checks are pure pattern matching --- no API calls. They are always the fastest way to catch issues.
API usage scales with the number of files, risk level, and which pipeline phases run. The progress display shows a running token count during each review.
Fast path
Small PRs automatically use a fast path that skips synthesis, lenses, consolidation, and narrative phases. This cuts cost and time significantly.
A PR qualifies when all three thresholds are met:
| Threshold | Default | Config key |
|---|---|---|
| Changed lines | < 100 | fastPath.maxChangedLines |
| Files | ≤ 3 | fastPath.maxFiles |
| High-risk files | 0 | fastPath.maxHighRiskFiles |
Tune the thresholds in .caliper/config.yaml:
yaml
fastPath:
maxChangedLines: 200 # relax for repos with frequent small PRs
maxFiles: 5
maxHighRiskFiles: 1 # allow one high-risk file on fast pathYou can also force the behavior per-run:
bash
npx caliper 42 --fast # force fast path (skip synthesis/lenses/consolidation/narrative)
npx caliper 42 --full # force full pipeline even for small PRsModel selection
Caliper assigns models to pipeline phases based on PR scope. Three semantic slots map to Claude models:
| Slot | Default model | Used for |
|---|---|---|
small | Haiku | Mechanical tasks: crossfile selection, consolidation on small PRs, narrative on small PRs |
medium | Sonnet | Standard review, synthesis, lenses on normal PRs |
large | Opus | Deep reasoning: review, synthesis, and lenses on large PRs |
Configure in .caliper/config.yaml:
yaml
# Use one model for everything (lightest API usage):
model: haiku
# Or assign per slot:
model:
small: haiku
medium: sonnet
large: sonnet # use Sonnet instead of Opus for large slotThe npx caliper init wizard offers two presets:
- Light and cheap --- Haiku for most phases, Sonnet only for large PRs. Lighter API usage, good for most codebases.
- Thorough --- Sonnet and Opus. Heavier API usage, better for complex architectural review.
Cost controls
--max-cost flag (CI mode)
Set a hard cost ceiling. If the estimated cost exceeds this amount, Caliper posts an explanatory comment and exits cleanly without running the review.
bash
npx caliper 42 --ci --max-cost 2.00costWarningThreshold config
In interactive mode, Caliper prompts for confirmation when the estimated cost exceeds this threshold. Default is $2.
yaml
# .caliper/config.yaml
costWarningThreshold: 5 # warn above $5 instead of $2Set it high (e.g., 999) to effectively disable the warning.
Convention checks: the best optimization
The single most effective way to reduce API usage is to catch more issues deterministically. Convention checks compiled from your CLAUDE.md run in sub-second with no API calls --- every issue they catch is one the AI does not need to find.
bash
npx caliper refresh # compile CLAUDE.md rules into deterministic checks
npx caliper init --agent # install the stop hook so checks run after every agent turnRules that cannot be checked mechanically become conventions for the AI review layers. Nothing is dropped --- the enforcement just moves to the most efficient viable layer.
Tips for reducing API usage
Use convention checks first. The more rules you encode in CLAUDE.md and compile with
npx caliper refresh, the less work the AI has to do.Use
--fastto skip expensive phases. Synthesis, lenses, consolidation, and narrative are valuable for complex PRs but unnecessary for straightforward changes.Use
--min-severityto reduce noise. In CI,--min-severity recommendationskips nits and only posts findings that matter, reducing the number of AI-generated comments.Set
model: haikufor lighter pipelines. A single Haiku model for all slots uses the fewest tokens per review while still catching real issues.Keep PRs small. API usage scales with diff size. Smaller PRs naturally land in the fast path and use lighter model slots.
Use
--max-costin CI. Prevents runaway API usage on unexpectedly large PRs.