Cost Optimization

Caliper has three enforcement layers. Only two of them use the Anthropic API.

Layer	API usage?	Speed
Convention enforcement (`npx caliper check`)	No — free	Sub-second
Local AI review (`npx caliper review`)	Yes	30s -- 3 min
PR review (`npx caliper <pr>`)	Yes	1 -- 10 min

Convention checks are pure pattern matching --- no API calls. They are always the fastest way to catch issues.

API usage scales with the number of files, risk level, and which pipeline phases run. The progress display shows a running token count during each review.

Fast path

Small PRs automatically use a fast path that skips synthesis, lenses, consolidation, and narrative phases. This cuts cost and time significantly.

A PR qualifies when all three thresholds are met:

Threshold	Default	Config key
Changed lines	< 100	`fastPath.maxChangedLines`
Files	≤ 3	`fastPath.maxFiles`
High-risk files	0	`fastPath.maxHighRiskFiles`

Tune the thresholds in .caliper/config.yaml:

yaml

fastPath:
  maxChangedLines: 200 # relax for repos with frequent small PRs
  maxFiles: 5
  maxHighRiskFiles: 1 # allow one high-risk file on fast path

You can also force the behavior per-run:

bash

npx caliper 42 --fast    # force fast path (skip synthesis/lenses/consolidation/narrative)
npx caliper 42 --full    # force full pipeline even for small PRs

Model selection

Caliper assigns models to pipeline phases based on PR scope. Three semantic slots map to Claude models:

Slot	Default model	Used for
`small`	Haiku	Mechanical tasks: crossfile selection, consolidation on small PRs, narrative on small PRs
`medium`	Sonnet	Standard review, synthesis, lenses on normal PRs
`large`	Opus	Deep reasoning: review, synthesis, and lenses on large PRs

Configure in .caliper/config.yaml:

yaml

# Use one model for everything (lightest API usage):
model: haiku

# Or assign per slot:
model:
  small: haiku
  medium: sonnet
  large: sonnet    # use Sonnet instead of Opus for large slot

The npx caliper init wizard offers two presets:

Light and cheap --- Haiku for most phases, Sonnet only for large PRs. Lighter API usage, good for most codebases.
Thorough --- Sonnet and Opus. Heavier API usage, better for complex architectural review.

Cost controls

`--max-cost` flag (CI mode)

Set a hard cost ceiling. If the estimated cost exceeds this amount, Caliper posts an explanatory comment and exits cleanly without running the review.

bash

npx caliper 42 --ci --max-cost 2.00

`costWarningThreshold` config

In interactive mode, Caliper prompts for confirmation when the estimated cost exceeds this threshold. Default is $2.

yaml

# .caliper/config.yaml
costWarningThreshold: 5 # warn above $5 instead of $2

Set it high (e.g., 999) to effectively disable the warning.

Convention checks: the best optimization

The single most effective way to reduce API usage is to catch more issues deterministically. Convention checks compiled from your CLAUDE.md run in sub-second with no API calls --- every issue they catch is one the AI does not need to find.

bash

npx caliper refresh              # compile CLAUDE.md rules into deterministic checks
npx caliper init --agent         # install the stop hook so checks run after every agent turn

Rules that cannot be checked mechanically become conventions for the AI review layers. Nothing is dropped --- the enforcement just moves to the most efficient viable layer.

Tips for reducing API usage

Use convention checks first. The more rules you encode in CLAUDE.md and compile with npx caliper refresh, the less work the AI has to do.
Use --fast to skip expensive phases. Synthesis, lenses, consolidation, and narrative are valuable for complex PRs but unnecessary for straightforward changes.
Use --min-severity to reduce noise. In CI, --min-severity recommendation skips nits and only posts findings that matter, reducing the number of AI-generated comments.
Set model: haiku for lighter pipelines. A single Haiku model for all slots uses the fewest tokens per review while still catching real issues.
Keep PRs small. API usage scales with diff size. Smaller PRs naturally land in the fast path and use lighter model slots.
Use --max-cost in CI. Prevents runaway API usage on unexpectedly large PRs.

Cost Optimization ​

Fast path ​

Model selection ​

Cost controls ​

--max-cost flag (CI mode) ​

costWarningThreshold config ​

Convention checks: the best optimization ​

Tips for reducing API usage ​

Cost Optimization

Fast path

Model selection

Cost controls

`--max-cost` flag (CI mode)

`costWarningThreshold` config

Convention checks: the best optimization

Tips for reducing API usage