Skip to content

Cost Optimization

Caliper has three enforcement layers. Only two of them use the Anthropic API.

LayerAPI usage?Speed
Convention enforcement (npx caliper check)No — freeSub-second
Local AI review (npx caliper review)Yes30s -- 3 min
PR review (npx caliper <pr>)Yes1 -- 10 min

Convention checks are pure pattern matching --- no API calls. They are always the fastest way to catch issues.

API usage scales with the number of files, risk level, and which pipeline phases run. The progress display shows a running token count during each review.

Fast path

Small PRs automatically use a fast path that skips synthesis, lenses, consolidation, and narrative phases. This cuts cost and time significantly.

A PR qualifies when all three thresholds are met:

ThresholdDefaultConfig key
Changed lines< 100fastPath.maxChangedLines
Files≤ 3fastPath.maxFiles
High-risk files0fastPath.maxHighRiskFiles

Tune the thresholds in .caliper/config.yaml:

yaml
fastPath:
  maxChangedLines: 200 # relax for repos with frequent small PRs
  maxFiles: 5
  maxHighRiskFiles: 1 # allow one high-risk file on fast path

You can also force the behavior per-run:

bash
npx caliper 42 --fast    # force fast path (skip synthesis/lenses/consolidation/narrative)
npx caliper 42 --full    # force full pipeline even for small PRs

Model selection

Caliper assigns models to pipeline phases based on PR scope. Three semantic slots map to Claude models:

SlotDefault modelUsed for
smallHaikuMechanical tasks: crossfile selection, consolidation on small PRs, narrative on small PRs
mediumSonnetStandard review, synthesis, lenses on normal PRs
largeOpusDeep reasoning: review, synthesis, and lenses on large PRs

Configure in .caliper/config.yaml:

yaml
# Use one model for everything (lightest API usage):
model: haiku

# Or assign per slot:
model:
  small: haiku
  medium: sonnet
  large: sonnet    # use Sonnet instead of Opus for large slot

The npx caliper init wizard offers two presets:

  • Light and cheap --- Haiku for most phases, Sonnet only for large PRs. Lighter API usage, good for most codebases.
  • Thorough --- Sonnet and Opus. Heavier API usage, better for complex architectural review.

Cost controls

--max-cost flag (CI mode)

Set a hard cost ceiling. If the estimated cost exceeds this amount, Caliper posts an explanatory comment and exits cleanly without running the review.

bash
npx caliper 42 --ci --max-cost 2.00

costWarningThreshold config

In interactive mode, Caliper prompts for confirmation when the estimated cost exceeds this threshold. Default is $2.

yaml
# .caliper/config.yaml
costWarningThreshold: 5 # warn above $5 instead of $2

Set it high (e.g., 999) to effectively disable the warning.

Convention checks: the best optimization

The single most effective way to reduce API usage is to catch more issues deterministically. Convention checks compiled from your CLAUDE.md run in sub-second with no API calls --- every issue they catch is one the AI does not need to find.

bash
npx caliper refresh              # compile CLAUDE.md rules into deterministic checks
npx caliper init --agent         # install the stop hook so checks run after every agent turn

Rules that cannot be checked mechanically become conventions for the AI review layers. Nothing is dropped --- the enforcement just moves to the most efficient viable layer.

Tips for reducing API usage

  1. Use convention checks first. The more rules you encode in CLAUDE.md and compile with npx caliper refresh, the less work the AI has to do.

  2. Use --fast to skip expensive phases. Synthesis, lenses, consolidation, and narrative are valuable for complex PRs but unnecessary for straightforward changes.

  3. Use --min-severity to reduce noise. In CI, --min-severity recommendation skips nits and only posts findings that matter, reducing the number of AI-generated comments.

  4. Set model: haiku for lighter pipelines. A single Haiku model for all slots uses the fewest tokens per review while still catching real issues.

  5. Keep PRs small. API usage scales with diff size. Smaller PRs naturally land in the fast path and use lighter model slots.

  6. Use --max-cost in CI. Prevents runaway API usage on unexpectedly large PRs.

© 2026 Caliper AI. All rights reserved.