Back to Resources
Claude Code Optimization

From $750/month to 12% of quota in 3 days

Same workflow. Same Claude. Just the right settings, cache habits, and a few small swaps that most people never make.

Root Cause 1

Cache Misses

If your prompt cache breaks mid-session, you pay full price every turn instead of 0.1×.

What breaks the cache
  • ×Adding or removing a tool in the middle of a session
  • ×Switching models in the middle of a session
What to do
  • Decide your tools and model BEFORE you start the session
  • Never change them mid-session
  • A healthy cache hit rate is around 90%+
Root Cause 2

Context Bloat

The longer your session runs, the more Claude has to remember — and that costs tokens. Opus 4.7 defaults to a 1M context window which is overkill.

Settings to add
CLAUDE_CODE_DISABLE_1M_CONTEXT = 1
CLAUDE_AUTOCOMPACT_PCT_OVERRIDE = 80

5 habits to keep context clean

/compactat 50% full or after every task — don't wait
/clearbetween unrelated tasks — fresh session, fresh cache
/rewindif a turn went wrong — cheaper than working around bad context
Subagentsuse for bulk work — keeps your main session lean
@filenametag files directly so Claude doesn't search for them

Subagent model guide

Haiku
boring mechanical work
Sonnet
research, code exploration
Opus
planning, complex decisions only
Root Cause 3

Wrong Model or Effort Level

Most people use Opus-level reasoning on tasks that only need Haiku. Default reasoning uses ~2× more tokens than medium.

Effort levels to set per prompt

/effort lowquick fixes, simple tasks
/effort mediummost everyday prompts (big savings)
/effort highcomplex reasoning
/effort xhighdefault for agentic coding
/effort maxalmost never worth it

Model routing strategy

  • Start a Sonnet session when work is simple — cheaper
  • Start an Opus session when planning is involved — delegate the rest to Sonnet/Haiku
  • If you keep hitting limits, route through OpenRouter — same Claude Code interface, ~1/12th the cost
Root Cause 4

Wrong Input Format

Some file types cost way more tokens than necessary by default.

Don't use
Screenshots for web pages
Use instead
agent-browser
~90% fewer tokens
Don't use
Claude's built-in PDF reader
Use instead
pdftotext
Avoids loading PDFs as images
Don't use
Re-reading whole repo each task
Use instead
code-review-graph
6.8× fewer tokens on reviews, up to 49× on daily tasks
Root Cause 5

Not Watching Your Usage

You can't fix what you can't see. Three tools to track it.

phuryn/claude-usageHistorical view, breakdown by day/week/session (Pro/Max/Team)
Gronsten/claude-usage-monitorReal-time, shows your current 5-hour window live
platform.claude.com/usage/cacheAnthropic's cache dashboard — API users only
The One-Line Summary

Lock your tools and model before you start. Compact early and often. Use cheaper models for simple tasks. Use the right input format. Track your usage. That's it.