Same workflow. Same Claude. Just the right settings, cache habits, and a few small swaps that most people never make.
If your prompt cache breaks mid-session, you pay full price every turn instead of 0.1×.
The longer your session runs, the more Claude has to remember — and that costs tokens. Opus 4.7 defaults to a 1M context window which is overkill.
CLAUDE_CODE_DISABLE_1M_CONTEXT = 1 CLAUDE_AUTOCOMPACT_PCT_OVERRIDE = 80
/compactat 50% full or after every task — don't wait/clearbetween unrelated tasks — fresh session, fresh cache/rewindif a turn went wrong — cheaper than working around bad contextSubagentsuse for bulk work — keeps your main session lean@filenametag files directly so Claude doesn't search for themMost people use Opus-level reasoning on tasks that only need Haiku. Default reasoning uses ~2× more tokens than medium.
/effort lowquick fixes, simple tasks/effort mediummost everyday prompts (big savings)/effort highcomplex reasoning/effort xhighdefault for agentic coding/effort maxalmost never worth itSome file types cost way more tokens than necessary by default.
You can't fix what you can't see. Three tools to track it.
phuryn/claude-usageHistorical view, breakdown by day/week/session (Pro/Max/Team)Gronsten/claude-usage-monitorReal-time, shows your current 5-hour window liveplatform.claude.com/usage/cacheAnthropic's cache dashboard — API users onlyLock your tools and model before you start. Compact early and often. Use cheaper models for simple tasks. Use the right input format. Track your usage. That's it.