Claude Opus 4.8 released with fast mode option
New Opus 4.8 model available via claude-opus-4.8, includes optional -o fast 1 mode for orgs with feature access, and removes the 8,192 token default ceiling.
June 1, 2026
Summary
Default max_tokens now matches each model's actual limit instead of artificially capping output at 8,192, eliminating a common gotcha in token budgeting. Fast mode provides a speed/cost tradeoff for latency-sensitive workloads.
Why it matters
Default max_tokens now matches each model's actual limit instead of artificially capping output at 8,192, eliminating a common gotcha in token budgeting. Fast mode provides a speed/cost tradeoff for latency-sensitive workloads.
Implementation verdict
Drop-in model ID replacement (claude-opus-4.8) for existing Opus deployments. Requires no code changes to adopt longer output windows. Fast mode requires account-level feature enablement—check with your Anthropic contact. Worth testing immediately if you've hit token limits or need sub-second latencies.
Sources
- 1.New model: Claude Opus 4.8 (claude-opus-4.8)
- 2.New -o fast 1 option for fast mode, for organizations with that feature enabled on their account
- 3.Default max_tokens for each model now defaults to that model's maximum output rather than 8,192
Dev Signal
Get briefs like this in your inbox — free, 3x a week.
100+ sources compressed into one 4-minute read. Ranked, cited, implementation-ready.