Claude Opus 4.8 released with fast mode option

New Opus 4.8 model available via claude-opus-4.8, includes optional -o fast 1 mode for orgs with feature access, and removes the 8,192 token default ceiling.

June 1, 2026

Summary

Default max_tokens now matches each model's actual limit instead of artificially capping output at 8,192, eliminating a common gotcha in token budgeting. Fast mode provides a speed/cost tradeoff for latency-sensitive workloads.

Why it matters

Default max_tokens now matches each model's actual limit instead of artificially capping output at 8,192, eliminating a common gotcha in token budgeting. Fast mode provides a speed/cost tradeoff for latency-sensitive workloads.

Implementation verdict

Drop-in model ID replacement (claude-opus-4.8) for existing Opus deployments. Requires no code changes to adopt longer output windows. Fast mode requires account-level feature enablement—check with your Anthropic contact. Worth testing immediately if you've hit token limits or need sub-second latencies.

Sources

  1. 1.New model: Claude Opus 4.8 (claude-opus-4.8)
  2. 2.New -o fast 1 option for fast mode, for organizations with that feature enabled on their account
  3. 3.Default max_tokens for each model now defaults to that model's maximum output rather than 8,192

Dev Signal

Get briefs like this in your inbox — free, 3x a week.

100+ sources compressed into one 4-minute read. Ranked, cited, implementation-ready.