Gemini 3.5 Flash launches GA at 3x prior cost
Google ships Gemini 3.5 Flash into production across consumer and API surfaces with 1M token context, but pricing jumped 3–6x versus prior Flash variants.
May 29, 2026
Summary
Directly affects cost-per-inference for Gemini API users; the benchmark cost ($1,551.60) now exceeds Gemini 3.1 Pro Preview ($892.28), forcing re-evaluation of model selection for cost-sensitive workloads.
Why it matters
Directly affects cost-per-inference for Gemini API users; the benchmark cost ($1,551.60) now exceeds Gemini 3.1 Pro Preview ($892.28), forcing re-evaluation of model selection for cost-sensitive workloads.
Implementation verdict
Replaces Gemini 3 Flash Preview and 3.1 Flash-Lite for production inference. Requires budget reassessment before migration. Worth testing for capabilities but skip if current Flash variant meets your latency/quality bar—price premium is steep and mirrors industry-wide cost creep.
Sources
- 1.Gemini 3.5 Flash is available today to billions of people globally
- 2.1,048,576 input tokens and 65,536 maximum output tokens
- 3.The new 3.5 Flash is 3x the price of 3 Flash Preview and 6x the price of 3.1 Flash-Lite
- 4.At $1.50/million input and $9/million output
- 5.Running the benchmark for 3.5 Flash (high) cost significantly more than 3.1 Pro Preview
Dev Signal
Get briefs like this in your inbox — free, 3x a week.
100+ sources compressed into one 4-minute read. Ranked, cited, implementation-ready.