ElevenLabs Scribe V2, Gemini 3 Flash, and AssemblyAI Universal 3-Pro handle bilingual switching best; transcription errors propagate directly into downstream task failure, measurable via Answer Error Rate.
June 10, 2026
Summary
Enterprise voice agents serving bilingual customers fail silently on code-switched input—misrouted tickets, wrong policy answers. Benchmark data lets you pick ASR systems that preserve semantic meaning, not just character accuracy.
Why it matters
Enterprise voice agents serving bilingual customers fail silently on code-switched input—misrouted tickets, wrong policy answers. Benchmark data lets you pick ASR systems that preserve semantic meaning, not just character accuracy.
Implementation verdict
Replaces guessing which ASR handles code-switching. Requires: testing your language pairs against this benchmark (Spanish-English, French-English, Canadian French-English, German-English covered). Ready now—use AU-Harness to evaluate models. Scribe V2 and Gemini 3 Flash are your safest bets; Whisper defaults to translation, not transcription, on code-switched audio.
Sources
Dev Signal
Get briefs like this in your inbox — free, 3x a week.
100+ sources compressed into one 4-minute read. Ranked, cited, implementation-ready.