Quantization hides bias emergence below perplexity thresholds
Standard metrics miss fairness degradation in quantized models—3-bit causes 6-21% of previously unbiased items to develop stereotypical behaviors while perplexity barely shifts.
May 19, 2026
Summary
If you're deploying quantized models to production, aggregate metrics won't catch bias emergence. You need item-level fairness audits before compression, not after, or you'll ship models that silently amplify stereotypes while passing quality gates.
Why it matters
If you're deploying quantized models to production, aggregate metrics won't catch bias emergence. You need item-level fairness audits before compression, not after, or you'll ship models that silently amplify stereotypes while passing quality gates.
Implementation verdict
This doesn't replace existing quantization pipelines yet—it replaces your confidence in standard eval metrics. Requires adding fairness benchmarks (BBQ-style) to your quantization testing matrix. Worth implementing now if deploying 4-bit or lower to any inference service touching user-facing classification or generation.
Sources
- 1.3-bit quantization causes 6-21% of previously unbiased items to develop new stereotypical behaviors
- 2.models' willingness to select "unknown" answers declines by 17.4%
- 3.perplexity increases by less than 0.5% at 8-bit and under 3% at 4-bit across all three models, yet 2.5-5.6% of items already develop new biases at 4-bit
- 4.aggregate evaluation metrics systematically miss fairness-critical degradation
Dev Signal
Get briefs like this in your inbox — free, 3x a week.
100+ sources compressed into one 4-minute read. Ranked, cited, implementation-ready.