Quantization hides bias emergence below perplexity thresholds

Standard metrics miss fairness degradation in quantized models—3-bit causes 6-21% of previously unbiased items to develop stereotypical behaviors while perplexity barely shifts.

May 19, 2026

Summary

If you're deploying quantized models to production, aggregate metrics won't catch bias emergence. You need item-level fairness audits before compression, not after, or you'll ship models that silently amplify stereotypes while passing quality gates.

Why it matters

If you're deploying quantized models to production, aggregate metrics won't catch bias emergence. You need item-level fairness audits before compression, not after, or you'll ship models that silently amplify stereotypes while passing quality gates.

Implementation verdict

This doesn't replace existing quantization pipelines yet—it replaces your confidence in standard eval metrics. Requires adding fairness benchmarks (BBQ-style) to your quantization testing matrix. Worth implementing now if deploying 4-bit or lower to any inference service touching user-facing classification or generation.

Sources

  1. 1.3-bit quantization causes 6-21% of previously unbiased items to develop new stereotypical behaviors
  2. 2.models' willingness to select "unknown" answers declines by 17.4%
  3. 3.perplexity increases by less than 0.5% at 8-bit and under 3% at 4-bit across all three models, yet 2.5-5.6% of items already develop new biases at 4-bit
  4. 4.aggregate evaluation metrics systematically miss fairness-critical degradation

Dev Signal

Get briefs like this in your inbox — free, 3x a week.

100+ sources compressed into one 4-minute read. Ranked, cited, implementation-ready.