slm-inference structured-output constraint-tax json-validation sub-3b-models

Structured output constraints degrade small model accuracy

Hard schema constraints on sub-3B models reduce answer accuracy from 19.7% to 11.0% while achieving 100% schema validity—the tradeoff is semantic, not structural.

July 1, 2026

Summary

If you deploy SLMs for JSON/tool-call outputs, assuming schema enforcement improves reliability is unsafe. You need to measure executable accuracy and wrong-valid-schema rate separately, not just schema validity.

Why it matters

Implementation verdict

Schema-only validation replaces nothing—this reveals a measurement gap. Requires tracking four metrics independently: schema validity, answer accuracy, executable accuracy, wrong-valid-schema rate. Implement delayed constraint packaging (reason free, constrain late) instead of hard decoding. Worth testing now on your SLM pipeline before prod rollout.

Sources

1.hard answer-only schema decoding raises schema validity from 61.5% to 100.0%, but lowers answer accuracy from 19.7% to 11.0% and increases wrong-valid-schema outputs from 49.5% to 88.9%
2.reason free, constrain late
3.production systems should report schema validity, answer accuracy, executable accuracy, and wrong-valid-schema rate separately

Dev Signal

Get briefs like this in your inbox — free, every weekday.

100+ sources compressed into one 4-minute read. Ranked, cited, implementation-ready.

Read the full issue →All briefs