quantization local-inference image-generation mobile-ai open-weights

Bonsai Image 4B runs diffusion on iPhone

PrismML released 1-bit and ternary quantized image models (0.93–1.21 GB) that preserve 95% visual quality while enabling local inference on phones and M-series Macs.

Summary

Eliminates round-trip latency for iterative image generation workflows and removes per-generation serving costs, shifting viable product patterns toward on-device, privacy-native image workflows. Developers can now build generation directly into app UX instead of rationing cloud calls.

Why it matters

Implementation verdict

Replaces cloud-dependent image generation pipelines for iterative use cases. Requires iOS/macOS deployment pipeline and quantized model integration. Ready now: open weights under Apache 2.0, Bonsai Studio app available for testing, GitHub repo provided. Start with iPhone 17 Pro Max baseline (9.4 sec/512×512); verify quality on your prompt distribution before shipping.

Sources

1.reduces the footprint of a modern 4B-class diffusion transformer by up to 8.3x
2.the first image model in its parameter class to run directly on the iPhone
3.On iPhone 17 Pro Max, Bonsai Image 4B generates a 512x512 image in about 9.4 seconds
4.retain up to 95% of the image-generation quality of the full-precision model
5.Both 1-bit and Ternary Bonsai Image 4B will be released with open weights and code under the Apache 2.0 license

Dev Signal

Get briefs like this in your inbox — free, every weekday.

100+ sources compressed into one 4-minute read. Ranked, cited, implementation-ready.

Read the full issue →All briefs