Composer 2.5 improves long-task execution and collaboration
Targeted textual feedback during RL training fixes localized failures (bad tool calls, style violations) that global reward signals miss, enabling better long-horizon behavior without full rollout retraining.
May 20, 2026
Summary
Composer now handles multi-step coding tasks more reliably with fewer false starts, reducing iteration cycles in sustained agentic work. Better instruction following and communication style cut friction in human-AI collaboration loops.
Why it matters
Composer now handles multi-step coding tasks more reliably with fewer false starts, reducing iteration cycles in sustained agentic work. Better instruction following and communication style cut friction in human-AI collaboration loops.
Implementation verdict
Drop-in replacement for Composer 2 at $0.50/$2.50 per M tokens (standard) or $3.00/$15.00 (fast tier). Requires no client-side changes—Cursor users get it automatically. Worth switching today if you're running long-context code tasks; the 25x synthetic task scale and targeted feedback training directly address timeout/retry patterns in multi-file editing.
Sources
- 1.It is better at sustained work on long-running tasks, follows complex instructions more reliably, and is more pleasant to collaborate with
- 2.Composer 2.5 is trained with 25x more synthetic tasks than Composer 2
- 3.we trained Composer 2.5 with targeted textual feedback
- 4.Credit assignment during RL is becoming an increasingly difficult challenge as rollouts can span hundreds of thousands of tokens
- 5.Composer 2.5 is priced at $0.50/M input and $2.50/M output tokens
Dev Signal
Get briefs like this in your inbox — free, 3x a week.
100+ sources compressed into one 4-minute read. Ranked, cited, implementation-ready.