openai-models reinforcement-learning reasoning tool-use inference-cost

OpenAI ships o3, o4-mini with scaling RL improvements

o4-mini is cheaper and better across the board; o3 gains 10x compute efficiency on RL, now dominating benchmarks like SEAL and AIME.

June 11, 2026

Summary

o3 and o4-mini introduce end-to-end tool use and multimodal reasoning in chain-of-thought, reducing inference cost per task. Vision and tool capabilities reshape what agents can execute without external orchestration.

Why it matters

Implementation verdict

o4-mini replaces o1-mini for cost-sensitive reasoning tasks. Requires API access (vision/tools not yet available). o3 is 4-5x more expensive than Gemini 2.5 Pro—worth testing for tasks where reasoning ROI justifies cost, but skip for simple completions. Codex CLI (open source) is ready now for code generation workflows.

Sources

1.o4-mini is cheaper and better across the board
2.improvements in both scaling RL
3.o3 is 4-5x more expensive than Gemini 2.5 Pro
4.o3 is absolutely dominating the SEAL leaderboard
5.Codex CLI, which oneupped Claude Code by being fully open source
6.o3 and o4-mini can integrate uploaded images directly into their chain of thought

Dev Signal

Get briefs like this in your inbox — free, every weekday.

100+ sources compressed into one 4-minute read. Ranked, cited, implementation-ready.

Read the full issue →All briefs