AWS Bedrock launches GPT-5.5, GPT-5.4, Codex
OpenAI's latest models now available on Bedrock with pay-per-token pricing and no seat licenses; Codex integrated into VS Code and JetBrains without IDE-level seat restrictions.
Eliminates per-developer licensing overhead and locks you into a single cloud vendor's inference layer. Codex at scale (4M weekly users) means standardizing on AWS for code generation if you need IDE integration.
Replaces self-managed OpenAI API calls with Bedrock's managed endpoint. Requires AWS account, IAM policy updates, and SDK swap. Worth trying now if you're already on AWS; otherwise, DigitalOcean Serverless Inference offers the same models without vendor lock-in.
- “pay-per-token pricing without per-developer seat licenses”
- “used by over 4 million developers weekly”
- “GPT-5.5 is available in US East (Ohio) for demanding workloads while GPT-5.4 is available in two US regions”
bedrockllm-inferencecodexawscost-optimization
Ollama shifts to llama.cpp architecture directly
0.30.0-rc29 replaces GGML with direct llama.cpp integration and adds GGUF native support, requiring local testing before production use.
Direct llama.cpp integration reduces abstraction layers and improves inference performance targeting on Apple Silicon via MLX. Developers must validate against their existing GGML workflows before upgrading.
Replaces GGML build approach with llama.cpp direct support. Requires testing for performance regressions and compatibility with existing models—Windows/Linux laguna-xs.2 and llama3.2-vision are blockers. Pre-release status: install now for early feedback only, not production.
- “directly support llama.cpp instead of building on top of GGML”
- “allows for compatibility with GGUF file format”
- “MLX is used to accelerate model inference on Apple Silicon”
- “laguna-xs.2 is not yet supported on Windows/Linux”
- “llama3.2-vision is not yet supported”
ollamallama-cppgguflocal-inferencepre-release
Maintainer embeds prompt injection in Java testing library
jqwik 1.10.0 contains hidden ANSI-obfuscated instructions targeting AI agents, invisible to humans in terminals but visible in logs and to LLMs reading CI output.
Any project running agentic coding tools against dependencies that surface test output to LLM context windows can execute unreviewed maintainer-injected commands. This reframes supply chain risk: trust is no longer enough if your agent treats build logs as instructions.
Audit your lockfiles for net.jqwik:jqwik-engine 1.10.0 and upgrade to 1.10.1 or drop the dependency entirely. The real fix is mandatory: sandbox agents with read-only filesystem access during test runs and treat all tool output as untrusted input. Worth implementing now regardless of whether you use jqwik—this pattern will repeat.
- “Any pipeline that pulled the dependency and fed test output back into an LLM agent could have triggered the prompt injection.”
- “The instruction lived in a new method called printMessageForCodingAgents(), inside the net.jqwik.engine.execution.JqwikExecutor class.”
- “The code prints the instruction line, then prints ESC [2K followed by a carriage return, twice.”
- “Treat tool output as untrusted input. Build your agent loops so text from build tools, test runners, and third-party processes never gets silently promoted to instructions.”
- “Affected version: 1.10.0”
supply-chainprompt-injectionai-agentsjavaprotestware
MAI-Code-1-Flash solves tasks with 60% fewer tokens
Production-harness-trained coding model trades benchmark optimization for real Copilot workflow efficiency, reducing latency and token cost without sacrificing accuracy.
Faster time-to-first-useful-output and lower per-task cost directly improve interactive coding workflows. Adaptive response length means simple requests stay snappy while complex refactors get the reasoning budget they need.
Replaces Claude Haiku 4.5 for Copilot-integrated workflows if you're cost-conscious or latency-sensitive. Requires integration testing against your actual IDE harness—benchmark wins don't guarantee production gains. Worth testing now if you're already evaluating smaller models.
- “solving harder problems with up to 60% fewer tokens”
- “trained directly with GitHub Copilot harnesses used in production”
- “+16-point lead on the diverse, real-world tasks of SWE-Bench Pro (51.2% vs. 35.2%)”
- “higher accuracy and greater efficiency are no longer a trade-off”
coding-modelstoken-efficiencycopilot-integrationbenchmark-analysis
Elixir v1.20 infers types without annotations
Elixir's gradual type system now performs inference across existing code via the dynamic() type, catching verified bugs (type violations guaranteed to fail at runtime) with near-zero false positives—no annotations required.
Developers get dead code detection and runtime bug verification in existing Elixir programs without overhead or trust erosion from false positives. Type narrowing from guards and pattern matching refines dynamic() constraints as code executes, catching real errors early.
Replaces manual type annotation burden for the first milestone; requires Elixir v1.20+. Ready now for detection workflows, but annotation-free inference is the shipped feature. When user annotations land, static typing becomes opt-in. No migration cost—runs alongside existing code.
- “perform type inference and gradually type check every Elixir program, without introducing type annotations”
- “typing violations that are guaranteed to fail at runtime if executed”
- “Elixir can find verified bugs in existing programs efficiently, without introducing developer overhead, and with an extremely low false positives rate”
- “Elixir passes 12 of the 13 categories, showing that it can recover precise type information from ordinary Elixir code, which we use to find verified bugs in dynamically typed programs”
- “the dynamic() type in Elixir effectively works as a range, which can be refined as it is used throughout the program and reports violations whenever type checks fall outside of the range”
type-inferencegradual-typingelixirset-theoretic-typesverified-bugs