Browser Run exposes CDP endpoint for agent control
Cloudflare's Browser Rendering rebrand to Browser Run now exposes Chrome DevTools Protocol directly, eliminating the need to write Workers and letting existing CDP scripts migrate with a one-line config change.
Removes infrastructure overhead for agent-browser automation; existing CDP scripts and agent frameworks (Claude Desktop, Cursor) can now target Cloudflare's global network without self-hosting Chrome or abstraction layer rewrites. Session recordings, Human-in-the-Loop handoff, and 120 concurrent browsers scale agent workflows beyond prototype.
Replaces self-hosted Chrome infrastructure. Requires: Cloudflare account, API token, pointing browserWSEndpoint to the new CDP endpoint. Ready now—existing Puppeteer/Playwright code works unchanged if you're already using Browser Rendering; if you're running self-hosted Chrome with CDP scripts, it's a one-line migration. WebMCP is speculative (Chromium 146+ adoption unknown) but doesn't block adoption.
“run full browser sessions on Cloudflare's global network, drive them with code or AI, record and replay sessions, crawl pages for content, debug in real time”
“120 concurrent browsers, up from 30”
“Point your WebSocket URL at Browser Run and stop managing your own browser infrastructure”
“CDP gives agents the most control possible over the browser”
“you can now connect from any language, any environment, without needing to write a Cloudflare Worker”
Get issues like this in your inbox — free, 3x a week.
Quick Signals
Genkit middleware intercepts generation calls three layers deep
Middleware hooks attach to generate(), model, and tool layers to inject retries, fallbacks, human approval, and custom logic without rewriting prompts.
Replaces scattered error-handling and safety logic across your codebase with composable, reusable middleware that works across TypeScript, Go, and Dart. Eliminates the need to encode policy rules in every prompt.
Ready now in TypeScript, Go, Dart; Python coming soon. Five pre-built modules (Retry, Fallback, ToolApproval, Skills, Filesystem) handle 80% of production needs. Custom middleware requires minimal boilerplate (~20 lines). Worth adopting immediately if you're building agentic apps and need deterministic guardrails.
“Middleware hooks attach at three layers of this loop”
“Only the model call is retried; the surrounding tool loop is not replayed”
“The middleware system is available today in TypeScript, Go, and Dart, with Python support coming soon”
“provide a name and a factory function that returns the hooks you want”
Parallel text diffusion model trades output quality for local inference speed by generating 256 tokens per forward pass instead of sequential decoding.
Eliminates GPU underutilization in single-user local inference by shifting from memory-bandwidth bottleneck to compute-bound workload, unlocking real-time interactive features like inline editing and code infilling without cloud latency.
Enjoying Dev Signal? Get every issue in your inbox.
Free forever · 3 issues a week · One-click unsubscribe
3 issues a week · Free forever · 4,200+ developers
Replaces autoregressive Gemma 4 for speed-critical local workflows only; requires dedicated GPU with 18GB VRAM (H100: 1000+ tok/s, RTX 5090: 700+ tok/s); experimental quality makes it unsuitable for production output. Worth trying now for interactive apps, not general-purpose replacement.
“up to 4x faster text generation on GPUs”
“26B Mixture of Experts (MoE) model that activates only 3.8B parameters during inference”
“1000+ tokens per second on a single NVIDIA H100, 700+ tokens per second on NVIDIA GeForce RTX 5090”
“Generating 256 tokens in parallel with each forward pass allows every token to attend to all others”
“DiffusionGemma's overall output quality is lower than standard Gemma 4”
Unified Model API translates client requests to OpenAI Chat Completions format, routing transparently to Anthropic, Google Vertex, or other backends—one governance layer covers all providers.
Eliminates vendor lock-in at the API layer and collapses governance overhead. Teams mixing models across providers no longer need separate rate-limiting, content safety, and token accounting per backend.
Replaces custom adapter code for model portability. Requires Azure API Management tier (available across all SKUs) and registration of backend providers in APIM. Unified Model API is public preview; content safety for MCP/A2A and extended token metrics are GA. Worth trying now if already running APIM; adoption friction is low for new deployments.
“Unified Model API lets clients standardize on a single format, currently OpenAI Chat Completions, while APIM transparently transforms requests to the backend provider's native format”
“every governance policy, rate limit, content safety check, and token metric applies consistently, regardless of which provider handles inference”
“the policy buffers events in a sliding window and simply stops forwarding further events to the client without returning an error”
“APIM now logs reasoning tokens, cached tokens, and audio tokens to Application Insights”
api-gatewaymulti-model-routingazuregovernancemcp
Cloudflare Mesh routes agent traffic through private networks
Mesh provides bidirectional private networking for AI agents without VPN interactive login or manual SSH tunnel setup, inheriting Cloudflare One's security policies automatically.
Agents now need secure access to private infrastructure (databases, staging APIs, home services) without exposing them publicly or requiring per-resource tunnel configuration. Mesh eliminates the credential-leakage and visibility gaps that plague traditional private network tools for autonomous workloads.
Replaces VPN, SSH tunnels, and per-service Tunnel configs for agent access patterns. Requires Cloudflare One subscription and lightweight connector deployment. Ready now—GA release with existing Cloudflare One policies apply automatically. Worth trying if you're already on Cloudflare One; otherwise evaluate cost vs. self-hosted mesh alternatives.
“agents need to reach private resources, but the tools for doing that were built for humans, not autonomous software”
“Mesh is directly integrated into your existing Cloudflare One deployment. Your existing Gateway policies, Access rules, and device posture checks apply to Mesh traffic automatically”
“Every device and node on your Mesh can access one another using their private IPs”
“All Mesh traffic routes through Cloudflare's global network across 330+ cities”
Modular skill packages loaded into coding agents fix the gap between general LLM knowledge and SDK-specific patterns—paginators, waiters, async/await, error handling—that agents consistently misgenerate.
AI-generated AWS SDK code frequently doesn't compile, silently fails on real data, or leaves performance and cost on the table. Skills reduce the need for manual code review and rework when agents handle S3, DynamoDB, and client initialization tasks.
Replaces manual prompting/guardrails for AWS SDK tasks in agents that support the open skills format. Requires agent integration and one of three available skills (Swift, JavaScript v3, Python). Ready now—install via `npx skills add` and test against your workflows. Start with the SDK language you use most in agent tasks.
“AI coding agents know the general shape of AWS SDK usage, but they get the details wrong.”
“Skills are modular packages that give AI coding agents specialized SDK knowledge.”
“Across our test suite, code generated with a skill installed consistently passed more checks than code generated without one.”
“S3Client() is async throws, and so is listBuckets.”