Kimi 2.7 ranked 2nd after Fable 5 and before GPT-5 xhigh
We have re-run our ErdosBench smoke test on 14 problems with Kimi 2.7, Qwen 3.7 Max, Grok 4.3 and compared it with the top performers from previous runs.
Kimi 2.7 is amazingly good. More below.
Kimi K2.7 Code is now live on Workers AI. This code-optimized MoE model delivers big benchmark gains and a massive 262k context window for complex coding tasks.
https://t.co/h5gx6b5fzP
🌘 Kimi-K2.7-Code, our latest coding model, is now released and open-sourced!
🔷 Improved coding & agent performance over K2.6: +21.8% on Kimi Code Bench v2, +11.0% on Program Bench, and +31.5% on MLS Bench Lite.
🔷 Reasoning efficiency: Less overthinking, with 30% lower reasoning-token usage compared to K2.6.
🔷 Long-horizon coding: Improved instruction following, higher end-to-end coding task success rates.
⚡️ 6x High-Speed Mode coming soon!
🔌 Available today via Kimi API and Kimi Code.
🔗 Kimi Code: https://t.co/uvoSJKyGCY
🔗 API: https://t.co/EOZkbOwCN4
Browser Run's /snapshot endpoint now supports a formats parameter. Return screenshot, Markdown, and accessibility tree together in a single API call.
https://t.co/of84pmaYGB
Cloudflare DLP now lets you define custom topics for AI prompt protection. Detect sensitive concepts like confidential roadmap details even when prompts paraphrase them without using specific keywords.
https://t.co/WIeSUyIXYW
We've just added two new Claude Managed Agents features:
1. Scheduled deployments - run tasks on a schedule
2. Environment variables - expose vault credentials for CLIs as environment variables
New from Code with Claude Tokyo: scheduled deployments and environment variables in vaults are in public beta in Claude Managed Agents, and dynamic workflows in Claude Code are generally available.
Agents now run on a schedule, use your tools securely, and take on bigger jobs.
Docker for AI Agents is officially over 🤯
Pydantic open-sourced a new way to run LLM-generated code that:
- does not need Docker.
- does not spin up containers.
- does not call any cloud sandbox.
- does not cost a cent to run.
It's called Monty.
Instead of spinning up a Docker container every time your agent writes code,
it runs Python directly in your own process, locked down by a tiny Rust interpreter that controls every filesystem, network, and env call.
boots in 0.06ms. ~3,000x faster than a Docker container.
snapshots execution to bytes so you can pause and resume mid-run.
no containers. no images. no daemon.
100% open source.
SMTP submission is now available in beta for Cloudflare Email Service. Send emails through standard clients like Nodemailer and smtplib.
https://t.co/uUD4jxzcJs