🌘 Kimi-K2.7-Code, our latest coding model, is now released and open-sourced!
🔷 Improved coding & agent performance over K2.6: +21.8% on Kimi Code Bench v2, +11.0% on Program Bench, and +31.5% on MLS Bench Lite.
🔷 Reasoning efficiency: Less overthinking, with 30% lower reasoning-token usage compared to K2.6.
🔷 Long-horizon coding: Improved instruction following, higher end-to-end coding task success rates.
⚡️ 6x High-Speed Mode coming soon!
🔌 Available today via Kimi API and Kimi Code.
🔗 Kimi Code: https://t.co/uvoSJKyGCY
🔗 API: https://t.co/EOZkbOwCN4
For the next two weeks, Plus and Pro users can invite up to three friends to try Codex.
When a friend sends their first Codex message, you’ll both get another banked reset.
We heard you wanted to use Codex rate limit resets on your own time.
Starting today, we’re rolling out the ability to save rate limit resets to use later.
We’re starting Go, Plus, Pro, and Business users with one free reset:
Anthropic can use the internet, copyrighted books, code, user feedback, public human knowledge, synthetic data, and its own models to improve Claude
But if a developer uses Claude to bootstrap a competitive open alternative, Anthropic calls foul
That is called Gatekeeping
🔓 And the best part — we're open-sourcing it.
1,000+ tps on a 1T model wasn't a single breakthrough — it's deep model × system co-design between the MiMo and TileRT teams, all on general-purpose GPUs (no Cerebras-style wafer-scale, no Groq-style SRAM ASICs).
On the model side: FP4 quantization (smaller footprint, less memory traffic) + DFlash, our block-masked parallel speculative decoding that accepts far more tokens per verification. On the system side, TileRT tailors its compiler & kernels to exactly these techniques.
The result: a 1T model breaking 1,000 tps on a single, standard 8-GPU node.
🤗 Open weights (FP4 + DFlash checkpoint): https://t.co/jYQsgeruMg
If you really think about it, despite being mocked as “ClosedAI,” OpenAI has contributed enormously to the field: GPT, GPT-2, GPT-3, CLIP, the ChatGPT paper, the GPT-4 Technical Report, the Sora technical blog, and even open-sourced Codex.
Anthropic, meanwhile, has contributed far less to the public research ecosystem while increasingly promoting fear-based narratives and restricting access through heavy gatekeeping.
The world I least want to live in is one where the future of AI is controlled by companies that prioritize secrecy, gated access, and centralized control over openness, reproducibility, and scientific progress.
Things I really dislike about Fable:
1. Anthropic collects my prompt history, stores it, and does whatever they want with it for 30 days. No opt-out
2. They can nerf their most expensive model without telling me, billing me the same amount, wasting my time. Whenever they want