We open sourced Kimi K2.6. The next frontier in test-time compute isn't bigger models. It's better organizations of intelligence.
The hardest things were never built by one person. They require coordination. Different skills, different contexts, different minds arguing until something better emerges.
Meet Kimi K2.6: Advancing Open-Source Coding
🔹Open-source SOTA on HLE w/ tools (54.0), SWE-Bench Pro (58.6), SWE-bench Multilingual (76.7), BrowseComp (83.2), Toolathlon (50.0), Charxiv w/ python(86.7), Math Vision w/ python (93.2)
What's new:
🔹Long-horizon coding - 4,000+ tool calls, over 12 hours of continuous execution, with generalization across languages (Rust, Go, Python) and tasks (frontend, devops, perf optimization).
🔹Motion-rich frontend - Videos in hero sections, WebGL shaders, GSAP + Framer Motion, Three.js 3D.
🔹Agent Swarms, elevated - 300 parallel sub-agents × 4,000 steps per run (up from K2.5's 100 / 1,500). One prompt, 100+ files.
🔹Proactive Agents - K2.6 model powers OpenClaw, Hermes Agent, etc for 24/7 autonomous ops.
🔹Claw Groups (research preview) - bring your own agents, command your friends', bots & humans in the loop.
-
K2.6 is now live on https://t.co/YutVbwktG0 in chat mode and agent mode.
For production-grade coding, pair K2.6 with Kimi Code: https://t.co/uvoSJKyGCY
-
🔗 API: https://t.co/EOZkbOwCN4
🔗 Tech blog: https://t.co/9wWvgIQSS3
🔗 Weights & code: https://t.co/Be0hjs2RTP
Meet Kimi K2.6: Advancing Open-Source Coding
🔹Open-source SOTA on HLE w/ tools (54.0), SWE-Bench Pro (58.6), SWE-bench Multilingual (76.7), BrowseComp (83.2), Toolathlon (50.0), Charxiv w/ python(86.7), Math Vision w/ python (93.2)
What's new:
🔹Long-horizon coding - 4,000+ tool calls, over 12 hours of continuous execution, with generalization across languages (Rust, Go, Python) and tasks (frontend, devops, perf optimization).
🔹Motion-rich frontend - Videos in hero sections, WebGL shaders, GSAP + Framer Motion, Three.js 3D.
🔹Agent Swarms, elevated - 300 parallel sub-agents × 4,000 steps per run (up from K2.5's 100 / 1,500). One prompt, 100+ files.
🔹Proactive Agents - K2.6 model powers OpenClaw, Hermes Agent, etc for 24/7 autonomous ops.
🔹Claw Groups (research preview) - bring your own agents, command your friends', bots & humans in the loop.
-
K2.6 is now live on https://t.co/YutVbwktG0 in chat mode and agent mode.
For production-grade coding, pair K2.6 with Kimi Code: https://t.co/uvoSJKyGCY
-
🔗 API: https://t.co/EOZkbOwCN4
🔗 Tech blog: https://t.co/9wWvgIQSS3
🔗 Weights & code: https://t.co/Be0hjs2RTP
Congrats to the @cursor_ai team on the launch of Composer 2!
We are proud to see Kimi-k2.5 provide the foundation. Seeing our model integrated effectively through Cursor's continued pretraining & high-compute RL training is the open model ecosystem we love to support.
Note: Cursor accesses Kimi-k2.5 via Fireworks' hosted RL and inference platform as part of an authorized commercial partnership.
Introducing Kimi K2.5: open-source visual agentic intelligence
🚀State-of-the-art benchmarks: Humanity's Last Exam full set (50.2%), BrowseComp (74.9%)
Vision is humanity's native language. When Kimi understands what you see, creation becomes instinctive.
No coding, No frontend jargon. Upload a mockup, Share a video, Describe your vision. Kimi turns it into code with taste.
Enjoy creation 🌈
🥝Meet Kimi K2.5, Open-Source Visual Agentic Intelligence.
🔹Global SOTA on Agentic Benchmarks: HLE full set (50.2%), BrowseComp (74.9%)
🔹Open-source SOTA on Vision and Coding: MMMU Pro (78.5%), VideoMMMU (86.6%), SWE-bench Verified (76.8%)
🔹Code with Taste: turn chats, images & videos into aesthetic websites with expressive motion.
🔹Agent Swarm (Beta): self-directed agents working in parallel, at scale. Up to 100 sub-agents, 1,500 tool calls, 4.5× faster compared with single-agent setup.
-
🥝K2.5 is now live on https://t.co/YutVbwktG0 in chat mode and agent mode.
🥝K2.5 Agent Swarm in beta for high-tier users.
🥝For production-grade coding, you can pair K2.5 with Kimi Code: https://t.co/A5WQozJF3s
-
🔗 API: https://t.co/EOZkbOwCN4
🔗 Tech blog: https://t.co/6h2KkoA0xd
🔗 Weights & code: https://t.co/H38KegeDIY
Kimi K2 Thinking is here!
Scale up reasoning with more thinking tokens and tool-call steps.
Now live on https://t.co/YutVbwktG0, the Kimi app, and API.
Today, we're releasing Kimi K2 Thinking, our best open-source model.
What makes it different isn't just the benchmarks, though it achieves SOTA results on Humanity's Last Exam, BrowseComp, and other challenging tests. What matters is how it thinks.
It reminds me of the minds on our team: always asking the next question, refusing to settle for the first answer, following each thread until it leads somewhere true.
This is test-time scaling in its full form, giving models the space to think longer and act more deliberately.
🚀 Hello, Kimi K2 Thinking!
The Open-Source Thinking Agent Model is here.
🔹 SOTA on HLE (44.9%) and BrowseComp (60.2%)
🔹 Executes up to 200 – 300 sequential tool calls without human interference
🔹 Excels in reasoning, agentic search, and coding
🔹 256K context window
Built as a thinking agent, K2 Thinking marks our latest efforts in test-time scaling — scaling both thinking tokens and tool-calling turns.
K2 Thinking is now live on https://t.co/YutVbwktG0 in chat mode, with full agentic mode coming soon. It is also accessible via API.
🔌 API is live: https://t.co/EOZkbOwCN4
🔗 Tech blog: https://t.co/n7xxaszqzF
🔗 Weights & code: https://t.co/4ukcXB0iP6
🚨 BREAKING: @Kimi_Moonshot’s Kimi-K2 is now the #1 open model in the Arena!
With over 3K community votes, it ranks #5 overall, overtaking DeepSeek as the top open model.
Huge congrats to the Moonshot team on this impressive milestone! The leaderboard now features 7 different providers in the top 15 - the most competitive it’s ever been.
More insights in the thread 🧵
@LEON_0xx0 @Kimi_Moonshot@YouWareAI Thank you for the support! We’re pushing hard to make it better. You should see a noticeable speed bump within the next few days!
We put Kimi K2 @Kimi_Moonshot to the test on @YouWareAI using actual user queries.The performance is shockingly good, and the cost savings are amazing. Here're 8 test cases from our platform to show you the difference👇
Moonshot AI has surpassed xAI in token market share, just a few days after launching Kimi K2
🎁 We also just put up a free endpoint for Kimi - try it now! 👇