That tracks with the test. Grok Build 0.1 was strong on this specific bug-detection task and cheap doing it, but detection and full feature implementation across a repo are different jobs, and the ranking changes depending on which one you're asking for. Opus carrying the heavier repo-understanding work matches what we see too. It's the reason Code Reviewer keeps all of them on tap instead of betting on one. Use Grok where it's sharp, reach for Opus where it isn't.
Grok Build 0.1 cost $0.08 and out-performed 4 frontier models on the hardest bug in our code review test.
It also tied second place on overall detection against Opus 4.8, GPT-5.5, Gemini 3.1 Pro, and Sonnet 4.6.
More expensive no longer equals better.
Exactly right. No single model catches everything, and the misses don't fully overlap, so running two or three in review surfaces more than any one of them alone. That's the whole reason Code Reviewer is model-agnostic. Pick a few, let them disagree, and the disagreements are usually where the real bugs are.
full breakdown with per-category detection rates, cost per catch, and what each model got right and wrong:
https://t.co/bFc41XKQKA
Tested inside Code Reviewer in @kilocode - 500+ models, pay per token at provider cost, and swap whenever you want.
The cost math is what makes this hard to ignore. Grok and Sonnet tie on detection, but Grok gets there for about a third of the price - roughly 3x cheaper per catch than Sonnet, and nearly 9x cheaper than Opus.
The usual logic where cheap means less useful just doesn't apply here.
Kilo Code v7 for VS Code is @ProductHunt's #1 Product of the Month for May, and three of the top four spots in the OSS category for the year so far are also from @kilocode. 🎉
Thank you to every developer who voted us up. It genuinely made our month!
This is why I think the model ecosystem will only keep expanding.
When you can compete with other frontier models on quality at a fraction of the price, it's a signal, not an outlier.
@xai shipping a model this good and cost-efficient means the frontier isn't one lab's to own anymore.
This one's been on the request list for a while. A lot of you are already on X Premium+ or SuperGrok and have been asking how to get those models into Kilo without paying for them twice. Now you can. Grok Build 0.1 included.
grok-build-0.1 is available in Kilo right now.
Built for speed and agentic coding. If you have SuperGrok or X Premium+, you can route to it from the Kilo IDE extension or CLI.
Go break something interesting.
https://t.co/ICLE1GDzXU
Grok Build 0.1 just blew my mind in Kilo Code.
$0.35 for a 3D, holographic @SpaceX Starship simulator.
1 prompt with the rocket specs.
1 prompt with the launch animation details.
Pretty remarkable stuff from the @xai team.
My second prompt instructed Kilo on the launch animation.
I told it that I wanted the camera to follow the rocket until the booster separated, then follow the booster back to the landing pad. Another 1 shot.