Cursor Composer 2.5 is actually really good. They basically took the open Kimi 2.5 which is already a great coding model, trained it on Colossus 2 (SpaceX’s huge 200k GPU cluster), did continued pretraining on a ton of code data with long context, then ran this big RL stage with 25 times more synthetic long-horizon tasks, GRPO and self-summarization.
Post-training with high quality data works incredibly well. Google should just do this to catch up.
The singularity is here, it’s just not evenly distributed.
That is: the singularity is usually modeled in time, but we can model it in space. Conceptually, the change starts at one point on the surface of earth, and then propagates out. Imperceptibly and then suddenly.
Therefore: nothing ever happens, right before it’s happening.
I would go a step further than framing this as time, instead these are entirely different realities. Step into an AI lab where people are using a SOTA model and it’s like Scotty just beamed you to a different planet with a completely different understanding of the world and what is possible.
Proximity has always been a thing but not like this.
At first this made me think of crossing the chasm but what’s happening with AI is very different to the adoption of any single product. For example this is not like how smartphone adoption happened from early adopters to laggards. It’s more that a group of people can be living in a completely different reality of what is possible across the entire spectrum of applications.
With AI super-intelligence, the state of the art can be applied to every product, every service, every topic of research.
The closest comparison is the Industrial Revolution. Walking around Birmingham England in the early 1800’s you would have an understanding of how steam power and machinery was going to change everything. Enabling mass produced iron, and machine tools for interchangeable parts which led to power looms, engines and Babbage computers. The rest of the world at this time was still farming by hand- two different realities across the entire spectrum of science and technology.
Except this comparison is not close. The implications now are far more significant.
I'm lucky enough to have a great doctor and access to excellent Bay Area medical care. I've taken lots of standard screening tests over the years and have tried lots of "health tech" devices and tools.
With all this said, by far the most useful preventative medical advice that I've ever received has come from unleashing coding agents on my genome, having them investigate my specific mutations, and having them recommend specific follow-on tests and treatments.
Population averages are population averages, but we ourselves are not averages. For example, it turns out that I probably have a 30x(!) higher-than-average predisposition to melanoma. Fortunately, there are both specific supplements that help counteract the particular mutations I have, and of course I can significantly dial up my screening frequency. So, this is very useful to know.
I don't know exactly how much the analysis cost, but probably less than $100. Sequencing my genome cost a few hundred dollars.
(One often sees papers and articles claiming that models aren't very good at medical reasoning. These analyses are usually based on employing several-year-old models, which is a kind of ludicrous malpractice. It is true that you still have to carefully monitor the agents' reasoning, and they do on occasion jump to conclusions or skip steps, requiring some nudging and re-steering. But, overall, they are almost literally infinitely better for this kind of work than what one can otherwise obtain today.)
There are still lots of questions about how this will diffuse and get adopted, but it seems very clear that medical practice is about to improve enormously. Exciting times!
@bcherny FYI latest v of cc is wild today, making code changes while explicitly in plan mode and without confirmation to proceed. Also pulling in old plans which are already executed:
Feedback ID: 5594563c-729d-42c6-b0e5-4a72a968da36
Using the terminal is such a nicer experience than VS code/intelliJ/Cursor which often felt laggy handling large files - I never feel input lag in the terminal using Claude Code and Open Code and I can have many terminals going simultaneously.
@karpathy Yeah I experience the same. For medical it’s important to control context precisely and be deliberate with memory via API call instead of the consumer interfaces.