What can a neuron compute?
Real biological neurons are complex, but how capable are they?
Using a new method, we found that a single cortical neuron can classify cats vs dogs, recognize spoken words, and solve 10-bit parity, all tasks thought to require entire networks. (1/15)
Today I'm publishing a new essay, Policy on the AI Exponential. AI is progressing extremely fast—much faster than the policy process was built to handle. The essay lays out where I think the technology is now, and the action needed to close the gap: https://t.co/Lh6PWae178
How far can we compress the discrete tokens in an LLM's context into compact latent vectors?
With the right training recipe at large scale, our Latent Context Language Models (LCLMs) compress context up to 16× and land on a new Pareto frontier for long-context inference. 🧵(1/n)
Today, we're introducing Claude Fable 5 and Mythos 5, two configurations of our next major language model.
I'd normally highlight the numbers: It's SOTA on nearly all benchmarks. I want to talk about something else, because with Fable 5 out in the world, I think a third era quietly started today.
I lead Claude Code & Cowork on the desktop, so I think a lot about how people use AI to get work done. I believe we're about to see a major shift, moving from giving AI tasks to giving it responsibilities.
We've known about LLM test-time compute scaling since @OpenAI o1.
Yet 2 years later labs still report scalar evals for models; safety orgs are still surprised when a scaffold does better via 100x inference; and RSPs still ignore inference budget when deciding critical thresholds.
In medieval times, within the arms race of ever more demonic torture devices, some sadistic genius came up with the idea of the Little Ease.
This was a prison cell built so small in every dimension that a grown man could not stand upright in it nor lie down at full length nor properly sit.
The pain is relentless and without relief and inflicted by one's own body. Prisoners were known to go insane within a few days. A stay at the Little Ease was considered even more cruel than the rack, the thumbscrew, and the other ghoulish machinery of the Tower of London.
A breeding pig will spend her whole life in a version of that box.
These are social, roaming creatures (more intelligent than dogs) who will never leave this corset of steel.
They have been selectively bred to be bigger than their frames can support. Yet we put them in cells so confined that they cannot comfortably sit, and their attempts to do so (for example, by sneaking their limbs into adjacent stalls) reliably lead to fractures and sprains.
They cannot sweat, yet have nothing to roll around in to cool themselves off. Except their own manure, which (contrary to the common misconception) they are so averse to (thanks to their strong sense of smell) that new sows will often suffer from constipation to avoid soiling the space from which they eat and sleep.
Here is how the writer Matthew Scully described what saw at one of Smithfield’s “gestation barn”:
> “Sores, tumors, ulcers, pus pockets, lesions, cysts, bruises, torn ears, swollen legs everywhere. Roaring, groaning, tail biting, fighting, and other “Vices,” as they’re called in the industry. Frenzied chewing on bars and chains, stereotypical “vacuum” chewing on nothing at all, stereotypical rooting and nest building with imaginary straw. And “social defeat,” lots of it, in every third or fourth stall some completely broken being you know is alive only because she blinks and stares up at you … creatures beyond the power of pity to help or indifference to make more miserable, dead to the world except as heaps of flesh into which the [insemination] rod may be stuck once more and more flesh reproduced.”
—
The Save Our Bacon Act is trying to unroll the few state protections we have against this barbaric cruelty - for example California’s Prop 12 - which banned the sale of pork from pigs kept in gestation crates.
It’s incredibly important we don’t end up with this sort of federal preemption.
SOB will not only kill the most important animal welfare related laws in the US of the past decade, but more importantly, it will also restrict ALL future legislative progress (aka how the animal welfare movement has gotten its biggest wins).
The Senate is currently deciding whether to add the SOB Act to the Farm Bill.
With relatively little money now, we can discourage the most pivotal senators in the Ag committee from backing this amendment.
Defeating this bill is even more important given the amount of philanthropic funding I expect to come online in the next year or two.
It will plausibly be over 10x more expensive to repeal SOB than to prevent it from passing in the first place.
All that money that could be spent transforming our society's relationship to mass animal suffering will instead have to be spent just getting us back to where we are right now.
That's why money spent now fighting this bill (and I mean right NOW) is so effective.
If you’re in a position to donate six figures, please DM me.
We recently submitted a confidential S-1. We expect it to leak so we’re just announcing it. We have not decided on timing yet; it may be a while because there are things we want to do that are likely easier as a private company. But it’s a complicated set of tradeoffs and this gives us the option to go public sooner if that ends up being best.
This announcement is being made pursuant to Rule 135 under the Securities Act of 1933, as amended, and does not constitute an offer to sell or the solicitation of an offer to buy any securities. Any offers, solicitations of offers to buy, or any sales of securities will be made in accordance with the registration requirements of the Securities Act.
Qualcomm is a piece of shit supplier to anyone trying to innovate on the edge, especially startups.
My client is trying to buying ~500,000 of SoCs and wants them to be American because patriotism.
Instead Qualcomm has them waiting months to even get pricing, just to be quoted with 80% margins at lead times that are not competitive with overseas suppliers.
They're about to ship the M1: 37g, IMU, Camera, NPU, CPU, WiFi/BT, entirely hardware synchronized and designed for mass-manufacturing.
Hundreds of thousands pre-ordered, scaling to millions, for next-gen devices for robotics, automotive, night vision...
Client wants to build American but the way Qualcomm is engaging right now means we have to keep building perception and compute systems with overseas suppliers.
Mediatek priced it better and more competitive lead times.
The China suppliers such as Rockchip are even faster but cant do that.
We take for granted that larger models are better than smaller ones, but why is this so? Our new paper, led by Jing Huang and @EkdeepL, traces this to a data-induced competition for resources (neurons), using formal analysis, idealized tasks, and real pretraining.
Cheers, chills, and a standing ovation when RASolute 302 showed unprecedented survival on daraxonrasib for patients with progressive pancreatic cancer
Seldom do you sense you’re witnessing a historic moment in cancer care but this feels like ras targeting has arrived
#ASCO26
A new and possibly controversial perspective:
In this video, I explain the sense in which generative AI trained by supervised learning is incapable of making novel discoveries.
https://t.co/zin5QbbT9N
The text of the speech:
AI Creativity and Discovery
Good day ladies and gentlemen. I regret that I am unable to be with you all today to engage in a back-and-forth discussion, but I am nevertheless pleased to be able to share with you, via this recording, some high-level thoughts about the current and future state of artificial intelligence, and in particular about AI’s relationship to science and mathematics, which is, as I understand it, the central focus of this meeting and of the SAIR Foundation.
I would like to start with an old joke; I am sure you have heard it before. It is the one about the researcher whose work is being evaluated, and the review comes back, and says “This work is both novel and good. Unfortunately, the parts that are good are not novel, and the parts that are novel are not good.”
My first point about AI is that this assessment applies exactly to large parts of AI as we know it today. Not all of today’s AI, but a large part of it. Pretty much all of what we mean by “Generative AI”---which includes large language models, and the images and video models, and even the new methods for learning world models. All of these AIs take large numbers of examples and produce a “model” which behaves similar to the examples, that is, which generates text like people, or images like artists or nature, and videos like we find on the internet. Don’t get me wrong, Generative AI can be extremely useful. No doubt about that. But the assessment of the joke still applies. These systems can produce output that is both novel and good, but not at the same time.
In many ways this is just absolutely not a problem. When we ask an AI for an answer from the internet, or to summarize a document, we don’t want it to be novel. We are happy if the quality of the answer, the goodness, comes from the source material—from the people who wrote the document or the articles on the internet. If the AI’s answer is novel it means it is going beyond the source material, adding something beyond it. This is what we call “hallucinations”. In most cases, we don’t like it when the AI makes something up, when it adds something novel.
One exception, of course, is when we are looking not for facts or reality, but for fiction and entertainment. We might ask for a bedtime story for a child, or an image based on existing images on the internet but which is nevertheless different and distinct from them. In these cases, it is never easy for us to know how creative the AI is actually being, as we do not know how close the AI’s story, poem, or image is to the source material. In a real practical sense we can not know this because the internet is too big, the possible sources that the AI may draw upon are too numerous.
When we ask for a fiction or novelty, the AI can give it to us because its processing is in part stochastic. Every decision can go multiple ways and will go different ways and produce a different trajectory every time. The trajectory can be random—and thus novel—or it can be based on the training data—and thus “good” because the training data is good, sourced from people or reality. Thus, the trajectory is either novel or good—based on randomness or based on data—but never both at the same time.
Really, I think it is okay if the output of Generative AI is never good and novel at the same time. For the researcher in the joke this is a devastating criticism, but for most things it is not, and for Generative AI it is not. Generative AI is meant to be a mimic. This is what supervised learning is for. Generative AI can be extremely useful, even when it just mimics, if it is faster, or cheaper, or smaller, or more customizable, or more copy-able, than the thing being mimicked. It is okay if Generative AI cannot be both novel and good at the same time. It is still a transformative technology.
But it is a limitation. And remember we are here to use AI for science and mathematics, and for these areas the assessment of the reviewer in the joke is devastating. For these areas we need true creativity and discovery. Generative AI—or Mimicking AI—will never get where us there. For these we need something more, and indeed we have something more in other parts of AI. We have many AI systems which can give us more. We have AlphaGo with its world-changing move 37, or AlphaZero with its brilliant original chess-playing style. We have GT-Sophy that drives simulated racecars better than any human. We have AlphaFold and AlphaProof and Claude-Code, which have brought true advances in science, mathematics, and programming. We have RL-Lyft which optimizes the assignment of cars to passengers in the ride-hailing business. All these systems have found things that are both novel and good. And, truth be told, some language models have been augmented in ways that make them more than Generative AI based on supervised learning.
All these systems have some additional features that make them capable of true creativity and true discovery. It is important for us to recognize what this is—and that it is not present in ordinary, garden-variety Generative AI. It is something that can not come from just supervised learning, from learning from examples. What is it? Well, it is a simple thing, a commonsense thing. It is not new. We have many names for it, but unfortunately none of them are very good names. I will call it Discovery. Basically, Discovery is just the idea of trying many things and seeing which of them work, then keeping those that worked the best. Evolution by natural selection works this way. The scientific method works this way. And just ordinary life and learning works this way. We try things and remember what works. What could be more obvious? In this behavioral case, psychology has two names for it— “instrumental learning” and “operant conditioning”—and in machine learning it is what we mean by “reinforcement learning”. We also see the idea of Discovery in planning and combinatorial search—anything that involves the idea of “generate and test”.
The essence of Discovery is to combine three steps:
1. Variation,
2. Evaluation, and
3. Selective retention.
Of course, I am not the first to say this. I am not the first to point out that this combination of steps is key to science, to evolution by natural selection, and to animal behavior. I think particularly of papers by Donald Campbell, by Daniel Dennett, and by Gary Cziko. What is new in my remarks is to directly relate the idea of Discovery to modern AI to help us see that it is not present in supervised learning or Generative AI—in particular, that Discovery is not present in backpropagation or gradient descent.
Let me say explicitly what is missing from Generative AI. As we have remarked, these systems do have a stochastic aspect, so they do generate a variety of trajectories and behavior. What is missing is the Evaluation step. The generator was pre-trained by supervised learning, leaving no way at runtime to Evaluate what it generates. And of course without Evaluation there can be no Selective retention, and thus no Discovery. The variation can bring novelty, but without evaluation there is no Discovery, and arguably, no creativity. That is, I would say that creativity requires that the new things generated be Evaluated. Without evaluation, and retention of the best, there is nothing created. The novelty flickers into existence but, if its value is unrecognized, it flickers away and is lost.
In many cases, Evaluation is done by people to make a discovery. As when we have Generative AI make many pictures for us, and then we pick the one that we like the best. The human+AI system completes the discovery.
In many other cases, the Evaluation comes from a clear objective. Some moves lead to checkmate, some steps lead to a proof, some actions result in high reward, some genotypes make more copies, some theories explain the data better.
Some prefer the Variation step to be called Blind variation, where “blind” here means that it is uninformed, a shot in the dark. It does not need to be completely uninformed; a good scientist does not select theories to test at random. But neither can it be completely informed and determined. There must be some uncertainty about where the answer lies in order for there to be a discovery. In practice, the variation is partly informed and partly blind, but it is the blind part that corresponds to the discovery.
Now let us briefly go all the way to modern deep learning, to the backpropagation algorithm. At first it might seem that backpropagation is incapable of discovery because it is deterministic and thus incapable of variation. But this is not correct. The weight updates of backprop are deterministic, but the weights are initialized to small random values. The random initialization is often downplayed, but in fact it is a necessary form of variation; it must be done properly to get good performance. In backprop this Variation is done once, at network initialization, so its effect is temporary, and later the network may lose its ability to learn. This is the weakness of deep learning that is alleviated with a new algorithm that my group presented in Nature a couple of years ago. Our “continual backpropagation” made one small change: every so often a less-used neuron would be re-initialized to small random weights. This allows the variation to continue and plasticity to be retained.
Although there is much more to be said about Creativity and Discovery, this is the key point: they are more than supervised learning, more than pattern recognition, more than prediction, and more than world modeling. Those things are important, but they alone will not bring us to discovery. Discovery requires Evaluation from a person or from an explicit goal, and only in the latter case will we attain full autonomy.
So that is my call to arms. If we want the full power of AI scientists, then we should share the goals with them so they can create, evaluate, discover, and in these ways fully participate in achieving the goals. Let’s be bold! Let’s fully automate Creativity and Discovery!
Earlier this month I volunteered at Stanford’s Future of Math symposium, and ever since, I've been puzzling through what it now means to pursue mathematics as a student in the age of AI. I wrote an essay to make sense of it all: https://t.co/cYyUDqSady
A Buddhist hall in Western Japan known for housing an "eternal flame" that has been burning for over 1,000 years caught fire and burned to the ground on Miyajima Island.
Whoa. This breakthrough is going to fundamentally affect the structure of how universities select and retain professors. And more generally the structure of work and creativity. AI has managed to discover a significant result in research mathematics in a one-shot query (!!!), disproving a conjecture that a huge number of people have tried working on (including myself, although I am not anywhere as good as the other experts who have thought about it).
@wtgowers (Fields Medalist who has been thinking a lot about this space of AI and math) wrote: "if a human had written the paper and submitted it to the Annals of Mathematics and I had been asked for a quick opinion, I would have recommended acceptance without any hesitation. No previous AI-generated proof has come close to that." [The Annals of Mathematics is perhaps the most prestigious math journal in the world.]
OpenAI's announcement: https://t.co/faCjJFkY43
Mathematicians' discussion: https://t.co/RF0PzhRonM
In college classes, we already have a major issue where take-home assignments are susceptible to students using AI. Even with in-person exams, we're having issues where students use AI in the bathroom during the exam. Now, how will universities decide which professors to hire and promote? Universities used to judge on the basis of whether you got papers published in highly prestigious journals. Should the person who can use AI to generate massive quantities of publishable results be picked, if it can be done with one-shot prompts?
More generally, everyone (not just universities) needs to rethink their objectives for hiring and promotion. I am also an entrepreneur, and have been refining my hiring process in this age of AI. Before, I used to be particularly impressed by academic competition performance. Today, I search for people who hold 2 certain principles very strongly: they enjoy (1) delighting other people, and (2) achieving understanding through their own thought. I find these people are good at figuring out what makes customers/partners tick (hence able to identify good directions to run without micromanagement), and also curious enough to learn forever. They also tend to already be pretty strong skills-wise, because those two principles inherently drive them to build skills.
I actually think the whole world would be better off with more Thought Full people: https://t.co/cPdqAhQRz3. We'll need people like that to figure out how to help humanity survive. If you'd like to collaborate on ways to make a future, feel free to reach out. That's all I work on nowadays.
this is what children in shenzhen learn about in their science and tech museum:
- supply chain logistics
- photolithography for chip design
- applications of mxene-liquid crystal elastomer materials (in solar/optics/robotics)
- biological 3D printing
just to name a few...
so what will you (and/or your children) learn about today?
It was very fun to hear Ron tell the stories of early Jane Street.
Their first cluster was a pile of 6 Dell boxes in their office.
And it was important to them that it be physically in their office so that if something went haywire, they could just physically unplug the machine.
Goes without saying that they can no longer fit their 100s of k of GPUs in their office.