Andrew Ng just revealed why the AI companies throwing the most compute at the problem are going to lose.
The winner of the intelligence race won’t use the most compute.
They’ll waste the least.
Ng: “Most of your high-dimensional data lies on a lower-dimensional subspace. It’s just a fact of life.”
Here’s what that means in practice.
You have a 10,000-dimensional dataset.
Every dimension dragged through every calculation.
Every training cycle hauling dead weight the model will never use.
Ng: “You’re carrying around these 10,000-dimensional examples throughout your whole training process.”
That bloat isn’t just inefficient.
It’s a tax on every computation you run.
Memory bandwidth. Network bandwidth. Computational speed.
All of it eaten by dimensions that contribute nothing to intelligence.
They contribute noise.
The insight that separates the architects from the arms race: that 10,000-dimensional dataset is almost entirely captured by a much smaller subspace.
The signal lives in a fraction of the space you’re paying to process.
Compress it. 10,000 dimensions down to 1,000.
Ng: “You can run your learning algorithm on a much lower-dimensional set of data and it may be much more efficient.”
Same hardware. Same budget. A fraction of the friction.
Brute force is the strategy of whoever has the deepest pockets.
Compression is the strategy of whoever actually understands the problem.
The companies that master this don’t just build faster models.
They build models that find more truth in less data than anything scaling blindly ever will.
Intelligence was never about processing everything.
It’s about knowing what to cut.