turns out AI models cannot do math.. even grade school math. the kind a 10-year-old solves.
Apple published a devastating study that exposes a massive illusion at the core of artificial intelligence.
they took the standard math benchmark (GSM8K) that every AI company uses to brag about how smart their model is.
first, they just changed the names in the word problems.. the models' performance fluctuated for no reason.
then, they changed the numbers. the performance immediately dropped.
but then they ran the test that broke everything.
they added one single, completely irrelevant sentence to the word problem. something like: "By the way, 5 of the apples were green."
A human 10-year-old ignores the green apples and solves the underlying math.
the AI didn't.
across every state-of-the-art model, performance collapsed by up to 65%.
the AI blindly grabbed the irrelevant number and tried to shove it into the equation. it didn't know why it was doing the math. it just saw a number and assumed it was supposed to use it.
there is no genuine logical reasoning happening under the hood.
we are deploying these systems to run our finances, analyze our legal documents, and make complex strategic decisions.
but the models don't actually understand the logic they are spitting out.
they just know what a smart answer is supposed to look like.