Build American AI, a nonprofit linked to a super PAC bankrolled by executives at OpenAI and Andreessen Horowitz, is funding a campaign to spread pro-AI messaging and stoke fears about China. https://t.co/6cYWEMELYH
🚨SHOCKING: Apple just proved that AI models cannot do math. Not advanced math. Grade school math. The kind a 10-year-old solves.
And the way they proved it is devastating.
Apple researchers took the most popular math benchmark in AI — GSM8K, a set of grade-school math problems — and made one change. They swapped the numbers. Same problem. Same logic. Same steps. Different numbers.
Every model's performance dropped. Every single one. 25 state-of-the-art models tested.
But that wasn't the real experiment.
The real experiment broke everything.
They added one sentence to a math problem. One sentence that is completely irrelevant to the answer. It has nothing to do with the math. A human would read it and ignore it instantly.
Here's the actual example from the paper:
"Oliver picks 44 kiwis on Friday. Then he picks 58 kiwis on Saturday. On Sunday, he picks double the number of kiwis he did on Friday, but five of them were a bit smaller than average. How many kiwis does Oliver have?"
The correct answer is 190. The size of the kiwis has nothing to do with the count.
A 10-year-old would ignore "five of them were a bit smaller" because it's obviously irrelevant. It doesn't change how many kiwis there are.
But o1-mini, OpenAI's reasoning model, subtracted 5. It got 185.
Llama did the same thing. Subtracted 5. Got 185.
They didn't reason through the problem. They saw the number 5, saw a sentence that sounded like it mattered, and blindly turned it into a subtraction.
The models do not understand what subtraction means. They see a pattern that looks like subtraction and apply it. That is all.
Apple tested this across all models. They call the dataset "GSM-NoOp" — as in, the added clause is a no-operation. It does nothing. It changes nothing.
The results are catastrophic.
Phi-3-mini dropped over 65%. More than half of its "math ability" vanished from one irrelevant sentence.
GPT-4o dropped from 94.9% to 63.1%.
o1-mini dropped from 94.5% to 66.0%.
o1-preview, OpenAI's most advanced reasoning model at the time, dropped from 92.7% to 77.4%.
Even giving the models 8 examples of the exact same question beforehand, with the correct solution shown each time, barely helped. The models still fell for the irrelevant clause.
This means it's not a prompting problem. It's not a context problem. It's structural.
The Apple researchers also found that models convert words into math operations without understanding what those words mean. They see the word "discount" and multiply. They see a number near the word "smaller" and subtract. Regardless of whether it makes any sense.
The paper's exact words: "current LLMs are not capable of genuine logical reasoning; instead, they attempt to replicate the reasoning steps observed in their training data."
And: "LLMs likely perform a form of probabilistic pattern-matching and searching to find closest seen data during training without proper understanding of concepts."
They also tested what happens when you increase the number of steps in a problem. Performance didn't just decrease. The rate of decrease accelerated. Adding two extra clauses to a problem dropped Gemma2-9b from 84.4% to 41.8%. Phi-3.5-mini from 87.6% to 44.8%. The more thinking required, the more the models collapse.
A real reasoner would slow down and work through it. These models don't slow down. They pattern-match. And when the pattern becomes complex enough, they crash.
This paper was published at ICLR 2025, one of the most prestigious AI conferences in the world.
You are using AI to help you make financial decisions. To check legal documents. To solve problems at work. To help your children with homework. And Apple just proved that the AI is not thinking about any of it. It is pattern matching. And the moment something unexpected shows up in your question, it breaks. It does not tell you it broke. It just quietly gives you the wrong answer with full confidence.
In 2010, Aaron Swartz downloaded 4.8 million articles from JSTOR. We don't know why - it may have been to make them freely available online (if so, he never did, as he was caught). He made no money from this, and there is no suggestion he intended to.
He was indicted on 13 felony counts, facing a maximum of 95 years in prison and $3 million in fines. Two years after his arrest, awaiting trial, he hung himself.
In 2019 & 2021, Ben Mann downloaded at least 5 million books from pirate libraries. He did so while working at OpenAI and Anthropic; the books were downloaded for the purposes of training AI models at those companies, two of the most successful commercial companies in recent history.
He has never faced criminal charges, and, as a co-founder of Anthropic, must now be extremely wealthy (it is known that many of his co-founders are billionaires).
your personal AI foley artist to create the perfect sound effect 🔉
maestro SFX on https://t.co/GsXQfSKV5R, coming soon....🎶
#sfx#aimusic#beatovenai#soundeffects
"AI cannot 'create' without our music. We never consented to having our music used to build a multi-billion dollar technology. We have not been asked, we have not been paid, but we have been robbed, disrespected and mocked by rich technocrats. This fight is bigger than music."
- @djpain1, one of the independent musicians suing AI music companies Suno & Udio
Timbaland and Suno have some explaining to do.
Suno was allegedly caught stealing @kfreshmusic 's beat and producer tag in a Ai generated beat prompted by Timbaland.
Today we’re pitching @beatovenai at @join_ef’s Angel Demo Day in San Francisco. We can’t wait to present @beatovenai . Here’s the tl;dr on what we’re building:
ADCx India Schedule Now Available
ADCx India 3-day meet-up for audio developers combines Music Hack Day India & a 1-day Audio Developer Conference pop-up.
Jan 19 Live Stream
https://t.co/r4wCrwhVBJ
Tickets & Info:
https://t.co/Rzgt493WkT
#audio#developer#programmer#india
We are delighted to announce that @sidb0710, Co-founder & CTO at @beatovenai, will be joining us as a speaker at Cypher 2024!
Siddharth Bhardwaj is the co-founder and CTO at https://t.co/UPM3l3xPCK. He has been working at the intersection of audio and technology for the past 11 years, completing his Master’s in Sound and Music Computing from Music Technology Group, UPF, Barcelona, and Bachelors from IIIT-Allahabad. He has also worked with several startups in the past, on problems relating to audio signal processing, machine learning, deep learning and generative music.
Don't miss the chance to gain valuable insights from one of the leading experts in AI and technology. Be there to witness groundbreaking discussions and innovations!
Stay updated: https://t.co/ewusJDnqYm
#Cypher2024 #AI #TechInnovation #FutureTech
Don’t miss the AI ethics roundtable at #ETSoonicornsSummit2024: ‘Countering Deepfakes & Inauthentic Content: The Ethical AI Imperative’ with leaders from ;@sidb0710; @geethamhp; Umakant_Soni; @sachinmalhan . Register now https://t.co/LawAR5dsSF
To secure your spot! | September 20 in Bengaluru.
@DrTBehrens Creative commons license exist for this particular reason. The choice should lie with the artists whether they want their art to be used by anyone freely or not.
This is a disaster from the government! This is not an attempt to safeguard public interest rather safeguarding political interest via censorship. This seems too reactionary in the aftermath of Gemini unreliable outputs. And definitely regressive and harmful to the tech ecosystem
India just kissed its future goodbye!
Every company deploying a GenAI model now requires approval from the Indian government!
That is, you now need approval for merely deploying a 7b open source model 🤯🤯
If you know the Indian government, you know this will a huge drag! All forms will need to completed in triplicate and there will be a dozen hoops to jump through!
This is how monopolies thrive, countries decay and consumers suffer!
Sadly India is already dominated by monopolies, nepotism and bureaucracy and this new rule just made it far worse.
@okio_ai This looks great! I am a bit confused on the licensing part though, I am assuming(based on what I read) that your generative models are trained on top of MusicGen which does not allow for commercial use but your models are available for commercial use?Am I missing something here?
We played around with OpenAI's Sora prompts on our Text to Music feature to whip up some background music.
Gotta say, we're pretty stoked about the results! 🤟🏽
What do you think?
#AI#OpenAI#aimusic#beatovenai#texttomusic
Sora x https://t.co/b3r1M1UT3R 🔥
We tried the prompt used to generate this video on Sora to generate background music on https://t.co/GsXQfSLsVp. Loving. the results so far :D
What do you think?
#ADCxIndia Panelist: Siddharth Bhardwaj
Audio Tech Ecosystem in India: Opportunities and Challenges
Join the Co-founder of https://t.co/BCoudl10Q1 and other distinguished guests on this informative panel.
👉 https://t.co/5ovyA4Yfaw
#audio#developer#India#music#technology