🧵 1/8 The Illusion of Thinking: Are reasoning models like o1/o3, DeepSeek-R1, and Claude 3.7 Sonnet really "thinking"? 🤔 Or are they just throwing more compute towards pattern matching?
The new Large Reasoning Models (LRMs) show promising gains on math and coding benchmarks, but we found their fundamental limitations are more severe than expected.
In our latest work, we compared each “thinking” LRM with its “non-thinking” LLM twin. Unlike most prior works that only measure the final performance, we analyzed their actual reasoning traces—looking inside their long "thoughts". Our analysis reveals several interesting results ⬇️
📄 https://t.co/PjnYpVRdX3
Work led by @ParshinShojaee and @i_mirzadeh, and with @KeivanAlizadeh2, @mchorton1991, Samy Bengio.
PARENTS: please check your kid's candy this halloween - i just found a 12-month enterprise salesforce contract for 15 seats with auto-renewal and a $50,000 early breakup fee inside this snickers bar
I am surprised there is not more talk about the shameful and unprecendented heist in open source history that Automattic pulled off (technically: WordPress Foundation, confirmed to "belong to" Automattic's CEO):
Like Apple "took ownership" of Spotify.
https://t.co/wGlbLpQskH
How is it allowed that:
- Audible has 65% market share in the US for audibooks
- It offers 20% royalty share to authors if audibooks are sold non-exclusive (so, outside Audible as well). So for a $10 audiobook: Amazon takes $8, publisher gets $2!
80% take rate!!
So this is the future of WordPress. Automattic - the entity controlling Wordpress .org and the WP trademark - can take over *any* plugin that it wants, when they want, and how they want.
Automattic is burning the principles of open source for their own profit.
A sad, new era.
@MicrosoftTeams newest version does not display user tags in channel member list, wasting time with large orgs. Please offer a way to always display tags of members. This wasn’t properly implemented!!
ANALYSIS COMPLETE!
I'm a professional CSS programmer with 8 decades of experience in HARCDORE PROGRAMMING
Here is what actually happened with the #crowdstrike cobalt strike cyber ATT&CK [REDACTED] thread of KNOWLEDGE THEY don't want you to know🧵🧵 🧵 🧵 :
(1/?)
A serious problem among multinational IT support teams today is the dilution of job titles, giving “senior” or “engineer” titles to those who’ve yet to earn them.