bleeding out, lol @yet - Twitter Profile

Pinned Tweet

bleeding out, lol @yet

about 5 years ago

waiting in the post jab holding area and they are playing wilson phillips hold on and i am having a moment

0

18

0

yet retweeted

Rimsha Bhardwaj

@heyrimsha

14 days ago

A Wharton economist ran a randomized controlled trial on almost a thousand high school students in Turkey. The result was so brutal for the AI-in-education narrative that it had to be peer-reviewed by PNAS before people would believe it. Her name is Hamsa Bastani. She teaches operations and information at the Wharton School at the University of Pennsylvania, and the study she published in 2025 alongside her co-authors is one of the cleanest experiments anyone has run on what AI actually does to learning when you remove it from the equation and check what is left. The setup was a randomized controlled trial, the same methodology used in clinical drug trials. Nearly a thousand high school math students in Turkey were split into three groups and put through four sessions of ninety minutes each. One group practiced with GPT Base, a standard ChatGPT-4 interface that could answer any question directly. One group practiced with GPT Tutor, a version of the same model that had been prompted to guide students with hints rather than hand them the answer. One group practiced with nothing but their textbook and their own head. During the practice sessions, the AI groups looked like a miracle. The GPT Base group solved 48% more problems than the students working alone. The GPT Tutor group solved 127% more. Every administrator looking at those numbers would have written a press release about the transformative power of AI in education and moved on. Then the actual exam came, and AI was not allowed. The students who had practiced with GPT Base scored 17% worse than the students who had practiced alone. Seventeen percent worse, despite having solved nearly half again as many problems in the sessions leading up to it. The students who had struggled the most, who had sat with the confusion and worked through it without a tool to rescue them, were now the only ones who could actually do the math when it counted. Bastani's team read through the chat logs to understand what had actually been happening during the practice sessions, and the answer was exactly what the exam results had already implied. The GPT Base group had not been learning. They had been extracting answers and moving on, and every moment that felt like understanding was actually the model doing the cognitive work while the student's brain waited for the next problem to arrive. The paper describes it precisely: without guardrails, students attempt to use GPT-4 as a crutch during practice, and subsequently perform worse on their own. The detail that should follow every conversation about AI in education is the one buried in the post-test survey results. The students who had relied on AI the most during practice were also the most confident they had understood the material. The tool had not just failed to teach them. It had convinced them they had learned something they had not, which is a different kind of failure entirely and a much harder one to correct because the student has no idea it is happening. The crutch had made them confident and weak at the same time.

heyrimsha's tweet photo. A Wharton economist ran a randomized controlled trial on almost a thousand high school students in Turkey.

The result was so brutal for the AI-in-education narrative that it had to be peer-reviewed by PNAS before people would believe it.

Her name is Hamsa Bastani. She teaches operations and information at the Wharton School at the University of Pennsylvania, and the study she published in 2025 alongside her co-authors is one of the cleanest experiments anyone has run on what AI actually does to learning when you remove it from the equation and check what is left.

The setup was a randomized controlled trial, the same methodology used in clinical drug trials. Nearly a thousand high school math students in Turkey were split into three groups and put through four sessions of ninety minutes each. One group practiced with GPT Base, a standard ChatGPT-4 interface that could answer any question directly. One group practiced with GPT Tutor, a version of the same model that had been prompted to guide students with hints rather than hand them the answer. One group practiced with nothing but their textbook and their own head.

During the practice sessions, the AI groups looked like a miracle. The GPT Base group solved 48% more problems than the students working alone. The GPT Tutor group solved 127% more. Every administrator looking at those numbers would have written a press release about the transformative power of AI in education and moved on.

Then the actual exam came, and AI was not allowed.

The students who had practiced with GPT Base scored 17% worse than the students who had practiced alone. Seventeen percent worse, despite having solved nearly half again as many problems in the sessions leading up to it. The students who had struggled the most, who had sat with the confusion and worked through it without a tool to rescue them, were now the only ones who could actually do the math when it counted.

Bastani's team read through the chat logs to understand what had actually been happening during the practice sessions, and the answer was exactly what the exam results had already implied. The GPT Base group had not been learning. They had been extracting answers and moving on, and every moment that felt like understanding was actually the model doing the cognitive work while the student's brain waited for the next problem to arrive. The paper describes it precisely: without guardrails, students attempt to use GPT-4 as a crutch during practice, and subsequently perform worse on their own.

The detail that should follow every conversation about AI in education is the one buried in the post-test survey results. The students who had relied on AI the most during practice were also the most confident they had understood the material. The tool had not just failed to teach them. It had convinced them they had learned something they had not, which is a different kind of failure entirely and a much harder one to correct because the student has no idea it is happening.

The crutch had made them confident and weak at the same time.

143

3K

1K

339K

yet retweeted

Ryan Moulton @moultano

21 days ago

This is maybe obvious to everyone, but it just occurred to me why bioterrorism is such a significant model risk. It's analogous to code with vulnerabilities that can never be patched.

22

555

30

84

43K

yet retweeted

BlueDot Impact

@BlueDotImpact

22 days ago

The linear representation hypothesis says neural networks encode concepts as directions in activation space. We trained a small model where 7 of 8 features behave this way. The 8th doesn't. $2,500+ in prizes to whoever can tell us how it's actually encoded. Bonus points if you can train a model with an even weirder representation. Link in thread 🧵

BlueDotImpact's tweet photo. The linear representation hypothesis says neural networks encode concepts as directions in activation space.

We trained a small model where 7 of 8 features behave this way. The 8th doesn't.

$2,500+ in prizes to whoever can tell us how it's actually encoded. Bonus points if you can train a model with an even weirder representation.

Link in thread 🧵

1

13

1

12

986

yet retweeted

Justin McElroy @thoughtographic

24 days ago

thoughtographic's tweet photo. https://t.co/YqhL0RVbvs

23

14K

362

381

383K

yet retweeted

John Carmack

@ID_AA_Carmack

27 days ago

Space launch was a clear case where there was a large difference in efficiency between what was possible and what was done in practice before SpaceX. A large part of that was due to everything being locked in to what (just barely) already worked, with huge risk aversion. WIth national prestige or a half billion dollar geosync satellite on the line, speculative engineering ideas that might result in a public debacle were not welcome. When failure is not an option, success can stay very expensive. You need to experiment to improve, and that fundamentally means being comfortable with failure. If you know it is going to work, it isn’t an experiment. I have long believed that nuclear power today is in precisely the same state as space launch two decades ago, but the even more pressing question now is if semiconductor fabrication might also be. On the one hand, Moore’s Law has been a sequence of heroic miracles of technology at the wafer fabrication level, grinding out hundreds of compounding small improvements. On the other hand, fabs are “too big to fail”, and there are elements of extreme conservatism at play. Intel’s “Copy exactly!” fab development exemplifies that mindset – instead of every new building being an opportunity to explore and optimize processes, it was deemed more valuable to just replicate. While each individual machine may be straining against physical limits of technology, it is possible that the systems orchestrating them all together could be far from optimal. The explore / exploit axis is fundamental to all decision making, but human risk avoidance probably biases away from optimal exploration.

103

3K

291

506

198K

yet retweeted

Danielle Baskin @djbaskin

28 days ago

someone at this art show made a remote control red carpet stanchion so they can randomly trap people or make them VIP

7

369

14

72

40K

yet retweeted

Peter Beck

@Peter_J_Beck

27 days ago

for your enjoyment

690

26K

2K

3K

1M

yet retweeted

Owen Brake

@OwenBrakes

about 1 month ago

The RF world is insane. Researchers recovered AES-128 keys from a Bluetooth chip by listening to its own antenna from 10 meters away. Crypto-engine switching noise couples into the RF chain, rides the 2.4 GHz carrier, and leaks out as radio.

OwenBrakes's tweet photo. The RF world is insane.

Researchers recovered AES-128 keys from a Bluetooth chip by listening to its own antenna from 10 meters away.

Crypto-engine switching noise couples into the RF chain, rides the 2.4 GHz carrier, and leaks out as radio. https://t.co/uWKIMBDsm4

109

6K

851

3K

348K

yet retweeted

Bernie Sanders

@BernieSanders

about 1 month ago

If the world’s leading scientists say there’s even a 10% chance humanity could be destroyed because of uncontrolled AI, shouldn’t we do everything possible to prevent it? This isn’t about competition with China. It's about coming together to prevent what might be a catastrophe.

541

4K

839

132

187K

yet retweeted

Jay in Kyiv

@JayinKyiv

about 1 month ago

Q: How many Ukrainians does it take to change a light bulb? A: None.

344

23K

4K

1K

831K

yet retweeted

Nate Soares ⏹️

@So8res

about 1 month ago

@SecScottBessent The existential threat is not about chatbots (governable by global standards). It's about the creation of superintelligent machines that nobody understands. US labs say they run that risk; foreign labs do too. We need international coordination to ensure nobody builds it.

7

172

10

3

3K

yet retweeted

Sen. Bernie Sanders

@SenSanders

about 1 month ago

Uncontrolled AI poses a severe danger to all of humanity. On Wednesday, I'll be hosting a discussion with leading AI scientists from the US and China about the need for international cooperation against this existential threat. This is an enormously important issue. Join us.

SenSanders's tweet photo. Uncontrolled AI poses a severe danger to all of humanity.

On Wednesday, I'll be hosting a discussion with leading AI scientists from the US and China about the need for international cooperation against this existential threat. This is an enormously important issue. Join us. https://t.co/8BHsXWcwch

992

3K

690

306

1M

yet retweeted

Acyn

@Acyn

about 2 months ago

Mamdani: When I ran for mayor, I said I was going to tax the rich  Well, today we're taxing the rich...

1K

136K

11K

9K

13M

yet retweeted

山谷剛史　アジア中国ITライター＆異国飯

@YamayaT

about 2 months ago

今、中国の道路の一部でこういうドローンステーションが設置されている電柱があるそうな。

68

1K

314

256

194K

yet retweeted

Ben Rhodes

@brhodes

about 2 months ago

In the best case scenario, Trump struck a deal to reopen a Strait that was open before the pointless war he started, with the IRGC demonstrating its control over the Strait and potentially extracting fees plus sanctions relief. Thousands of innocents - including hundreds of children - dead in Lebanon and Iran for no reason. U.S. troops killed and wounded. U.S. embassies and bases in the Middle East badly damaged. U.S. standing in the world obliterated. U.S. munitions badly depleted. Hundreds of billions spent. Prices up everywhere. More global economic fallout to come. Putin strengthened and enriched. Just a catastrophic situation even in the best of circumstances. A profoundly shameful episode in American history no matter what happens next.

836

27K

8K

3K

1M

yet retweeted

Nate Soares ⏹️

@So8res

about 2 months ago

Imagine Dario hopping on TikTok and saying "holy shit guys wake up, I think we're close to AIs that can automate AI research." Direct communication is underutilized.

6

202

7

22

12K

bleeding out, lol @yet

about 2 months ago

all together now...

0

26

bleeding out, lol @yet

about 2 months ago

A primer on existential AI risk https://t.co/iLZJd1sKcS

0

27

bleeding out, lol @yet

about 2 months ago

yet's tweet photo. https://t.co/TFV85b5gpo

0

23

yet retweeted

Steve Hou

@stevehou

2 months ago

Apparently workers in China have been creating “colleagues.skill” to distill their coworkers hoping to make them redundant hence saving themselves. In response someone has recently invented an “anti-distillation.skill” that has gone viral on GitHub.🤣

88

4K

384

3K

1M

bleeding out, lol

@yet

Last Seen Users on Sotwe

Trends for you

Most Popular Users