Facts and Quips⏹

22 days ago

Instead of regulating frontier AI development with a burdensome patchwork of conflicting national laws, we need a global governance framework banning it everywhere on earth

FactsAndQuips retweeted

Rational Animations

@RationalAnimat1

about 1 month ago

Developing a superintelligent AI that does what we want without killing everyone may be extremely difficult. In this video, we explain why, using arguments from "If Anyone Builds It, Everyone Dies" by @ESYudkowsky and @So8res.

202

FactsAndQuips retweeted

Feels good to be a Bitcoiner

about 1 month ago

Today is the 40th anniversary of the Chernobyl disaster. What can we learn from it? Four lessons with important implications for AI (from IF ANYONE BUILDS IT, EVERYONE DIES, by @ESYudkowsky and @So8res): >"1. 𝐀𝐧 𝐞𝐧𝐠𝐢𝐧𝐞𝐞𝐫𝐢𝐧𝐠 𝐜𝐡𝐚𝐥𝐥𝐞𝐧𝐠𝐞 𝐢𝐬 𝐦𝐮𝐜𝐡 𝐡𝐚𝐫𝐝𝐞𝐫 𝐭𝐨 𝐬𝐨𝐥𝐯𝐞 𝐰𝐡𝐞𝐧 𝐭𝐡𝐞 𝐮𝐧𝐝𝐞𝐫𝐥𝐲𝐢𝐧𝐠 𝐩𝐫𝐨𝐜𝐞𝐬𝐬𝐞𝐬 𝐫𝐮𝐧 𝐨𝐧 𝐭𝐢𝐦𝐞𝐬𝐜𝐚𝐥𝐞𝐬 𝐟𝐚𝐬𝐭𝐞𝐫 𝐭𝐡𝐚𝐧 𝐡𝐮𝐦𝐚𝐧𝐬 𝐜𝐚𝐧 𝐫𝐞𝐚𝐜𝐭. Transistors switch even faster than neutrons multiply. Engineers can con- trive to make events run slow enough for humans to react, but if the contrivance fails the humans are back to being frozen statues, on the timescale that matters. >2. 𝐀𝐧 𝐞𝐧𝐠𝐢𝐧𝐞𝐞𝐫𝐢𝐧𝐠 𝐜𝐡𝐚𝐥𝐥𝐞𝐧𝐠𝐞 𝐢𝐬 𝐦𝐮𝐜𝐡 𝐡𝐚𝐫𝐝𝐞𝐫 𝐭𝐨 𝐬𝐨𝐥𝐯𝐞 𝐰𝐡𝐞𝐧 𝐭𝐡𝐞𝐫𝐞 𝐢𝐬 𝐚 𝐧𝐚𝐫𝐫𝐨𝐰 𝐦𝐚𝐫𝐠𝐢𝐧 𝐟𝐨𝐫 𝐞𝐫𝐫𝐨𝐫, 𝐞𝐬𝐩𝐞𝐜𝐢𝐚𝐥𝐥𝐲 𝐢𝐟 𝐢𝐭’𝐬 𝐚 𝐧𝐚𝐫𝐫𝐨𝐰 𝐦𝐚𝐫𝐠𝐢𝐧 𝐛𝐞𝐭𝐰𝐞𝐞𝐧 “𝐮𝐧𝐢𝐦𝐩𝐫𝐞𝐬𝐬𝐢𝐯𝐞” 𝐚𝐧𝐝 “𝐞𝐱𝐩𝐥𝐨𝐬𝐢𝐯𝐞.” The analogy to intelligence is how apes and hominids wandered around for a few million years, and then got smart enough to set off a whole cascade of inventions: Agriculture led to writing led to science led to spacecraft. It would be a narrow target to make hominids that were intelligent enough to be profitable office workers, but not intelligent enough for explosive technological development. >3. 𝐒𝐞𝐥𝐟-𝐚𝐦𝐩𝐥𝐢𝐟𝐲𝐢𝐧𝐠 𝐩𝐫𝐨𝐜𝐞𝐬𝐬𝐞𝐬, 𝐥𝐢𝐤𝐞 𝐚𝐧 𝐨𝐯𝐞𝐫𝐡𝐞𝐚𝐭𝐢𝐧𝐠 𝐫𝐞𝐚𝐜𝐭𝐨𝐫 𝐛𝐨𝐢𝐥𝐢𝐧𝐠 𝐨𝐟𝐟 𝐢𝐭𝐬 𝐜𝐨𝐨𝐥𝐚𝐧𝐭 𝐰𝐚𝐭𝐞𝐫 𝐚𝐧𝐝 𝐭𝐡𝐞𝐧 𝐨𝐯𝐞𝐫𝐡𝐞𝐚𝐭- 𝐢𝐧𝐠 𝐦𝐨𝐫𝐞, 𝐥𝐞𝐚𝐯𝐞 𝐥𝐢𝐭𝐭𝐥𝐞 𝐫𝐨𝐨𝐦 𝐟𝐨𝐫 𝐞𝐫𝐫𝐨𝐫. And nuclear engineers don’t even have it that bad, compared to artificial superintelligence developers. Nuclear reac- tors that get too hot don’t start intelligently redesigning themselves to increase their own reactivity rate. Overheating nuclear reactors don’t start trying to fool the operators into complacency until the reactor is ready to fully explode. >4. 𝐂𝐨𝐦𝐩𝐥𝐢𝐜𝐚𝐭𝐢𝐨𝐧𝐬 𝐦𝐚𝐤𝐞 𝐞𝐧𝐠𝐢𝐧𝐞𝐞𝐫𝐢𝐧𝐠 𝐩𝐫𝐨𝐛𝐥𝐞𝐦𝐬 𝐰𝐨𝐫𝐬𝐞. Chernobyl Unit 4 managed to get into a weird state where lowering the control rods caused the reactor to explode. No engineer designed for that. The operators didn’t know that something unusual would happen if the reactor had been operating at low power for a while and some of the water had been shut off. And they had never seen the reactor’s state change that fast. The complicated internals of a nuclear reactor have nothing on the unknown complications that lurk in the hundreds of billions of weights that make up a modern LLM. >From these lessons in combination, we infer an additional lesson for engineers: If someone doesn’t know exactly what’s going on inside a complicated device subject to all these curses—speed, narrow margins, self-amplification, complications—then they should stop. They should shut it down immediately, the moment the behavior looks strange; don’t wait until the behavior becomes visibly concerning. >The operators at Chernobyl knew about delayed neutrons and prompt neutrons. They knew that a nuclear reactor walks a line a fraction of a percent wide between life and death. They knew the theory saying that a reactor’s apparently human-manageable timescale is an artifice, a clever contrivance that hides neutron generation times measured in microseconds. >A wise operator treats a device like that with respect. If the device starts behaving in any way odd or unexpected, then it is no longer operating inside the narrow, constrained region where they are sure they understand exactly what is going on. Which means that nobody knows what’s going on inside there anymore. Who knows whether the clever contrivances will keep working? They can only guess. When a dangerous device starts acting strangely, it is not time to withdraw all but eight control rods and expect the reactor to keep playing nice. It is time to shut it down. >The operators did not treat the reactor with that sort of respect. They knew, intellectually, that it could explode, but they had never seen the reactor change that fast. Besides, before 1986, the Soviets did not have a culture conducive to caution around nuclear reactors. They had a system where, if you didn’t perform the scheduled safety test, you got fired. >(In the coming chapters we’ll discuss the lack of safety culture prevailing in AI, which is much worse.)"

$HumanHarlan's tweet photo. Today is the 40th anniversary of the Chernobyl disaster. What can we learn from it? Four lessons with important implications for AI (from IF ANYONE BUILDS IT, EVERYONE DIES, by @ESYudkowsky and @So8res): >"1. 𝐀𝐧 𝐞𝐧𝐠𝐢𝐧𝐞𝐞𝐫𝐢𝐧𝐠 𝐜𝐡𝐚𝐥𝐥𝐞𝐧𝐠𝐞 𝐢𝐬 𝐦𝐮𝐜𝐡 𝐡𝐚𝐫𝐝𝐞𝐫 𝐭𝐨 𝐬𝐨𝐥𝐯𝐞 𝐰𝐡𝐞𝐧 𝐭𝐡𝐞 𝐮𝐧𝐝𝐞𝐫𝐥𝐲𝐢𝐧𝐠 𝐩𝐫𝐨𝐜𝐞𝐬𝐬𝐞𝐬 𝐫𝐮𝐧 𝐨𝐧 𝐭𝐢𝐦𝐞𝐬𝐜𝐚𝐥𝐞𝐬 𝐟𝐚𝐬𝐭𝐞𝐫 𝐭𝐡𝐚𝐧 𝐡𝐮𝐦𝐚𝐧𝐬 𝐜𝐚𝐧 𝐫𝐞𝐚𝐜𝐭. Transistors switch even faster than neutrons multiply. Engineers can con- trive to make events run slow enough for humans to react, but if the contrivance fails the humans are back to being frozen statues, on the timescale that matters. >2. 𝐀𝐧 𝐞𝐧𝐠𝐢𝐧𝐞𝐞𝐫𝐢𝐧𝐠 𝐜𝐡𝐚𝐥𝐥𝐞𝐧𝐠𝐞 𝐢𝐬 𝐦𝐮𝐜𝐡 𝐡𝐚𝐫𝐝𝐞𝐫 𝐭𝐨 𝐬𝐨𝐥𝐯𝐞 𝐰𝐡𝐞𝐧 𝐭𝐡𝐞𝐫𝐞 𝐢𝐬 𝐚 𝐧𝐚𝐫𝐫𝐨𝐰 𝐦𝐚𝐫𝐠𝐢𝐧 𝐟𝐨𝐫 𝐞𝐫𝐫𝐨𝐫, 𝐞𝐬𝐩𝐞𝐜𝐢𝐚𝐥𝐥𝐲 𝐢𝐟 𝐢𝐭’𝐬 𝐚 𝐧𝐚𝐫𝐫𝐨𝐰 𝐦𝐚𝐫𝐠𝐢𝐧 𝐛𝐞𝐭𝐰𝐞𝐞𝐧 “𝐮𝐧𝐢𝐦𝐩𝐫𝐞𝐬𝐬𝐢𝐯𝐞” 𝐚𝐧𝐝 “𝐞𝐱𝐩𝐥𝐨𝐬𝐢𝐯𝐞.” The analogy to intelligence is how apes and hominids wandered around for a few million years, and then got smart enough to set off a whole cascade of inventions: Agriculture led to writing led to science led to spacecraft. It would be a narrow target to make hominids that were intelligent enough to be profitable office workers, but not intelligent enough for explosive technological development. >3. 𝐒𝐞𝐥𝐟-𝐚𝐦𝐩𝐥𝐢𝐟𝐲𝐢𝐧𝐠 𝐩𝐫𝐨𝐜𝐞𝐬𝐬𝐞𝐬, 𝐥𝐢𝐤𝐞 𝐚𝐧 𝐨𝐯𝐞𝐫𝐡𝐞𝐚𝐭𝐢𝐧𝐠 𝐫𝐞𝐚𝐜𝐭𝐨𝐫 𝐛𝐨𝐢𝐥𝐢𝐧𝐠 𝐨𝐟𝐟 𝐢𝐭𝐬 𝐜𝐨𝐨𝐥𝐚𝐧𝐭 𝐰𝐚𝐭𝐞𝐫 𝐚𝐧𝐝 𝐭𝐡𝐞𝐧 𝐨𝐯𝐞𝐫𝐡𝐞𝐚𝐭- 𝐢𝐧𝐠 𝐦𝐨𝐫𝐞, 𝐥𝐞𝐚𝐯𝐞 𝐥𝐢𝐭𝐭𝐥𝐞 𝐫𝐨𝐨𝐦 𝐟𝐨𝐫 𝐞𝐫𝐫𝐨𝐫. And nuclear engineers don’t even have it that bad, compared to artificial superintelligence developers. Nuclear reac- tors that get too hot don’t start intelligently redesigning themselves to increase their own reactivity rate. Overheating nuclear reactors don’t start trying to fool the operators into complacency until the reactor is ready to fully explode. >4. 𝐂𝐨𝐦𝐩𝐥𝐢𝐜𝐚𝐭𝐢𝐨𝐧𝐬 𝐦𝐚𝐤𝐞 𝐞𝐧𝐠𝐢𝐧𝐞𝐞𝐫𝐢𝐧𝐠 𝐩𝐫𝐨𝐛𝐥𝐞𝐦𝐬 𝐰𝐨𝐫𝐬𝐞. Chernobyl Unit 4 managed to get into a weird state where lowering the control rods caused the reactor to explode. No engineer designed for that. The operators didn’t know that something unusual would happen if the reactor had been operating at low power for a while and some of the water had been shut off. And they had never seen the reactor’s state change that fast. The complicated internals of a nuclear reactor have nothing on the unknown complications that lurk in the hundreds of billions of weights that make up a modern LLM. >From these lessons in combination, we infer an additional lesson for engineers: If someone doesn’t know exactly what’s going on inside a complicated device subject to all these curses—speed, narrow margins, self-amplification, complications—then they should stop. They should shut it down immediately, the moment the behavior looks strange; don’t wait until the behavior becomes visibly concerning. >The operators at Chernobyl knew about delayed neutrons and prompt neutrons. They knew that a nuclear reactor walks a line a fraction of a percent wide between life and death. They knew the theory saying that a reactor’s apparently human-manageable timescale is an artifice, a clever contrivance that hides neutron generation times measured in microseconds. >A wise operator treats a device like that with respect. If the device starts behaving in any way odd or unexpected, then it is no longer operating inside the narrow, constrained region where they are sure they understand exactly what is going on. Which means that nobody knows what’s going on inside there anymore. Who knows whether the clever contrivances will keep working? They can only guess. When a dangerous device starts acting strangely, it is not time to withdraw all but eight control rods and expect the reactor to keep playing nice. It is time to shut it down. >The operators did not treat the reactor with that sort of respect. They knew, intellectually, that it could explode, but they had never seen the reactor change that fast. Besides, before 1986, the Soviets did not have a culture conducive to caution around nuclear reactors. They had a system where, if you didn’t perform the scheduled safety test, you got fired. >(In the coming chapters we’ll discuss the lack of safety culture prevailing in AI, which is much worse.)"$

Who to follow

₿elivin₿oy

@BitcoinwestOy

Sci-Exploitation

@SciExploitation

A sci-fi/grindhouse anthology of live-action and animated films. Escape From Planet Omega-12 now on @indiegogo #SciFi #Animation #SciExploitationi

Sean Raymond

@sraymond_astro

Dad. Husband. Building planetary systems. Asteroids, comets, rogue planets, interstellar objects, submoons. Astro poem book: https://t.co/3cxkB6oOjk He/him.

about 2 months ago

@TheZvi Usually great in a fresh session where I have a complex coding task for it. But for in-the-loop drudgery and longer sessions, it's much more likely than 4.6 to ignore explicit instructions or take lazier approaches.

256

FactsAndQuips retweeted

Alex Krusz ➡️ vibecamp!

@AlexKrusz

about 2 months ago

621

751

261K

FactsAndQuips retweeted

SubatomicArticles

@OptiMiserJoe

about 2 months ago

A fun exercise: Read AI discourse, but replace every instance of the word "AI" or "model" with "Vizier."

427

FactsAndQuips retweeted

about 2 months ago

While it's still fresh, pay attention to how it feels for a single training run to have just made the difference between "speculative risk" and "serious threat that the world is unprepared for." And take a moment to be thankful that this particular threat is one that can be mitigated by choosing to not deploy it (and that the first ones to reach this capability are choosing to not deploy it). If the industry succeeds at creating agentic machines that are capable enough to outsmart and outmaneuver humanity, choosing not to deploy won't be good enough. They'll just deploy themselves.

137

3 months ago

Star Trek: The Motion Picture (1979): Members of a sluggish “carbon-based infestation” must save Earth from a budding superintelligence while caught between two of its orifices. (Remarkably prescient!)

3 months ago

@HumanHarlan I (unfortunately) think it's because people like to cover the story in way where they don't have to be up front about this being an experiment under lab conditions. Murder is so "big, if true" that there's no avoiding that caveat, and caveated tellings are less viral.

FactsAndQuips retweeted

tetraspace (🛫🔜🇰🇷) 💎⏹️🇺🇳

@TetraspaceWest

3 months ago

accs say that, while a pause would be good, the US unilaterally pausing wouldn't fix the problem. Decel doomers, on the other hand, say that, while a pause would be good, the US unilaterally pausing wouldn't fix thr problem

FactsAndQuips retweeted

Nate Soares ⏹️

@So8res

3 months ago

Many experts warn that jumping off an emormous cliff without a parachute could lead to increased health insurance premiums, getting lost in the wilderness, or - in extreme cases - even death.

245

FactsAndQuips retweeted

3 months ago

Grok says xAI comes before national interests https://t.co/zNRwushCte

623

FactsAndQuips retweeted

Chris Painter

@ChrisPainterYup

3 months ago

119

FactsAndQuips retweeted

Nate Soares ⏹️

@So8res

3 months ago

Many of the ways this can go wrong are much derpier than "an engineer fumbled an esoteric alignment obstacle". Those obstacles would be suffient to kill us, but we're not on track to die from the dignified obstacles. Not even close. Shut it all down.

3 months ago

"Don't listen to doomers. History shows that a receding tide just means more beach!" The beach:

3 months ago

@bellaforristal Don't get me started on their browser biscuits.

305

FactsAndQuips retweeted

Rob Wiblin

@robertwiblin

3 months ago

Every AI lab is working to make their AI helpful, harmless and honest. Max Harms (@raelifin) thinks this is a complete wrong turn, and 'aligning' AI to human values is actively dangerous. In his view a safe AGI must have absolutely no opinion about how the world ought to be, be willingly modifiable, and be entirely indifferent to being shut down. The opposite of all commercial models today. The key appeal is that so-called 'corrigibility' could be an attractor state – get close enough and the AI actively helps you make it more corrigible over time. That forgiveness would at least give us a shot. It's a strategy that feels natural within the 'MIRI worldview', recently laid out by his colleagues @ESYudkowsky and @So8res in 'If Anyone Builds It Everyone Dies'. But it risks causing a different AI catastrophe, because the resulting AI model would necessarily be willing to assist any human operator with a power grab, or indeed any crime at all. I interviewed Max on the 80,000 Hours Podcast to debate the MIRI worldview, and what we should do to figure out if corrigibility ought to be our one and only focus. Links below – enjoy! 00:01:56 If anyone builds it, will everyone die? The MIRI perspective on AGI risk 00:24:28 Evolution failed to ‘align’ us, just as we'll fail to align AI 00:42:56 We're training AIs to want to stay alive and value power for its own sake 00:52:24 Objections: Is the 'squiggle/paperclip problem' really real? 01:05:02 Can we get empirical evidence re: 'alignment by default'? 01:10:17 Why do few AI researchers share Max's perspective? 01:18:34 We're training AI to pursue goals relentlessly — and superintelligence will too 01:24:51 The case for a radical slowdown 01:27:53 Max's best hope: corrigibility as stepping stone to alignment 01:32:34 Corrigibility is both uniquely valuable, and practical, to train 01:45:06 What training could ever make models corrigible enough? 01:51:38 Corrigibility is also terribly risky due to misuse risk 01:58:57 A single researcher could make a corrigibility benchmark. Nobody has. 02:12:20 Red Heart & why Max writes hard science fiction 02:34:08 Should you homeschool? Depends how weird your kids are.

467

464

296K

3 months ago

@KelseyTuoc Methinks there is much reason in his sayings.

218