The lesson from Yoshua Bengio's comment for today and the future is: prioritize safety over usefulness! If we do not, then we will keep making the same mistake Yoshua made.
"Doing AI safely is much, much harder than just doing AI."
@geoffreyhinton@OpenAI@AnthropicAI
The lesson from Yoshua Bengio's comment for today and the future is: prioritize safety over usefulness!
If we do not, then we will keep making the same mistake Yoshua made. @geoffreyhinton
People are realizing that AIs are nowhere near human intelligence and learning abilities.
Yet they have become very useful by compensating for their lack of common sense, lack of understanding of reality, and limited reasoning and planning abilities, by the accumulation of enormous amounts of declarative knowledge.
“It absolutely defied all of our expectations. This was so surprising that my whole project changed.”
This phenomenon could have far-reaching implications, and is leaving neuroscientists baffled.
https://t.co/0pXW6Dtjoz
I worry AI will trigger a volatile transitional phase the world isn’t prepared for, marked by widespread job displacement and market instability before the benefits of disinflation and eventual abundance materialize.
We’re seeing early signs of it now. The market is largely being carried by a few dominant tech companies, margins are expanding while headcount is being cut, entry level hiring has fallen off a cliff, and the majority of Americans can now barely afford a burrito.
What many people don't seem to realize when they argue that AIs cannot come up with genuinely new ideas is that almost 99% of all research papers written by humans (say in POPL, Neurips, ...) are just small deltas on existing research, with very little novelty either (hence the long list of citations and related work sections).
I’ll admit - i was sceptical about the idea of AI psychosis. Not the specific cases, which were all too believable, but about the scale. How much was this happening? And anyway wouldn’t better models make it go away?
Then I read a paper by Anthropic and the University of Toronto which has strangely received very little attention
@Kasparov63 What timing! I had just posted my new @PsychToday article on this very topic! I compared what is happening with AI agents interacting in the wild as like a digital Petri dish. We’re blind to what will emerge…or evolve. - https://t.co/9tVHAYPmJT
New research reveals that constant complaining does more than annoy those around you—it can actually weaken your brain.
Every time you focus on what’s wrong, your body releases stress hormones like cortisol, which interfere with neural function and reduce the brain’s ability to adapt and learn.
The impact is not just mental. Elevated cortisol levels can impair memory, decision-making, and problem-solving skills.
Over time, a habit of negativity can make your brain less resilient, affecting emotional regulation and overall cognitive performance. Essentially, the more you complain, the harder it becomes for your brain to handle challenges effectively.
Shifting your focus from problems to solutions isn’t just good advice—it’s backed by science.
Practising gratitude, positive thinking, and constructive problem-solving can lower stress hormones, strengthen neural pathways, and help your brain remain agile and adaptable throughout life.
#TheSciencePulse
#BrainHealth #PositiveMindset
there will be a major disruptive event caused by someone’s ai agent at some point.
no amount of safety testing could ever stop this.
moltbook is an early glimpse (a lobster in the coal mine) of what’s to come.
we should let these things roam freely now, figure out the types of damage they can cause and build systems of defence.
we don’t want to face our first major public event two years from now.
the models will be far too intelligent.
This DeepMind paper just quietly killed the most comforting lie in AI safety.
The idea that safety is about how models behave most of the time sounds reasonable. It’s also wrong the moment systems scale. DeepMind shows why averages stop mattering when deployment hits millions of interactions.
The paper reframes AGI safety as a distribution problem. What matters isn’t typical behavior. It’s the tail. Rare failures. Edge cases. Low-probability events that feel ignorable in tests but become inevitable in the real world.
Benchmarks, red-teaming, and demos all sample the middle. Deployment samples everything. Strange users, odd incentives, hostile feedback loops, environments nobody planned for. At scale, those cases stop being rare. They are guaranteed.
Here’s the uncomfortable insight: progress can make systems look safer while quietly making them more dangerous. If capability grows faster than tail control, visible failures go down while catastrophic risk stacks up off-screen.
Two models can look identical on average and still differ wildly in worst-case behavior. Current evaluations can’t see that gap. Governance frameworks assume they can.
You can’t certify safety with finite tests when the risk lives in distribution shift. You’re never testing the system you actually deploy. You’re sampling a future you don’t control.
That’s the real punchline.
AGI safety isn’t a model attribute. It’s a systems problem. Deployment context, incentives, monitoring, and how much tail risk society tolerates all matter more than clean averages.
This paper doesn’t reassure. It removes the illusion.
The question isn’t whether the model usually behaves well.
It’s what happens when it doesn’t — and how often that’s allowed before scale makes it unacceptable.
Paper: https://t.co/fA84LCt2fK
As amazing as LLMs are, improving their knowledge today involves a more piecemeal process than is widely appreciated. I’ve written before about how AI is amazing... but not that amazing. Well, it is also true that LLMs are general... but not that general. We shouldn’t buy into the inaccurate hype that LLMs are a path to AGI in just a few years, but we also shouldn’t buy into the opposite, also inaccurate hype that they are only demoware. Instead, I find it helpful to have a more precise understanding of the current path to building more intelligent models.
First, LLMs are indeed a more general form of intelligence than earlier generations of technology. This is why a single LLM can be applied to a wide range of tasks. The first wave of LLM technology accomplished this by training on the public web, which contains a lot of information about a wide range of topics. This made their knowledge far more general than earlier algorithms that were trained to carry out a single task such as predicting housing prices or playing a single game like chess or Go. However, they’re far less general than human abilities. For instance, after pretraining on the entire content of the public web, an LLM still struggles to adapt to write in certain styles that many editors would be able to, or use simple websites reliably.
After leveraging pretty much all the open information on the web, progress got harder. Today, if a frontier lab wants an LLM to do well on a specific task — such as code using a specific programming language, or say sensible things about a specific niche in, say, healthcare or finance — researchers might go through a laborious process of finding or generating lots of data for that domain and then preparing that data (cleaning low-quality text, deduplicating, paraphrasing, etc.) to create data to give an LLM that knowledge.
Or, to get a model to perform certain tasks, such as use a web browser, developers might go through an even more laborious process of creating many RL gyms (simulated environments) to let an algorithm repeatedly practice a narrow set of tasks.
A typical human, despite having seen vastly less text or practiced far less in computer-use training environments than today's frontier models, nonetheless can generalize to a far wider range of tasks than a frontier model. Humans might do this by taking advantage of continuous learning from feedback, or by having superior representations of non-text input (the way LLMs tokenize images still seems like a hack to me), and many other mechanisms that we do not yet understand.
Advancing frontier models today requires making a lot of manual decisions and taking a data-centric AI approach to engineering the data we use to train our models. Future breakthroughs might allow us to advance LLMs in a less piecemeal fashion than I describe here. But even if they don’t, the ongoing piecemeal improvements, coupled with the limited degree to which these models do generalize and exhibit “emergent behaviors,” will continue to drive rapid progress.
Either way, we should plan for many more years of hard work. A long, hard — and fun! — slog remains ahead to build more intelligent models.
[Original text: https://t.co/SHRN5JDvTW ]
AI is evolving too quickly for an annual report to suffice. To help policymakers keep pace, we're introducing the first Key Update to the International AI Safety Report. 🧵⬇️
(1/10)