Daniel @daniel_berns - Twitter Profile

about 1 month ago

New blackboard lecture w @ericjang11 He walks through how to build AlphaGo from scratch, but with modern AI tools. Sometimes you understand the future better by stepping backward. AlphaGo is still the cleanest worked example of the primitives of intelligence: search, learning from experience, and self-play. You have to go back to 2017 to get insight into how the more general AIs of the future might learn. Once he explained how AlphaGo works, it gave us the context to have a discussion about how RL works in LLMs and how it could work better – naive policy gradient RL has to figure out which of the 100k+ tokens in your trajectory actually got you the right answer, while AlphaGo’s MCTS suggests a strictly better action every single move, giving you a training target that sidesteps the credit assignment problem. The way humans learn is surely closer to the second. Eric also kickstarted an Autoresearch loop on his project. And it was very interesting to discuss which parts of AI research LLMs can already automate pretty well (implementing and running experiments, optimizing hyperparameters) and which they still struggle with (choosing the right question to investigate next, escaping research dead ends). Informative to all the recent discussion about when we should expect an intelligence explosion, and what it would look like from the inside. Timestamps: 0:00:00 – Basics of Go 0:08:06 – Monte Carlo Tree Search 0:31:53 – What the neural network does 1:00:22 – Self-play 1:25:27 – Alternative RL approaches 1:45:36 – Why doesn’t MCTS work for LLMs 2:00:58 – Off-policy training 2:11:51 – RL is even more information inefficient than you thought 2:22:05 – Automated AI researchers

65

3K

285

3K

694K

daniel_berns retweeted

annie @_annieversary

3 months ago

what the Fuck https://t.co/jSWRRiEBQ5

188

8K

791

5K

1M

Daniel @daniel_berns

3 months ago

Pay per fail

Shishir

@ShishirShelke1

3 months ago

Artemis II crew is thousands of miles away from Earth And they’re asking ground crew for help because they have two versions of Microsoft Outlook open and neither is working This scene is now canon 😭

790

177K

21K

20K

7M

0

2

daniel_berns retweeted

𝚃𝙷𝙴 𝚆𝙷𝙸𝚃𝙴 𝚁𝙰𝙱𝙱𝙸𝚃

@White_Rabbit_OG

3 months ago

Finally 👏👏👏

401

81K

8K

7K

2M

Who to follow

Sourodip Kundu

@KunduSourodip

Algorithmist. Interest in #Data #AI #Blockchain #Crypto #QuantumComputing #ComputationalFinance

Laksana Utama

@Aiolia21

Comp-sci maniac, Travelling enthusiast.

Marc Molina - ICML 🛩️

@marcm_77

ML Engineer - PhD at CERN and UPF, Building Kosmico the AI research workspace where your lab thinks together. Shipping a SaaS from Europe.

daniel_berns retweeted

Turing Post

@TheTuringPost

4 months ago

A useful survey – "Anatomy of Agentic Memory" Explains why agent memory systems often fail in practice, focusing on how systems store and manage information over long interactions Covers: - Memory-Augmented Generation (MAG) - Agent memory architectures - lightweight semantic, entity-centric/personalized, episodic & reflective, structured/hierarchical - Benchmark saturation and metric problems - Backbone dependence - LLM-as-a-judge instability - System costs: latency, retrieval overhead, throughput

TheTuringPost's tweet photo. A useful survey – "Anatomy of Agentic Memory"

Explains why agent memory systems often fail in practice, focusing on how systems store and manage information over long interactions

Covers:

- Memory-Augmented Generation (MAG)
- Agent memory architectures - lightweight semantic, entity-centric/personalized, episodic & reflective, structured/hierarchical
- Benchmark saturation and metric problems
- Backbone dependence
- LLM-as-a-judge instability
- System costs: latency, retrieval overhead, throughput

7

111

25

116

7K

daniel_berns retweeted

Andrej Karpathy

@karpathy

5 months ago

New art project. Train and inference GPT in 243 lines of pure, dependency-free Python. This is the *full* algorithmic content of what is needed. Everything else is just for efficiency. I cannot simplify this any further. https://t.co/HmiRrQugnP

646

25K

3K

29K

5M

Daniel @daniel_berns

5 months ago

@snaut66 @bayraktar_1love Ukrainian "space debris"?

0

36

Daniel @daniel_berns

7 months ago

@LucasSa56947288 Holding hands?

0

2

daniel_berns retweeted

Juan Francisco @medpedmex

8 months ago

@Farmaenfurecida https://t.co/u7X8Vb8scr

2

45

9

22

2K

Daniel @daniel_berns

9 months ago

@NVIDIAGeForce GeForce Day

0

1

daniel_berns retweeted

Sakana AI

@SakanaAILabs

10 months ago

What if we could evolve AI models like organisms in nature, letting them compete, mate, and combine their strengths to produce ever-fitter offspring? Excited to share our new work: “Competition and Attraction Improve Model Fusion” presented at GECCO’25🦎 where it was a runner-up for best paper! Paper: https://t.co/ihfdriOPNw Code: https://t.co/lOW2ghb5bj Summary of Paper At Sakana AI, we draw inspiration from nature’s evolutionary processes to build the foundation of future AI systems. Nature doesn’t create one single, monolithic organism; it fosters a diverse ecosystem of specialized individuals that compete, cooperate, and combine their traits to adapt and thrive. We believe AI development can follow a similar path. What if instead of building one giant monolithic AI, we could evolve a whole ecosystem of specialized models that collaborate and combine their skills? Like a school of fish 🐟, where collective intelligence emerges from the group. This new paper builds on our previous research on model merging, which follows such an evolutionary path. We started by using evolution to find the best “recipes” to merge existing models (our Nature Machine Intelligence paper: https://t.co/zqs7kGL4Gl). Then, we explored how to maintain diversity to acquire new skills in LLMs (our ICLR 2025 paper: https://t.co/00Buf1051A). Now, we're combining these ideas into a full evolutionary system. A key limitation remained in earlier work: model merging required manually defining how models should be partitioned (e.g., by fixed layer or blocks) before they could be combined. What if we could let evolution figure that out too? Our new paper proposes M2N2 (Model Merging of Natural Niches), a more fluid method, which overcomes this with three key, nature-inspired ideas: 1/ Evolving Merging Boundaries 🌿: Instead of merging models using pre-defined, static boundaries (e.g. fixed layers), M2N2 dynamically evolves the “split-points” for merging. This allows for a far more flexible and powerful exploration of parameter combinations, like swapping variable-length segments of DNA rather than entire chromosomes. 2/ Diversity through Competition 🐠: To ensure we have a rich pool of models to merge, M2N2 makes them compete for limited resources (i.e., data points in a training set). This forces models to specialize and find their own “niche,” creating a population of diverse, high-performing specialists that are perfect for merging. 3/ Attraction and Mate Selection 💏: Merging models can be computationally expensive. M2N2 introduces an “attraction” heuristic that intelligently pairs models for fusion based on their complementary strengths—choosing partners that perform well where the other is weak. This makes the evolutionary search much more efficient. Does it work? The results are fascinating: This is the first time model merging has been used to evolve models entirely from scratch, outperforming other evolutionary algorithms. In one experiment, starting with random networks, M2N2 evolved an MNIST classifier that achieves performance comparable to CMA-ES, but is far more computationally efficient. Does it scale? We also showed that M2N2 can scale to large, pre-trained models: We used M2N2 to merge a math specialist LLM with an agentic specialist LLM. M2N2 produced a merged model that excelled at both math and web shopping tasks, significantly outperforming other methods. The flexible split-point was crucial here. Does it work on multimodal models? When we applied M2N2 to text-to-image models, we merged several models by adapting them only for Japanese prompts. The resulting model not only improved on Japanese but also retained its strong English capabilities—a key advantage over fine-tuning, which can suffer from catastrophic forgetting. This nature-inspired approach is central to Sakana AI’s mission to find new foundations for AI based on collective intelligence. Rather than scaling monolithic models, we envision a future where ecosystems of diverse, specialized models co-evolve, collaborate, and combine, leading to more adaptive, robust, and creative AI. 🐙 We hope this work sparks more interest in these under-explored ideas! Published in ACM GECCO’25: Proceedings of the Genetic and Evolutionary Computation Conference. DOI: https://t.co/5eSwhvs5tQ

SakanaAILabs's tweet photo. What if we could evolve AI models like organisms in nature, letting them compete, mate, and combine their strengths to produce ever-fitter offspring?

Excited to share our new work: “Competition and Attraction Improve Model Fusion” presented at GECCO’25🦎 where it was a runner-up for best paper!

Paper: https://t.co/ihfdriOPNw
Code: https://t.co/lOW2ghb5bj

Summary of Paper

At Sakana AI, we draw inspiration from nature’s evolutionary processes to build the foundation of future AI systems. Nature doesn’t create one single, monolithic organism; it fosters a diverse ecosystem of specialized individuals that compete, cooperate, and combine their traits to adapt and thrive. We believe AI development can follow a similar path.

What if instead of building one giant monolithic AI, we could evolve a whole ecosystem of specialized models that collaborate and combine their skills? Like a school of fish 🐟, where collective intelligence emerges from the group.

This new paper builds on our previous research on model merging, which follows such an evolutionary path. We started by using evolution to find the best “recipes” to merge existing models (our Nature Machine Intelligence paper: https://t.co/zqs7kGL4Gl). Then, we explored how to maintain diversity to acquire new skills in LLMs (our ICLR 2025 paper: https://t.co/00Buf1051A). Now, we're combining these ideas into a full evolutionary system.

A key limitation remained in earlier work: model merging required manually defining how models should be partitioned (e.g., by fixed layer or blocks) before they could be combined. What if we could let evolution figure that out too?

Our new paper proposes M2N2 (Model Merging of Natural Niches), a more fluid method, which overcomes this with three key, nature-inspired ideas:

1/ Evolving Merging Boundaries 🌿: Instead of merging models using pre-defined, static boundaries (e.g. fixed layers), M2N2 dynamically evolves the “split-points” for merging. This allows for a far more flexible and powerful exploration of parameter combinations, like swapping variable-length segments of DNA rather than entire chromosomes.

2/ Diversity through Competition 🐠: To ensure we have a rich pool of models to merge, M2N2 makes them compete for limited resources (i.e., data points in a training set). This forces models to specialize and find their own “niche,” creating a population of diverse, high-performing specialists that are perfect for merging.

3/ Attraction and Mate Selection 💏: Merging models can be computationally expensive. M2N2 introduces an “attraction” heuristic that intelligently pairs models for fusion based on their complementary strengths—choosing partners that perform well where the other is weak. This makes the evolutionary search much more efficient.

Does it work?

The results are fascinating: This is the first time model merging has been used to evolve models entirely from scratch, outperforming other evolutionary algorithms. In one experiment, starting with random networks, M2N2 evolved an MNIST classifier that achieves performance comparable to CMA-ES, but is far more computationally efficient.

Does it scale?

We also showed that M2N2 can scale to large, pre-trained models: We used M2N2 to merge a math specialist LLM with an agentic specialist LLM. M2N2 produced a merged model that excelled at both math and web shopping tasks, significantly outperforming other methods. The flexible split-point was crucial here.

Does it work on multimodal models?

When we applied M2N2 to text-to-image models, we merged several models by adapting them only for Japanese prompts. The resulting model not only improved on Japanese but also retained its strong English capabilities—a key advantage over fine-tuning, which can suffer from catastrophic forgetting.

This nature-inspired approach is central to Sakana AI’s mission to find new foundations for AI based on collective intelligence. Rather than scaling monolithic models, we envision a future where ecosystems of diverse, specialized models co-evolve, collaborate, and combine, leading to more adaptive, robust, and creative AI. 🐙

We hope this work sparks more interest in these under-explored ideas!

Published in ACM GECCO’25: Proceedings of the Genetic and Evolutionary Computation Conference. DOI: https://t.co/5eSwhvs5tQ

39

826

150

484

177K

daniel_berns retweeted

Quanta Magazine

@QuantaMagazine

12 months ago

A team of mathematicians just built the first “monostable” tetrahedron. Someday, it might help inform the design of a self-righting spacecraft. https://t.co/MshP0f8fgk

8

286

73

64

26K

Daniel @daniel_berns

12 months ago

@pickover 1) a * (100 + 10 + 1) / (3 * a) = b then b = 111/3 = 37.

0

37

daniel_berns retweeted

Google AI Developers

@googleaidevs

12 months ago

YouTube tutorial + Gemini CLI

7

201

8

90

31K

daniel_berns retweeted

Gary Marcus

@GaryMarcus

12 months ago

BREAKING: Explosive new paper from MIT/Harvard/UChicago. Things just got worse — a lot worse — for LLM’s and the myth that they can understand and reason. The paper documents a pattern they called Potemkins, a kind of reasoning inconsistency (see figure below). They show that LLMs - even models like o3 — make these errors frequently. You can’t possibly create AGI based on machines that cannot keep consistent with their own assertions. You just can’t. “success on benchmarks only demonstrates potemkin understanding: the illusion of understanding driven by answers irreconcilable with how any human would interpret a concept … these failures reflect not just incorrect understanding, but deeper internal incoherence in concept representations” Game over for any hopes of building AGI on a pure LLM substrate. cc @geoffreyhinton, checkmate.

GaryMarcus's tweet photo. BREAKING: Explosive new paper from MIT/Harvard/UChicago.

Things just got worse — a lot worse — for LLM’s and the myth that they can understand and reason.

The paper documents a pattern they called Potemkins, a kind of reasoning inconsistency (see figure below). They show that LLMs - even models like o3 — make these errors frequently.

You can’t possibly create AGI based on machines that cannot keep consistent with their own assertions. You just can’t.

“success on benchmarks only demonstrates potemkin understanding: the illusion of understanding driven by answers irreconcilable with how any human would interpret
a concept … these failures reflect not just incorrect understanding, but deeper internal incoherence in concept representations”

Game over for any hopes of building AGI on a pure LLM substrate. cc @geoffreyhinton, checkmate.

227

3K

577

3K

417K

Daniel @daniel_berns

about 1 year ago

Nice article for the weekend

Sergey Levine

@svlevine

about 1 year ago

Self-supervised representation learning looks a bit like RL. What if we literally use RL as a SSL method for visual representations? Turns out that it works quite well. In new work by @its_dibya, we show how this can be done: https://t.co/JTTbqIW2Pv

7

581

98

480

49K

0

34

daniel_berns retweeted

Sergey Levine

@svlevine

about 1 year ago

Self-supervised representation learning looks a bit like RL. What if we literally use RL as a SSL method for visual representations? Turns out that it works quite well. In new work by @its_dibya, we show how this can be done: https://t.co/JTTbqIW2Pv

7

581

98

480

49K

Daniel @daniel_berns

about 1 year ago

About detection of emerging system of values in LLM.

Ethan Mollick

@emollick

about 1 year ago

https://t.co/7eNytkUmIp

3

46

4

29

16K

0

7

Daniel @daniel_berns

about 1 year ago

Design principles for agents

Jerry Liu

@jerryjliu0

about 1 year ago

We’re excited to release an interactive guide highlighting the definitive set of principles for building AI agents 🔥 Based on the popular 12-Factor agents repo by @dexhorthy. We packaged the principles into an interactive website and Colab notebook with working code examples, so that you can incorporate these principles into your agent application in minutes. As a refresher - some of the principles include getting structured outputs from tools, state management, checkpointing, human-in-the-loop, error handling, composing smaller agents into bigger ones, and more. This is amazing work by @seldo and should be required reading for anyone looking to get started building agents. Website: https://t.co/4F1yt37OwS Notebook: https://t.co/jiNCzQt05H

5

123

22

166

34K

0

12

Daniel @daniel_berns

about 1 year ago

Paper about reasoning with multimodal input.

Xin Eric Wang

@xwang_lk

about 1 year ago

Project scooped by OpenAI? No worries! Proprietary LLM methods are behind the door and not the only way forward. Introducing GRIT (Grounded Reasoning with Images & Text): Teach Multimodal LLMs to think with images for general visual reasoning tasks, with just 20 examples! Pure RL, no supervised fine-tuning, no labeled reasoning data. Simpler, efficient, and fully open-source. 🚀 #OpenSource #MLLM #reasoning

xwang_lk's tweet photo. Project scooped by OpenAI? No worries! Proprietary LLM methods are behind the door and not the only way forward.

Introducing GRIT (Grounded Reasoning with Images & Text): Teach Multimodal LLMs to think with images for general visual reasoning tasks, with just 20 examples! Pure RL, no supervised fine-tuning, no labeled reasoning data.

Simpler, efficient, and fully open-source. 🚀 #OpenSource #MLLM #reasoning

4

260

36

227

49K

0

1

0

125

Daniel

@daniel_berns

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users