Anastasia Razdaibiedina

@razdaibi

Research Scientist @GoogleDeepMind | PhD @UofT 🇨🇦 ex-@MetaAI @MSFTResearch | efficient ML · data · lifelong learning · AI agents |🏃‍♀️🎸🧘‍♀️🧋| made in 🇺🇦

Joined January 2023

182 Following

219 Followers

431 Posts

Pinned Tweet

Anastasia Razdaibiedina @razdaibi

3 months ago

Happy to share that I started a new role as a Research Scientist at Google DeepMind Toronto working with amazing @kswersk and the team! Looking forward to new adventures 🥳🤩🚀🇨🇦

razdaibi retweeted

Akari Asai

@AkariAsai

6 days ago

DR Tulu has been selected for an oral presentation at ICML 2026 (0.7% of all submissions) 🥳 Check out our latest version, featuring additional ablations and a deeper analysis of RL with evolving rubrics for unverifiable open ended tasks!

razdaibi retweeted

Akshay 🚀

@akshay_pachaar

4 days ago

Google just dropped a new LLM! You can run it locally on just 8GB RAM. Let's fine-tune this on our own data (100% locally):

127

356

783K

razdaibi retweeted

Dimitris Papailiopoulos

@DimitrisPapail

2 months ago

https://t.co/lbjcGDxpJn

144

474K

Who to follow

Jesse Dodge

@JesseDodge

Research Scientist at Meta. 10-yr test-of-time ACL 22, Best Demo ACL 25, Best Resource Paper ACL 24, Best Theme Paper ACL 24, Best Student Paper NAACL 15 🏳️‍🌈

Tianyu Gao

@gaotianyu1350

@Meta MSL TBD lab and incoming assistant prof. @UCSanDiego. Prev @OpenAI @Princeton @Tsinghua_Uni

Alex Fabbri

@alexfabbri4

Research @meta superintelligence labs; @scale_AI @SFResearch; PhD @Yale; BA @Columbia; Opinions are my own.

razdaibi retweeted

James Zou @james_y_zou

2 months ago

Training multi-agent teams is hard. #AgentFlow comes to the rescue. We introduce Flow-GRPO, an efficient method to train multi-agent teams. Improves planning and tool use. Selected as an #ICLR2026 Oral (top 1%)🚀

201

143

28K

razdaibi retweeted

Antonio Lupetti

@antoniolupetti

2 months ago

"Why We Think" by Lilian Weng is a serious look at how LLMs reason. The argument: more thinking time doesn't automatically mean better reasoning. The inference strategy is what actually moves the needle. Technical and worth the 40 minutes. https://t.co/ZJ6vwGN34p

160

84K

razdaibi retweeted

Jeff Dean

@JeffDean

2 months ago

Today we're releasing Gemma 4, our new family of open foundation models, built on the same research and technology as our Gemini 3 series. These models set a new standard for open intelligence, offering SOTA reasoning capabilities from edge-scale (2B and 4B w/ vision/audio) up to a 26B parameter MoE model and a 31B dense model. By releasing Gemma 4 under the Apache 2.0 license, we hope to enable more innovation across the research and developer communities. Our earlier Gemma 3 models were downloaded 400M times and over 100,000 variants of those models have been published, so we're excited to see what the community will do with the even better Gemma 4 models! Learn more at https://t.co/BW6O3Gr8bc and https://t.co/8M0XSQSP4u Great work by everyone involved! #Gemma4 #AI #OpenSource #ML

177

176

100K

razdaibi retweeted

Tri Dao

@tri_dao

2 months ago

It's my favorite kind of work: linear algebra insight + fast kernels. When playing w Muon a while ago, we were thinking why not speed it up by operating on the small square matrix X X^T instead of the large rectangular matrix X. Jack, Noah, and Berlin spent many months understanding eigenvalues/vectors of the intermediate matrices in Muon, and finally came up with a simple and elegant algo to make this work.

610

90K

razdaibi retweeted

Google Research

@GoogleResearch

3 months ago

Introducing TurboQuant: Our new compression algorithm that reduces LLM key-value cache memory by at least 6x and delivers up to 8x speedup, all with zero accuracy loss, redefining AI efficiency. Read the blog to learn how it achieves these results: https://t.co/CDSQ8HpZoc

39K

22K

19M

razdaibi retweeted

Niklas Muennighoff @Muennighoff

3 months ago

One gem from Composer paper is that RL improved both pass@k & pass@1. Suggests RL does not just reweigh existing capabilities but also teaches new ones? 💎

Muennighoff's tweet photo. One gem from Composer paper is that RL improved both pass@k & pass@1. Suggests RL does not just reweigh existing capabilities but also teaches new ones? 💎 https://t.co/7JNMI9kYka

328

182

62K

razdaibi retweeted

News from Google

@NewsFromGoogle

3 months ago

Google was just named #1 in the @FastCompany 2026 World’s Most Innovative Companies list. 🎉 Google is also ranked #1 in their Artificial Intelligence category. See the full story. https://t.co/SJYa8hgJVm

715

254K

razdaibi retweeted

Nathan Lambert

@natolambert

3 months ago

Really excited for the things @rosstaylor90 has been building to go out into the world. He's been one of the people I can always rely on to have non-cope takes on what we need to do to make the open ecosystem great. What a great time in RL.

198

28K

razdaibi retweeted

Cursor @cursor_ai

3 months ago

We're releasing a technical report describing how Composer 2 was trained.

169

480

razdaibi retweeted

He He

@hhexiy

3 months ago

https://t.co/H3TAsaThYQ

876

128

118K

Anastasia Razdaibiedina @razdaibi

3 months ago

@kevinwty Cool, thanks :)

Anastasia Razdaibiedina @razdaibi

3 months ago

@jennyzhangzt @jennyzhangzt great work! So the improvement loop operates over programs and the meta-level improvement strategy without updating LLM weights. Do you see a viable extension where the system could safely incorporate gradient-based updates of the LLM itself?

razdaibi retweeted

Jenny Zhang

@jennyzhangzt

3 months ago

Introducing Hyperagents: an AI system that not only improves at solving tasks, but also improves how it improves itself. The Darwin Gödel Machine (DGM) demonstrated that open-ended self-improvement is possible by iteratively generating and evaluating improved agents, yet it relies on a key assumption: that improvements in task performance (e.g., coding ability) translate into improvements in the self-improvement process itself. This alignment holds in coding, where both evaluation and modification are expressed in the same domain, but breaks down more generally. As a result, prior systems remain constrained by fixed, handcrafted meta-level procedures that do not themselves evolve. We introduce Hyperagents – self-referential agents that can modify both their task-solving behavior and the process that generates future improvements. This enables what we call metacognitive self-modification: learning not just to perform better, but to improve at improving. We instantiate this framework as DGM-Hyperagents (DGM-H), an extension of the DGM in which both task-solving behavior and the self-improvement procedure are editable and subject to evolution. Across diverse domains (coding, paper review, robotics reward design, and Olympiad-level math solution grading), hyperagents enable continuous performance improvements over time and outperform baselines without self-improvement or open-ended exploration, as well as prior self-improving systems (including DGM). DGM-H also improves the process by which new agents are generated (e.g. persistent memory, performance tracking), and these meta-level improvements transfer across domains and accumulate across runs. This work was done during my internship at Meta (@AIatMeta), in collaboration with Bingchen Zhao (@BingchenZhao), Wannan Yang (@winnieyangwn), Jakob Foerster (@j_foerst), Jeff Clune (@jeffclune), Minqi Jiang (@MinqiJiang), Sam Devlin (@smdvln), and Tatiana Shavrina (@rybolos).

jennyzhangzt's tweet photo. Introducing Hyperagents: an AI system that not only improves at solving tasks, but also improves how it improves itself.

The Darwin Gödel Machine (DGM) demonstrated that open-ended self-improvement is possible by iteratively generating and evaluating improved agents, yet it relies on a key assumption: that improvements in task performance (e.g., coding ability) translate into improvements in the self-improvement process itself. This alignment holds in coding, where both evaluation and modification are expressed in the same domain, but breaks down more generally. As a result, prior systems remain constrained by fixed, handcrafted meta-level procedures that do not themselves evolve.

We introduce Hyperagents – self-referential agents that can modify both their task-solving behavior and the process that generates future improvements. This enables what we call metacognitive self-modification: learning not just to perform better, but to improve at improving.

We instantiate this framework as DGM-Hyperagents (DGM-H), an extension of the DGM in which both task-solving behavior and the self-improvement procedure are editable and subject to evolution. Across diverse domains (coding, paper review, robotics reward design, and Olympiad-level math solution grading), hyperagents enable continuous performance improvements over time and outperform baselines without self-improvement or open-ended exploration, as well as prior self-improving systems (including DGM). DGM-H also improves the process by which new agents are generated (e.g. persistent memory, performance tracking), and these meta-level improvements transfer across domains and accumulate across runs.

This work was done during my internship at Meta (@AIatMeta), in collaboration with Bingchen Zhao (@BingchenZhao), Wannan Yang (@winnieyangwn), Jakob Foerster (@j_foerst), Jeff Clune (@jeffclune), Minqi Jiang (@MinqiJiang), Sam Devlin (@smdvln), and Tatiana Shavrina (@rybolos).

158

657

501K

Anastasia Razdaibiedina @razdaibi

3 months ago

@MainzOnX Great 🤩

Anastasia Razdaibiedina @razdaibi

3 months ago

@_christinabaek Thanks 😊

Anastasia Razdaibiedina

@razdaibi

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users