Ajay Sridhar @ajaysridhar0 - Twitter Profile

Pinned Tweet

8 months ago

VLAs are great, but most lack long-term memory humans use for everyday tasks. This is a critical gap for solving complex, long-horizon problems. Introducing MemER: Scaling Up Memory for Robot Control via Experience Retrieval. A thread 🧵 (1/8)

6

319

45

201

43K

ajaysridhar0 retweeted

Jubayer Ibn Hamid

@jubayer_hamid

4 days ago

The most capable reasoning systems in AI scale inference compute along several axes: sequential compute to think longer, parallel compute to sample many independent attempts, and aggregative compute to synthesize prior traces into a new improved one. But during training, we only optimize how models use sequential compute. This creates a fundamental mismatch between how we ultimately deploy these systems and how we train them, leaving much of search and synthesis unoptimized. We introduce SPIRAL, an RL framework for making all inference-compute primitives end-to-end learnable: models learn to coordinate sequential, parallel, and aggregative reasoning using only the reward of the final output. Work with @ifdita_hasan (co-lead), @michaelyli_ , @oshaikh13 , @yoonholeee , @DorsaSadigh , @chelseabfinn , @noahdgoodman 🧵

jubayer_hamid's tweet photo. The most capable reasoning systems in AI scale inference compute along several axes: sequential compute to think longer, parallel compute to sample many independent attempts, and aggregative compute to synthesize prior traces into a new improved one. But during training, we only optimize how models use sequential compute. This creates a fundamental mismatch between how we ultimately deploy these systems and how we train them, leaving much of search and synthesis unoptimized.

We introduce SPIRAL, an RL framework for making all inference-compute primitives end-to-end learnable: models learn to coordinate sequential, parallel, and aggregative reasoning using only the reward of the final output. Work with @ifdita_hasan (co-lead), @michaelyli_ , @oshaikh13 , @yoonholeee , @DorsaSadigh , @chelseabfinn , @noahdgoodman 🧵

14

391

84

451

141K

ajaysridhar0 retweeted

Jadelynn @_jadelynn

16 days ago · Stanford

test-time compute [ttc] in robotics isn't free & isn't always worth it. smart allocation of ttc recovers frontier-level planning at a fraction of the cost! coauthor @milanganai w/ Yasmina @ajaysridhar0 Mozghan @katielulula Clark Barrett @jiajunwu_cs @chelseabfinn @drmapavone 🧵

3

64

14

34

40K

ajaysridhar0 retweeted

Gabe Guo 🦄 @therealgabeguo

about 2 months ago

🚨New time series generative model just dropped. Paper: https://t.co/2pdg4Y1Ig4 Demo: https://t.co/llqVryo2tg ⏰Meet ABC: Any-Subset Autoregressive Diffusion Bridges in Continuous Time & Space. With @StefanoErmon @elon_lit @Jose_Blanchet @thanawatsornwan @lutong_hao

3

124

22

73

17K

Ajay Sridhar @ajaysridhar0

3 months ago

Excited to present MemER at ICLR 2026 in April. The code is now open source: https://t.co/hG5lI1Fgeg

Ajay Sridhar @ajaysridhar0

8 months ago

VLAs are great, but most lack long-term memory humans use for everyday tasks. This is a critical gap for solving complex, long-horizon problems. Introducing MemER: Scaling Up Memory for Robot Control via Experience Retrieval. A thread 🧵 (1/8)

6

319

45

201

43K

1

37

4

16

4K

ajaysridhar0 retweeted

Patrick Yin

@patrickhyin

3 months ago

We’re releasing OmniReset, a framework for training robot policies using large-scale RL and diverse resets for contact-rich, dexterous manipulation. OmniReset pushes the frontier of robustness and dexterity, without any reward engineering or demonstrations. Try the policies yourself in our interactive simulator! https://t.co/3hW3nYx2vD (1/N 🧵)

21

466

93

305

109K

ajaysridhar0 retweeted

Jensen Gao @jensen_gao

3 months ago

Understanding generalization in robotics can be tricky. If a robot does the dishes in a new kitchen, does this require new behavior, or is the countertop just a new color? Excited to share RADAR 📡, work I did at @GoogleDeepMind towards better characterizing robot evaluations.

4

65

9

38

12K

ajaysridhar0 retweeted

Yinpei Dai @YinpeiD

4 months ago

Robot memory methods are growing fast, but systematic evaluation is largely lacking. 📉 Introducing RoboMME: a new benchmark for memory-augmented robotic manipulation! 🤖🧠 Featuring 16 tasks across temporal, spatial, object, and procedural memory 🔗 https://t.co/4ELtnhDwrt

7

236

48

153

72K

ajaysridhar0 retweeted

Marcel Torné @marceltornev

4 months ago

We equipped PI policies with memory! And taught our robots to do long-horizon real world tasks such as preparing the items for a recipe, cooking a grilled cheese and cleaning the kitchen!

8

89

15

7

10K

ajaysridhar0 retweeted

Noriaki Hirose @Noriaki_Hirose

4 months ago

My journey at UC Berkeley is coming to an end as I return to Japan. Over the past four years, I’ve had the privilege of collaborating with @svlevine and his students and I sincerely appreciate their support and contributions. I’ve learned and grown tremendously through our works

Noriaki_Hirose's tweet photo. My journey at UC Berkeley is coming to an end as I return to Japan. Over the past four years, I’ve had the privilege of collaborating with @svlevine and his students and I sincerely appreciate their support and contributions. I’ve learned and grown tremendously through our works https://t.co/eZF8iCJC8t

3

107

4

7

6K

ajaysridhar0 retweeted

Yanjiang Guo

@Yanjiang_Guo

4 months ago

Excited to share VLAW: Iterative Co-Improvement of Vision-Language-Action Policy and World Model We explore improving VLA inside a learned world model, and find that the key is to jointly improve VLA and WM! Website: https://t.co/Hfkff47bHZ

Yanjiang_Guo's tweet photo. Excited to share VLAW: Iterative Co-Improvement of Vision-Language-Action Policy and World Model

We explore improving VLA inside a learned world model, and find that the key is to jointly improve VLA and WM!

Website: https://t.co/Hfkff47bHZ

4

272

43

183

59K

ajaysridhar0 retweeted

Noriaki Hirose @Noriaki_Hirose

4 months ago

Robotic foundation models generalize well—but high inference latency limits real-time deployment. 🚀 AsyncVLA enables real-time control of large robotic models, even under network delays. Great collaboration with @CatGlossop ,@shahdhruv_ & @svlevine ! #EmbodiedAI #EdgeAI

7

320

51

238

55K

ajaysridhar0 retweeted

Perry Dong @perryadong

5 months ago

Reinforcement learning doesn't scale like supervised learning—yet We introduce Transformer Q-Learning (TQL): a method that unlocks scaling of transformer-based value functions in RL We show that value-based RL can also achieve performance gains through scale (1/7)

perryadong's tweet photo. Reinforcement learning doesn't scale like supervised learning—yet

We introduce Transformer Q-Learning (TQL): a method that unlocks scaling of transformer-based value functions in RL

We show that value-based RL can also achieve performance gains through scale

(1/7) https://t.co/rtMFTABB7g

10

269

38

257

54K

ajaysridhar0 retweeted

Jie Wang

@JieWang_ZJUI

6 months ago

VLAs nowadays enable robotic manipulation to perform impressive tasks like folding clothes, making coffee, and cleaning dishes. However, surprisingly, most VLAs lack memory. Unlike their close relatives LLMs, VLAs have no context window and no access to history. This causes them to repeatedly fail in the same way without learning from online experience. But why? Why not simply extend the context window like LLMs? It's not that we don't want to -- it's because it's extremely difficult. Here, I share a talk by @chelseabfinn at NeurIPS that scope the challenges in developing long-horizon autonomy for embodied agents. At the end, there's a reading list on memory for robotics. ⭐

JieWang_ZJUI's tweet photo. VLAs nowadays enable robotic manipulation to perform impressive tasks like folding clothes, making coffee, and cleaning dishes. However, surprisingly, most VLAs lack memory. Unlike their close relatives LLMs, VLAs have no context window and no access to history. This causes them to repeatedly fail in the same way without learning from online experience.

But why? Why not simply extend the context window like LLMs? It's not that we don't want to -- it's because it's extremely difficult. Here, I share a talk by @chelseabfinn at NeurIPS that scope the challenges in developing long-horizon autonomy for embodied agents. At the end, there's a reading list on memory for robotics. ⭐

7

368

49

300

21K

ajaysridhar0 retweeted

Suvir Mirchandani @suvir_m

6 months ago

Data collection remains a bottleneck in imitation learning for robotics: it’s tedious & often needs access to a robot. Can we make the data collection process more accessible and engaging? We introduce RoboCade, a platform for gamifying remote robot data collection 🎮🤖 (1/6)

2

58

20

35

19K

ajaysridhar0 retweeted

Paul Zhou @zhiyuan_zhou_

6 months ago

Do you ever find finetuning VLA overfits to the target task, to the point where generalist ability is lost and even minor deviations beyond the SFT data break the policy? We found an extremely simple solution: directly merge the base and finetuned policy in weight space 🤯 👇🧵

7

385

49

238

127K

Ajay Sridhar @ajaysridhar0

6 months ago

Had a great time chatting with @micoolcho and @chris_j_paxton about memory for robot policies!

RoboPapers

@RoboPapers

6 months ago

Most robot policies today still largely lack memory: they make all their decisions based on what they can see right now. MemER aims to change that by learning which frames are important; this lets it deal with tasks like object search. @ajaysridhar0, @jenpan_, and @satviks107Sharma tell us about how to achieve this fundamental capability for long-horizon task execution. Watch Episode #54 of RoboPapers with @micoolcho and @chris_j_paxton to learn more!

7

88

14

43

18K

0

10

0

573

ajaysridhar0 retweeted

Jenn Grannen @jenngrannen

7 months ago

Meet Scanford 📚🤖: a robot that improves foundation models by doing useful work in the wild. Deployed for 2 weeks in the Stanford East Asia Library, Scanford scans books, helps librarians, and continually improves the VLM it relies on. 🔗 https://t.co/r2ZXyeKaIf 🧵1/8

17

484

73

198

94K

ajaysridhar0 retweeted

Dhruv Shah

@shahdhruv_

7 months ago

My group @Princeton is hiring! We are looking for strong postdoc and PhD candidates to join our quest for intelligent robots in open-world environments. Read more below and get in touch 🤖🐅🧡 https://t.co/7o35pwPZCz

shahdhruv_'s tweet photo. My group @Princeton is hiring!

We are looking for strong postdoc and PhD candidates to join our quest for intelligent robots in open-world environments. Read more below and get in touch 🤖🐅🧡

https://t.co/7o35pwPZCz https://t.co/itSvYU6S7I

15

856

140

324

310K

ajaysridhar0 retweeted

Mateo Guaman Castro

@mateoguaman

8 months ago

How can we create a single navigation policy that works for different robots in diverse environments AND can reach navigation goals with high precision? Happy to share our new paper, "VAMOS: A Hierarchical Vision-Language-Action Model for Capability-Modulated and Steerable Navigation"! 📜 Paper: https://t.co/XmyuBnrM1D 🌐 Website: https://t.co/Jt80tySWzQ

4

123

40

32

17K

Ajay Sridhar @ajaysridhar0

8 months ago

@yoyu0203 Good point - the HLP is finetuned to predict the labeled indices of the keyframes from the context. We use a subset of subtask transition frames as labels. We only pick useful ones (e.g., last frame of "look inside bin") and skip others (no frames from "reset scooper").

0

1

0

1

355

Ajay Sridhar @ajaysridhar0

8 months ago

VLAs are great, but most lack long-term memory humans use for everyday tasks. This is a critical gap for solving complex, long-horizon problems. Introducing MemER: Scaling Up Memory for Robot Control via Experience Retrieval. A thread 🧵 (1/8)

6

319

45

201

43K

Ajay Sridhar

@ajaysridhar0

Last Seen Users on Sotwe

Trends for you

Most Popular Users