Debidatta Dwibedi @debidatta - Twitter Profile

Pinned Tweet

11 months ago

Our Vision-Language-Action robot demo at #RSS2025 was eye-opening. The ultimate eval for any generalist model: new environment, new objects from audience, and new instructions. For the first time it really hit me: what if we've been underestimating what these models can do?

3

85

10

21

12K

debidatta retweeted

Sean Kirmani

@SeanKirmani

6 months ago

Introducing Veo Robotics! In this work, we show that an action-conditioned video model can be used as a general robot simulator for evaluation, safety, etc. https://t.co/CFVvSCZ0GR

5

92

17

35

18K

Debidatta Dwibedi @debidatta

11 months ago

For more details check out https://t.co/bViLhiZAsN

0

1

0

319

Debidatta Dwibedi @debidatta

11 months ago

Our Vision-Language-Action robot demo at #RSS2025 was eye-opening. The ultimate eval for any generalist model: new environment, new objects from audience, and new instructions. For the first time it really hit me: what if we've been underestimating what these models can do?

3

85

10

21

12K

Who to follow

Corey Lynch

@coreylynch

Director of AI at @figure_robot, building Helix 🧬

Deepak Pathak

@pathak2206

Co-Founder & CEO @SkildAI, Faculty @CarnegieMellon. PhD @UCBerkeley; BTech @IITKanpur I study topics in AI (robotics, machine learning & computer vision).

Devendra Chaplot

@dchaplot

Building superintelligence @xai

debidatta retweeted

Peng Xu

@sippeyxp

12 months ago

🔥Gemini Robotics On-Device is here! VLA with similar generalization, instruction following, and fast adaptation as our March release, now fits on a 4090! More exciting: we're 🚀an SDK and a model dev service (flywheel) alongside it 🎯= democratizing model development! #DeepMind #robotics is collaborating with select Trusted Testers to refine the process. For everyone else, check the 🧵 below for videos showcasing. We're just getting started—all suggestions and guidance are welcome! https://t.co/Uzdc3isfyY

4

29

9

16K

Debidatta Dwibedi @debidatta

12 months ago

What was once a dream is now real! ✨ Excited to announce Gemini Robotics On-Device: our VLA model that runs locally and shows impressive performance on 3 robot types. On-device intelligence, no internet needed!

Google DeepMind @GoogleDeepMind

12 months ago

We’re bringing powerful AI directly onto robots with Gemini Robotics On-Device. 🤖 It’s our first vision-language-action model to help make robots faster, highly efficient, and adaptable to new tasks and environments - without needing a constant internet connection. 🧵

106

3K

540

633

819K

0

24

1

0

1K

debidatta retweeted

Kevin Zakka @kevin_zakka

about 1 year ago

Booster recovery controller from last night. Sim design, training and deployment on hardware took < 1 day. With @qiayuanliao

38

674

82

187

105K

debidatta retweeted

Carolina Parada

@parada_car88104

about 1 year ago

✨🤖 Today our team is so excited to bring Gemini 2.0 into the physical world with Gemini Robotics, our most advanced AI models to power the next generation of helpful robots. 🤖✨ Check it out! https://t.co/cRLKmKmcFV And read our blog: https://t.co/k8NE4tg2Cs We are looking forward to seeing how robot developers will use these models to continue to advance robot performance with Gemini at the core.

3

101

13

3

8K

debidatta retweeted

Yuge Shi (Jimmy) @YugeTen

over 1 year ago

✨New blog post✨: my attempt as a vision researcher at finally understanding RLHF -- a deep dive into PPO & DeepSeek's GRPO! No hot take, I promise. https://t.co/cjIgpd7c14

25

1K

173

1K

89K

debidatta retweeted

Kevin Zakka @kevin_zakka

over 1 year ago

The ultimate test of any physics simulator is its ability to deliver real-world results. With MuJoCo Playground, we’ve combined the very best: MuJoCo’s rich and thriving ecosystem, massively parallel GPU-accelerated simulation, and real-world results across a diverse range of robot platforms: quadrupeds, humanoids, dexterous hands, and arms. Best of all? You can get started today with a single command: pip install playground https://t.co/t6pZCNeOSK

37

898

178

392

153K

debidatta retweeted

Antoine Yang @AntoineYang2

over 1 year ago

Gemini 2.0 Flash's video understanding is here 🚀 Think: search in videos via timecodes, extract text from moving camera footage, analyze screen recordings in real-time interactions with native audio out 🔊 Come and try it https://t.co/Z9zVQbNBUD 😀 https://t.co/Axa4IVplCo

2

83

10

21

9K

debidatta retweeted

Vidhi Jain @viddivj

over 1 year ago

🧵1/8 So annoying when my 🤖 vacuum cleaner buzzes loudly during my Zoom meeting! Can we teach robots to be aware of their noise levels at home? Introducing ANAVI—a framework that uses indoor visuals to predict sound propagation! 🎶🏠

5

120

25

32

17K

debidatta retweeted

Julen Urain

@robotgradient

almost 2 years ago

YouTube is a LARGE dataset of demonstration videos to train Generalist robot agents, but lacks action data. How can we learn DEXTEROUS skills from them? In #CoRL2024, we explore the problem of learning a Generalist Piano Playing agent from YouTube videos. https://t.co/nRRy3hdqkL

6

315

43

147

42K

debidatta retweeted

No Context Brits

@NoContextBrits

almost 2 years ago

“The first slot machine was invented in 1894.” People in 1893:

57

20K

1K

2K

3M

debidatta retweeted

Stephen James

@stepjamUK

almost 2 years ago

As we explore new opportunities and the future of this talented group, we’re grateful for all the support. Feel free to reach out—our DMs are open! @mohito1905 @younggyoseo @iainhaughton @nc__dev @chrysalis_ai @eugene_teoh @JafarUruc @SridharSola

4

36

7

1

10K

debidatta retweeted

Alexander Kolesnikov @__kolesnikov__

about 2 years ago

We just released PaliGemma-3B, a very capable Vision-Language Model. Do not waste any time, finetune it for your task: Code: https://t.co/V9wQU7jtmv Colab: https://t.co/aDGJd7Iz8z Kaggle: https://t.co/A5ZrnjDZni HF: https://t.co/Du52eHcXNh Vertex AI: https://t.co/qxK9Irgera

4

312

54

187

28K

debidatta retweeted

Michael Tschannen @mtschannen

about 2 years ago

We just released a big 🎁GIVT update! 📈 Larger models and improved image generation results across the board 💡 Improved GMM formulation and adapter module 💻 Code, model checkpoints, and a colab are now available at https://t.co/zaf5orekfZ More details below... 1/

5

249

46

182

66K

Debidatta Dwibedi @debidatta

about 2 years ago

With Vid2Robot, we take a step towards developing robots that can perform tasks by observing humans do them in videos. Check 🧵 below for more details.

Vidhi Jain @viddivj

about 2 years ago

What if we could show a robot how to do a task? We present Vid2Robot, which is a robot policy trained to decode human intent from visual cues and translate it into actions in its environment. 🤖 Website: https://t.co/ufFHK1Dgbg Arxiv: https://t.co/qEUjaXovJa 🧵(1/n)

5

205

39

68

30K

1

7

1

1K

Debidatta Dwibedi @debidatta

about 2 years ago

@adityagolatkar2 That's an interesting connection! Hadn't thought of it that way.

0

23

Debidatta Dwibedi @debidatta

about 2 years ago

Can we train a model to describe different parts of images in varying levels of detail? Introducing FlexCap, a VLM designed to output localized captions in N words where we can control N with special length tokens. https://t.co/tDsyHF1AVI

2

38

9

17

14K

Debidatta Dwibedi @debidatta

about 2 years ago

@adityagolatkar2 The model counts implicitly because of special length tokens that we add. If we use the token length_N then the model outputs N words before outputting EOS.

1

0

28

Debidatta Dwibedi @debidatta

about 2 years ago

Project webpage: https://t.co/tDsyHF28Lg Arxiv: https://t.co/D8BubwWjnl This is joint work with @viddivj, @JonathanTompson, Andrew Zisserman and @yusufaytar .

0

6

1

0

644

Debidatta Dwibedi @debidatta

about 2 years ago

FlexCap has been useful for robotics. We used it in AutoRT (https://t.co/04yiJAldCg) to find objects in the robot's environment. It also helped create the dataset used to train SpatialVLM (https://t.co/Zyo3dJ6gV7).

1

0

257

Debidatta Dwibedi

@debidatta

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users