Hugo Hadfield @HugoHadfield1 - Twitter Profile

Pinned Tweet

8 months ago

It turns out you can just take an off-the-shelf VLM and fine-tune it directly to output robot actions *as text* and it performs better than/as-good-as all the more complex model architectures… Check out the paper!

Ankit Goyal

@imankitgoyal

8 months ago

What's the right architecture for a VLA? VLM + custom action heads (π₀)? VLM with special discrete action tokens (OpenVLA)? Custom design on top of the VLM (OpenVLA-OFT)? Or... VLM with ZERO modifications? Just predict action as text. The results will surprise you. VLA-0: Outperforms π₀, GR00T-N1, MolmoAct, SmolVLA. With ZERO changes to the VLM. 🧵⬇️

20

572

76

443

106K

0

1

0

194

HugoHadfield1 retweeted

Ankit Goyal

@imankitgoyal

about 2 months ago

Evaluation is a critical bottleneck in building robot foundation models. Check out our latest work RoboLab, led by @xuningy, which addresses this exact challenge. Its a high-fidelity simulation environment for testing these models. A truly generalist policy should be able to complete these tasks zero-shot, and this benchmark highlights exactly how far we still have to go. More info 👇

3

77

11

57

30K

HugoHadfield1 retweeted

Xuning Yang @xuningy

about 2 months ago

RoboLab comes with RoboLab-120 — a curated, diverse benchmark of 120 tasks to get started. Set up and run in <20 min. (6/6) Try it out 👇 🌐 https://t.co/pNMITqaCus 📄 https://t.co/CDS0tpFnZ0 💻 https://t.co/bnJmhPMXa5

0

21

3

7

2K

HugoHadfield1 retweeted

Xuning Yang @xuningy

about 2 months ago

When every generalist robot model scores 95%+ on a benchmark, the numbers become meaningless. What if we built a photorealistic benchmark that never saturates and can generate new scenes and tasks with AI Workflows in minutes? We introduce RoboLab! 🧵(1/6)

xuningy's tweet photo. When every generalist robot model scores 95%+ on a benchmark, the numbers become meaningless.

What if we built a photorealistic benchmark that never saturates and can generate new scenes and tasks with AI Workflows in minutes?

We introduce RoboLab! 🧵(1/6) https://t.co/GxFIivVmKa

10

149

27

108

28K

Who to follow

Lecturer @OpenUniversity. Researcher working on human perception, loves robots, believes in sensor fusion

Mihir Kulkarni

@mihirk284

Postdoctoral Researcher, @NTNU, @NTNUNorway | Mechanical Engineering @bitspilanigoa 2020.

HugoHadfield1 retweeted

Caelan Garrett @CaelanGarrett

3 months ago

Check out Yash Narang's GTC talk today where he will highlight some of our work on GPU-accelerated multi-arm manipulation planning! https://t.co/H36o9bTnqf https://t.co/PFNiKc4Sk5

0

4

1

0

193

HugoHadfield1 retweeted

Ankit Goyal

@imankitgoyal

6 months ago

Happy to share that the code for VLA-0 is out now: https://t.co/Vg8wsCSIPQ Given its simplicity, it’s a great starting point to try out VLAs!

10

269

22

218

28K

Hugo Hadfield @HugoHadfield1

7 months ago

Today we have open sourced our training code for vla0, our state of the art VLA with zero modifications. Have a go with it here https://t.co/ePC2z5UnTd

9

1

0

1

50

HugoHadfield1 retweeted

Ankit Goyal

@imankitgoyal

8 months ago

Huge thanks to my incredible collaborators: @HugoHadfield1, Xuning Yang, Valts Blukis, Fabio Ramos And the amazing teams at NVIDIA @NVIDIARobotics @NVIDIAAI @NVIDIAEmbedded If you're excited about simple, effective approaches to VLAs: 💻 Code: https://t.co/za0bgtQE5x (Coming soon!) 🌐 Page: https://t.co/ctqopKWyij 📄 Paper: https://t.co/wUqKcosUXv

2

48

1

21

3K

HugoHadfield1 retweeted

Ankit Goyal

@imankitgoyal

8 months ago

What's the right architecture for a VLA? VLM + custom action heads (π₀)? VLM with special discrete action tokens (OpenVLA)? Custom design on top of the VLM (OpenVLA-OFT)? Or... VLM with ZERO modifications? Just predict action as text. The results will surprise you. VLA-0: Outperforms π₀, GR00T-N1, MolmoAct, SmolVLA. With ZERO changes to the VLM. 🧵⬇️

20

572

76

443

106K

HugoHadfield1 retweeted

Physics Memes

@ThePhysicsMemes

almost 2 years ago

I feel the pain #students

13

3K

284

125

115K

Hugo Hadfield @HugoHadfield1

over 1 year ago

Built a little automated N'th order derivative package yesterday afternoon as I got a bit tired of dealing with nasty time series data with noisy/missing values and people seem to like it :) https://t.co/eEFGXQl6J5

0

3

0

77

Hugo Hadfield @HugoHadfield1

almost 2 years ago

@thomasahle Eric is a force of nature in almost any software/hardware/mathematical environment he finds his way to. First met him when he was at 18yo designing XAP processors for Cambridge consultants and a core numpy contributor, the guy shows no signs of slowing down 🚀

0

3

0

48

HugoHadfield1 retweeted

Thomas Ahle

@thomasahle

almost 2 years ago

Cool project by @EricWieser to formalize all of the Matrix Cookbook in Lean! https://t.co/MV34Fhvorr

2

108

19

60

10K

Hugo Hadfield @HugoHadfield1

almost 2 years ago

Now with blog post write up! https://t.co/wrwK90yHiF

Hugo Hadfield @HugoHadfield1

almost 2 years ago

The advantage of this method vs a checkerboard is that 1. You don’t need to stand in the rain in front of your robot holding a massive checkerboard and feeling like an idiot 2. You can just get any old image that has lines and straight edges in and it works 6/n

1

0

214

0

1

0

106

Hugo Hadfield @HugoHadfield1

almost 2 years ago

So how can you actually play with some code that does this? I’ve found this paper https://t.co/vzMUtntdQF which looks well great. I’ve added code fix ups, python binding, and mapping to opencv here: https://t.co/WGVk8wd8ih Would highly recommend having a play, it works great! 7/7

0

105

Hugo Hadfield @HugoHadfield1

almost 2 years ago

Humans can tell when an image has fisheye lens distortion, it just look wrong, like a GoPro video. We can tell if an image is correctly undistorted, all the lines which should be straight are straight. Begs the question, can we make computers understand this too? 1/n

HugoHadfield1's tweet photo. Humans can tell when an image has fisheye lens distortion, it just look wrong, like a GoPro video. We can tell if an image is correctly undistorted, all the lines which should be straight are straight. Begs the question, can we make computers understand this too? 1/n https://t.co/4NGVWFVUPs

2

3

1

418

Hugo Hadfield @HugoHadfield1

almost 2 years ago

The advantage of this method vs a checkerboard is that 1. You don’t need to stand in the rain in front of your robot holding a massive checkerboard and feeling like an idiot 2. You can just get any old image that has lines and straight edges in and it works 6/n

1

0

214

Hugo Hadfield

@HugoHadfield1

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users