TimSong @TimSong52005757 - Twitter Profile

Pinned Tweet

TimSong @TimSong52005757

3 months ago

So honored to have the support of my professors and peers, including the omnipotent Long @LongLeRobot, on my first PhD project. 3D space is the bridge between the task and action space, where the guidance from foundation knowledge flows.

Kostas Daniilidis

@KostasPenn

3 months ago

VLA policies learn generalist robot behaviors from massive teleoperation datasets, hoping that the right behavior emerges. But they rarely use perception during training or inference: powerful foundation models of 3D geometry, semantics, or human motion are ignored. @TimSong52005757, @LongLeRobot, and our @GRASPlab team introduce Omniguide, based on a simple idea: Instead of retraining policies, let perception guide action at inference time. We express diverse guidance sources as attractive and repulsive energy fields in 3D space and inject their gradients into the generative process of VLA policies. This lets perception modules steer actions without retraining the policy: A modular path toward composable robot intelligence. Project: https://t.co/5Vqvz1t8TR Paper: https://t.co/IoYzKrOzKM

2

158

39

100

47K

2

33

8

7

4K

TimSong52005757 retweeted

Chen Geng

@gengchen01

4 days ago

🌟Your static 3D world models are now alive and interactable! 🚀Introducing NeuROK, a neural simulation framework that turns any static 3D object into an interactive 4D asset — no per-category physics, no physical annotations for training. 📄 https://t.co/PSAILjHmZb 🧵 1/n

8

373

73

235

30K

TimSong52005757 retweeted

Zhiyang (Frank) Dou

@frankzydou

about 1 month ago

Excited to share that our work NeuralActuator: Neural Actuation Modeling for Robot Dynamics and External Force Perception has been accepted to #RSS2026! Your robot — even a low-cost one — can feel external forces without torque or tactile sensors. TL;DR: NeuralActuator is a neural actuator model that jointly predicts 1️⃣torque to capture the nonlinear and time-varying current–to–torque relationship of low-cost servos, 2️⃣external contact forces (and force detection gates) for sensorless force perception, 3️⃣and motor conditions that indicate each motor’s operating regime. Here is a fast-forward video clip ⬇️ We are also covering more robots like LeRobot-S101 and Franka Panda. More details coming soon.

8

328

58

229

40K

TimSong52005757 retweeted

Guanya Shi

@GuanyaShi

2 months ago

I’m so tired of writing rebuttals to this kind of “lack of novelty” review: “This paper trivially combines A, B, and C, so the algorithmic novelty is limited.” Technically, most (if not all) robotics papers are convex combinations of existing ideas. I still deeply appreciate A+B+C papers—especially when they deliver: - New capabilities: the “trivial combination” unlocks behaviors we simply couldn’t achieve before - Sensible & organic design: A+B+C is clearly the right composition—not some arbitrary A′+B+C′ - Nontrivial interactions: careful analysis of the dynamics, coupling, or failure modes between A, B, C - Rehabilitating old ideas: A was dismissed for years, but paired with modern B/C, it suddenly works—and teaches us why - System-level & "interface" insight: the contribution is not any single piece, but how the pieces talk to each other - Scaling laws or regimes: identifying when/why A+B+C works (and when it doesn’t) - Engineering clarity: making something actually work robustly in the real world is not “trivial” - New problem formulations: sometimes the real novelty is in the reformulation—only under this view does A+B+C make sense. Maybe worth keeping these in mind when reviewing the next A+B+C paper : )

30

980

121

279

115K

Who to follow

Thick and tired 🥱

@bigbootytc

Freaky bottom. Love good dick, love recording, DM me 📩

Wassgood. Numero uno.

TimSong52005757 retweeted

Jinqi Luo

@peterljq

2 months ago

Really exciting work! This is exactly promising for test-time controllability. Agents shall perceive to guide and correct behavior actively.

0

10

2

1K

TimSong52005757 retweeted

Dinesh Jayaraman @dineshjayaraman

3 months ago

Excited about this project: we showed how, for challenging tasks, a VLA can "get by with a little help from its friends"😉 --- powerful perception models that infer geometry and more, by constructing 3D guidance fields for diffusion and flow policies.@LongLeRobot @TimSong52005757

0

44

5

25

7K

TimSong52005757 retweeted

Chris Paxton

@chris_j_paxton

3 months ago

It was a lot of fun to see this in person. We see so much online, but theres nothing nearly as convincing as just doing a demo, first try, right in front of someone. And its cool work too, addressing a serious shortcoming of current policies

1

41

9

12

6K

TimSong52005757 retweeted

Long Le @LongLeRobot

3 months ago

For this project, we’ve also optimized the speed significantly and shown real-time demos to several visitors coming to Penn including @chris_j_paxton and @DJiafei. Here’s a video of the robot finishing a task in 20 secs autonomously

2

63

12

32

14K

TimSong52005757 retweeted

Robots Digest 🤖

@robotsdigest

3 months ago

Scaling VLAs with more robot data is just not enough. OmniGuide shows you can fix generalist policies at inference time. Add guidance fields in 3D space that attract toward goals and repel from obstacles, and steer the policy without retraining.

1

71

16

48

4K

TimSong52005757 retweeted

Long Le @LongLeRobot

3 months ago

Check out my most recent project that I've been working on for more than half a year -- co-led with Tim @TimSong52005757! We showed a flexible and effective recipe for improving generalist VLAs leveraging non-robotic foundation models! The key is guidance in 3D! More details @ https://t.co/aBauZS5TiT

0

30

8

6

7K

TimSong52005757 retweeted

Kostas Daniilidis

@KostasPenn

3 months ago

VLA policies learn generalist robot behaviors from massive teleoperation datasets, hoping that the right behavior emerges. But they rarely use perception during training or inference: powerful foundation models of 3D geometry, semantics, or human motion are ignored. @TimSong52005757, @LongLeRobot, and our @GRASPlab team introduce Omniguide, based on a simple idea: Instead of retraining policies, let perception guide action at inference time. We express diverse guidance sources as attractive and repulsive energy fields in 3D space and inject their gradients into the generative process of VLA policies. This lets perception modules steer actions without retraining the policy: A modular path toward composable robot intelligence. Project: https://t.co/5Vqvz1t8TR Paper: https://t.co/IoYzKrOzKM

2

158

39

100

47K

TimSong @TimSong52005757

over 1 year ago

@20Kamio @JiahuiLei1998 @CrossEntropi @LingjieLiu1 @KostasPenn Hi! We compared our method with GSTex on the NeRF synthetic dataset. I communicated with the author, that their method can not run on all the scenes of the mipnerf360 dataset, so that's not reported on both their paper and ours.

TimSong52005757's tweet photo. @20Kamio @JiahuiLei1998 @CrossEntropi @LingjieLiu1 @KostasPenn Hi! We compared our method with GSTex on the NeRF synthetic dataset. I communicated with the author, that their method can not run on all the scenes of the mipnerf360 dataset, so that's not reported on both their paper and ours. https://t.co/dkRbs4mSEX

0

31

TimSong52005757 retweeted

Jiahui Lei

@JiahuiLei1998

over 1 year ago

Capture a scene with 2DGS that can be viewed from extremely close and far away viewpoints, without aliasing while maintaining detailed texture. Check our new work https://t.co/VP3MasjC7J lead by @TimSong52005757 @CrossEntropi and advised by me, @LingjieLiu1 and @KostasPenn

1

65

12

29

9K

TimSong52005757 retweeted

MrNeRF

@janusch_patas

over 1 year ago

3D Gaussian Splatting in Robotics: A Survey Paper: https://t.co/GnqtiIdifP

3

184

29

124

11K

TimSong52005757 retweeted

Yuanwen Yue

@YueYuanwen

almost 2 years ago

Introducing #ECCV2024 work FiT3D: Improving 2D Feature Representations by 3D-Aware Fine-Tuning. 2D foundation models are awesome - but we live in a 3D world. How to inject 3D awareness into 2D foundation models?🤔In FiT3D, we first lift 2D foundation features (e.g. DINOv2) into a 3D Gaussian representation for each scene. Then we use the rendered 3D-aware features of multiple scenes to finetune the 2D foundation model. We show that semantic features fused into 3D representations can in turn effectively improve 2D foundation models. 💻 Code: https://t.co/2xrkQToNmt 🚀 Project: https://t.co/HIwQlEb7ez 🤗 Demo: https://t.co/JG2JOqVJjR With @_anurag_das, @FrancisEngelman, @SiyuTang3, and @janericlenssen #ETHZurich #MPI_INF #GoogleAI

5

266

43

140

41K

TimSong52005757 retweeted

MrNeRF

@janusch_patas

almost 2 years ago

Fasten your seat belts. Michael from @radiancefields and I will host a weekly X Space discussing exciting developments with researchers and creators in the radiance field and GenAI community, starting next Thursday. Our first guest will be @scannerian1, one of the world's leading experts in capturing 3D scenes from images. We plan to create an interactive community where we can share knowledge and grow together. We hope you will join and participate to make this space a success. We have some exciting guests lined up. Stay tuned and let's have fun together sharing the latest insights.

1

86

5

22

10K

TimSong

@TimSong52005757

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users