David Chan @_dmchan - Twitter Profile

4 days ago

Some really awesome work out of our lab! Congrats team :)

4 days ago

Excited to share T-Rex: Tactile-Reactive Dexterous Manipulation 🦖🤖 Touch is fundamental to human dexterity, yet most Vision-Language-Action (VLA) models either ignore tactile feedback or lack the ability to react to high-frequency contact signals. In this work, we tackle both the data and architectural challenges of tactile-reactive dexterous manipulation. 🦖 A 100-hour tactile-synchronized dexterous manipulation dataset with 7,700+ trajectories, 22 motor primitives, and 200+ everyday objects. 🦖 A tactile-reactive MoT architecture with spatial-temporal tactile encoding and asynchronous high-frequency tactile refinement. 🦖 A scalable training recipe combining 22,889 hours of human egocentric pretraining with tactile-grounded robot mid-training. Across 12 real-world contact-rich manipulation tasks, T-Rex achieves over 30% higher average success rate than the strongest baseline. We are fully open-sourcing the dataset, models, teleoperation stack, training code, and inference pipeline. 🌐 Project: https://t.co/AiHKRR8YXU 📄 Paper: https://t.co/mXY2UNLlqc 💻 Code: https://t.co/7skCxUtwKC 🤗 Dataset: https://t.co/uNwW8dcRZL 🧵 Thread ↓

21

228

63

134

68K

0

2

0

80

_dmchan retweeted

luuk de leest @luuk58

15 days ago

fun fact: tijdens de keynote hakt Apple een stukje 3k, 4k, 5k en 6kHz eruit wanneer ze "Siri" zeggen, zodat niet iedereens HomePods terug beginnen te praten 🗣️🚫

luuk58's tweet photo. fun fact: tijdens de keynote hakt Apple een stukje 3k, 4k, 5k en 6kHz eruit wanneer ze "Siri" zeggen, zodat niet iedereens HomePods terug beginnen te praten 🗣️🚫 https://t.co/x13WbNPztr

115

25K

970

2K

1M

_dmchan retweeted

Baifeng

@baifeng_shi

3 months ago

Humans can see in high-res, high-FPS in real-time. Why can't VLMs? Introducing AutoGaze: ViTs/VLMs "gaze" only at key video regions! Up to 4-100x token savings, 19x speedup, and enables scaling to 4K-res 1K-frame videos. 📄 https://t.co/GhbWZwMAg7 🌐 https://t.co/mEJ991MAIR 🤗 https://t.co/FOfc2QRThi (1/n)🧵

47

2K

202

1K

159K

_dmchan retweeted

Edson Araujo @edsonroteia

4 months ago

📢 Deadline Extension for MMFM Workshop @ #CVPR2026! We are extending the submission deadline to **March 14, 2026 (AoE)**. For updated details on submission timelines and guidelines, please refer to the workshop website and OpenReview page below. We’re excited to see your work!

2

11

7

1

4K

Who to follow

Joshua Meier

@joshim5

Co-Founder at Chai Discovery

Miguel (Miggy) Chuapoco

@MiggyChuapoco

@LumaGroup_ | prev: @CapsidaBio, @Caltech, & @Stanford | All tweets and views are my own

Yuchen Cui

@YuchenCui1

Assistant Professor @CS_UCLA researching Interactive Robot Learning 🤖🤖🤖 | previously Postdoc @Stanford, CS PhD @UTAustin

David Chan @_dmchan

4 months ago

Our deadline is only one week away! Don't forget to submit!

Edson Araujo @edsonroteia

4 months ago

The 5th edition of the MMFM Workshop is coming to @CVPR 2026! "What is Next in Multimodal Foundation Models?" exploring the frontiers of vision, language, and beyond. June 2026 | Denver, CO Details in thread 👇

edsonroteia's tweet photo. The 5th edition of the MMFM Workshop is coming to @CVPR 2026!

"What is Next in Multimodal Foundation Models?" exploring the frontiers of vision, language, and beyond.

June 2026 | Denver, CO
Details in thread 👇 https://t.co/dY9CgWHWqB

1

14

5

2

22K

0

12

6

2

9K

David Chan @_dmchan

5 months ago

Some awesome new work from our group which explores the multimodal capabilities of interactive agents!

Zirui "Colin" Wang @zwcolin

5 months ago

🎮 We release VisGym: Diverse, Customizable, Scalable Environments for Multimodal Agents (w/ @junyi42 @aomaru_21490) 🌐 With 17 environments across multiple domains, we show systematically the brittleness of VLMs in visual interaction, and what training leads to. 🧵[1/8]

2

180

32

83

40K

0

4

0

1

279

David Chan @_dmchan

5 months ago

Ooooh, fancy!

Haven Feng

@HavenFeng

5 months ago

✨Thinking with Blender~ Meet VIGA: a multimodal agent that autonomously codes 3D/4D blender scenes from any image, with no human, no training! @berkeley_ai #LLMs #Blender #Agent 🧵1/6

72

2K

307

2K

338K

0

4

0

205

_dmchan retweeted

XuDong Wang

@XDWang101

7 months ago

Objectness should be user-defined — not human-label-defined! Unsupervised SAM 2 (UnSAMv2) makes it real✨ 1 point + a continuous granularity slider = the mask you want! UnSAMv2 beats SAM2: +16% NoC-90, +26% 1-IoU, +37% AR on 11+ datasets (w/ just 6k unlabeled images)!💪 1/n

1

18

10

1

2K

_dmchan retweeted

Jay Alammar

@JayAlammar

8 months ago

The Illustrated NeurIPS 2025: A Visual Map of the AI Frontier New blog post! NeurIPS 2025 papers are out—and it’s a lot to take in. This visualization lets you explore the entire research landscape interactively, with clusters, summaries, and @cohere LLM-generated explanations that make the field easier to grasp. Link in thread!

25

1K

215

989

184K

_dmchan retweeted

Phillip Isola @phillip_isola

8 months ago

Arxiv has been such a wonderful service but I think this is a step in the wrong direction. We have other venues for peer review. To me the value of arxiv lies precisely in its lack of excessive moderation. I'd prefer it as "github for science," rather than yet another journal.

25

719

34

55

77K

David Chan @_dmchan

8 months ago

I’m not sure what they get out of this, but I’m here for it!

Vaibhav (VB) Srivastav

@reach_vb

8 months ago

Chinese doordash dropping MIT license foundation video models??? “We introduce LongCat-Video, a foundational video generation model with 13.6B parameters, delivering strong performance across Text-to-Video, Image-to-Video, and Video-Continuation generation tasks.” https://t.co/jPTY2Uac1S

20

734

94

506

113K

0

1

0

197

David Chan @_dmchan

8 months ago

Two days at ICCV = Two new papers! Interrupting LLMs’ reasoning should have seamless and predictable behavior. Turns out, that’s not the case.

Patrick Wu

@tsunghan_wu

8 months ago

Humans handle dynamic situations easily, what about models? Turns out, they break in three distinct ways: ⛔ Force Stop → Reasoning leakage (won’t stop) ⚡️ Speedup → Panic (rushed answers) ❓ Info Updates → Self-doubt (reject updates) 👉Check out https://t.co/vr7f2ZYMTp

tsunghan_wu's tweet photo. Humans handle dynamic situations easily, what about models?

Turns out, they break in three distinct ways:

⛔ Force Stop → Reasoning leakage (won’t stop)
⚡️ Speedup → Panic (rushed answers)
❓ Info Updates → Self-doubt (reject updates)

👉Check out https://t.co/vr7f2ZYMTp

5

72

22

24

18K

1

3

0

318

David Chan @_dmchan

8 months ago

This is some awesome new work from our lab - Echo is a way we can build benchmarks automatically from social media! Check it out!

Jiaxin Ge

@jiaxin_ge_

8 months ago

✨Introducing ECHO, the newest in-the-wild image generation benchmark! You’ve seen new image models and new use cases discussed on social media, but old benchmarks don’t test them! We distilled this qualitative discussion into a structured benchmark. 🔗 https://t.co/wJmmEY8TFQ

4

129

31

47

48K

1

2

0

369

David Chan @_dmchan

8 months ago

I'm at #ICCV2025 this week - send me a DM or email if you'd like to find a time to talk anything multimodal! Speaking of multimodal, don't forget to check out our workshop: "What's Next in Multimodal Foundation Models?" on Monday in 326 B! https://t.co/2BjLLfag9y

0

3

0

159

_dmchan retweeted

Roei Herzig

@roeiherzig

8 months ago

🌺 Join us in Hawaii at ICCV 2025 for the workshop “What is Next in Multimodal Foundation Models?” 🗓️ Monday, October 20 | 8:00 – 12:00📍Room 326 B We’ve got a stellar lineup of speakers & panelists— details here: 🔗 https://t.co/t2DmcZAlWM @ICCVConference

roeiherzig's tweet photo. 🌺 Join us in Hawaii at ICCV 2025 for the workshop

“What is Next in Multimodal Foundation Models?”
🗓️ Monday, October 20 | 8:00 – 12:00📍Room 326 B

We’ve got a stellar lineup of speakers & panelists— details here: 🔗 https://t.co/t2DmcZAlWM

@ICCVConference https://t.co/EuZFSk35CZ

0

60

14

24

38K

_dmchan retweeted

Wen-Han Hsieh @henseoba

10 months ago

🚀Excited to share that our paper, “Do What? Teaching Vision-Language-Action Models to Reject the Impossible,” has been accepted to #EMNLP2025 Findings! 📄Paper: https://t.co/TDamXwV5i8 🌎Project page: https://t.co/QIK01vk60b

12

48

14

10

16K

David Chan @_dmchan

11 months ago

I'll be in Vienna for ACL starting Today - I’m presenting work on how LMMs perform in-context updates in a Bayesian way, but I’m excited to talk anything multimodal! Feel free to reach out if you’re around! #ACL2025

0

9

0

216

David Chan @_dmchan

11 months ago

Awesome work exploring the power of serial computing!

Konpat Ta Preechakul @konpatp

11 months ago

Some problems can’t be rushed—they can only be done step by step, no matter how many people or processors you throw at them. We’ve scaled AI by making everything bigger and more parallel: Our models are parallel. Our scaling is parallel. Our GPUs are parallel. But what if the real bottleneck isn’t size—but depth?What if the model just didn’t have enough serial steps to get it right? Some problems need depth, not width. This is the Serial Scaling Hypothesis. This is not the same as recent studies in scaling test-time compute, which focus on train vs. test and are agnostic to parallel vs. serial. For example: test-time majority voting increases compute by running models in parallel — but doesn’t help when the task itself is serial. We argue: what really matters is how the compute is structured. And for many real-world problems, it must be serial. Read more at: https://t.co/msytYszWK0 or 🧵. (In collaboration with: @layer07_yuxi , Kananart Kuwaranancharoen and @YutongBAI1002 )

26

425

75

342

58K

0

2

0

124

David Chan @_dmchan

11 months ago

Me (To Cursor): Refactor this code. Cursor: Sure! I've refactored your code! It's shorter and cleaner now! Me: Are you sure there are no feature regressions? Cursor: The code is missing essential functionality. Me: ....

0

5

0

119

_dmchan retweeted

Patrick Wu

@tsunghan_wu

about 1 year ago

📢 Call for Papers! Last chance to hang with the CV crowd in Hawaii 🌴 We're hosting the 4th MMFM Workshop at #ICCV2025 — submit your work on vision, language, audio & more by July 1 🗓️ Also check out the CVPR edition 👉 @MMFMWorkshop 🔗 https://t.co/ZpLDbqIAOy

tsunghan_wu's tweet photo. 📢 Call for Papers!
Last chance to hang with the CV crowd in Hawaii 🌴

We're hosting the 4th MMFM Workshop at #ICCV2025 — submit your work on vision, language, audio & more by July 1 🗓️

Also check out the CVPR edition 👉 @MMFMWorkshop

🔗 https://t.co/ZpLDbqIAOy https://t.co/qBoo0JgFeR

1

15

13

0

11K

David Chan

@_dmchan

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users