Jerry Han

Verified account

@JerryHan_og

Co-founder & CTO at @OpenGraph_Labs, building multimodal data infrastructure

Seoul

Joined September 2025

124 Following

841 Followers

59 Posts

about 1 month ago

Physical AI needs human data, but human data capture is still way too hard. Not because pressing record is hard. Because the moment you add cameras + sensors, everything gets messy: Every device has its own clock. Streams can silently fail. Recording health has to be checked. Start / stop has to line up. Synchronization has to be solved after. SyncField Desktop turns it into one workflow. Auto-discover cameras + sensors. Connect streams with aliases. Drag, arrange, and monitor panels. Record everything in one click. Review synchronized playback. Get frame-aligned data on disk. No handclaps. No LED flashes. No sync scripts. No file wrangling. Just humans doing real tasks, captured cleanly. If you're working on human data for Physical AI, reach out: https://t.co/MR4Mljaut5

1

5

3

1

489

about 2 months ago

https://t.co/cmQBKGe1Cl

2

65

14

97

7K

JerryHan_og retweeted

3 months ago

Introducing EgoVerse: an ecosystem for robot learning from egocentric human data. Built and tested by 4 research labs + 3 industry partners, EgoVerse enables both science and scaling 1300+ hrs, 240 scenes, 2000+ tasks, and growing Dataset design, findings, and ecosystem 🧵

34

877

162

459

267K

JerryHan_og retweeted

OpenGraph Labs 🧤 @OpenGraph_Labs

3 months ago

Excited to share that @OpenGraph_Labs has been accepted into @NVIDIA’s Inception Program 🚀 Our mission is to build reliable infrastructure for multimodal data capture, powering the next generation of robotics & world models 🌎

OpenGraph_Labs's tweet photo. Excited to share that @OpenGraph_Labs has been accepted into @NVIDIA’s Inception Program 🚀

Our mission is to build reliable infrastructure for multimodal data capture, powering the next generation of robotics & world models 🌎 https://t.co/cXMBokeSey

0

20

3

0

2K

3 months ago

@qigbrian 4x3 pressure per finger, bending, and IMU! all synced and calibrated through our capture pipeline

0

0

0

0

107

3 months ago

World models can predict the next frame. They can't predict the next touch. That's the gap visuo-tactile world models will close. Is the robot gripping hard enough? Is the surface rigid or soft? When exactly does contact begin and end? Vision doesn't know. Tactile does. We built @OpenGraph_Labs to capture what cameras miss. Egocentric RGB × 5-finger multi-taxel tactile gloves. Frame-synced. Calibrated. In-the-wild. No lab setups. No scripted pick-and-place. Just humans doing real tasks in real stores. Watch the exact moment contact happens. The pressure map lights up in sync. Every touch. Every frame. 👇

4

118

15

84

13K

3 months ago

repeatability depends on the sensor, we're sensor-agnostic and calibrate across devices to keep data consistent. occlusion is exactly the point haha, vision loses the object the moment you grab it. tactile doesn't. we have thimble-type sensors too, way lighter and less sweaty 👍

0

1

0

1

100

3 months ago

Owning your data infra is the new moat in robotics.

3 months ago

Data can’t just be outsourced🤯 To iterate fast, robotics teams must own their data infrastructure Introducing SyncField: turnkey data infrastructure for in-the-wild data collection (Best for UMI-style & Embodied human) #Robotics #UMI #DataCollection

8

136

12

85

15K

0

4

0

3

411

3 months ago

@tmztmobile Fun idea, hadn't thought about that. But occlusion makes it tricky. Hand shadows, blind spots, bad angles where the camera can't see contact.

1

0

0

1

113

3 months ago

Are you sure your training data is actually synced? Egocentric camera sees a hand grasping an orange, but the wrist cam shows nothing and tactile reads zero contact. Your policy is learning from broken data and doesn't even know it. In Physical AI, multi-modal sync is everything. → Egocentric: 30fps → Wrist: 30fps → Tactile: 100Hz Different devices, different clocks, slightly different rates. The drift starts small. Barely noticeable frame by frame. But over a 4-minute episode, that tiny difference compounds into seconds of misalignment. And you had no way to even check. Until now. We built the Sync Quality Dashboard. One score tells you if your data is clean. Then go deeper. Clock offset, drift rate, jitter, frame drops, per-stream correction. All visible, all measurable. In a 4-min episode, accumulated clock drift reached 7.5 seconds by the end of the recording. After correction: 9.0ms. That's the difference between "roughly aligned" and "actually aligned." Visually confirm vision-to-vision, vision-to-tactile alignment frame by frame. No more "trust me, the data is fine." We don't just collect multi-modal demos. We ship a quality assurance layer so you can verify every episode before it touches your model. All data in @LeRobotHF format. Ready to train. Verified in sync. Stop guessing. Start verifying.

5

103

11

121

10K

3 months ago

@iamRezaSayar Not public. Just internal tooling we use for our own data.

0

1

0

0

97

3 months ago

@DominiqueCAPaul No public link, it's internal tooling for now.

0

0

0

0

122

3 months ago

That's a cool idea, fine-tuning could help for sure. But some contacts are just invisible to vision. Cameras get occluded by shadows and blind spots, so you can't always tell if contact is actually happening. You'd risk training on noise. That's why we think tactile is the missing signal, not better pixels.

0

1

0

0

28

3 months ago

VLMs see everything. Feel nothing. VLMs annotate what looks like contact. Tactile sensors verify what actually is contact. We ran VLM annotation on real manipulation demos. It labeled a grasp as "approach." Skipped release phases entirely. Hallucinated state transitions that never occurred. 6 out of 36 action phases wrong. 17% of your training data, corrupted. Why? Pixels don't know when a fingertip touches a surface. Pixels don't know when grip pressure hits zero. Tactile sensors do. So we built a pipeline that catches every error automatically. Tactile evidence validates every contact transition. Wrong labels corrected, missing phases inserted. Not faster labeling. Truthful labeling. This is one of the core problems we're solving at @OpenGraph_Labs

2

43

5

27

4K

3 months ago

@sanskxr02 Oh nice, curious what you're building! Keep me posted

1

1

0

0

65

3 months ago

@YuvalDimri Thanks! We built our own abstraction layer. New sensors or data types basically just plug right in. For tactile, we extract contact events from each sensor's raw data.

0

1

0

0

81

4 months ago

Robots can't learn if their eyes and hands are out of sync. A 30fps camera and a 1kHz tactile sensor don't speak the same language. Multiple cameras, multiple sensors, all on different clocks at different rates. Jitter from USB and OS scheduling. Drift that compounds every second of recording. We built a multi-modal sync pipeline that aligns all of it to ±2.5ms. Automatically. Every frame matched. Zero sensor samples lost. No hardware triggers needed. Sensor-agnostic. Hardware-agnostic. Just plug in and record. Physical AI needs real hand-eye coordination. Not approximate, precise. We're building this at OpenGraph. https://t.co/H3yzDkaw1j

5

290

39

235

21K

3 months ago

@lee_redden For real. Still a pain point in 2026, so we're fixing it.

0

2

0

0

265

3 months ago

@kellypeilinchan Exactly. Software sync has to be right first. Everything built on top inherits the error otherwise. That's why we started there.

0

1

0

0

223

JerryHan_og retweeted

4 months ago

🏆 Won 1st Place at the AGI Hackathon at @agihouse_org with @juliakeem @JerryHan_og and @OpenGraph_Labs! We built a "Temporal Action Segmentation Pipeline" for Physical AI. The Problem: Robotics data today = short clips, RGB-only, lab settings. We need long-horizon, multi-modal, in-the-wild data. Our Solution: 🎬 Input: Long manipulation video (5+ mins) 🤖 Gemini VLM → Action & Phase segmentation 🎯 SAM3 → Object tracking with text prompts 🌐 Pi3 → 3D reconstruction & camera poses 📚 Skill clustering → Reusable skill library → Output: Structured robot training data with timestamps, masks & 3D Humans ARE the ultimate robots 🦾 #PhysicalAI #Robotics #Hackathon #Gemini #SegmentAnything Huge thanks to @henry_yu_01 @NomadicML @zoox @DynaRobotics

9

98

8

53

8K

Last Seen Users on Sotwe

Trends for you

Most Popular Users