Junlin Han @han_junlin - Twitter Profile

Pinned Tweet

8 months ago

Excited to share our new work: “Learning to See Before Seeing”! 🧠➡️👀 We investigate an interesting phenomeno: how do LLMs, trained only on text, learn about the visual world? Project page: https://t.co/9mQt3qnckL

han_junlin's tweet photo. Excited to share our new work: “Learning to See Before Seeing”! 🧠➡️👀 We investigate an interesting phenomeno: how do LLMs, trained only on text, learn about the visual world?
Project page: https://t.co/9mQt3qnckL https://t.co/srjfx64kkd

7

157

26

57

26K

Junlin Han @han_junlin

18 days ago

Do the right things that can scale.

Jianyuan@CVPR

@jianyuan_wang

18 days ago

Introducing VGGT-Ω: scaling feed-forward reconstruction across static and dynamic scenes, and studying whether the learned geometric representations transfer beyond reconstruction.

14

845

144

242

774K

1

13

0

1

2K

Junlin Han @han_junlin

about 2 months ago

@phillip_isola PRH fan here. Beyond measurement metrics, just sharing my feelings. When working with LLM + vision (understand & gen), we can feel the Platonic behind it. it’s less about one modality replacing others, but about high rep similarity making cross-modal learning so much easier.

0

5

0

1

558

Junlin Han @han_junlin

about 2 months ago

@TongPetersb @sainingxie @ylecun @mengyer @YiMaTweets @LukeZettlemoyer @liuzhuang1234 Our friendship began during our PhD application cycle back to 2022, and there’s hardly any need to say more about how amazing your work has been. Behind all of it is an incredibly hardworking person, with strong belief and a very kind heart. Congrats, Dr. Tong!!!

1

9

0

906

Who to follow

Kaifeng Zhang

@kaiwynd

PhD student at Columbia University

Qiaolin Yu

@liin1211

working on @sgl_project at @radixark

Yiming Dou

@_YimingDou

Ph.D. student at Cornell | Computer Vision, Multimodal, Robotics

Junlin Han @han_junlin

about 2 months ago

@__JohnNguyen__ @TongPetersb @DavidJFan @sainingxie @ylecun @_ellisbrown the talk and works behind are amazing! Congrats Dr. Tong!! @TongPetersb

0

5

0

711

Junlin Han @han_junlin

about 2 months ago

@xuanchi13 super impressive!!! Amazing work!

1

2

0

561

Junlin Han @han_junlin

3 months ago

@xxunhuang so true, can’t agree more!!! 🤣

0

3

0

290

Junlin Han @han_junlin

3 months ago

AMI was founded by amazing researchers I have deeply respected since the very beginning of my research career. They have built fundemental things: before LLM era (SSL, JEPA, moco, mae, barlow twins, arch from LeNet to more recently resnext, convnext...), during LLM era (gpt, dalle, gemini, Cambrian, rae, beyond language modeling...), and AMI will undoubtedly open the path of building advanced, world-centric intelligence that shape the future!

Saining Xie

@sainingxie

3 months ago

i’m joining forces with @ylecun and an incredible group of people to start AMI Labs @amilabs. AMI isn’t a conventional lab. we don’t intend to become one. a lot to say about why this moment matters, but for now we’re heads down building. join us: https://t.co/zXj1IyBYDc

153

3K

160

476

498K

2

59

1

8

7K

Junlin Han @han_junlin

3 months ago

AMI was founded by amazing researchers I have deeply respected since the very beginning of my research career. They have built fundemental things: before LLM era (SSL, JEPA, moco, mae, barlow twins, arch from LeNet to more recently resnext, convnext...), during LLM era (gpt, dalle, gemini, Cambrian, Jepa, rae, beyond language modeling...), and AMI will undoubtedly open the path of building advanced, world-centric intelligence that shape the future!

1

6

0

957

Junlin Han @han_junlin

3 months ago

@nanliuuu 😊 thanks Nan!

0

1

0

70

Junlin Han @han_junlin

3 months ago

We believe the next leap in General Intelligence lies beyond language. Vision holds an untapped ocean of potential for true world modeling. By training all from scratch, we show that vision can play a more foundational role in intelligence, rather than just being an add-on!!

Peter Tong

@TongPetersb

3 months ago

Train Beyond Language. We bet on the visual world as the critical next step alongside and beyond language modeling. So, we studied building foundation models from scratch with vision. We share our exploration: visual representations, data, world modeling, architecture, and scaling behavior! [1/9]

TongPetersb's tweet photo. Train Beyond Language. We bet on the visual world as the critical next step alongside and beyond language modeling. So, we studied building foundation models from scratch with vision.
We share our exploration: visual representations, data, world modeling, architecture, and scaling behavior! [1/9]

35

1K

220

721

217K

2

48

2

12

5K

Junlin Han @han_junlin

3 months ago

@__JohnNguyen__ 😆😆😆 I probably need to switch my coding llm to Claude!

0

5

0

122

han_junlin retweeted

David Fan

@DavidJFan

3 months ago

[1/9] What happens when you treat vision as a first-class citizen during multimodal pretraining? To find out, we studied the design space of training Transfusion-style models that input and output all modalities, from scratch. Here is what we learned about visual representations, data, world modeling, architecture, and scaling behavior! Paper: https://t.co/ik6JGgjbTD Website: https://t.co/nklaggMEfT @TongPetersb, @DavidJFan, @__JohnNguyen__, @ellisbrown, @GaoyueZhou, @JasonQSY, @boyangzheng, @webalorn, @han_junlin, @rob_fergus, @NailaMurray, @gh_marjan, @ml_perception, Nicolas Ballas, @_amirbar, Michael Rabbat, Jakob Verbeek, @LukeZettlemoyer, @koustuvsinha, @ylecun, @sainingxie

12

301

60

209

51K

han_junlin retweeted

John Nguyen

@__JohnNguyen__

3 months ago

Humans communicate through language and interact with the world through vision, yet most multimodal models are language-first. What happens when we go beyond language? 🤔 Beyond Language Modeling: a deep dive into the design space of truly native multimodal models Paper: https://t.co/KOpmL1PItn Project: https://t.co/Oy6XuEtUAi

__JohnNguyen__'s tweet photo. Humans communicate through language and interact with the world through vision, yet most multimodal models are language-first. What happens when we go beyond language? 🤔
Beyond Language Modeling: a deep dive into the design space of truly native multimodal models

Paper: https://t.co/KOpmL1PItn
Project: https://t.co/Oy6XuEtUAi

10

202

39

157

40K

Junlin Han @han_junlin

3 months ago

@DavidJFan I know this is Truly a massive challenge that went far beyond only the technical aspects. Thank you, David, for driving this forward and delivering such incredible work. Absolutely amazing!!🤩

0

1

0

47

Junlin Han @han_junlin

3 months ago

It was really a high-stakes bet and a challenging journey. Huge congrats to the amazing team, especially Peter @TongPetersb , David @DavidJFan , and John @__JohnNguyen__ for their incredible work in leading the project! You are da best!!!

0

5

0

231

Junlin Han @han_junlin

3 months ago

It covers many design spaces in unified pre-training, from visual rep and arch to world modeling and scaling. Each part has very useful findings backed up with tons of explorations. Crucially, we show that vision and language are highly complementary, a synergy for intelligence.

1

5

0

281

Junlin Han @han_junlin

3 months ago

now accepted to iclr 2026 with oral presentation! See you in Rio.

Junlin Han @han_junlin

8 months ago

Excited to share our new work: “Learning to See Before Seeing”! 🧠➡️👀 We investigate an interesting phenomeno: how do LLMs, trained only on text, learn about the visual world? Project page: https://t.co/9mQt3qnckL

7

157

26

57

26K

1

55

3

17

7K

Junlin Han @han_junlin

7 months ago

@yawarnihal amazing!! Congrats!🎊🎉

0

1

0

70

Junlin Han

@han_junlin

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users