Wei Yu @GnosisYu - Twitter Profile

Pinned Tweet

2 months ago

Dropping an exciting new demo of MosaicMem! 👀🔥 A friend brought up a great question: why not combine long-horizon navigation video generation, promptable world events, and scene concatenation? Fair point — so we gave it a shot. 🎬✨ For more technical details, check this thread 🧵👇 https://t.co/qyQYwmHsE6 #WorldModel #GenerativeAI #VideoGeneration #InteractiveAI #Genie3 #EmbodiedAI #GameAI

0

107

21

72

9K

Wei Yu @GnosisYu

2 days ago

@_sam_sinha_ @BerntBornich @1x_tech Huge congrats, @sam_sinha! 🎉

0

2

0

1

144

GnosisYu retweeted

Howard Xiao

@howard_xhc

4 days ago

I’m very excited to share that my very first Ph.D project, Policy-based Foveated Imaging and Perception, will be presented at #SIGGRAPH2026! Intelligent sensing transcends passive capture. Our framework allows ultra-high-resolution sensors to intelligently allocate acquisition bandwidth, perceiving the blooming present and awakening the vivid past. We further demo our framework on a physical 200MP sensor prototype real-time with only laptop CPU! I’m extremely grateful for the advice and support from my Ph.D advisor @GordonWetzstein, and for the wonderful collaboration with @jan_on_x and @boyang_deng! Please check out our paper and our website 📷+👁️: https://t.co/GC4V7qqtP9.

6

33

4

9

4K

GnosisYu retweeted

Shuhong Zheng @zhengshuhong

11 days ago

Exciting to share our work "Good Token Hunting" 🔍 (Yes, the name is inspired by the classic movie "Good Will Hunting" 🎬!), which focuses on accelerating visual geometry transformers 🚀 by limiting the number of keys/values each query can attend in global attention layers. [1/6]

zhengshuhong's tweet photo. Exciting to share our work "Good Token Hunting" 🔍 (Yes, the name is inspired by the classic movie "Good Will Hunting" 🎬!), which focuses on accelerating visual geometry transformers 🚀 by limiting the number of keys/values each query can attend in global attention layers. [1/6] https://t.co/ER24A6cEpq

1

27

8

19

17K

Who to follow

Tianyu Li EasyPaperSniper

@SniperPaper

Founding Researcher@Dexmate | exRS@FAIR | GaTech PhD

Heeger

@GChongkai

Visiting PhD student at @StanfordSVL. Working with Prof. @drfeifei. PhD student of NUS. Robotics, world models.

Yi Ru (Helen) Wang @ ICRA 2026

@YiruHelenWang

Ph.D. Student @uwcse @uw_robotics | Chair @ieee_ras_sac | NSERC-PGSD Fellow @nserc_crsng | BASc from @UofT EngSci | Robot learning & perception

GnosisYu retweeted

Xuanchi Ren

@xuanchi13

11 days ago

The latent-vs-pixel debate misses the point. GPT Image 2 shows what users notice: pixel-level fidelity. Latent models show what scales: compact semantic structure. We connect them by replacing VAE/RAE decoders with a Pixel Diffusion Decoder. Code and Model available: https://t.co/JjtecJzF0W 🧵(1/N)

16

412

69

306

668K

GnosisYu retweeted

Lily Goli @lily_goli

16 days ago

🚀 🚀 🚀 Excited to share our new paper: Remember to be Curious: Episodic Context and Persistent Worlds for 3D Exploration What does it take for an agent to stay curious in a 3D world? The answer is memory. 🌐 Project: https://t.co/G4SjLoFJht 📄 Paper: https://t.co/iUFwp5NvRu 💻 Code: https://t.co/KZRaQLyzyh

2

222

40

129

70K

GnosisYu retweeted

Runway

@runwayml

16 days ago

Aleph 2.0 is here. Now you can edit a single frame in your video, preview the change and then Aleph 2.0 carries that edit across the rest of your video. Try it now in the new Edit Studio on web at the link below.

136

2K

263

1K

5M

GnosisYu retweeted

Odyssey @odysseyml

19 days ago

Meet our new friend, Starchild-1 ❤️ Starchild-1 is the first ever real-time multimodal world model. A world model understands and simulates the world. Starchild-1 has learned to generate not just the visuals of the world, but the sounds of it too!

63

781

73

475

120K

GnosisYu retweeted

Dwarkesh Patel

@dwarkesh_sp

22 days ago

New blackboard lecture w @ericjang11 He walks through how to build AlphaGo from scratch, but with modern AI tools. Sometimes you understand the future better by stepping backward. AlphaGo is still the cleanest worked example of the primitives of intelligence: search, learning from experience, and self-play. You have to go back to 2017 to get insight into how the more general AIs of the future might learn. Once he explained how AlphaGo works, it gave us the context to have a discussion about how RL works in LLMs and how it could work better – naive policy gradient RL has to figure out which of the 100k+ tokens in your trajectory actually got you the right answer, while AlphaGo’s MCTS suggests a strictly better action every single move, giving you a training target that sidesteps the credit assignment problem. The way humans learn is surely closer to the second. Eric also kickstarted an Autoresearch loop on his project. And it was very interesting to discuss which parts of AI research LLMs can already automate pretty well (implementing and running experiments, optimizing hyperparameters) and which they still struggle with (choosing the right question to investigate next, escaping research dead ends). Informative to all the recent discussion about when we should expect an intelligence explosion, and what it would look like from the inside. Timestamps: 0:00:00 – Basics of Go 0:08:06 – Monte Carlo Tree Search 0:31:53 – What the neural network does 1:00:22 – Self-play 1:25:27 – Alternative RL approaches 1:45:36 – Why doesn’t MCTS work for LLMs 2:00:58 – Off-policy training 2:11:51 – RL is even more information inefficient than you thought 2:22:05 – Automated AI researchers

65

3K

286

3K

683K

GnosisYu retweeted

Haoyi Zhu @HaoyiZhu

23 days ago

🤩Excited to share SANA-WM: a 2.6B open-source world model for minute-scale 720p video generation. Given one image + text + a 6-DoF camera trajectory, it synthesizes action-controllable 60s worlds on a single GPU. Project: https://t.co/5NINfiFoTK Paper: https://t.co/JKczmyRsJL

27

1K

142

795

121K

GnosisYu retweeted

Wayve @wayve_ai

about 1 month ago

Meet LA-Pose. Our latest model taking Wayve another step towards generalization at scale. LA-Pose employs large-scale self-supervised learning, building strong motion representations for 3D perception from 10.2 million unlabeled driving video snippets, unlike today's strongest approaches that often depend on expensive, carefully curated 3D supervision. With only a lightweight pose head and limited labelled data, LA-Pose achieves: 📷 State-of-the-art camera pose estimation 🌎 Strong zero-shot generalization across diverse driving scenarios 🏷️ Orders of magnitude less labelled data than fully supervised 3D approaches Our full blog post: https://t.co/CcNWuLHJsn Explore the full paper here: https://t.co/DHRsAS9ckV

1

145

37

100

36K

GnosisYu retweeted

Alberto

@taiuti

30 days ago

We're taking our first step towards democratizing World Models, so that everyone can build on this incredible technology. We have more to share, but enjoy a glimpse of what's to come, today. Try it here: https://t.co/FP7acKd7v7

42

796

63

235

1M

GnosisYu retweeted

Junchen Liu

@JunchenLiu77

about 1 month ago

Excited to share that this work has been accepted to #ICML and the code is now publicly available at https://t.co/nAQec6sLOC!

1

43

2

5

7K

GnosisYu retweeted

Nathan Yan @OfficialNathanY

about 1 month ago

What if your robot could understand any object you describe, just from a phone camera? RADIO-ViPE builds a 3D map from raw monocular video that you can query with natural language. (1/4)

11

423

49

339

61K

GnosisYu retweeted

Xindi Wu @cindy_x_wu

5 months ago

New #NVIDIA Paper We introduce Motive, a motion-centric, gradient-based data attribution method that traces which training videos help or hurt video generation. By isolating temporal dynamics from static appearance, Motive identifies which training videos shape motion in video generation. 🔗 https://t.co/TbKXjQMN3H 1/10

11

582

119

265

110K

GnosisYu retweeted

Moonlake @moonlake

about 1 month ago

Introducing Moonlake's 3D Agent. Our agent acts like a technical artist that can build and reconstruct articulated assets and large-scale editable scenes with hundreds of objects from a single image and can improve its generations continuously. Learn more in the thread below.

39

1K

180

2K

1M

GnosisYu retweeted

Qixing Huang @qixing_huang

about 1 month ago

Presenting Yunpeng’s paper on positional encoding field. I am at P3-1908!

0

104

7

59

9K

GnosisYu retweeted

Rohan Paul

@rohanpaul_ai

about 2 months ago

In Beijing's 2026 humanoid robot half-marathon, HONOR's Lightning completed the 21 km course in 50:26 minute. Beat current human men's half-marathon world record of 57:20. Last year's winner took over 2 hours 40 minutes. Massive progress in 12 month

4

91

25

13K

GnosisYu retweeted

Skywork

@Skywork_ai

about 2 months ago

Introducing Skywork 3.0 —— A 24/7 Autonomous Cloud Workforce. Available to all Skywork subscribers.

11

70

6

24

8K

GnosisYu retweeted

Rohan Paul

@rohanpaul_ai

about 2 months ago

This robot took home the “Best Design” award, in today's Beijing humanoid robot half-marathon. recognition that its motion looks closer to natural human running than most competitors. TienKung Ultra completed the full 21.1 km in 1 hour 15 minutes.

28

573

88

97

44K

GnosisYu retweeted

Kai He @Kai__He

about 2 months ago

We open-sourced the code and model for UniRelight! 🎉 Given an input video and a target lighting configuration, our method jointly predicts a relit video and its corresponding albedo. Code: https://t.co/4zF94saWvo Model: https://t.co/d8i66UyvhU

7

277

45

192

32K

Wei Yu

@GnosisYu

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users