Lei Shu @shulindt - Twitter Profile

shulindt retweeted

5 days ago

Missing Fable 5🥹? Come and cast your votes on 3DCodeArena https://t.co/KBEEv4qH5A to see it’s performance on agentic 3D modeling via writing code. Check https://t.co/6RWSgVW5XQ for more results of 12 frontier LLMs/VLMs on text/image to 3D generation via Blender code. Code is released https://t.co/SpL82YEz42. Please star if you find it interesting! #Fable #Claude #Gemini #GPT #Blender #ProceduralModeling #GoogleDeepmind

0

3

2

0

167

shulindt retweeted

Mohit Bansal

@mohitban47

3 months ago

🚨 New #CVPR2026 collaboration with Google DeepMind --> Ego2Web bridges egocentric video perception and web execution, enabling agents that see the first-person real-world video of the user’s surroundings, and take actions on the web grounded in the egocentric video: ▪️ Introduces a task where agents must ground egocentric video (first-person view) into concrete web actions (requires visual grounding → entity extraction → planning → real website execution). ▪️Covers realistic cross-domain tasks e.g., e-commerce (find/buy items you saw), media retrieval (find related videos), knowledge lookup (identify & query entities), maps/local (locate places from visual cues). ▪️Proposes Ego2WebJudge to automatically evaluate whether web agent results are correctly grounded in the video context. ▪️Reveals concrete failure modes across 6 strong agents (GPT-5.4, Claude, Gemini-based agents, etc.): weak visual grounding, brittle cross-modal reasoning, and planning breakdowns (only ~58% success rate). Details 👇👇

0

37

10

11

7K

shulindt retweeted

Shoubin Yu

@shoubin621

3 months ago

Introducing Ego2Web from Google DeepMind and UNC Chapel Hill, accepted to #CVPR2026. AI agents can browse the web. But can they act based on what you see? Existing benchmarks focus only on web interaction while ignoring the real world. Ego2Web bridges egocentric video perception and web execution, enabling agents that can see through first-person video, understand real-world context, and take actions on the web grounded in the egocentric video. This opens a path toward AI assistants that operate seamlessly across physical and digital environments. We hope Ego2Web serves as an important step for building more capable, perception-driven agents. 🧵👇

10

141

45

55

44K

shulindt retweeted

Shoubin Yu

@shoubin621

3 months ago

Awesome collaboration with @shulindt @AntoineYang2 @Francis_YAO_ Srinivas Sunkara, Maria Wang, @jdchen @mohitban47 @BoqingGo @unc_ai_group @unccs @GoogleDeepMind Check the full paper for more details! ArXiv: https://t.co/hdzQDpirbf Code: https://t.co/ebiMxeExjL Benchmark: https://t.co/PAWgIORG4r Webpage: https://t.co/8VE4D3BUAl @huggingface page: https://t.co/EISEEmq4Oy

0

11

1

3

472

Who to follow

Longyue Wang

@wangly0229

Dr. | Senior Staff Engineer @AlibabaGroup | IEEE Senior Member | Previously @DCU, @TencentGlobal

Zeming Chen (Eric)

@eric_zemingchen

Working on test-time learning and reasoning agents; PhD student - NLP Lab @EPFL; Ex @AIatMeta (FAIR) @allen_ai #AI #ML #NLP

Siru Ouyang

@Siru_Ouyang

CS PhD candidate @UofIllinois. Alumni @sjtu1896. Intern @NVIDIAAI @GoogleAI @TencentGlobal @MSFTResearch. Building self-evolving agents. ReasoningBank/SkillOS

shulindt retweeted

Hu Xu (back on robotics)

@Hu_Hsu

3 months ago

Our team at FAIR, @AIatMeta is looking for a 2026 Summer Intern to work on video pretraining, with related interests in video generation, world models, or robotics. Given the current timing, we’re especially happy to hear from PhD students who have been prioritizing research last year and may not have had much time for ad-hoc/random trials on internship interviews (I was definitely in that situation during my PhD 😆). Feel free to DM me.

9

192

7

138

20K

shulindt retweeted

Hu Xu (back on robotics)

@Hu_Hsu

7 months ago

Thrilled that SAM 3 and SAM 3D @AIatMeta leverage Meta CLIP(https://t.co/v4FHn1uOjJ)’s concept-rich images that greatly expanding and scaling concept-based image curation.

0

3

1

340

shulindt retweeted

Hu Xu (back on robotics)

@Hu_Hsu

10 months ago

Truly appreciate the authors of Molmo @Molmo_AI (from @allen_ai and @UW) for promoting open research and adopting MetaCLIP. There are many forms of openness today—such as open APIs, open weights, and open-source for reproducibility etc. I view MetaCLIP and Molmo's research approach as “from scratch” that is on top of being open: building the entire process without relying on black-box modules that limit the research scope, pushing the limit of every module and sharing both the knowledge and insights gained along the way.

Hu_Hsu's tweet photo. Truly appreciate the authors of Molmo @Molmo_AI (from @allen_ai and @UW) for promoting open research and adopting MetaCLIP. There are many forms of openness today—such as open APIs, open weights, and open-source for reproducibility etc. I view MetaCLIP and Molmo's research approach as “from scratch” that is on top of being open: building the entire process without relying on black-box modules that limit the research scope, pushing the limit of every module and sharing both the knowledge and insights gained along the way.

1

98

9

26

12K

shulindt retweeted

Hu Xu (back on robotics)

@Hu_Hsu

11 months ago

The cross-modal multilingual capability in Meta CLIP 2 is naturally from scaling. It is a reflection of the Bitter Lesson(@RichardSSutton) on building a scalable learning environment, and let optimization and scale do the heavy lifting. Inspired by Jensen Huang’s scaling law plot (at CES2025), the scaling isn’t just about increasing compute or data at each stage—it’s about consistently identifying bottlenecks, removing constraints, and rethinking a simpler setup to make it more scalable. This mindset extends to data curation itself: starting from ImageNet to worldwide data curation. @AIatMeta

Hu_Hsu's tweet photo. The cross-modal multilingual capability in Meta CLIP 2 is naturally from scaling. It is a reflection of the Bitter Lesson(@RichardSSutton) on building a scalable learning environment, and let optimization and scale do the heavy lifting. Inspired by Jensen Huang’s scaling law plot (at CES2025), the scaling isn’t just about increasing compute or data at each stage—it’s about consistently identifying bottlenecks, removing constraints, and rethinking a simpler setup to make it more scalable. This mindset extends to data curation itself: starting from ImageNet to worldwide data curation. @AIatMeta

1

24

7

13

10K

shulindt retweeted

Jason Weston

@jaseweston

11 months ago

🌿Introducing MetaCLIP 2 🌿 📝: https://t.co/mSncoFH5bE code, model: https://t.co/11j9HcaeAB After four years of advancements in English-centric CLIP development, MetaCLIP 2 is now taking the next step: scaling CLIP to worldwide data. The effort addresses long-standing challenges: (1) large-scale non-English data curation pipelines are largely undeveloped, and (2) the curse of multilinguality, where English performance often degrades in multilingual CLIP compared to English-only CLIP. With a complete recipe for worldwide CLIP—spanning data curation, modeling, and training—we show that English and non-English worlds can mutually benefit and elevate each other, achieving SoTA multilingual performance. Join the Meta booth at #ACL2025 to learn more. (1/3)

jaseweston's tweet photo. 🌿Introducing MetaCLIP 2 🌿
📝: https://t.co/mSncoFH5bE
code, model: https://t.co/11j9HcaeAB

After four years of advancements in English-centric CLIP development, MetaCLIP 2 is now taking the next step: scaling CLIP to worldwide data. The effort addresses long-standing challenges: (1) large-scale non-English data curation pipelines are largely undeveloped, and (2) the curse of multilinguality, where English performance often degrades in multilingual CLIP compared to English-only CLIP. With a complete recipe for worldwide CLIP—spanning data curation, modeling, and training—we show that English and non-English worlds can mutually benefit and elevate each other, achieving SoTA multilingual performance. Join the Meta booth at #ACL2025 to learn more.
(1/3)

14

338

67

150

60K

shulindt retweeted

Hu Xu (back on robotics)

@Hu_Hsu

11 months ago

Thanks for highlighting this impact. Yes, we keep removing existing well-known filters since the start of Meta CLIP 1 and English filter is the last one. We believe every data point has its unique information (eg, file name has concepts, timestamp talks about how old the camera etc.) and removes it is at risk of losing (unknown) information. It's only about how much that information human really cares. Curation (select, not remove) is TheRightWay™, alignment is even more TheRightWay™. Illustrated as a venn diagram in our #icml2025 talk, basically the bitter lesson is one cannot approximate a desired training distribution with a finite number of binary classifiers, well enough?

1

10

2

1

1K

shulindt retweeted

Jeff Dean

@JeffDean

11 months ago

Colab Pro is now available for free for verified US students and faculty for one year.

34

794

83

203

130K

shulindt retweeted

ComputerUseAgents Workshop @workshopcua

11 months ago

Join us at the Computer Use Agents workshop at ICML2025. Happening now in West Meeting Room 211, Vancouver Convention Centre! We have a day packed with fantastic invited and contributed talks, posters, and discussions!

workshopcua's tweet photo. Join us at the Computer Use Agents workshop at ICML2025. Happening now in West Meeting Room 211, Vancouver Convention Centre!

We have a day packed with fantastic invited and contributed talks, posters, and discussions! https://t.co/9Vi2Fbe9k5

0

12

5

4

5K

Lei Shu

@shulindt

11 months ago

#Google #DeepMind WebQuest benchmark will be presented at #ICML2025 Workshop on #ComputerUseAgents (July 19) Happy to chat offline all things about #AIAgent #LLMAgents

shulindt's tweet photo. #Google #DeepMind WebQuest benchmark will be presented at #ICML2025 Workshop on #ComputerUseAgents (July 19)

Happy to chat offline all things about #AIAgent #LLMAgents https://t.co/eGfTNmZNsb

ComputerUseAgents Workshop @workshopcua

about 1 year ago

⏳ Less than 1 day left to submit! 🔦 Speaker Spotlight Time! We’re thrilled to welcome Yu Su (@ysu_nlp), Distinguished Assistant Professor at The Ohio State University, as an invited speaker at the ICML 2025 Workshop on Computer Use Agents! His work bridges LLM agents, memory, and planning, driving some of the most cited advances in the field. #ICML2025 #LLMAgents #ComputerUseAgents #NLProc

workshopcua's tweet photo. ⏳ Less than 1 day left to submit!

🔦 Speaker Spotlight Time!
We’re thrilled to welcome Yu Su (@ysu_nlp), Distinguished Assistant Professor at The Ohio State University, as an invited speaker at the ICML 2025 Workshop on Computer Use Agents!

His work bridges LLM agents, memory, and planning, driving some of the most cited advances in the field.
#ICML2025 #LLMAgents #ComputerUseAgents #NLProc

1

26

8

5

4K

0

1

139

shulindt retweeted

Hu Xu (back on robotics)

@Hu_Hsu

11 months ago

Thanks for the invited talk and happy to share our industrial insights on “scaling data alignment” from Meta CLIP (its wide adoption and what’s next) in the DataWorld workshop #ICML2025 . happy to chat offline about data research.

0

10

3

0

1K

Lei Shu

@shulindt

11 months ago

https://t.co/3U5Sn1fcG6

0

25

shulindt retweeted

Zeyuan Allen-Zhu, Sc.D.

@ZeyuanAllenZhu

12 months ago

Facebook AI Research (FAIR) is a small, prestigious lab in Meta. We don't train large models like GenAI or MSL, so it's natural that we have limited GPUs. GenAI or MSL's success or failure, past or future, doesn't reflect the work of FAIR. It is important to make this distinction

ZeyuanAllenZhu's tweet photo. Facebook AI Research (FAIR) is a small, prestigious lab in Meta. We don't train large models like GenAI or MSL, so it's natural that we have limited GPUs. GenAI or MSL's success or failure, past or future, doesn't reflect the work of FAIR. It is important to make this distinction https://t.co/2aN9ZEou7u

15

821

56

357

124K

shulindt retweeted

Demis Hassabis

@demishassabis

about 1 year ago

It’s been an amazing few months of relentless building, shipping, and optimising our models incorporating your feedback. Excited for more users and developers to try out the incredible Gemini 2.5 series!

60

1K

93

106

170K

shulindt retweeted