Guangxing Han @GuangxingHan - Twitter Profile

2 months ago

True multimodal AI needs to understand the world spatially 🎯 🚀 Excited to release #CVPR2026 TIPSv2 from @GoogleDeepMind, a foundational image-text encoder with spatial awareness, leading to strong overall results and massive gains on patch-text alignment. 🔥 1/N

andrefaraujo's tweet photo. True multimodal AI needs to understand the world spatially 🎯
🚀 Excited to release #CVPR2026 TIPSv2 from @GoogleDeepMind, a foundational image-text encoder with spatial awareness, leading to strong overall results and massive gains on patch-text alignment. 🔥
1/N https://t.co/BoIWcbUDFF

11

737

94

444

83K

GuangxingHan retweeted

Google DeepMind @GoogleDeepMind

7 months ago

This is Gemini 3: our most intelligent model that helps you learn, build and plan anything. It comes with state-of-the-art reasoning capabilities, world-leading multimodal understanding, and enables new agentic coding experiences. 🧵

211

6K

1K

2M

Guangxing Han @GuangxingHan

8 months ago

Shraman is one of the best young researchers I have been working with. He has demonstrated profound skill in multimodal LLMs for visual grounding, segmentation and reasoning. Reach out to him if you need a top-tier vision-language multimodal expert!

Shraman Pramanick

@Shramanpramani2

8 months ago

My role at Meta's SAM team (MSL, previously at FAIR Perception) has been impacted within 3 months of joining after PhD. If you work with multimodal LLMs for grounding or complex reasoning, or have a long-term vision of unified understanding and generation, let's talk. I am on the job market starting immediately. #metalayoffs #FAIR #MSL #SAM

26

338

26

72

110K

0

1

0

445

GuangxingHan retweeted

Sundar Pichai

@sundarpichai

9 months ago

Our new Gemini 2.5 Computer Use model is now available in the Gemini API, setting a new standard on multiple benchmarks with lower latency. These are early days, but the model’s ability to interact with the web – like scrolling, filling forms + navigating dropdowns – is an important next step in building general-purpose agents. Developers can try these capabilities via API in @googleaistudio + Vertex AI.

sundarpichai's tweet photo. Our new Gemini 2.5 Computer Use model is now available in the Gemini API, setting a new standard on multiple benchmarks with lower latency. These are early days, but the model’s ability to interact with the web – like scrolling, filling forms + navigating dropdowns – is an important next step in building general-purpose agents.

Developers can try these capabilities via API in @googleaistudio + Vertex AI.

114

3K

289

474

310K

Who to follow

Shaowei Liu

@stevenpg8

CS PhD @IllinoisCDS | MSCS @ucsd_cse | BSEE @Tsinghua_uni

Xiaolong Wang

@xiaolonw

Research Director, @Meta Superintelligence Labs Co-founder of ARI Associate Professor @UCSDJacobs Postdoc @berkeley_ai PhD @CMU_Robotics

Ziqi Huang

@ziqi_huang_

Ph.D. student @NTUsg MMLab@NTU - Visual Generation

Guangxing Han @GuangxingHan

11 months ago

@YangsiboHuang @jasondeanlee at least better readability. Congrats!

0

1

0

66

GuangxingHan retweeted

Demis Hassabis

@demishassabis

12 months ago

Thrilled to welcome @windsurf_ai founders @_mohansolo & Douglas Chen and some of the brilliant Windsurf eng team to @GoogleDeepMind. Excited to be working with them to turbocharge our Gemini efforts on coding agents, tool use and much more. Great to have you on board!

90

3K

192

260

383K

GuangxingHan retweeted

Giorgos Kordopatis-Zilos @g_kordo

12 months ago

🚨 Deadline Extension Instance-Level Recognition and Generation (ILR+G) Workshop at ICCV2025 @ICCVConference 📅 new deadline: June 26, 2025 (23:59 AoE) 📄 paper submission: https://t.co/gTGYhrTc6Z 🌐 ILR+G website: https://t.co/Oy1vGAg5uh #ICCV2025 #ComputerVision #AI

g_kordo's tweet photo. 🚨 Deadline Extension

Instance-Level Recognition and Generation (ILR+G) Workshop at ICCV2025 @ICCVConference

📅 new deadline: June 26, 2025 (23:59 AoE)
📄 paper submission: https://t.co/gTGYhrTc6Z
🌐 ILR+G website: https://t.co/Oy1vGAg5uh

#ICCV2025 #ComputerVision #AI https://t.co/lJJlLqttLd

1

11

4

1

4K

GuangxingHan retweeted

Sundar Pichai

@sundarpichai

about 1 year ago

Gemini 2.5 Pro + 2.5 Flash are now stable and generally available. Plus, get a preview of Gemini 2.5 Flash-Lite, our fastest + most cost-efficient 2.5 model yet. 🔦 Exciting steps as we expand our 2.5 series of hybrid reasoning models that deliver amazing performance at the Pareto frontier of cost and speed. 🚀

sundarpichai's tweet photo. Gemini 2.5 Pro + 2.5 Flash are now stable and generally available. Plus, get a preview of Gemini 2.5 Flash-Lite, our fastest + most cost-efficient 2.5 model yet. 🔦

Exciting steps as we expand our 2.5 series of hybrid reasoning models that deliver amazing performance at the Pareto frontier of cost and speed. 🚀

250

4K

439

403

1M

GuangxingHan retweeted

Arjun Karpur @arjunkarpur

about 1 year ago

Excited to be presenting TIPS at this morning’s #ICLR2025 poster session! Come by poster #318 and say hi 👋 w/ @kfrancischen @andrefaraujo @kmaninis #ICLR #ICLR25

arjunkarpur's tweet photo. Excited to be presenting TIPS at this morning’s #ICLR2025 poster session! Come by poster #318 and say hi 👋

w/ @kfrancischen @andrefaraujo @kmaninis #ICLR #ICLR25 https://t.co/mO9YPPvZE1

0

8

4

1

810

GuangxingHan retweeted

Google DeepMind @GoogleDeepMind

about 1 year ago

Think you know Gemini? 🤔 Think again. Meet Gemini 2.5: our most intelligent model 💡 The first release is Pro Experimental, which is state-of-the-art across many benchmarks - meaning it can handle complex problems and give more accurate responses. Try it now → https://t.co/sxHWaCUzRp

89

2K

506

426

1M

GuangxingHan retweeted

André Araujo @andrefaraujo

over 1 year ago

Multimodal AI encoders often lack spatial understanding… but not anymore! Our #ICLR2025 TIPS model (Text-Image Pretraining with Spatial awareness) from @GoogleDeepMind can help 💡🚀 Check out our strong & versatile image-text encoder 💪 Paper & code: https://t.co/LCiqV4gaQ0

andrefaraujo's tweet photo. Multimodal AI encoders often lack spatial understanding… but not anymore! Our #ICLR2025 TIPS model (Text-Image Pretraining with Spatial awareness) from @GoogleDeepMind can help 💡🚀
Check out our strong & versatile image-text encoder 💪
Paper & code: https://t.co/LCiqV4gaQ0 https://t.co/tFZytyJi1y

6

324

65

148

35K

GuangxingHan retweeted

André Araujo @andrefaraujo

over 1 year ago

Excited to release a super capable family of image-text models from our TIPS #ICLR2025 paper! https://t.co/1scX7H1DIb We have models from ViT-S to -g, with spatial awareness, suitable to many multimodal AI applications. Can’t wait to see what the community will build with them!

1

17

6

14

4K

GuangxingHan retweeted

André Araujo @andrefaraujo

over 1 year ago

Very happy to see learnings from our TIPS method (ICLR'25 accepted https://t.co/IP6JowSDcE) adopted into SigLIP2! A very nice collaboration, great outcome!

0

11

1

0

526

GuangxingHan retweeted

André Araujo @andrefaraujo

over 1 year ago

Want some TIPS? Well, then check out “Text-Image Pretraining with Spatial awareness” :) TIPS is a general-purpose image-text encoder, for off-the-shelf dense and image-level prediction. Finally image-text pretraining with spatially-aware representations! https://t.co/LCiqV4gaQ0

andrefaraujo's tweet photo. Want some TIPS? Well, then check out “Text-Image Pretraining with Spatial awareness” :)

TIPS is a general-purpose image-text encoder, for off-the-shelf dense and image-level prediction. Finally image-text pretraining with spatially-aware representations!

https://t.co/LCiqV4gaQ0 https://t.co/XF4b7nfMCx

4

49

11

17

6K

Guangxing Han @GuangxingHan

over 1 year ago

@liuzhuang1234 @PrincetonCS Congrats!

0

1

0

279

Guangxing Han @GuangxingHan

over 1 year ago

@_tim_brooks @GoogleDeepMind Congrats Google

0

1

0

126

Guangxing Han @GuangxingHan

almost 2 years ago

@ProfTomYeh @AnthropicAI interesting work. Any link for the original paper?

1

0

435

GuangxingHan retweeted

André Araujo @andrefaraujo

almost 2 years ago

Our call for papers for the ILR workshop at #ECCV2024 is open! Deadline on July25th, options for both long and short papers. Don't miss this opportunity to showcase your work in the broad area of instance-level recognition! Submit at: https://t.co/K0amT3liUC

0

5

2

1

1K

Guangxing Han @GuangxingHan

almost 2 years ago

@zibuyu9 刘老师多多发推🤣

0

460

GuangxingHan retweeted

André Araujo @andrefaraujo

about 2 years ago

Announcing the #ECCV2024 workshop on Instance-Level Recognition (ILR)! This is the 6th edition in our workshop series, with amazing keynote speakers: @CordeliaSchmid, @jampani_varun and @g_kordo. Call for papers now open! All information on our website: https://t.co/y2jJrvpDAa

andrefaraujo's tweet photo. Announcing the #ECCV2024 workshop on Instance-Level Recognition (ILR)! This is the 6th edition in our workshop series, with amazing keynote speakers: @CordeliaSchmid, @jampani_varun and @g_kordo.

Call for papers now open!

All information on our website: https://t.co/y2jJrvpDAa https://t.co/ffdg19b9Ko

0

9

7

0

3K

Guangxing Han

@GuangxingHan

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users