Haobo Yuan @HarborYuan - Twitter Profile

11 days ago

Proud to share our lab’s @MMLabNTU work Log-linear Sparse Attention (LLSA) - a trainable sparse attention mechanism that reduces attention complexity from O(N²) to O(N log N), making diffusion transformers much more efficient. Also, special shout-out to the first author @zhouyifan1107 for presenting the poster in full costume - truly above and beyond. The level of dedication is impressive! 👏 #CVPR2026 #DiffusionModels #EfficientAI #SparseAttention

ccloy's tweet photo. Proud to share our lab’s @MMLabNTU work Log-linear Sparse Attention (LLSA) - a trainable sparse attention mechanism that reduces attention complexity from O(N²) to O(N log N), making diffusion transformers much more efficient.

Also, special shout-out to the first author @zhouyifan1107 for presenting the poster in full costume - truly above and beyond. The level of dedication is impressive! 👏

#CVPR2026 #DiffusionModels #EfficientAI #SparseAttention

14

826

87

275

101K

Haobo Yuan @HarborYuan

over 1 year ago

Sa2VA is the the first unified model for dense grounded understanding of both images and videos. It combines the SAM-2 and MLMM models to enable a wide range of image and video tasks in minimal one-shot instruction tuning.

0

120

Haobo Yuan @HarborYuan

over 1 year ago

Introducing our Sa2VA. 🔥 Project Page: https://t.co/6RFu459ks2 GitHub: https://t.co/4VPQxABO33 Huggingface Demo: https://t.co/yyWI7WvQXh (Feat: @huggingface @fffiloni ) We provide various ways to get a quick start. Have fun~🥳

merve

@mervenoyann

over 1 year ago

ByteDance just dropped SA2VA: a new family of vision LMs combining Qwen2VL/InternVL and SAM2 with MIT license 💗 The models are capable of tasks involving vision-language understanding and visual referrals (referring segmentation) for images and videos ⏯️ take a look 🧶

mervenoyann's tweet photo. ByteDance just dropped SA2VA: a new family of vision LMs combining Qwen2VL/InternVL and SAM2 with MIT license 💗

The models are capable of tasks involving vision-language understanding and visual referrals (referring segmentation) for images and videos ⏯️
take a look 🧶 https://t.co/TE8A2PEvg6

11

774

120

592

83K

1

2

0

294

HarborYuan retweeted

Adina Yakup

@AdinaYakup

over 1 year ago

Sa2Va 🔥 a unified model for dense grounded understanding of images & videos released by Bytedance. Model: https://t.co/yjTV7yGjiI Paper: https://t.co/ntvcGrtJmg ✨ 1B/4B/8B ✨ Based on InternVL, used Qwen2 & 2.5, InternLM as language part. ✨ unifies text, images, and videos into a shared token space for seamless multimodal interactions

1

126

28

55

13K

Who to follow

Vivian Liu

@viv_lavida

CS PhD student @Columbia. previously @GoogleDeepmind, @AdobeResearch @ADSKResearch 🌷

Yixiao Ge

@ge_yixiao

Director @XPENGRobotics @XPengMotors. We are hiring!🦾 Previously Principal Researcher @TencentGlobal. PhD from MMLab @CUHKofficial.

picsart ai

@picsartai

design and market anything for your brand, ads, videos, logos, ai-powered, performance-driven, built by creators, for creators. part of the @picsart fam.

Haobo Yuan @HarborYuan

almost 2 years ago

Our Open-Vocabulary SAM has been accepted by ECCV 2024. 🔥 Project Page: https://t.co/izru0l2he3 Code: https://t.co/I4ysfJi2V5 #ECCV

AK

@_akhaliq

over 2 years ago

mmlab-ntu presents Open-Vocabulary SAM Segment and Recognize Twenty-thousand Classes Interactively paper page: https://t.co/QqODebOcPp Open-Vocabulary SAM extends SAM's segmentation capabilities with CLIP-like real-world recognition, while significantly reducing computational costs. It outperforms combined SAM and CLIP methods in object recognition on the COCO open vocabulary benchmark

2

304

66

160

47K

0

1

0

446

HarborYuan retweeted

Xiangtai Li

@xtl994

over 2 years ago

Happy to share that our survey has been accepted by T-PAMI. @HenghuiDing @HarborYuan We present the first comprehensive survey on open-vocabulary learning: detection, segmentation, video, 3D analysis, etc. Paper: https://t.co/uju0cI37oy Github: https://t.co/9BivSt22uN

xtl994's tweet photo. Happy to share that our survey has been accepted by T-PAMI. @HenghuiDing @HarborYuan

We present the first comprehensive survey on open-vocabulary learning: detection, segmentation, video, 3D analysis, etc.

Paper: https://t.co/uju0cI37oy

Github: https://t.co/9BivSt22uN https://t.co/644wEnpkGo

2

17

2

0

1K

Haobo Yuan @HarborYuan

over 2 years ago

@ClementDelangue Congrats. Hugging Face🤗 is really helpful in my research.

0

316

HarborYuan retweeted

MrBeast

@MrBeast

over 2 years ago

I’m gonna give 10 random people that repost this and follow me $25,000 for fun (the $250,000 my X video made) I’ll pick the winners in 72 hours

376K

2M

3M

63K

285M

HarborYuan retweeted

Xiangtai Li

@xtl994

over 2 years ago

Glad to share one research work during my post-doc study with @HarborYuan @ccloy. Name: OMG-Seg: Is One Model Good Enough For All Segmentation? Arxiv: https://t.co/uO1UFFcRjl Project Page: https://t.co/3tflOkGmOR Code: https://t.co/Xji2qTcLti Demo: https://t.co/QBkW7YWHCJ

xtl994's tweet photo. Glad to share one research work during my post-doc study with @HarborYuan @ccloy.

Name: OMG-Seg: Is One Model Good Enough For All Segmentation?

Arxiv: https://t.co/uO1UFFcRjl

Project Page: https://t.co/3tflOkGmOR

Code: https://t.co/Xji2qTcLti

Demo: https://t.co/QBkW7YWHCJ https://t.co/i7rV5RMYLl

5

82

18

30

18K

Haobo Yuan @HarborYuan

over 2 years ago

Thank you @AK for sharing our work! We are excited to introduce our Open-Vocabulary SAM, which fuses the knowledge from two foundation models (CLIP and SAM) into a unified architecture. Website: https://t.co/08w7K1s4qU Code: https://t.co/I4ysfJi2V5 Paper: https://t.co/sgSNvCvBup

AK

@_akhaliq

over 2 years ago

mmlab-ntu presents Open-Vocabulary SAM Segment and Recognize Twenty-thousand Classes Interactively paper page: https://t.co/QqODebOcPp Open-Vocabulary SAM extends SAM's segmentation capabilities with CLIP-like real-world recognition, while significantly reducing computational costs. It outperforms combined SAM and CLIP methods in object recognition on the COCO open vocabulary benchmark

2

304

66

160

47K

0

2

0

157

HarborYuan retweeted

iSchool at Illinois @iSchoolUI

over 2 years ago

Maintaining anonymity on the internet has become increasingly challenging. @MIT @techreview explores China's shift away from online anonymity, highlighting research conducted by #iSchoolUI PhD student @kyriezz78. Check out the full article: https://t.co/xW13oghxnK

iSchoolUI's tweet photo. Maintaining anonymity on the internet has become increasingly challenging. @MIT @techreview explores China's shift away from online anonymity, highlighting research conducted by #iSchoolUI PhD student @kyriezz78. Check out the full article: https://t.co/xW13oghxnK https://t.co/72x1y1Shz3

0

6

4

3

2K

HarborYuan retweeted

Chong Zhou @ChongZhou7

over 2 years ago

Introducing EdgeSAM, the first SAM variant that can run at over 30 FPS on an iPhone 14 with minimal compromise in performance. Code, models, and Hugging Face demo are available! arXiv: https://t.co/kPt7fnlCvC Project page: https://t.co/6w60RZQgnG

17

494

75

280

144K

HarborYuan retweeted

Google DeepMind @GoogleDeepMind

over 2 years ago

We’re excited to announce 𝗚𝗲𝗺𝗶𝗻𝗶: @Google’s largest and most capable AI model. Built to be natively multimodal, it can understand and operate across text, code, audio, image and video - and achieves state-of-the-art performance across many tasks. 🧵 https://t.co/mwHZTDTBuG

157

6K

2K

815

1M

HarborYuan retweeted

Xiangtai Li

@xtl994

over 2 years ago

Happy to achieve 500+ stars. https://t.co/XDmGcqE4uC. We will continue to update the draft on the arxiv. @HenghuiDing @wenweiz97 @HarborYuan

1

13

2

0

1K

HarborYuan retweeted

Andrea Tagliasacchi @CVPR @taiyasaki

over 2 years ago

This is how you get a 2-column (SIGGRAPH style) teaser image in #CVPR2024 (code and outcome below)

2

201

30

114

27K

Haobo Yuan @HarborYuan

over 2 years ago

@encounter19972 Awesome work!!

0

2

0

58

HarborYuan retweeted

encounter1997 @encounter19972

over 2 years ago

Excited to share our recent work "Object-aware Inversion and Reassembly for Image Editing" Paper: https://t.co/q6mn33f2V6 Code: https://t.co/Kl8ox0c7Dr Project Page: https://t.co/EKJuo3tlMW (1/5)

2

22

4

2

9K

HarborYuan retweeted