Jie Lei @jayleicn - Twitter Profile

Pinned Tweet

Jie Lei @jayleicn

7 months ago

Sharing our latest work SAM 3, the most advanced model for segmenting anything in images and videos.

AI at Meta

@AIatMeta

7 months ago

Today we’re excited to unveil a new generation of Segment Anything Models: 1️⃣ SAM 3 enables detecting, segmenting and tracking of objects across images and videos, now with short text phrases and exemplar prompts. 🔗 Learn more about SAM 3: https://t.co/CjMnf7fspz 2️⃣ SAM 3D brings the model collection into the 3rd dimension to enable precise reconstruction of 3D objects and people from a single 2D image. 🔗 Learn more about SAM 3D: https://t.co/yXcvts8Ogc These models offer innovative capabilities and unique tools for developers and researchers to create, experiment and uplevel media workflows.

140

4K

595

2K

1M

0

6

0

405

jayleicn retweeted

Kate Saenko @kate_saenko_

6 months ago

My team at Meta is looking for summer research interns! We develop cutting-edge perception models like SAM 3, SAM 3D and Perception Encoder. Application link: https://t.co/yTEwBRK7Kh (the video is SAM 3 with prompt "fish")

8

279

26

156

21K

jayleicn retweeted

Kate Saenko @kate_saenko_

6 months ago

We have LM Arena for chatbots, but what about one for computer vision models? It now exists! You can blind compare and rate models side by side on vision tasks. #SAM3 is currently the top scoring and fastest model for object detection! https://t.co/E89h6YetCI

kate_saenko_'s tweet photo. We have LM Arena for chatbots, but what about one for computer vision models? It now exists! You can blind compare and rate models side by side on vision tasks. #SAM3 is currently the top scoring and fastest model for object detection!

https://t.co/E89h6YetCI https://t.co/ONpGPmipbD

2

60

12

19

10K

jayleicn retweeted

Nikhila Ravi

@nikhilaravi

7 months ago

🧵Announcing Segment Anything 3! SAM 3 extends SAM 2 with open vocabulary text and exemplar prompts, enabling it to detect, segment, and track all instances of a target category in images/videos. We're releasing code, a checkpoint, an eval benchmark, & demo playground. SAM 3 will be coming soon to features in Edits, Vibes, & FB Marketplace! Deep dive below 👇

7

149

16

33

29K

Who to follow

Huan Sun

@hhsun1

Prof. @OhioState, endowed CoE Innovation Scholar, advancing the capability and safety/security of LLM-based agents, understanding transformers' limitations

Lianhui Qin

@Lianhuiq

Assistant Professor at UCSD CSE. NLP, ML, AI. I’m recruiting PhD students.

Zineng Tang

@ZinengTang

PhD in @Berkeley_ai and @BerkeleyNLP. Previously @UNCNLP and @MSFTResearch.

jayleicn retweeted

Manling Li

@ManlingLi_

almost 3 years ago

I am excited to join @northwesterncs as an assistant professor in Fall24 and @StanfordSVL as a postdoc with @jiajunwu_cs. I cannot say how much I appreciate the help from my advisor @elgreco_winter, references @ShihFuChang @kchonyc @JiaweiHan @kathymckeown and many many people.

37

378

15

20

99K

Jie Lei @jayleicn

almost 3 years ago

I missed the days working with Linjie, best collaborator ever.

Linjie (Lindsey) Li @LINJIEFUN

almost 3 years ago

I am humbled to be re-featured as Women in Computer Vision for the BEST of CVPR section of the Computer Vision News July Magazine. It was great chatting with Ralph Anzarouth. I hope my unconventional career path can encourage more female researchers. https://t.co/vzLUNMRmQg

LINJIEFUN's tweet photo. I am humbled to be re-featured as Women in Computer Vision for the BEST of CVPR section of the Computer Vision News July Magazine. It was great chatting with Ralph Anzarouth. I hope my unconventional career path can encourage more female researchers. https://t.co/vzLUNMRmQg https://t.co/GMA6L9mmNF

3

134

5

6

20K

2

5

0

2K

Jie Lei @jayleicn

almost 3 years ago

Welcome to our tutorial @CVPR!

Manling Li

@ManlingLi_

almost 3 years ago

Knowledge vs Large Models? Welcome to our #CVPR23 tutorial "Knowledge-Driven Vision-Language Encoding" with @Xudong_Lin_AI @jayleicn @mohitban47 @cvondrick @Shih-Fu Chang @elgreco_winter Jun 19: 9:00-12:30 Loc: East 8 Website:https://t.co/cxHORLzTDh Zoom:https://t.co/FbuuxRtZgg

ManlingLi_'s tweet photo. Knowledge vs Large Models?
Welcome to our #CVPR23 tutorial "Knowledge-Driven Vision-Language Encoding" with
@Xudong_Lin_AI @jayleicn @mohitban47 @cvondrick @Shih-Fu Chang @elgreco_winter
Jun 19: 9:00-12:30
Loc: East 8
Website:https://t.co/cxHORLzTDh
Zoom:https://t.co/FbuuxRtZgg https://t.co/nIoQeeRoTR

2

176

41

29

27K

0

21

5

2

3K

jayleicn retweeted

UNC Computer Science @unccs

about 3 years ago

Exciting research from @UNCCS coming to #CVPR2023 shows that pretrained vision models can understand audio-visual data without audio pretraining #ComputerVision #MachineLearning @yilin_sung @jayleicn @mohitban47 @gberta227 @CVPRConf @CVPR

0

17

5

2

3K

Jie Lei @jayleicn

about 3 years ago

Check out our recent work studying the important factors of video-language pre-training.

Gedas Bertasius

@gberta227

about 3 years ago

What makes modern Video-Language (VidL) perform well? Check out our #CVPR2023 paper "VindLU: A Recipe for Effective Video-and-Language Pretraining" where we demystify the most critical factors in the VidL model design. https://t.co/fNZ0RtdEgu @fncheng2333 @jayleicn @mohitban47

gberta227's tweet photo. What makes modern Video-Language (VidL) perform well? Check out our #CVPR2023 paper "VindLU: A Recipe for Effective Video-and-Language Pretraining" where we demystify the most critical factors in the VidL model design.
https://t.co/fNZ0RtdEgu
@fncheng2333 @jayleicn @mohitban47 https://t.co/fOtYolJl4A

1

53

11

9

12K

1

7

3

2

3K

Jie Lei @jayleicn

about 3 years ago

@rown Great work! Congrats Rowan!

0

1

0

326

Jie Lei @jayleicn

over 3 years ago

Come and join our AAAI tutorial on knowledge-driven vision-language pre-training tomorrow afternoon.

Manling Li

@ManlingLi_

over 3 years ago

What is the value of knowledge in the era of large-scale pretraining? Welcome to our #AAAI23 tutorial "Knowledge-Driven Vision-Language Pretraining" with @Xudong_Lin_AI @jayleicn @mohitban47 @Shih-Fu Chang @elgreco_winter Feb 8: 2-6pm Loc: Room 201 Zoom: https://t.co/IBuba5YEkd

ManlingLi_'s tweet photo. What is the value of knowledge in the era of large-scale pretraining? Welcome to our #AAAI23 tutorial "Knowledge-Driven Vision-Language Pretraining" with @Xudong_Lin_AI @jayleicn @mohitban47 @Shih-Fu Chang @elgreco_winter
Feb 8: 2-6pm
Loc: Room 201
Zoom: https://t.co/IBuba5YEkd https://t.co/ABPJPokbzG

1

83

17

19

51K

0

13

2

0

2K

jayleicn retweeted

Mohit Bansal

@mohitban47

over 3 years ago

🎉🎉BIG congrats to @ZinengTang for the amazing achievement of being selected as Winner (out of 4 in North America) of the 2023 CRA Outstanding Undergraduate Researcher Award! #ProudAdvisor🙂 🚨 Zineng is applying for a PhD this year 👉 https://t.co/R4URBXSZmo @CRAtweets @unccs

1

52

9

0

13K

Jie Lei @jayleicn

over 3 years ago

@gabriel_ilharco Very interesting work! Inspiring.

0

1

0

366

Jie Lei @jayleicn

over 3 years ago

Efficient vision language learning with our Perceiver-VL.

Jaemin Cho

@jmin__cho

over 3 years ago

Self-attention for VL tasks (esp. video+text) is too expensive! Check out our #WACV2023 paper “Perceiver-VL: Efficient Vision-and-Language Modeling with Iterative Latent Attention” https://t.co/in7X3bPKVV https://t.co/mo24Iu8dSn @ZinengTang* @jmin__cho* @jayleicn @mohitban47 🧵

jmin__cho's tweet photo. Self-attention for VL tasks (esp. video+text) is too expensive!

Check out our #WACV2023 paper “Perceiver-VL: Efficient Vision-and-Language Modeling with Iterative Latent Attention”

https://t.co/in7X3bPKVV
https://t.co/mo24Iu8dSn
@ZinengTang* @jmin__cho* @jayleicn @mohitban47
🧵 https://t.co/DMxY37ifZz

2

59

17

11

0

9

4

0

jayleicn retweeted

Yi Lin Sung @yilin_sung

over 3 years ago

🎉Our LST paper was accepted to #NeurIPS2022🎉 Ladder Side-tuning achieves both memory & parameter efficiency in NLP + VL tasks. Talk video: https://t.co/GmsHSd0sRB Camera-ready version: https://t.co/VIK83Fw7lC We will be in New Orleans, happy to chat! @jmin__cho @mohitban47

1

40

13

5

0

Jie Lei @jayleicn

over 3 years ago

Static frame-level info + LLM = a strong few-shot video captioner.

Zhenhailong Wang @zhenhailongW

about 4 years ago

Can GPT-3 understand videos? Glad to share our new work VidIL on prompting LLMs to understand videos using image descriptors (frame caption + visual token). We show strong few-shot video-to-text generation ability WITHOUT the need to train on ANY videos: https://t.co/96zmEqGe1n

3

67

12

17

0

25

2

5

0

Jie Lei @jayleicn

over 3 years ago

@HanGuo97 @ericxing @yoonrkim @ZhitingHu @mohitban47 Big Congrats!

1

0

Jie Lei @jayleicn

over 3 years ago

@swarnaNLP @Google @mohitban47 @uncnlp @unccs @mishumausam Big Congrats!🎊

1

0

Jie Lei @jayleicn

over 3 years ago

Neat idea - directly using audio and video signals for learning vision language models.

AK

@_akhaliq

over 3 years ago

TVLT: Textless Vision-Language Transformer abs: https://t.co/psa2t7q8bD github: https://t.co/piHjmeKdsa

0

161

28

33

0

12

5

0

Jie Lei @jayleicn

almost 4 years ago

Check out our #ECCV2022 oral paper on efficient long-range video retrieval using sparse frame+audio.

Yan-Bo Lin @yblin98

almost 4 years ago

🥳🥳 Check out our #ECCV2022 oral paper. We propose ECLIPSE 🌒 that integrates audio🔊🎵 into popular CLIP to have 2.92x faster and 2.34x memory-efficient for long-range video retrieval. https://t.co/2YpUIpVWlm https://t.co/B7mWFxBcm3 w. @jayleicn @mohitban47 @gberta227 🧵👇

yblin98's tweet photo. 🥳🥳 Check out our #ECCV2022 oral paper. We propose ECLIPSE 🌒 that integrates audio🔊🎵 into popular CLIP to have 2.92x faster and 2.34x memory-efficient for long-range video retrieval.

https://t.co/2YpUIpVWlm

https://t.co/B7mWFxBcm3

w. @jayleicn @mohitban47 @gberta227

🧵👇 https://t.co/8U3zQ3ln4E

1

49

8

0

19

4

0

Jie Lei

@jayleicn

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users