Visual Inference Lab @visinf - Twitter Profile

about 15 hours ago

Work by: @ChristophR1996*, @olvr_hhn*, @neekans, @lealtaixe, C. Rupprecht, D. Cremers and @stefanroth 📄Paper: https://t.co/JStX3RQLEd 🌍Project Page: https://t.co/FpSVtkyiaZ 💻Code: https://t.co/D7gczjLFBH 📹Video: https://t.co/32K3HoXSMg 👁️CVPR: Friday, Poster Session 2 #333

0

10

0

8

347

Visual Inference Lab @visinf

about 15 hours ago · Denver

📢 [CVPR’26] Can we learn to detect, segment, and track every object in a video without human supervision? Yes, we introduce VideoCUPS, the first unsupervised video panoptic segmentation (VPS) method: 1. Get pseudo-labels from monocular videos. 2. Train a VPS model on them.

5

282

38

168

15K

Visual Inference Lab @visinf

about 15 hours ago

When fine-tuned with just 10% of labels, VideoCUPS already matches a fully supervised model trained on all Cityscapes-VPS labels, and outperforms the DINO-initialized baseline significantly.

visinf's tweet photo. When fine-tuned with just 10% of labels, VideoCUPS already matches a fully supervised model trained on all Cityscapes-VPS labels, and outperforms the DINO-initialized baseline significantly. https://t.co/FGf1cjdjNt

1

4

0

351

Visual Inference Lab @visinf

1 day ago

[6/6] Multimodal Knowledge Distillation for Egocentric Action Recognition Robust to Missing Modalities @dustin_carrion*, M. Santos-Villafranca*, A. Perez-Yus, J. Bermudez-Cameo, J.J. Guerrero, @schaub_simone Paper: https://t.co/nzKtI4rcMn Project Page: https://t.co/wyhQxxq8R2

0

3

0

98

Who to follow

Anton Milan

@antonmil

computer vision, deep learning, robotics...

Nicolas Pinto

@npinto

Stealth (Safe/Decentralized AI). Prev: Private AI @ Perceptio (acq by Apple), Scientist/Lecturer @ MIT+Harvard. Music Producer. Engineer. Angel Investor.

Cees Snoek

@cgmsnoek

Head of Video & Image Sense Lab | University of Amsterdam | Scientific Director Amsterdam AI

Visual Inference Lab @visinf

1 day ago

[1/6] 📢 We are in Denver at #CVPR2026 presenting 5 papers!

1

28

9

3

1K

Visual Inference Lab @visinf

1 day ago

[5/6] MUFASA: A Multi-Layer Framework for Slot Attention S. Bock*, L. Schüßler*, @krissingh_ , @schaub_simone , @stefanroth Paper: https://t.co/5MWZUczdoH Project Page: https://t.co/p5FvnQWGNo

1

4

0

121

Visual Inference Lab @visinf

2 days ago

[3/3] Project page: https://t.co/QH7qZ6Ytq6 Poster (ICRA): Thursday, 03:00 PM, P207 (Hall C - ThI2I) Poster (CVPRW): Thursday, 10:00 AM, A2A-MML Workshop, Hall A

0

1

0

161

Visual Inference Lab @visinf

2 days ago

[1/3] Multimodal Knowledge Distillation for Egocentric Action Recognition Robust to Missing Modalities by @dustin_carrion*, Maria Santos-Villafranca*, Alejandro Perez-Yus, Jesus Bermudez-Cameo, Jose J. Guerrero, and @schaub_simone

1

9

2

1

352

Visual Inference Lab @visinf

2 days ago

[2/3] KARMMA is a multimodal-to-multimodal distillation framework for egocentric action recognition that does not require modality-aligned data and supports any subset of modalities at inference. It produces a lightweight student robust to missing modalities without retraining.

1

0

158

visinf retweeted

Nikita Araslanov

@neekans

9 days ago

In-context learning suggests that a model has learned versatile representations. What if we use in-context learning itself as a training task for visual representations? 📣 Introducing 𝗟𝗜𝗟𝗔: 𝗟𝗶𝗻𝗲𝗮𝗿 𝗜𝗻-𝗖𝗼𝗻𝘁𝗲𝘅𝘁 𝗟𝗲𝗮𝗿𝗻𝗶𝗻𝗴 ✨ @CVPR 2026 Oral ✨ 𝗟𝗜𝗟𝗔 trains on videos without manual annotation. Key idea: An optimal linear mapping that predicts dense cues (e.g. depth, flow), estimated on one video frame, should also predict the corresponding cues of another frame from the same video. This yields compelling results on dense vision tasks: video object segmentation, (zero-shot) semantic segmentation and surface normal estimation. Paper, code, models and demo: https://t.co/Xn2SgskKQ8 Joint work with @ma_sundermeyer, Hidenobu Matsuki, David Joseph Tan and @fedassa (and special thanks to David and Federico for hosting my research visit at Google). #cvpr2026 @Google @MunichCenterML @tumcvg @TU_Muenchen

9

385

49

286

28K

visinf retweeted

Claudia Cuttano @ClaudiaCuttano

12 days ago

✨#CVPR2026 Oral ✨ A tale of a failed experiment: what if you fine-tune DINOv2 on sparse keypoints, beat every benchmark, only to discover it performs worse than the original frozen model on novel keypoints? 🚀MARCO closes this gap: a unified model for generalisable correspondences https://t.co/vE62YiTVfd

ClaudiaCuttano's tweet photo. ✨#CVPR2026 Oral ✨

A tale of a failed experiment: what if you fine-tune DINOv2 on sparse keypoints, beat every benchmark, only to discover it performs worse than the original frozen model on novel keypoints?

🚀MARCO closes this gap: a unified model for generalisable correspondences

https://t.co/vE62YiTVfd

7

439

53

294

25K

visinf retweeted

Gabriele Trivigno @gabTrivv

about 1 month ago

What if a model could learn dense semantic matches from just a handful of annotated landmarks, while still generalizing to unseen keypoints and categories — and running 10× faster than diffusion-based approaches? MARCO is selected as an Oral at #CVPR2026! A unified model for generalizable semantic correspondence, built on DINOv2⭐️ 👉 Try our model: https://t.co/Zvt4QTRVJQ

4

132

27

93

13K

visinf retweeted

Claudia Cuttano @ClaudiaCuttano

about 2 months ago

✨ As a first-year PhD student, I used to wonder what it must feel like to have a paper selected as an Oral at #CVPR. Today, I’m experiencing that feeling twice! I’m beyond happy to share that both of my first-author papers have been selected as #Oral at #CVPR2026 🎉

ClaudiaCuttano's tweet photo. ✨ As a first-year PhD student, I used to wonder what it must feel like to have a paper selected as an Oral at #CVPR. Today, I’m experiencing that feeling twice!

I’m beyond happy to share that both of my first-author papers have been selected as #Oral at #CVPR2026 🎉 https://t.co/0xkRDSvXow

27

614

16

79

31K

visinf retweeted

Gabriele Trivigno @gabTrivv

about 2 months ago

🔥 Can in-context segmentation emerge directly from frozen DINOv3 features? At #CVPR2026, we present INSID3: Training-Free In-Context Segmentation with DINQv3 — a collaboration between PoliTo, TU Darmstadt and TU Munich. A training free approach that generalizes from object-level to part-level and personalized segmentation, across natural, medical, underwater, and aerial domains Check it out: https://t.co/AaMRbfjLyn

3

204

33

156

30K

Visual Inference Lab

@visinf

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users