Tanush @ CVPR2026 @tanushyy - Twitter Profile

Tanush @ CVPR2026

@tanushyy

1 day ago

Come find us at the KnowledgeMR, VidLLMs, and CVSports workshops today at @CVPR!

Jae Sung Park @CVPR2026

@jjaesungpark

1 day ago

VideoNet will appear as @CVPR Highlight✨ + 3 workshops TODAY! Multimodal AI is improving fast, but can it tell apart moves only a domain expert could name?✒️🪀 You probably got a hang of it from clips below😉 — Can models do the same with few-shot examples? Come find out 👇 🔗 Website: https://t.co/Y9AOF3Rid0 📍 Poster: Fri, Jun 5, 4:00–6:00 PM 🗓️ Workshops (all today, Jun 4): - KnowledgeMR — 🏆Best Paper Award candidate, talk by @tanushyy - CVSports - VidLLMs

1

12

2

1

2K

1

2

0

154

Tanush @ CVPR2026

@tanushyy

29 days ago

@hq_fang Thanks! You as well for MolmoAct 2 :)

0

2

0

23

Tanush @ CVPR2026

@tanushyy

29 days ago

@jason_lee328 Thanks Jason!

0

81

Tanush @ CVPR2026

@tanushyy

29 days ago

@weikaih04 Thanks Weikai!

0

29

Who to follow

ᅟᅟ

@cakophon

22 • 🇺🇸🇨🇳🇵🇭 • ENG • game music forever • @crackitwideopen

showstopper, bad liar, homie hopper, drama starter

Tanush @ CVPR2026

@tanushyy

about 1 month ago

Huge thanks to @mrezasal1, @jjaesungpark, @RamanujanVivek, @HannaHajishirzi, @YejinChoinka, Ali Farhadi, @rohun_t, @RanjayKrishna 🫶 & to Apple (@OncelTuzel) for the funding 🙏 This release is the culmination of 15 months of work 😅and would not be possible without them 🫡

1

0

292

Tanush @ CVPR2026

@tanushyy

about 1 month ago

Remember action recognition? The days of trying to climb on Kinetics?👻 Announcing VideoNet, a CVPR 2026 Highlight 🎉 which revitalizes action recognition in the VLM era Explore our data with this fun, interactive demo: https://t.co/W53aBi3QAX (1/8) 🧵

3

57

23

20

10K

Tanush @ CVPR2026

@tanushyy

about 1 month ago

Try VideoNet today! 🌐 Website: https://t.co/mVAlEp8KQq 📜 Paper: https://t.co/nmWtOacw7c 🖥️ Code: https://t.co/hoXKrlWJlw ▶️ Demo: https://t.co/dKAKTLtH2Z

1

2

0

1

1K

tanushyy retweeted

Weikai Huang @ CVPR 2026

@weikaih04

about 2 months ago

Thrilled to announce our latest project at @allen_ai @RAIVNLab: WildDet3D Humans understand objects in 3D effortlessly -- we see a mug on a desk, judge the distance to a parked car, or estimate the height of a building across the street. For CV / Robotics models, this remains surprisingly hard. We've built great models that each handle a piece of the puzzle: FoundationPose for 6-DoF pose over tabletops, MoGe 2 for accurate metric depth estimation, SAM for 2D segmentation and tracking. But they're fragmented -- each solves one sub-task, none gives you the full picture: where is this object in 3D, how big is it, and how is it oriented? Monocular 3D object detection is exactly this task -- recovering the full 3D bounding box of any object from a single RGB image. It's the missing link that connects 2D perception to real-world 3D understanding for robotics, AR/VR, and embodied AI. vehicles So why hasn't anyone cracked open-world 3D detection? Data. Existing 3D datasets (Omni3D, COCO3D) cover fewer than 100 categories, locked to driving corridors and indoor rooms. And the annotation methods -- BEV labelling, point cloud labelling -- fundamentally don't scale to in-the-wild scenes where you don't have LiDAR or a well-reconstructed point cloud. And objects are much more diverse in size/pose compared with vehicle and furniture. To tackle this: We designed a human-in-the-loop pipeline to change this. We build complex pseudo-3D box generators using different algorithms/models. Then, 1700+ human annotators from Prolific select the best candidate and verify quality. Along with thousands of annotators for several months, we got the result: WildDet3D-Data -- 1M total images, 13.5K categories of objects, with 100k of all human-verified 3d detection images. That's 138x more category coverage than Omni3D. Street food carts, violins, traffic cones, sculptures -- objects no 3D dataset has ever covered. With this data, we trained WildDet3D -- a single geometry-aware architecture built on SAM 3 and LingBot-Depth that unifies every way you'd want to interact with a 3D detector: - Text: "find all chairs" - Box prompt: click a 2D box, get its 3D box (geometric, one-to-one) - Exemplar prompt: draw one box, find all similar objects (one-to-many) - Point prompt: click on an object And when you have extra depth -- LiDAR, stereo, anything -- just pass it in. The model fuses it and gets substantially better: +20.7 AP on average. No depth? It works fine without it. Results on our new in-the-wild benchmark (WildDet3D-Bench, 700+ open-world categories): 22.6 AP text / 24.8 AP box -- up from 2.3 AP for the previous best. With depth: 41.6 AP text / 47.2 AP box. Also SOTA on Omni3D (34.2 AP text / 36.4 AP box) with 10x fewer training epochs, and strong zero-shot transfer to Argoverse 2 and ScanNet (40.3 / 48.9 ODS).

5

85

19

42

20K

Tanush @ CVPR2026

@tanushyy

4 months ago

@ethnlshn SPD author strikes gold again‼️

0

2

0

243

Tanush @ CVPR2026

@tanushyy

4 months ago

@ethnlshn 🐐🐐

0

1

0

249

tanushyy retweeted

Ethan Shen

@ethanlshen

4 months ago

Today, we release SERA-32B, an approach to coding agents that matches Devstral 2 at just $9,000. It is fully open-source and you can train your own model easily - at 26x the efficiency of using RL. Paper: https://t.co/aeD6T2WW3O Here’s how 🧵

27

687

90

491

94K

tanushyy retweeted

Ethan Kilbreath

@EthanArles

over 3 years ago

Total TDS/INT Michael Penix Jr. 33/7 Stetson Bennett 26/6 Total YDS Penix Jr. 4440 Bennett 3609 Total YDS/Game Penix Jr. 370 Bennett 278 Passer Rating Penix Jr. 155.5 Bennett 154.2 Adjusted Yards/Attempt Penix Jr. 9.2 Bennett 9.0

8

323

58

1

0

tanushyy retweeted

John Gennaro @johnmgennaro

over 3 years ago

If you’re unfamiliar, Chargers Pro Bowl OG Kris Dielman tried flying home from NY after suffering a concussion. He had a seizure, they had to do an emergency landing, and he never played again. They’re lucky he lived.

25

3K

469

65

0