Akshit Singh @akshit_fbd - Twitter Profile

Pinned Tweet

8 months ago

Introducing TTRV: Test-Time Reinforcement Learning for Vision-Language Models 🧠📸 Most RL methods need labeled data — but humans learn directly from experience. We bring that ability to VLMs. 🚀 TTRV adapts models on-the-fly at inference, with no labels.

akshit_fbd's tweet photo. Introducing TTRV: Test-Time Reinforcement Learning for Vision-Language Models 🧠📸

Most RL methods need labeled data — but humans learn directly from experience.

We bring that ability to VLMs.
🚀 TTRV adapts models on-the-fly at inference, with no labels. https://t.co/nelxf4ly3o

2

4

3

1

623

akshit_fbd retweeted

Hilde Kuehne 🛫 @cvpr2026

@HildeKuehne

3 days ago

Presenting today @CVPR Poster 422 11:45-13:45: TTRV: Test-Time Reinforcement Learning for Vision Language Models Can a VLM improve itself at test time without any labels or external supervision? We show that the answer is YES. 👇 Poster 422 11:45-13:45 https://t.co/he1vTMIe3H

1

24

4

6

2K

akshit_fbd retweeted

Thao Nguyen (Shibe) @thaoshibe

3 days ago

give me your best #CVPR2026 poster me first: i shamelessly nominate myself 🤡🫣💀🤣 @CVPR @CVPRConf #CVPR

4

89

7

17

8K

akshit_fbd retweeted

Zihan "Zenus" Wang

@wzenus

6 days ago

Time for this meme again :)

10

472

51

312

65K

akshit_fbd retweeted

Paul Gavrikov @ CVPR 🇺🇸

@PaulGavrikov

9 days ago

Tomorrow I will travel to Denver, CO for #CVPR2026 🏔️🤖🇺🇸 I'm incredibly excited to dive into the latest research, catch up with old friends, and make new connections. If you are attending and want to chat about Computer Vision, Multi-modal LLMs, or just grab a coffee, let's connect! Please also step by our 3 posters: "VisualOverload: Visual Understanding of VLMs in Really Dense Scenes" Paul Gavrikov, Wei Lin, Jehanzeb Mirza, Soumya Jahagirdar, Muhammad Huzaifa, Sivan Doveh, Serena Yeung-Levy, James Glass, Hilde Kuehne 📍 Main Poster Session 6 on June 7th, 15:30 PM - 17:30 PM, Poster 431 (ExHall A) "TTRV: Test-Time Reinforcement Learning for Vision Language Models" Akshit Singh, Shyam Marjit, Wei Lin, Paul Gavrikov, Serena Yeung-Levy, Hilde Kuehne, Rogerio Feris, Sivan Doveh, James Glass, Jehanzeb Mirza 📍 Main Poster Session 5 on June 7th, 11:45 AM - 13:45 PM Poster 422 (ExHall F) "When Negation Is a Geometry Problem in Vision-Language Models" Fawaz Sammani, Tzoulio Chamiti, Paul Gavrikov, Nikos Deligiannis 📍 MAR Workshop Poster Session June 4th, 11:50 AM - 12:30 PM, Poster 179 See you in Denver!

PaulGavrikov's tweet photo. Tomorrow I will travel to Denver, CO for #CVPR2026 🏔️🤖🇺🇸

I'm incredibly excited to dive into the latest research, catch up with old friends, and make new connections.
If you are attending and want to chat about Computer Vision, Multi-modal LLMs, or just grab a coffee, let's connect! Please also step by our 3 posters:

"VisualOverload: Visual Understanding of VLMs in Really Dense Scenes"
Paul Gavrikov, Wei Lin, Jehanzeb Mirza, Soumya Jahagirdar, Muhammad Huzaifa, Sivan Doveh, Serena Yeung-Levy, James Glass, Hilde Kuehne
📍 Main Poster Session 6 on June 7th, 15:30 PM - 17:30 PM, Poster 431 (ExHall A)

"TTRV: Test-Time Reinforcement Learning for Vision Language Models"
Akshit Singh, Shyam Marjit, Wei Lin, Paul Gavrikov, Serena Yeung-Levy, Hilde Kuehne, Rogerio Feris, Sivan Doveh, James Glass, Jehanzeb Mirza
📍 Main Poster Session 5 on June 7th, 11:45 AM - 13:45 PM Poster 422 (ExHall F)

"When Negation Is a Geometry Problem in Vision-Language Models"
Fawaz Sammani, Tzoulio Chamiti, Paul Gavrikov, Nikos Deligiannis
📍 MAR Workshop Poster Session June 4th, 11:50 AM - 12:30 PM, Poster 179

See you in Denver!

0

9

1

0

402

akshit_fbd retweeted

Peyman Milanfar

@docmilanfar

about 2 months ago

Outstanding researchers excel at the art of finding problems right at the edge of our understanding — that can actually be solved.

6

434

31

104

28K

akshit_fbd retweeted

Anthropic

@AnthropicAI

about 1 month ago

New Anthropic research: Natural Language Autoencoders. Models like Claude talk in words but think in numbers. The numbers—called activations—encode Claude’s thoughts, but not in a language we can read. Here, we train Claude to translate its activations into human-readable text.

594

17K

2K

9K

2M

akshit_fbd retweeted

John Hewitt @johnhewtt

about 1 month ago

New paper! Subliminal learning—transferring hidden signals between language models—is more powerful than we thought. By biasing the teacher with a steering vector instead of a prompt, we achieve strong, consistent transfer, which we use to study its mechanisms. w/@GeorgeMorgulis

johnhewtt's tweet photo. New paper! Subliminal learning—transferring hidden signals between language models—is more powerful than we thought. By biasing the teacher with a steering vector instead of a prompt, we achieve strong, consistent transfer, which we use to study its mechanisms. w/@GeorgeMorgulis https://t.co/9rMRB3BbjE

6

302

35

198

20K

akshit_fbd retweeted

紫云

@dviolettchan

about 2 months ago

AI research used to be in a unique position. Unlike biology or materials science, it was not extremely resource-intensive: 1-2 GPUs could still support top-venue-level work. Unlike math or theoretical physics, it was also not purely intelligence-intensive. (1/3)

7

744

21

220

95K

akshit_fbd retweeted

Yulu Gan

@yule_gan

3 months ago

Simply adding Gaussian noise to LLMs (one step—no iterations, no learning rate, no gradients) and ensembling them can achieve performance comparable to or even better than standard GRPO/PPO on math reasoning, coding, writing, and chemistry tasks. We call this algorithm RandOpt. To verify that this is not limited to specific models, we tested it on Qwen, Llama, OLMo3, and VLMs. What's behind this? We find that in the Gaussian search neighborhood around pretrained LLMs, diverse task experts are densely distributed — a regime we term Neural Thickets. Paper: https://t.co/rFJz2kVEOA Code: https://t.co/HAmonfpXIA Website: https://t.co/QZ6AMIsKCw

yule_gan's tweet photo. Simply adding Gaussian noise to LLMs (one step—no iterations, no learning rate, no gradients) and ensembling them can achieve performance comparable to or even better than standard GRPO/PPO on math reasoning, coding, writing, and chemistry tasks. We call this algorithm RandOpt.

To verify that this is not limited to specific models, we tested it on Qwen, Llama, OLMo3, and VLMs.

What's behind this? We find that in the Gaussian search neighborhood around pretrained LLMs, diverse task experts are densely distributed — a regime we term Neural Thickets.

Paper: https://t.co/rFJz2kVEOA
Code: https://t.co/HAmonfpXIA
Website: https://t.co/QZ6AMIsKCw

89

3K

433

3K

697K

akshit_fbd retweeted

Chelsea Finn

@chelseabfinn

3 months ago

We added short-term visual memory + long-term text memory to pi models. 🤖 Enables robots to: - complete tasks up to 15 min long - cook grilled cheese while keeping track of time - adapt in-context Paper & videos: https://t.co/mcYlotvJlO

12

491

47

137

31K

akshit_fbd retweeted

elvis

@omarsar0

3 months ago

Can AI agents agree? Communication is one of the biggest challenges in multi-agent systems. New research tests LLM-based agents on Byzantine consensus games, scenarios where agents must agree on a value even when some participants behave adversarially. The main finding: valid agreement is unreliable even in fully benign settings, and degrades further as group size grows. Most failures come from convergence stalls and timeouts, not subtle value corruption. Why does it matter? Multi-agent systems are being deployed in high-stakes coordination tasks. This paper is an early signal that reliable consensus is not an emergent property you can assume. It needs to be designed explicitly. Paper: https://t.co/3fllhchiKX Learn to build effective AI agents in our academy: https://t.co/1e8RZKs4uX

omarsar0's tweet photo. Can AI agents agree?

Communication is one of the biggest challenges in multi-agent systems.

New research tests LLM-based agents on Byzantine consensus games, scenarios where agents must agree on a value even when some participants behave adversarially.

The main finding: valid agreement is unreliable even in fully benign settings, and degrades further as group size grows. Most failures come from convergence stalls and timeouts, not subtle value corruption.

Why does it matter?

Multi-agent systems are being deployed in high-stakes coordination tasks. This paper is an early signal that reliable consensus is not an emergent property you can assume. It needs to be designed explicitly.

Paper: https://t.co/3fllhchiKX

Learn to build effective AI agents in our academy: https://t.co/1e8RZKs4uX

39

337

62

293

43K

akshit_fbd retweeted

Fatima Institute for Global AI Research @fatimafellowshp

3 months ago

Fatima Institute is offering a free AI research opportunity! If you’re from a developing country and planning on applying to grad school, you can work on a research project with a mentor, get help with applications, and $1,000 in compute credits. Apply https://t.co/Q4G4Ew3ugn

fatimafellowshp's tweet photo. Fatima Institute is offering a free AI research opportunity!

If you’re from a developing country and planning on applying to grad school, you can work on a research project with a mentor, get help with applications, and $1,000 in compute credits.

Apply https://t.co/Q4G4Ew3ugn https://t.co/BL6YubrJW4

4

160

42

177

9K

akshit_fbd retweeted

Hilde Kuehne 🛫 @cvpr2026

@HildeKuehne

3 months ago

🚀4 Papers accepted to CVPR 2026! Checkout: VOLD: https://t.co/sPXnXi4mXL AMoE: https://t.co/FXEn1CfUz4 VisualOverload: https://t.co/iRsFcDq7wk TTRV: https://t.co/NYWdbrkzCv Detailed posts follow soon! Big congrats @BousselhamWalid Sofian Caybouti @PaulGavrikov @akshit_fbd

1

104

11

24

5K

akshit_fbd retweeted

Srinath Sridhar @drsrinathsridha

3 months ago

Vincent's post is timely. I wrote a follow up examining some of his arguments and providing an alternative perspective. https://t.co/D3BBGiV938 Yes, the lesson is bitter, but I believe the flavor is also sweet. Thread 🧵

drsrinathsridha's tweet photo. Vincent's post is timely. I wrote a follow up examining some of his arguments and providing an alternative perspective.

https://t.co/D3BBGiV938

Yes, the lesson is bitter, but I believe the flavor is also sweet. Thread 🧵 https://t.co/QPxAPuAPTL

4

98

19

82

14K

Akshit Singh @akshit_fbd

4 months ago

@MNaseerSubhani great! Can you share your paper

1

0

1K

akshit_fbd retweeted

Maria Brbic @mariabrbic

4 months ago

Are neural nets across modalities really converging to the same representation as they scale, as the Platonic Representation Hypothesis suggests? We show that common representational similarity metrics are confounded by network width & depth. We propose a permutation-based null calibration that fixes this. Result❓ • Global convergence largely disappears. • Local neighborhoods persist. We propose the alternative Aristotelian Representation Hypothesis: Neural networks, trained with different objectives on different data and modalities, are converging to shared local neighborhood relationships Very proud of @FabianGroger and @ShuoWen18 for this work! Paper: https://t.co/GmkhwsiN1N Webpage: https://t.co/xaI31BU2FS Code: https://t.co/5qItdzRBZP

12

596

91

461

70K

akshit_fbd retweeted

Amy Tam

@amytam01

4 months ago

https://t.co/b9f99xbcMx

195

8K

925

14K

3M

akshit_fbd retweeted

Vincent Sitzmann @vincesitzmann

4 months ago

In my recent blog post, I argue that "vision" is only well-defined as part of perception-action loops, and that the conventional view of computer vision - mapping imagery to intermediate representations (3D, flow, segmentation...) is about to go away. https://t.co/aFmE9CHHau

44

1K

165

791

392K

akshit_fbd retweeted

Alif Munim (d/acc)

@alifmunim

4 months ago

We trained a foundation model on 18 million heart ultrasound videos to predict structure instead of pixels. Introducing EchoJEPA, the first foundation-scale JEPA for medical video. Paper: https://t.co/iN7MBfSBFW Code: https://t.co/n4svDzRM7Q 🧵 1/n

58

3K

377

2K

590K

akshit_fbd retweeted

Gabriele Berton

@gabriberton

4 months ago

This PR will go down in history as the first ever argument between people and an AI It gets completely dystopian very quickly Here's what happened: 1) An AI (OpenClaw agent) made a PR on matplotlib GitHub repo 2) Maintainer says AI's PRs are not allowed

gabriberton's tweet photo. This PR will go down in history as the first ever argument between people and an AI

It gets completely dystopian very quickly

Here's what happened:

1) An AI (OpenClaw agent) made a PR on matplotlib GitHub repo

2) Maintainer says AI's PRs are not allowed https://t.co/I8VTqPds2n

2

38

1

14

14K

Akshit Singh

@akshit_fbd

Last Seen Users on Sotwe

Trends for you

Most Popular Users