๐ Super stoked to share that our work is accepted to the main conference of ACL 2026!!
See you in sunny San Diego ๐
#ACL2026#NLProc
Paper thread below ๐งต
๐จNew Paper๐จ
Learning to Plan and Orchestrate for Open-Ended Image Editing
Image editors handle concrete tasks, but struggle with abstract ones (e.g., adapt this ad for winter), which require coordinated edits.
We tackle this via planning and experiential tool learning.
๐งต๐
๐จ Very happy to have advised this #icml2026 work led by @MaheshRam23629 exploring RL for cooperative reasoning in LLMs
tl;dr post training LLMs to play the card game Hanabi ๐ improves performance on unseen tasks like instruction following and temporal reasoning!
๐ฅOur paper โSparks of Cooperative Reasoning: LLMs as Strategic Hanabi Agentsโ was accepted to ICML 2026!
We show that post-training LLMs with RL to play a cooperative game teach generalizable cooperative reasoning.
1/n
Hi #ml twitter!
Iโm in San Francisco from 5/6-5/12, Iโd love to catch up over a coffee and chat about research and things yโall are working on!!
(post @NeurIPSConf submission hangover ๐ค)
@andrew_atanov Very cool work Andrei and team!! I've been thinking a lot about adaptive representations for video, excited to read in detail over the weekend :)
We all knew LLM agents struggle to explore, but we had to eyeball it ๐. We couldn't measure exploration errors. Until now. ๐บ๏ธ๐ค
We built a policy-agnostic metric to quantify exploration and exploitation errors in LLM agents.
Spoiler: Exploration error is what kills๐ agent performance in our setting ๐๐งต(1/8)
Excited to see Qwen-3.5-Omni evaluated on AV-SpeakerBench ๐
From ~54 โ ~71.5: open models are catching up in audiovisual reasoning.
Gemini still leads, but the gap is shrinking.
Special thanks to my mentor, @yu_zhuoran32720 , and my advisor, @yong_jae_lee , from the Wisconsin AI Vision Lab, for making AV-SpeakerBench possible.
Two questions:
- "Is Qwen-3.5-Omni really open-source?"
- "Why no strong AV models yet from @OpenAI@AnthropicAI ?"
AV reasoning is the next frontier.
๐ Benchmark Project Page: https://t.co/GXNwPCuDlN
๐ Qwen blog: https://t.co/imClscpV7E
@GoogleDeepMind@_gaganm A HUGE shoutout to my network for your overwhelming response to my internship search post :')
I had SO many wonderful conversations and I'm so thankful for all your generosity and kindness!
https://t.co/T0hpE0HB43
Hi ML Twitter!
My Summer 2026 internship unfortunately fell through last minute ๐ตโ๐ซ
If your team is looking for interns, Iโd love to connect - RTs appreciated ๐
My website: https://t.co/rNih6t6Emb
๐จ I'm super stoked to share that I'll be spending this summer as a student researcher at the GenAI team of @GoogleDeepMind with @_gaganm , working on long horizon tool use!!
๐จNew work with @Meta@RealityLabs
We introduce EGAgent, an agentic reasoning framework for very long video understanding powered by entity scene graphs
Why? With long multimodal data streams, agents must search and reason across multiple modalities!
๐งต (1/n)
๐ Super stoked to share that our work is accepted to the main conference of ACL 2026!!
See you in sunny San Diego ๐
#ACL2026#NLProc
Paper thread below ๐งต
If youโre working on benchmarks for cultural authenticity or methodology for AI pluralism, we want to see your work at the MAPS Workshop @CVPR 2026.
We are also hosting the Machine Translation for Vision (MTV) challenge, don't miss out! ๐
Check out the full CFP below. ๐
Hi computer vision fam!
Interested in generative models that suit a diverse plurality of human values and cultures?
Please submit to the @maps_cvpr workshop at #CVPR2026 led by the amazing @PShravannayak which I'm helping co-organize!
Looking forward to your exciting works!!
๐ข Submissions are now OPEN for our @CVPR Workshop: Multimodal Alignment for a Pluralistic Society (MAPS)!
Help us build AI that reflects global diversity, culture, and human values. ๐๐๐
๐ Mar 3 โ Apr 10, 2026
๐ Short papers (4 pgs)
๐ https://t.co/ersHIrd6SY
#CVPR2026