Today's a special day for me! We released Nemotron-Personas-Korea, the 1st Korean persona dataset🇰🇷💚 Built the largest persona PGM ever from 62 census data, capturing up to 10^46 states to closely simulate Korea. Already trending Top5 on 🤗 plz hit like❤️https://t.co/JmpC5o1o86
I’m returning to Berkeley as a Ph.D. in EECS, specializing in Computer Vision!
I am grateful for my mentors/family/friends who have guided me through this process. I cant wait to continue building exciting work in @berkeley_ai. Stay tuned!!
We are presenting "Generate, but Verify". Our backtracking schema reduces VLM hallucinations significantly.
See how we implemented this in a single VLM architecture.
Visit our poster session to chat with us!
📄: https://t.co/3Znp7uT8yT
@NeurIPSConf
Excited to see @deepseek_ai push the generation-verification idea to IMO-level math (insane!)
We’ve been exploring the same flow earlier this year and it really cuts VLM hallucinations!
NeurIPS folks, come chat at our poster: Thu Dec 4, 4:30–7:30 pm, Exhibit Hall C/D/E, #4618.
Me as an undergrad, I am super fortunate to learn and work w Patrick - has so many ideas in bringing up exciting problems in AI that needs to be solved someday
and this essay is amazing - check it out!
Accidentally wrote a blog post on dynamic human-AI interaction this week, sharing some tentative ideas I find interesting in this field and the connection to our REVERSE-VLM. I’ll be presenting it at NeurIPS next week.
Happy to chat more at San Diego 🙂
https://t.co/0fPibuxifk
Attending EMNLP2025 (SuZhou)!
🍀 Showing our work visual puzzles with @_dmchan , and supporting the community as a volunteer
Happy to meet everyone and to make new connections!
@emnlpmeeting#EMNLP2025
https://t.co/CZxI6c7dpC : An exciting dataset to test vlm capabilities
✨Introducing ECHO, the newest in-the-wild image generation benchmark!
You’ve seen new image models and new use cases discussed on social media, but old benchmarks don’t test them!
We distilled this qualitative discussion into a structured benchmark.
🔗 https://t.co/wJmmEY8TFQ
I really LOOOOOOVE making logos for all my papers (and events... see the spoof BAIR one lol)
At this point it’s basically turned into a whole TV show cast: SESAME BUN, Haystack, Puzzle, REVERSE guard, and the latest: Dugtrio by the street! 🍔 🌾 🧩 🔥🔧 👀
Stay tuned for the paper (plan to release in a week or two!)
NeurIPS 2025 ✅ Our generate-verify de-hallucination paper is in! ✔️ DFS-backtracking–like tricks fix VLM hallucinations ✔️ Explicit confidence targets matter (we stressed this before @OpenAI’s “Why LMs Hallucinate”)
👉 Check it out: https://t.co/k9utaMjUzS
See u all at SD!
🔥 Excited to share that my first paper “Puzzled by Puzzles: When Vision-Language Models Can’t Take a Hint” has been accepted to main conference #EMNLP2025 🎉
📝 Paper: https://t.co/CZxI6c6FA4
Follow the thread below and stay tuned if interested
🔍 Just dropped: “Puzzled by Puzzles: When Vision-Language Models Can’t Take a Hint” 👉 https://t.co/CZxI6c6FA4
Puns + pictures + positioning = a nightmare for today’s AI.
These models just don’t get it (yet).😵💫
Check out the 🧵 to see our findings (1/4)
#AI#Multimodal#VLM
We break down:
🌀 Confusion types
❌ Failure patterns
📉 Sensitivity to hints
Even state-of-the-art VLMs struggle with logic that feels intuitive to us.
Dive in and see where today’s models fall short — and why.
🧩 https://t.co/t4b4cNEXyH
#AI#Rebus#VLM
We all learned DFS in undergrad — but did you know it can fix hallucinations in VLMs?
💡 Meet REVERSE-VLM: a self-correcting model using DFS-style backtracking + resampling
📉 12% fewer hallucinations (CHAIR-MSCOCO)
📈 28% more accurate (HaloQuest)
🔗 https://t.co/ZrizoRvvF9