Excited to announce that I will be starting a CS PhD at @MIT_CSAIL in the fall w/ NSF GRFP funding, advised by @sarameghanbeery! After graduating this week with my BS+MEng from MIT, I'm going to be at @ETH_en this summer, please let me know any recommendations in the area!
I’ll be attending #CVPR2026 this week, June 2–7, and presenting:
Vibe Spaces for Creatively Connecting and Expressing Visual Concepts
🗓️ Saturday June 6 at 4:45 pm
📍 Poster 44 in Exhibit Hall A
Happy to chat at @CVPR – come to our poster or DM if you’re interested!
Excited to share ID-Sim, our identity-focused similarity metric, presenting at #CVPR2026 this week in Denver! 🎉
Humans are remarkably good at distinguishing highly similar objects across different contexts.
We asked: can we train a metric that does the same?
LLMs can learn to self-generate curricula for hard problems that they can't yet solve! Using meta-RL, with rewards grounded in learning progress, models produce their own stepping stones that kickstart learning on hard problems where direct RL plateaus.
Poster at the ICLR RSI workshop today!
NVFP4 allows models to be quantized to 4 bits without too much performance degradation, but can we push 4-bit performance even further?
Today, we're releasing a new class of low-precision block-scaled data types that natively adapt to your input data: for 4-bit quantization, IF4 (Int/Float 4) allows each scaled group of 16 values to be saved as FP4 or INT4 depending on which option offers less error. Selections are recorded using the scale factor’s sign bit, which is unused in NVFP4, allowing IF4 to offer better performance with no memory overhead!
Our data types provide better downstream accuracy in LLMs, they can be implemented efficiently in next-generation hardware accelerators, and they reveal some interesting insights about low-bit quantization! 🧵
Targeted instruction tuning for LLMs involves selecting a subset of instructions from a candidate pool using a small query set from target tasks. Despite growing interest, we still lack guidance on what to select. Our new preprint brings clarity to this space (thread 👇).
I'll be at NeurIPS on workshop days, helping organize DCVLR (https://t.co/dyOBSkU31H) on Dec 6 from 11:30 am - 2:30 pm!
Please reach out to chat about data curation across modalities (especially scientific data), data-efficient learning, and DCVLR!
I am on the job market this year! My research advances methods for reliable machine learning from real-world data, with a focus on healthcare. Happy to chat if this is of interest to you or your department/team.
So excited to announce the DCVLR (Data Curation for Vision-Language Reasoning) competition at NeurIPS 2025, led by @Oumi_PBC and sponsored by @LambdaAPI!
🌟open-data 🌟
🤖 open-models 🤖
💻 open-source 💻
💪anyone can compete for free 💪
https://t.co/7FLCl255cK
🧵 1 / n
The submission portal is now OPEN to take part in this interesting @NeurIPSConf 2025 data curation competition!! This is the first open-data, open-models, open-source competition for data curation in vision-language reasoning -- learn more 👇
https://t.co/iFqY5docF1
If you are attending #ICML2025, check out our DataWorld workshop on Sat July 19. We have updated the website with more info on speakers & accepted papers! https://t.co/K3U540rqoe
Also happy to chat offline about all things ✨ data ✨
So excited to announce the DCVLR (Data Curation for Vision-Language Reasoning) competition at NeurIPS 2025, led by @Oumi_PBC and sponsored by @LambdaAPI!
🌟open-data 🌟
🤖 open-models 🤖
💻 open-source 💻
💪anyone can compete for free 💪
https://t.co/7FLCl255cK
🧵 1 / n
📢 Announcing our data-centric workshop at ICML 2025 on unifying data curation frameworks across domains!
📅 Deadline: May 24, AoE
🔗 Website: https://t.co/K3U540rqoe
We have an amazing lineup of speakers + panelists from various institutions and application areas.
📢 Announcing our data-centric workshop at ICML 2025 on unifying data curation frameworks across domains!
📅 Deadline: May 24, AoE
🔗 Website: https://t.co/K3U540rqoe
We have an amazing lineup of speakers + panelists from various institutions and application areas.
What happens when models see the world as humans do?
In our #NeurIPS2024 paper we show that aligning to human perceptual preferences can *improve* general-purpose representations!
📝: https://t.co/IPfJUos2O5
🌐: https://t.co/RWjqXmfUiy
💻: https://t.co/XsoJ2cbYDA
(1/n)
The BeeryLab is busy at @eccvconf today!
@juliachae_ and @EdwardVendrow co-organized @CV4E_ECCV workshop, happening 8:30-1, where I'll moderate the panel and
Kai van Brunt and @__justinkay will present fish counting in sonar and Jae Joong Lee will present 3D trees
...
Synthetic data has huge potential to drive new improvements in training and evaluation for computer vision. Interested in learning more about advancements and challenges? Join us at the SynData4CV Workshop at #CVPR2024 tomorrow (June 18)!
https://t.co/b2VVcAGETg
@CVPR AI for Conservation happy hour on Monday! Open to anyone working on or interested in the intersection of CV/ML and ecology, conservation, sustainability, climate, etc
Hosted w/ @sarameghanbeery@timmhaucke
Delighted to share one of my favorite pieces of work, now published: PURPLE, a method to estimate disparities in the prevalence of underreported outcomes, in women’s health and healthcare more broadly.
https://t.co/KNJqD1O2Xb