Asst. prof. @BrownCSDept. Working on improving how humans teach computers. Weak supervision, zero-shot learning, few-shot learning, and high-level knowledge.
@BrownCSDept Master's student Ilana Nguyen speaks at a United Nations panel focused on ensuring that AI expands opportunity. Learn more at Brown CS Blog: https://t.co/BfYrBh7AQZ
Last week we launched Muse Spark at an acceptable risk level under our Advanced AI scaling framework, after multiple mitigation iterations. Today we’re releasing its first Safety & Preparedness Report documenting that decision.
This was a long, cross-team effort — from catastrophic risk assessment to day-to-day model behavior. We hope this contributes to transparent discussion of responsible development of personal superintelligence. Running the evals, it was fascinating to watch the model’s safety profile take shape.
Under the new framework, we’re also introducing our first assessment of loss of control risks — built on extensive threat modeling that’s still evolving.
The report’s dense and there’s a lot of work ahead. You can find the full report here: https://t.co/erjgFHz4uc— we’re eager to hear feedback and improve.
The first @TheOfficialACM conference on agentic AI systems just got a boost. @SnorkelAI is joining as a sponsor of @CAISconf this May in San Jose. Stanford AI Lab roots, production AI focus, and a shared belief that this community needs a rigorous home. https://t.co/MxpkxyPWGO
I thought "AI for Science" was something like AlphaFold, ie. using AI to creatively address computational bottlenecks for well articulated scientific problems.
Now I'm seeing more of "AI slop cosplaying as research paper", where the problems are fake, methods unverified, etc.
1/7 🧵 The GPT-4 technical report featured detailed calibration curves.
Since then, not a single major model release has reported calibration. The field quietly stopped measuring whether models know what they don't know.
Our new position paper argues this is a mistake. Here's why.
Targeted instruction tuning for LLMs involves selecting a subset of instructions from a candidate pool using a small query set from target tasks. Despite growing interest, we still lack guidance on what to select. Our new preprint brings clarity to this space (thread 👇).
Simple (proposed!) rule for terminology around synthetic data:
If a "synthetic generation" method uses model A to generate data that leads to gains on model B, where A >> B - this is distillation, not synthetic generation :)
The true technical challenge of synthetic data is to use model A, plus some cleverness around system architecture and/or human-in-the-loop input (e.g. context eng, review/filtering, editing), to produce data that improves model B where B >= A.
I am saddened by the loss of Joe Halpern. I still remember taking his Reasoning About Uncertainty class during my first year as a PhD student at @Cornell. Joe leaves behind a tremendous legacy, not only in his research, but the lives of so many students he touched along the way.
https://t.co/l4AZmHNUxK
This week we launched the Open Benchmarks Grant with a $3M initial commitment from @SnorkelAI + partner support from @huggingface@togethercompute@PrimeIntellect@PyTorch@harborframework & others, in order to close the evaluation gap in AI.
Our ability to measure AI has been outpaced by our ability to develop it - and open benchmarks are one of several critical, complementary tools to fix this.
We're particularly interested in novel benchmarks that push and probe the frontier along three key vectors:
(1) Environment complexity
--> E.g. complex, domain-specific context and tool/action spaces, human interaction, world modeling)
(2) Autonomy horizon
--> E.g. long horizon, non-stationary goals
(3) Output complexity
--> E.g. complex outputs with nuanced, rubric-based evaluation / reward signals
Check out more detail + link to apply here! https://t.co/m1EftlAQTB
Awesome that @SnorkelAI is investing in open evaluation for agents! We’ve always said that data is the bottleneck. With increasing model capabilities, it’s often the *evaluation* data that limits progress now. Excited to see what gets built!
I may be biased because she's a friend, but this piece by @H_Lev is the best first-person account I've read summing up the mood in Providence right now https://t.co/7OsdfhS5AF
The Data Science Institute is pleased to announce our inaugural 2026 Early Career Breakthrough Research Award recipients! Congratulations to Ying Ma @yingma0107 (@BrownBiostats ), Loukas Gouskos (@brown_physics ), and Kim Fernandes (@BrownAnthro )! https://t.co/eC0EG7hQcd
I'm at NeurIPS this week! Excited to meet old/new friends and chat with people about training safer language models.
I'm presenting a few works on safety pretraining, measuring diversity in data curation, and monitoring model behaviors --- more info below 👇
⭐ New blog post!
Most people think activation steering ≈ a cheap version of finetuning. But why does it sometimes work, and sometimes fall flat?
We dug into this and found a surprisingly clear answer.
Full breakdown here 👇
https://t.co/bQRMdFf4Bm
How well do language models generalize to problems that are harder, or even easier, than the ones they’ve trained on?
We show that LLMs don’t generalize across difficulty levels quite as much as you might think. 🧵
I too am recruiting PhD students this year! things I think about: cognitively plausible LLMs, interpretability, evaluating and improving multi-turn interaction, LLMs for cognitive science and neuroscience, psycholinguistics... the deadline for Data Science is Dec 6 and for Linguistics Dec 18.
ARIA, a Brown-based research consortium supported by a $20 million grant from the National Science Foundation, welcomed scientists from across the U.S. to kick off its five-year program with a launch event in Providence. @BrownUniversity
https://t.co/fddeM3m2zn