Thanks for having me! I talked about our work on valid inference with synthetic data (https://t.co/yIzGKRL8Yq) and robust human-AI complementarity (ICML 2026, paper up soon), both with my PhD student @yewonbyun_
A recent study found an LLM scored 95% on a healthcare benchmark. Deployed with real patients, it dropped to 34%.
In our new work, we argue the problem isn't the benchmark, but the implicit assumptions buried in evaluation.
Paper: https://t.co/mi445QtJvM
๐งต 1/n
Deploying algorithmic research in practice is an opaque process. We're organizing a workshop at EC to share behind-the-scenes stories and move the field forward. Call for submissions open! With @NikhGarg, @ericachiang_ , @baileyflan
https://t.co/jRo0LlxfXr
@MiriZilka and I are co-chairing EAAMO'26 and we are *so* excited. (Abstract deadline May 1, paper deadline May 8).
The conf highlights work on the data-to-decision pipeline on how to better design algorithmic systems for society. If this sounds like your work, please submit!
Interested in long-context audio LLMs and hallucinations? We released ~1,140 hrs of synthetic doctor-patient conversations with reference SOAP notes. BeTraC Challenge: build the best open end-to-end SOAP-note system. Two tracks: โค6B and โค36B params. https://t.co/UzTx4AHw9I
Our lab, within the Berkeley EECS department, is hiring a postdoc!
More info and quick application form: https://t.co/GgYiwvtDbH
Apply by May 1!
Please reshare :)
๐ข Call for Papers: EAAMO 2026
We invite submissions to the 6th ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization.
๐ Abstract: May 1
๐ Paper: May 8
๐https://t.co/HPJnV2JAFw
LLMs are increasingly used as agents for decisions under uncertainty, e.g. medical diagnosis. But is their behavior explained by coherent beliefs and preferences? We test a basic question: could beliefs stated by the model be the subjective probabilities of a rational agent?
We give one answer: test whether the model's stated outputs function like a rational agent's belief when compared with its decisions. We give empirically checkable conditions that don't require any assumptions about the model's "utility function"
LLMs are increasingly used as agents for decisions under uncertainty, e.g. medical diagnosis. But is their behavior explained by coherent beliefs and preferences? We test a basic question: could beliefs stated by the model be the subjective probabilities of a rational agent?
In an exciting collaboration with MSR, we ask a simple question: do LLMs actually behave like rational agents? We test whether an LLMโs stated elicited beliefs can be treated as its true beliefs by checking decision-theoretic coherence. https://t.co/PYPa7uXPbx
For anyone who wants to understand how models make decisions, this "belief elicitation" is a central concern. How can well tell whether what the model tells us in response to a prompt represents its "real" belief?
In an exciting collaboration with MSR, we ask a simple question: do LLMs actually behave like rational agents? We test whether an LLMโs stated elicited beliefs can be treated as its true beliefs by checking decision-theoretic coherence. https://t.co/PYPa7uXPbx
At UAI: experimental design/causal inference with a hard budget on treatments. We show how negatively correlating treatment assignments does lots of good things, including substantially improving treatment effect estimates in small samples. See Khurram's thread below!
New UAI Causality Paper! We turn fractional assignment probs into an exact number of treated units (without changing anyone's marginal prob of treatment) and drastically reducing estimator variance (3 to 5 times less empirically + theoretical guarantee) https://t.co/y6F2O7aa4r
Iโm at NeurIPS this week (12/2-12/8) to present our work on when/how synthetic data (e.g., LLM simulations) can help scientists make inferences with less real data, improving the efficiency of costly experiments. Come by Poster #904 on Thursday 4:30PM (Exhibit Hall C,D,E)!๐
I gave talks at MIT and Harvard this week about "Science with synthetic data". How can generative models help us learn about the world (e.g., social systems) in a principled way? Lots of interesting conversations; more convinced than ever that there's nuanced issues to navigate
EAAMO is a thoughtful, interdisciplinary community genuinely committed to bridging research and practice.
We have an open call for institutional hosts for the 6th annual ACM @EAAMO_ORG conference, following EAAMO'25 in Pittsburgh next week.
Learn more and get in touch!
We're in the process of selecting the location for next year's ACM EAAMO conference! If you're interested in bringing the EAAMO community to your institution, please check out the open call here and get in touch. https://t.co/vsQQMA8yk7
Looking forward to this visit to UMich tomorrow! I'll talk about how to build causal goals into machine learning systems, and about collaborations bringing these ideas to real policy settings.
๐๏ธ Excited for the next #AI Seminar with Prof. @brwilder (CMU), exploring how AI can better guide interventions in #health, #education & social policy.
"Predictive vs Causal Targeting of Social Interventions"
๐๏ธ OCT. 21 | ๐ 4PM ET
๐https://t.co/yWZWxERNh4
๐ Shape the future of equitable and responsible tech at #EAAMO25!
๐ Nov 5โ7, 2025 | Univ. of Pittsburgh
โก Early bird registration closes Oct 15.
๐ https://t.co/MylMq5IbQ5