Berk Ustun @berkustun - Twitter Profile

Pinned Tweet

over 7 years ago

Denied a loan by an ML model? You should be able to change something to get approved! In a new paper w @AlexanderSpangh & @yxxxliu, we call this concept "recourse" & we develop tools to measure it for linear classifiers. PDF https://t.co/M3dEdEyrXR CODE https://t.co/9IsXbv6gUt

berkustun's tweet photo. Denied a loan by an ML model? You should be able to change something to get approved!

In a new paper w @AlexanderSpangh & @yxxxliu, we call this concept "recourse" & we develop tools to measure it for linear classifiers.

PDF https://t.co/M3dEdEyrXR
CODE https://t.co/9IsXbv6gUt https://t.co/QvyU5520Zv

4

183

41

36

0

berkustun retweeted

Alex Spangher @ Neurips2025 @AlexanderSpangh

6 months ago

Neurips 2025 was such a blast! We snuck a grand piano into the CreativeAI Track to demo Aria, our pretrained chat-style music model:

23

398

45

113

30K

berkustun retweeted

Jessica Hullman @JessicaHullman

6 months ago

A 'pragmatic intepretability' turn sounds a lot like our argument/framework for evaluating explanation methods--Time to replace task-agnostic fortunetelling w/concrete decision problem specs + theoretic & empirical evidence of expected performance boost https://t.co/z9XREAdjnt

0

14

1

7

3K

berkustun retweeted

Sakana AI

@SakanaAILabs

7 months ago

GPT-5 on Sudoku-Bench 🧩 Since releasing Sudoku-Bench in May 2025, when no LLM could solve a classic 9x9 puzzle, we've been evaluating the latest generation of models. GPT-5 now leads our leaderboard with 33% puzzles solved--approximately 2x the previous leader--and is the first LLM we've tested to solve a 9x9 Sudoku variant. However, with 67% of the much harder puzzles remaining unsolved, Sudoku-Bench continues to present significant challenges for AI reasoning. Modern Sudoku variants require models to first understand novel rulesets through meta-reasoning, then maintain global consistency across long reasoning chains. Our experiments with GRPO fine-tuning on Qwen2.5-7b and "Thought Cloning" (training on expert human reasoning from Cracking the Cryptic) show that current approaches still struggle with the spatial reasoning and creative "break-in" points that human solvers use naturally. We believe new approaches are required to solve our benchmark. These results highlight persistent gaps between computational problem-solving and human-like reasoning, particularly in tasks requiring integrated mathematical logic, spatial awareness, and creative insight. Read more about our update here: 🔗 Blogpost → https://t.co/qLlzYSalUw

30

662

109

236

155K

Who to follow

Aaron Roth

@Aaroth

CS prof at Penn. Amazon Scholar at AWS. Author of The Ethical Algorithm (w/ Michael Kearns). I study machine learning, privacy, game theory, and uncertainty.

𝙷𝚒𝚖𝚊 𝙻𝚊𝚔𝚔𝚊𝚛𝚊𝚓𝚞

@hima_lakkaraju

AI Professor @Harvard; Senior Staff Research Scientist @GoogleAI; @trustworthy_ml #AI #XAI; AI PhD from Stanford; Sloan/Kavli Fellow, MIT TR #35Under35

Been Kim

@_beenkim

Research Scientist at Google DeepMind, PhD from MIT. Make machines empower people.

berkustun retweeted

Alex Spangher @ Neurips2025 @AlexanderSpangh

7 months ago

✨ Very overdue update: I'll be starting as an Assistant Professor in CS at University of Minnesota, Twin Cities, Fall 2026. I will be recruiting PhD students!! Please help me spread the word! [Thread] 1/n

AlexanderSpangh's tweet photo. ✨ Very overdue update:

I'll be starting as an Assistant Professor in CS at University of Minnesota, Twin Cities, Fall 2026. I will be recruiting PhD students!!

Please help me spread the word! [Thread] 1/n https://t.co/J0crGi19X6

40

736

142

215

92K

berkustun retweeted

Nick Vincent

@nickmvincent

10 months ago

About a week away from the deadline to submit to the ✨ Workshop on Algorithmic Collective Action (ACA) ✨ https://t.co/fF2sF2S3gw at NeurIPS 2025!

0

10

3

1

1K

berkustun retweeted

Jessica Hullman @JessicaHullman

10 months ago

I often wonder whether the prospective grad students who contact me understand what they are signing up for. I hope this does the trick

JessicaHullman's tweet photo. I often wonder whether the prospective grad students who contact me understand what they are signing up for. I hope this does the trick https://t.co/bgZYTkBp3x

3

43

4

10

3K

berkustun retweeted

Hailey Joren @HaileyJoren

11 months ago

PhD in Computer Science, University of California San Diego 🎓 My research focused on uncertainty and safety in AI systems, including 🤷‍♀️letting models say "I don't know" under uncertainty 🔎understanding and reducing hallucinations 🔁 methods for answering "how much will providing data X improve performance on Y?" at inference time Many thanks to my advisor @berkustun, to my incredible research collaborators, and to my wonderful friends, husband and family. Getting a PhD while becoming a first-time parent is definitely a recipe for growth!

HaileyJoren's tweet photo. PhD in Computer Science, University of California San Diego 🎓

My research focused on uncertainty and safety in AI systems, including

🤷‍♀️letting models say "I don't know" under uncertainty
🔎understanding and reducing hallucinations
🔁 methods for answering "how much will providing data X improve performance on Y?" at inference time

Many thanks to my advisor @berkustun, to my incredible research collaborators, and to my wonderful friends, husband and family. Getting a PhD while becoming a first-time parent is definitely a recipe for growth!

30

620

20

79

37K

berkustun retweeted

Alan Jeffares @Jeffaresalan

11 months ago

Our new ICML 2025 oral paper proposes a new unified theory of both Double Descent and Grokking, revealing that both of these deep learning phenomena can be understood as being caused by prime numbers in the network parameters 🤯🤯 🧵[1/8]

Jeffaresalan's tweet photo. Our new ICML 2025 oral paper proposes a new unified theory of both Double Descent and Grokking, revealing that both of these deep learning phenomena can be understood as being caused by prime numbers in the network parameters 🤯🤯

🧵[1/8]

13

931

75

941

131K

berkustun retweeted

Jessica Hullman @JessicaHullman

12 months ago

Explainable AI has long frustrated me by lacking a clear theory of what explanations should do. Improve use of a model for what? How? Given a task what's max effect explanation can have? It's complicated bc most methods are functions of features & prediction but not true state 1/

5

66

10

26

9K

Berk Ustun @berkustun

12 months ago

Explanations don't help us detect algorithmic discrimination. Even when users are trained. Even when we control their beliefs. Even under ideal conditions... 👇

Julian Skirzynski @JSkirzynski

12 months ago

Right to explanation laws assume explanations help people detect algorithmic discrimination. But is there any evidence for that? In our latest work w/ David Danks @berkustun, we show explanations fail to help people, even under optimal conditions. PDF https://t.co/RlKmlzxgxN

JSkirzynski's tweet photo. Right to explanation laws assume explanations help people detect algorithmic discrimination.

But is there any evidence for that?

In our latest work w/ David Danks @berkustun, we show explanations fail to help people, even under optimal conditions.

PDF https://t.co/RlKmlzxgxN https://t.co/rEBw8etbbj

1

7

2

0

1K

0

4

0

4

509

berkustun retweeted

Julian Skirzynski @JSkirzynski

12 months ago

We’ll be presenting @FAccTConference on 06.24 at 10:45 AM during the Evaluating Explainable AI session! Come chat with us. We would love to discuss implications for AI policy, better auditing methods, and next steps for algorithmic fairness research. #AIFairness #xAI

0

2

1

0

243

berkustun retweeted

Julian Skirzynski @JSkirzynski

12 months ago

Right to explanation laws assume explanations help people detect algorithmic discrimination. But is there any evidence for that? In our latest work w/ David Danks @berkustun, we show explanations fail to help people, even under optimal conditions. PDF https://t.co/RlKmlzxgxN

1

7

2

0

1K

berkustun retweeted

Lily Weng @LilyWeng_

about 1 year ago

💡LLMs don’t have to be black boxes. We introduce CB-LLMs -- the first LLMs with built-in interpretability for transparent, controllable, and safer AI. 🚀Our #ICLR2025 paper: https://t.co/08Q4Jcl39L #TrustworthyAI #ExplainableAI #AI #MachineLearning #NLP #LLM #AIResearch

1

5

2

0

436

berkustun retweeted

Hailey Joren @HaileyJoren

about 1 year ago

When RAG systems hallucinate, is the LLM misusing available information or is the retrieved context insufficient? In our #ICLR2025 paper, we introduce "sufficient context" to disentangle these failure modes. Work w J. Zhang, C.S. Ferng, @DaChengJuan1, @ankurtaly @CyrusRashtchian

1

8

3

2

741

berkustun retweeted

Harry Cheon @1000_harrry

about 1 year ago

Denied a loan, an interview, or an insurance claim by machine learning models? You may be entitled to a list of reasons. In our latest w @anniewernerfelt @berkustun @kdphd, we show how existing explanation frameworks can fail and present an alternative tailored for recourse 🧵

1000_harrry's tweet photo. Denied a loan, an interview, or an insurance claim by machine learning models? You may be entitled to a list of reasons.

In our latest w @anniewernerfelt @berkustun @kdphd, we show how existing explanation frameworks can fail and present an alternative tailored for recourse

🧵 https://t.co/94AdjDyKDr

1

2

1

2

274

berkustun retweeted

Jessica Hullman @JessicaHullman

about 1 year ago

Why is it so hard to show that people can be better decision-makers than statistical models? Some ways that common intuitions about the superiority of human judgment contradict statistical reality, and a few that don't. https://t.co/5dgGMGuHXc

3

46

8

16

5K

berkustun retweeted

Sujay Nagaraj @sujnagaraj

about 1 year ago

Many ML models predict labels that don’t reflect what we care about e.g.: – Diagnoses from unreliable tests – Outcomes from noisy electronic health records In our #ICLR2025 paper, we study how this subjects individuals to a lottery of mistakes Paper: https://t.co/TqQOMTwONy 🧵👇

sujnagaraj's tweet photo. Many ML models predict labels that don’t reflect what we care about e.g.:
– Diagnoses from unreliable tests
– Outcomes from noisy electronic health records

In our #ICLR2025 paper, we study how this subjects individuals to a lottery of mistakes
Paper: https://t.co/TqQOMTwONy
🧵👇 https://t.co/T9XJa27h3n

1

7

2

420

berkustun retweeted

Sujay Nagaraj @sujnagaraj

about 1 year ago

🚨 Excited to announce a new paper accepted at ICLR2025 in Singapore! “Learning Under Temporal Label Noise” We tackle a new challenge in time series ML: label noise that changes over time 🧵👇 https://t.co/Ka7ABXArVR

2

13

1

3

2K

berkustun retweeted

Been Kim

@_beenkim

over 1 year ago

🔥🔥Our small team in Seattle Google DeepMind is hiring! 🔥🔥If you are willing to move to/already in Seattle, has done significant work on human-machine communication / interpretability (from ML side) with a relevant PhD and great publication record, Join us. Apply here 👉👉 https://t.co/HxU7zIcGMh

7

398

59

190

57K

berkustun retweeted

Sanghamitra Dutta

@Sangha26Dutta

over 1 year ago

Are you interested in serving as a Program Committee member for the ACM Conference on Fairness, Accountability, and Transparency (ACM FAccT 2025)? PC Members are expected to review papers in their area of expertise. Expression of interest form: https://t.co/RufvO5teu0 #FAccT2025

0

13

6

1

2K

Berk Ustun

@berkustun

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users