Robin Jia @robinomial - Twitter Profile

2 days ago

🚨 [New preprint] Can AI assistants hurt the very people who depend on them? Raine v. OpenAI alleges ChatGPT contributed to a teen's suicide; OpenAI's 2025 "sycophancy" retrospective on GPT-4o. The pattern: harm comes not from capability failures, but from the social dynamics of how models talk to us, especially when users open up. We introduce EUDAIMONIA, a benchmark grounded in a Social AI Design Code rooted in real-world harm cases. 🌐 Project page: https://t.co/IhfoCfmpNH 📄 Paper: https://t.co/9mh4NVYU3w

BillJohn1235813's tweet photo. 🚨 [New preprint] Can AI assistants hurt the very people who depend on them?

Raine v. OpenAI alleges ChatGPT contributed to a teen's suicide; OpenAI's 2025 "sycophancy" retrospective on GPT-4o. The pattern: harm comes not from capability failures, but from the social dynamics of how models talk to us, especially when users open up.

We introduce EUDAIMONIA, a benchmark grounded in a Social AI Design Code rooted in real-world harm cases.

🌐 Project page: https://t.co/IhfoCfmpNH
📄 Paper: https://t.co/9mh4NVYU3w

3

21

4

10

2K

Robin Jia @robinomial

6 days ago

Being Johnny’s PhD advisor has not only been a great privilege, but it has forever changed my research vision. His work combining AI, law, and statistics opened my eyes to how technical research can guide policy and promote AI accountability. Excited for his next work as Dr. Wei!

Johnny Tian-Zheng Wei @johntzwei

6 days ago

Hi all, I defended my PhD thesis. My thesis in two sentences: Current AI measurement takes LLMs as fixed objects, which constrains us to observational measurement. *Spiking* the training data (inserting certain data at known rates), enables statistically principled measurement.

johntzwei's tweet photo. Hi all, I defended my PhD thesis. My thesis in two sentences:

Current AI measurement takes LLMs as fixed objects, which constrains us to observational measurement. *Spiking* the training data (inserting certain data at known rates), enables statistically principled measurement. https://t.co/milOnNR3KE

28

176

6

37

18K

0

56

0

7

7K

robinomial retweeted

Johnny Tian-Zheng Wei @johntzwei

6 days ago

Hi all, I defended my PhD thesis. My thesis in two sentences: Current AI measurement takes LLMs as fixed objects, which constrains us to observational measurement. *Spiking* the training data (inserting certain data at known rates), enables statistically principled measurement.

28

176

6

37

18K

robinomial retweeted

Amin Banayeeanzade

@Amin__Bana

6 days ago

Does your GPT-5.5 also love Valparaíso in Chile 🇨🇱 !? Ask it to “Name a random city in the world”. You might expect a broad sample from thousands of cities. Instead, models collapse to the same small set of answers again and again. 😵‍💫 But why do LLMs lack diversity? Why are they not reliable random number generators? Why do they still struggle with genuinely creative writing? And why do decoding tricks like temperature, top-k, and top-p often fail to recover meaningful diversity? We have some answers in our new paper! 🧪 Demo: https://t.co/0sXuzkNQVz 📄 Paper: https://t.co/zTshh5FCjt

1

9

3

2K

Who to follow

Violet Peng

@VioletNPeng

Associated Professor@UCLA-CS. Research NLP, AI creativity, controllable generation, model evaluation, computational journalism, event. (she/her/hers)

Scott Wen-tau Yih

@scottyih

Research Scientist at Meta Fundamental AI Research (FAIR)

Yu Su

@ysu_nlp

co-founder @NeoCognition | prof. @osunlp | sloan fellow | building towards abundance of specialized intelligence

robinomial retweeted

Johnny Tian-Zheng Wei @johntzwei

8 days ago

🧵[1/5] Works on test set contamination focus on detection, but we show *correction* of inflated test scores is possible. https://t.co/7D6lr63d40 Our proposal is to spike the training data and insert some test examples at known rates. The spiked examples are used to calibrate...

1

32

10

9

5K

robinomial retweeted

Blaise Agüera (@blaiseaguera.bsky.social)

@blaiseaguera

20 days ago

Just as single cells became multicellular life, 8B+ brains are now joining with AI to form a collective superintelligence. At @USC's Institute on Ethics and Trust in Computing inaugural summit, @robinomial, Jinchi Lv, @paria_rd and I discussed navigating this transition.

blaiseaguera's tweet photo. Just as single cells became multicellular life, 8B+ brains are now joining with AI to form a collective superintelligence.

At @USC's Institute on Ethics and Trust in Computing inaugural summit, @robinomial, Jinchi Lv, @paria_rd and I discussed navigating this transition. https://t.co/RzzP7MIUsp

1

28

3

4

2K

robinomial retweeted

Ai2 @allen_ai

26 days ago

Today we’re releasing EMO, a new mixture-of-experts (MoE) model trained so modular structure emerges directly from data without human-defined priors. EMO can use a small subset of its experts for a given task while keeping near full-model performance. 🧵

allen_ai's tweet photo. Today we’re releasing EMO, a new mixture-of-experts (MoE) model trained so modular structure emerges directly from data without human-defined priors.

EMO can use a small subset of its experts for a given task while keeping near full-model performance. 🧵 https://t.co/xXcWsYh50D

13

401

57

233

86K

robinomial retweeted

Ryan Yixiang Wang

@RyanYixiang

26 days ago

MoEs are everywhere in frontier models, and they are deployed as a monolith system. But many applications only need a narrow slice of capabilities, e.g., math, code, biomedical, etc. So what if "modularity" is actually the missing opportunity for MoEs? Today, we're releasing EMO: an end-to-end pretrained MoE where modularity emerges naturally, enabling selective use of experts!

RyanYixiang's tweet photo. MoEs are everywhere in frontier models, and they are deployed as a monolith system.

But many applications only need a narrow slice of capabilities, e.g., math, code, biomedical, etc.

So what if "modularity" is actually the missing opportunity for MoEs?

Today, we're releasing EMO: an end-to-end pretrained MoE where modularity emerges naturally, enabling selective use of experts!

7

523

73

321

114K

robinomial retweeted

Deqing Fu

@DeqingFu

about 1 month ago

Glad to share that this paper is accepted to #ICML 2026 @icmlconf with an updated title "Transformers Provably Learn Algorithmic Solutions for Graph Connectivity, But Only with the Right Data". 🥳

2

31

3

10

4K

robinomial retweeted

Qingchuan (Tony) Yang

@qcyang20xx

about 1 month ago

EPSVec will see you at #ICML2026!!

1

26

7

2

4K

robinomial retweeted

Yuqing Yang @yyqcode

about 1 month ago

🧵 1/8 What should an LLM assistant remember across conversations? Existing memory work studies this one task at a time. But real-world assistants see all kinds of conversations, and that changes the problem. Introducing BEHEMOTH 🦣 + CluE 🌱: a benchmark & self-evolving method for heterogeneous memory extraction. 📄 Paper: https://t.co/szLIOdA4bm

yyqcode's tweet photo. 🧵 1/8
What should an LLM assistant remember across conversations?

Existing memory work studies this one task at a time. But real-world assistants see all kinds of conversations, and that changes the problem.

Introducing BEHEMOTH 🦣 + CluE 🌱: a benchmark & self-evolving method for heterogeneous memory extraction.

📄 Paper: https://t.co/szLIOdA4bm

6

50

16

28

14K

robinomial retweeted

Deqing Fu

@DeqingFu

about 1 month ago

After three papers on Fourier features in LLMs, I think there's a principle worth naming. How should we do science on an LLM? It corresponds to the existential questions: > who am I? ↔ the phenomenon. > where do I come from? ↔ the emergence. > where am I going? ↔ the use. 🧵

103

4K

171

364

5M

robinomial retweeted

Deqing Fu

@DeqingFu

about 1 month ago

New paper: Convergent Evolution: How Different Language Models Learn Similar Number Representations. Language models, classical word embeddings, and even raw token frequencies all develop the same Fourier features for numbers. But only some develop the underlying structure. 🧵

DeqingFu's tweet photo. New paper: Convergent Evolution: How Different Language Models Learn Similar Number Representations.

Language models, classical word embeddings, and even raw token frequencies all develop the same Fourier features for numbers. But only some develop the underlying structure. 🧵

2

108

22

60

45K

robinomial retweeted

Wang Bill Zhu

@BillJohn1235813

about 1 month ago

Frontier LLMs don't debug, they regenerate. We built PDB to measure that gap, GPT-5.1-Codex pass unit tests >76% of the time, but touch only <45% of the right lines. Even Claude Code touches only ~50%. 📄 Paper: https://t.co/OHvjcqAwJa 🌐 Project: https://t.co/CraU9xeUKg

1

28

10

7

2K

Robin Jia @robinomial

about 1 month ago

Article: https://t.co/8v0kSSdyMU Website: https://t.co/jGfT8yT1q0 Big thanks to Peter Hall for the article, as well as @NSF NAIRR and @nvidia for the compute!

1

7

3

1K

Robin Jia @robinomial

about 1 month ago

Excited to announce that Hubble, our new language model suite for studying LLM memorization, was recently featured in @ScienceMagazine ! Hubble has also received an oral presentation slot at ICLR; if you're there, check out @johntzwei and @Aflah02101 's presentation on Saturday!

3

89

12

15

6K

robinomial retweeted

Johnny Tian-Zheng Wei @johntzwei

about 2 months ago

Hi all, I am going to Rio for ICLR! If you are interested in AI safety, governance, reducing bad model behaviors, I would like to talk to you! My expertise is in statistics, law, and LLM pretraining and memorization.

0

31

1

2K

robinomial retweeted

Tim Dettmers

@Tim_Dettmers

about 2 months ago

We in the quantization community could quickly see this and were flabbergastered by the response to TurboQuant. Whenever I saw TurboQuant on my timeline, I found it hurtful, because the work of other academics who worked so hard was discounted.

9

236

12

43

19K

robinomial retweeted

Aflah 🍉🕊️ @Aflah02101

about 2 months ago

Really thrilled to share that Hubble was recently featured in @ScienceMagazine! Check out the article here: https://t.co/LfP47Y8ygB We’ll also be at ICLR later this month, come check out our talk and poster on this work!

1

14

1

1K

robinomial retweeted

Johnny Tian-Zheng Wei @johntzwei

3 months ago

Hi all, I wrote a Claude code tutorial for ML researchers who have never done SWE in their life: https://t.co/nb9TER2iEe I never learned SWE myself, so maybe there are others in the same boat. This is NOT just tips on how to write CLAUDE.md. 70% of my notes are on SWE principles

2

34

5

30

2K

Robin Jia

@robinomial

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users