Virginia Smith @gingsmith - Twitter Profile

@fjzzq2002

about 2 months ago

1/ 🌳 We are releasing Pando, a benchmark to climb for interpretability methods. We created 720+ model organisms with known ground-truth decision rules planted inside, then asked: do white-box tools actually beat just asking the model?

fjzzq2002's tweet photo. 1/ 🌳 We are releasing Pando, a benchmark to climb for interpretability methods. We created 720+ model organisms with known ground-truth decision rules planted inside, then asked: do white-box tools actually beat just asking the model? https://t.co/opr9lgrEcR

2

49

9

25

4K

gingsmith retweeted

Simons Institute for the Theory of Computing @SimonsInstitute

2 months ago

Next week at the Simons Institute, a workshop on Agency in Collaborative Learning. Join us! https://t.co/VcDcJA8fWX

0

11

2

3

11K

gingsmith retweeted

Niloofar ✈️ icml

@niloofar_mire

3 months ago

Excited to give a talk at @SimonsInstitute Trust in Decentralized Systems Workshop on Tuesday at 11am! Title: "2026 Is the New 2016" — on federated memory, contextual privacy, and personalized agents. The privacy conversation has moved way past training data memorization. With persistent memory, tool use, and agents acting on your behalf (#clawdbot👀), the real risk is what models do with the data you feed them at inference time. Context window is the new attack surface! I'll talk about our new benchmark CIMemories, where we test whether models can actually make context-dependent decisions about what to share with whom from memory. Turns out they really can't, up to 69% violation rates, and it only gets worse the more you use them. Link to slides🔻🔻🔻

niloofar_mire's tweet photo. Excited to give a talk at @SimonsInstitute Trust in Decentralized Systems Workshop on Tuesday at 11am!

Title: "2026 Is the New 2016" — on federated memory, contextual privacy, and personalized agents.

The privacy conversation has moved way past training data memorization. With persistent memory, tool use, and agents acting on your behalf (#clawdbot👀), the real risk is what models do with the data you feed them at inference time. Context window is the new attack surface!

I'll talk about our new benchmark CIMemories, where we test whether models can actually make context-dependent decisions about what to share with whom from memory. Turns out they really can't, up to 69% violation rates, and it only gets worse the more you use them.

Link to slides🔻🔻🔻

5

108

13

41

7K

gingsmith retweeted

Neil Kale

@neilkale

3 months ago

[1/n] Open Problems in AI Child Safety AI is misused to generate CSAM at alarming scale. 400% increase in AI-generated CSAM since 2024 (IWF). 1 in 17 teens are victimized by deepfake nudes. We outline 15 open problems where AI safety research can help. 🔗https://t.co/2BGTSsPj53

neilkale's tweet photo. [1/n] Open Problems in AI Child Safety

AI is misused to generate CSAM at alarming scale. 400% increase in AI-generated CSAM since 2024 (IWF). 1 in 17 teens are victimized by deepfake nudes.

We outline 15 open problems where AI safety research can help.
🔗https://t.co/2BGTSsPj53 https://t.co/hA0c2Vk59R

1

15

2

6

4K

Who to follow

Tian Li

@litian0331

Assistant Professor @UChicagoCS @DSI_UChicago | PhD @CSDatCMU

Sham Kakade

@ShamKakade6

Harvard Professor. Full stack ML and AI. Co-director of the Kempner Institute for the Study of Artificial and Natural Intelligence.

Chi Jin

@chijinML

Researcher @OpenAI | Associate Prof @Princeton AI Reasoning · Reinforcement Learning · Game Theory · ML Foundations

Virginia Smith

@gingsmith

6 months ago

🎅 A good reminder about LLM biases and a fun holiday project led by @neilkale, @prthaker_ & James Wedgwood

Neil Kale

@neilkale

6 months ago

We asked LLMs: Is Santa real? 🎅 GPT-4o says Yes at any age. Claude tells 5-year-olds the truth. What does this reveal about invisible assumptions in AI? Do LLMs believe in the tooth fairy or the Illuminati? New holiday post from @mlcmublog 🔗⬇️

neilkale's tweet photo. We asked LLMs: Is Santa real? 🎅

GPT-4o says Yes at any age. Claude tells 5-year-olds the truth.

What does this reveal about invisible assumptions in AI? Do LLMs believe in the tooth fairy or the Illuminati?

New holiday post from @mlcmublog 🔗⬇️ https://t.co/ZHpdVmHPoF

1

22

5

3

3K

0

2

0

279

gingsmith retweeted

Chhavi Yadav

@chhaviyadav_

6 months ago

Ending the year on a high note, with 2 papers accepted at @satml_conf 🎉 check them out! 📄 One is an interview study on cross-silo Federated Learning, highlighting a misalignment between real-world challenges & current research focus. (https://t.co/5osTNA6FHL) 📄 The other evaluates machine unlearning methods on a harder, but realistic setting involving multi-hop knowledge. (https://t.co/SK8qibGKvf)

chhaviyadav_'s tweet photo. Ending the year on a high note, with 2 papers accepted at @satml_conf 🎉 check them out!

📄 One is an interview study on cross-silo Federated Learning, highlighting a misalignment between real-world challenges & current research focus. (https://t.co/5osTNA6FHL)

📄 The other evaluates machine unlearning methods on a harder, but realistic setting involving multi-hop knowledge.
(https://t.co/SK8qibGKvf)

6

43

5

18

5K

gingsmith retweeted

Amrith Setlur

@setlur_amrith

7 months ago

🔥 New blog on scaling RL📈 by unlocking a *new* way to explore on hard problems. - ☹️Currently scaling compute on hard problems is futile. - ☠️Turns out exploration during RL is the main bottleneck! - 👎Classical exploration methods like token entropy fail. - 💡We need a new paradigm of exploration that goes beyond simply sharpening or chaining (vg-gap) capabilities in the base model. - 🗝️The key lies in the instruction-following abilities of base models to unlock a new regime of guided exploration. More here 👇:

2

27

5

13

3K

gingsmith retweeted

Chhavi Yadav

@chhaviyadav_

8 months ago

🚀 Federated Learning (FL) promises collaboration without data sharing. While Cross-Device FL is a success and deployed widely in industry, we don’t see Cross-Silo FL (collaboration between organizations) taking off despite huge demand and interest. Why could this be the case? 🤔 We conduct an interview study to dig deeper into the real-world barriers to cross-silo FL adoption! 👇 (1/3)

chhaviyadav_'s tweet photo. 🚀 Federated Learning (FL) promises collaboration without data sharing. While Cross-Device FL is a success and deployed widely in industry, we don’t see Cross-Silo FL (collaboration between organizations) taking off despite huge demand and interest.
Why could this be the case? 🤔

We conduct an interview study to dig deeper into the real-world barriers to cross-silo FL adoption! 👇 (1/3)

1

22

12

15

5K

gingsmith retweeted

Transactions on Machine Learning Research @TmlrOrg

8 months ago

As Transactions on Machine Learning Research (TMLR) grows in number of submissions, we are looking for more reviewers and action editors. Please sign up! Only one paper to review at a time and <= 6 per year, reviewers report greater satisfaction than reviewing for conferences!

2

64

27

20

56K

gingsmith retweeted

Steven Kolawole @_stevenkolawole

10 months ago

1/ Can model agreement replace LLM cascades routers for efficient inference? Formalizing this, we found that under broad conditions, ensemble provides provably safe, training-free routing rules. We call the method *Agreement-Based Cascading (ABC)*, and it's live now at TMLR! 🧵

1

35

15

5

2K

gingsmith retweeted

Pratiksha Thaker @prthaker_

11 months ago

I'm very excited to share some new work https://t.co/0rADO9DMCM. This work started out in conversations with @thorn where we realized that shadow model MIAs couldn't be used to audit models for harmful content of children. See 🧵 for why, and our progress on solving this...

1

25

7

6

4K

gingsmith retweeted

ICML Conference @icmlconf

12 months ago

ICML offers an optional poster printing service https://t.co/Y2w0rw1KDn Orders can be picked up the day at the Vancouver Convention Centre in West MR 104 during the following hours: Monday - Friday: 7:30 am - 5:00 pm Saturday: 8:00 am - 1:00 pm

0

13

4

8K

gingsmith retweeted

Matthew Yang

@_matthewyang

about 1 year ago

🚨 NEW PAPER: What if LLMs could tackle harder problems - not by explicitly training on longer traces, but by learning how to think longer? Our recipe e3 teaches models to explore in-context, enabling LLMs to unlock longer reasoning chains without ever seeing them in training. 🤯 Website: https://t.co/si6sTgpbtZ Paper: https://t.co/NmzUSH2P4y 🧵[1/8]

1

10

3

4

940

gingsmith retweeted

ML@CMU @mlcmublog

about 1 year ago

https://t.co/Jsl6oztcSF Are your LLMs truly forgetting unwanted data? In this new blog post authored by @shengyuan_26734, Yiwei Fu, @zstevenwu, and @gingsmith, we discuss how benign relearning can jog unlearned LLM's memory to recover knowledge that is supposed to be forgotten.

0

6

4

3

1K

gingsmith retweeted

Ameet Talwalkar

@atalwalkar

about 1 year ago

I’m excited to share new work from Datadog AI Research! We just released Toto, a new SOTA (by a wide margin!) time series foundation model, and BOOM, the largest benchmark of observability metrics. Both are available under the Apache 2.0 license. 🧵

atalwalkar's tweet photo. I’m excited to share new work from Datadog AI Research! We just released Toto, a new SOTA (by a wide margin!) time series foundation model, and BOOM, the largest benchmark of observability metrics. Both are available under the Apache 2.0 license. 🧵 https://t.co/vrDSadHdQz

5

242

52

213

38K

gingsmith retweeted

ICML Conference @icmlconf

about 1 year ago

Invited talked are announced. https://t.co/912HjtLsPE Jon Kleinberg Pamela Samuelson Frauke Kreuter Anca Dragan Andreas Krause

1

62

5

16K

gingsmith retweeted

Aashiq Muhamed @AashiqMuhamed

about 1 year ago

Thrilled to share our new work on improving LLM unlearning! 🚀 Gradient-based unlearning struggle with high cost, instability & lack of precision. We introduce Dynamic SAE Guardrails (DSG): an activation-based approach using SAEs for targeted, efficient knowledge removal.

AashiqMuhamed's tweet photo. Thrilled to share our new work on improving LLM unlearning! 🚀
Gradient-based unlearning struggle with high cost, instability & lack of precision.
We introduce Dynamic SAE Guardrails (DSG): an activation-based approach using SAEs for targeted, efficient knowledge removal. https://t.co/MRto4JdBNk

4

45

8

25

7K

gingsmith retweeted

ML@CMU @mlcmublog

about 1 year ago

https://t.co/0t37BZIOlf 📈⚠️ Is your LLM unlearning benchmark measuring what you think it is? In a new blog post authored by @prthaker_, @shengyuan_26734, @neilkale, @yash_maurya01, @zstevenwu, and @gingsmith, we discuss why empirical benchmarks are necessary but not sufficient measures of success (SaTML 2025).

0

12

3

2K

gingsmith retweeted

Andrew Gordon Wilson

@andrewgwils

about 1 year ago

The ICML 2025 workshops list is online! https://t.co/jNYm5qEZSm. Many exciting topics, spanning multi-agent systems, world models, test-time adaptation, actionable interpretability, and much more.

4

115

17

31

14K

gingsmith retweeted

Amrith Setlur

@setlur_amrith

about 1 year ago

How to effectively unlearn finetuning data? ❌ Approx. methods leak sensitive data ✅ Exact unlearning (eg. retraining) is secure 🔒 but inefficient 🚨 New paper: *efficient* & *exact* unlearning (led by Kevin) 🗝️ Idea: model merging at scale https://t.co/Skks6H8Mcm 🧵⤵️

setlur_amrith's tweet photo. How to effectively unlearn finetuning data?
❌ Approx. methods leak sensitive data
✅ Exact unlearning (eg. retraining) is secure 🔒 but inefficient

🚨 New paper: *efficient* & *exact* unlearning (led by Kevin)
🗝️ Idea: model merging at scale
https://t.co/Skks6H8Mcm
🧵⤵️ https://t.co/znNfYudrtT

1

54

10

30

4K

Virginia Smith

@gingsmith

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users