CogInterp Workshop @ NeurIPS 2025 @coginterp - Twitter Profile

Pinned Tweet

CogInterp Workshop @ NeurIPS 2025 @CogInterp

11 months ago

We’re excited to announce the first workshop on CogInterp: Interpreting Cognition in Deep Learning Models @ NeurIPS 2025! 📣 How can we interpret the algorithms and representations underlying complex behavior in deep learning models? 🌐 https://t.co/sKn7LYWtR7 1/

1

76

20

27

17K

CogInterp retweeted

NYU Center for Data Science

@NYUDataScience

4 months ago

Can LLMs evolve human-like semantic categories? CDS-affiliated @NogaZaslavsky and PhD student Nathaniel Imel show that, via simulated cultural transmission, LLMs reorganize color categories toward efficient compression. 🔗https://t.co/Dhiy0r1ScL

NYUDataScience's tweet photo. Can LLMs evolve human-like semantic categories?

CDS-affiliated @NogaZaslavsky and PhD student Nathaniel Imel show that, via simulated cultural transmission, LLMs reorganize color categories toward efficient compression.

🔗https://t.co/Dhiy0r1ScL https://t.co/SId8qPzJSa

2

29

4

16

9K

CogInterp retweeted

Ari Holtzman

@universeinanegg

5 months ago

this slide is solid gold

2

52

4

34

6K

CogInterp retweeted

Goodfire

@GoodfireAI

6 months ago

Our last Stanford guest lecture - @EkdeepL on what counts as an explanation & a neuro-inspired "model systems approach" to interp Plus, how in-context learning and many-shot jailbreaking are explained by LLM representations changing in-context (as a case study for that approach) 00:33 - What counts as an explanation? 04:47 - Levels of analysis & standard interpretability approaches 18:19 - The "model systems" approach to interp [Case study on in-context learning] 23:36 - How LLM representations change in-context 44:10 - Modeling ICL with rational analysis 1:10:54 - Conclusion & questions Thanks again to @SuryaGanguli for having us in his class!

3

136

27

135

31K

CogInterp retweeted

Christopher Potts

@ChrisGPotts

6 months ago

Safety-oriented interpretability researchers should be focused on AI systems, not individual model artifacts. A snippet from the NeurIPS CogInterp workshop panel on Sunday:

6

168

19

79

16K

CogInterp retweeted

Noga Zaslavsky @NogaZaslavsky

6 months ago

Honored and thrilled that our work received the @CogInterp best paper award! 💫 📄 Extended paper: https://t.co/TrQcZgrygv 🧵 Highlights: https://t.co/5xRT08StX3 @NeurIPSConf #NeurIPS2025

2

35

6

7

4K

CogInterp retweeted

Ari Holtzman

@universeinanegg

6 months ago

this was so awesome. Jay still killin' it five decades later

3

39

1

11

7K

CogInterp Workshop @ NeurIPS 2025 @CogInterp

6 months ago

Our Best Paper Award goes to Nathaniel Imel and Noga Zaslavsky @NogaZaslavsky for their excellent paper “Culturally transmitted color categories in LLMs reflect a learning bias toward efficient compression”!

CogInterp's tweet photo. Our Best Paper Award goes to Nathaniel Imel and Noga Zaslavsky @NogaZaslavsky for their excellent paper “Culturally transmitted color categories in LLMs reflect a learning bias toward efficient compression”! https://t.co/lEEpc9WKdS

0

11

1

5K

CogInterp retweeted

Justin Angel

@JustinAngel

6 months ago

At the @CogInterp workshop at NeurIPS. https://t.co/owG9CXhXj3 This slide explains MechIntrep vs CongIntrep:

0

11

4

8

644

CogInterp Workshop @ NeurIPS 2025 @CogInterp

6 months ago

We are about to start our panel discussion, join us for some hot takes about what cognitive interpretability should be about.

CogInterp's tweet photo. We are about to start our panel discussion, join us for some hot takes about what cognitive interpretability should be about. https://t.co/48gXJtirG2

0

7

1

0

342

CogInterp Workshop @ NeurIPS 2025 @CogInterp

6 months ago

Our final speaker @sydneymlevine makes a radical proposal: building computational models of human moral judgements to use as an AI system for making moral judgements.

CogInterp's tweet photo. Our final speaker @sydneymlevine makes a radical proposal: building computational models of human moral judgements to use as an AI system for making moral judgements. https://t.co/ksscnC1l0M

0

3

0

2

208

CogInterp Workshop @ NeurIPS 2025 @CogInterp

6 months ago

Jay proposes shifting from representing context as a sequence of tokens to a sequence of thoughts. The model learns a latent 'thought gestalt' from previous sentences to guide downstream prediction.

CogInterp's tweet photo. Jay proposes shifting from representing context as a sequence of tokens to a sequence of thoughts. The model learns a latent 'thought gestalt' from previous sentences to guide downstream prediction. https://t.co/XMk5QEYyhH

0

4

0

2

258

CogInterp Workshop @ NeurIPS 2025 @CogInterp

6 months ago

Jay McClelland, opens with a question, "Do LMs have thoughts?" Are LMs stochastic parrots or is there some understanding?

3

18

1

12

8K

CogInterp Workshop @ NeurIPS 2025 @CogInterp

6 months ago

Visualizing how LLMs handle object-property binding, he argues that even with scale, transformers might not be forming the kind of 'integrated representations' that human cognition relies on.

CogInterp's tweet photo. Visualizing how LLMs handle object-property binding, he argues that even with scale, transformers might not be forming the kind of 'integrated representations' that human cognition relies on. https://t.co/HQmB8ItkDn

1

0

2

274

CogInterp Workshop @ NeurIPS 2025 @CogInterp

6 months ago

A big crowd for Jay McClelland’s talk!

0

3

0

195

CogInterp Workshop @ NeurIPS 2025 @CogInterp

6 months ago

Swing by a super happening poster session where ML and CogSci meet!

0

6

1

3K

CogInterp Workshop @ NeurIPS 2025 @CogInterp

6 months ago

In our fourth spotlight talk, neural network legend Paul Smolensky uses symbolic programs such as production systems to understand how neural networks process symbols

CogInterp's tweet photo. In our fourth spotlight talk, neural network legend Paul Smolensky uses symbolic programs such as production systems to understand how neural networks process symbols https://t.co/j9lwlEuQZi

0

21

3

2

3K

CogInterp Workshop @ NeurIPS 2025 @CogInterp

6 months ago

For our third spotlight talk, Sonia Murthy @soniakmurthy uses probabilistic cognitive models to understand value trade-offs in LLMs that enable pragmatic reasoning about politeness in speech acts

CogInterp's tweet photo. For our third spotlight talk, Sonia Murthy @soniakmurthy uses probabilistic cognitive models to understand value trade-offs in LLMs that enable pragmatic reasoning about politeness in speech acts https://t.co/jg30d16dLw

0

3

0

166

CogInterp Workshop @ NeurIPS 2025 @CogInterp

6 months ago

Erin Grant @ermgrant discusses dissociations between function and representation, and asks whether representational alignment is enough for understanding deep neural networks

CogInterp's tweet photo. Erin Grant @ermgrant discusses dissociations between function and representation, and asks whether representational alignment is enough for understanding deep neural networks https://t.co/cIeZ6c9e1B

1

10

1

0

455

CogInterp retweeted

Sonia Murthy @soniakmurthy

6 months ago

Excited to be presenting our work on using cognitive models to interpret pluralistic values in LLMs once again as a spotlight talk 🌟 at the NeurIPS CogInterp workshop! Come by upper level room 5AB today and check out the paper here: https://t.co/feDaH3RvKY

0

8

2

0

996

CogInterp Workshop @ NeurIPS 2025

@CogInterp

Last Seen Users on Sotwe

Trends for you

Most Popular Users