Tal Haklay @tal_haklay - Twitter Profile

3 days ago

At CVPR this week for a talk on neural geometry of large vision models. If you’re interested in interpretability or joining @GoodfireAI, come say hi. 🤠

thomas_fel_'s tweet photo. At CVPR this week for a talk on neural geometry of large vision models. If you’re interested in interpretability or joining @GoodfireAI, come say hi. 🤠 https://t.co/guuZQGb3YQ

2

87

14

26

7K

tal_haklay retweeted

Tamar Rott Shaham @TamarRottShaham

4 days ago

Looking forward to giving a keynote at the @WiCVworkshop dinner tonight! If you're attending, come say hi!

0

30

3

1

1K

tal_haklay retweeted

Ekdeep Singh Lubana @EkdeepL

6 days ago

Very excited to have this paper out! We show by having more parameters, larger models see reduced interference between updates. This allows them to retain memories of rarely observed samples of a task, eventually allowing them to learn even the tail-end of the distribution. (1/3)

4

185

19

90

16K

tal_haklay retweeted

Goodfire

@GoodfireAI

6 days ago

New research from Goodfire and collaborators: why do larger models learn more tasks? (spoiler: it’s bottlenecked by data)

3

180

14

113

21K

tal_haklay retweeted

Christopher Potts

@ChrisGPotts

6 days ago

We take for granted that larger models are better than smaller ones, but why is this so? Our new paper, led by Jing Huang and @EkdeepL, traces this to a data-induced competition for resources (neurons), using formal analysis, idealized tasks, and real pretraining.

ChrisGPotts's tweet photo. We take for granted that larger models are better than smaller ones, but why is this so? Our new paper, led by Jing Huang and @EkdeepL, traces this to a data-induced competition for resources (neurons), using formal analysis, idealized tasks, and real pretraining. https://t.co/vqRUUe6whP

20

892

134

817

127K

Tal Haklay @tal_haklay

11 days ago

CFP is here >> https://t.co/gnMONhsAAm

0

2

208

Tal Haklay @tal_haklay

11 days ago

Submit your work! The 2nd Workshop on 𝐀𝐜𝐭𝐢𝐨𝐧𝐚𝐛𝐥𝐞 𝐈𝐧𝐭𝐞𝐫𝐩𝐫𝐞𝐭𝐚𝐛𝐢𝐥𝐢𝐭𝐲 will be held at COLM 2026 in San Francisco! Submission Deadline: June 21, 2026 @ActInterp

tal_haklay's tweet photo. Submit your work! The 2nd Workshop on 𝐀𝐜𝐭𝐢𝐨𝐧𝐚𝐛𝐥𝐞 𝐈𝐧𝐭𝐞𝐫𝐩𝐫𝐞𝐭𝐚𝐛𝐢𝐥𝐢𝐭𝐲 will be held at COLM 2026 in San Francisco!

Submission Deadline: June 21, 2026

@ActInterp https://t.co/xL8DoI7t3t

1

29

2

4

1K

tal_haklay retweeted

Tamar Rott Shaham @TamarRottShaham

11 days ago

What is the role of text tokens in diffusion? Do they carry anything beyond the text prompt? We study this in FLUX.2 @bfl_ml for the task of reference-guided generation, and found that text tokens hold visual information from the reference image!

2

20

7

4

3K

tal_haklay retweeted

Chris Ge @ChrisGe05

12 days ago

FLUX.2's @bfl_ml text tokens aren't just holding your prompt. During image editing, they absorb reference image content, and some of that absorbed content, like color and style, causally drives the output appearance. New paper 🧵👇

7

202

35

145

26K

tal_haklay retweeted

Hadas Orgad @OrgadHadas

15 days ago

Submit your work! The 2nd Workshop on 𝐀𝐜𝐭𝐢𝐨𝐧𝐚𝐛𝐥𝐞 𝐈𝐧𝐭𝐞𝐫𝐩𝐫𝐞𝐭𝐚𝐛𝐢𝐥𝐢𝐭𝐲 will be held at COLM 2026 in San Francisco! Submission Deadline: June 21, 2026 @ActInterp

OrgadHadas's tweet photo. Submit your work! The 2nd Workshop on 𝐀𝐜𝐭𝐢𝐨𝐧𝐚𝐛𝐥𝐞 𝐈𝐧𝐭𝐞𝐫𝐩𝐫𝐞𝐭𝐚𝐛𝐢𝐥𝐢𝐭𝐲 will be held at COLM 2026 in San Francisco!

Submission Deadline: June 21, 2026

@ActInterp https://t.co/HF2lJczx8T

2

131

18

76

13K

tal_haklay retweeted

Goodfire

@GoodfireAI

17 days ago

SAEs remain useful, as long as we’re aware of their limitations. And we have new techniques in the works that recover manifolds more directly, allowing us to understand models better and control them more effectively! Read the full post here: https://t.co/KIDlJl19mW

0

94

7

41

4K

tal_haklay retweeted

Goodfire

@GoodfireAI

17 days ago

This helps explain why SAEs can feel both illuminating and unsatisfying! Looking at SAE features one-by-one is like trying to understand the proverbial elephant by talking with each of the blind men: each label may be locally accurate, but the global structure is missing. (5/7)

1

56

1

3

3K

tal_haklay retweeted

Ryan Peters

@ryanpirl

16 days ago

This would provide a great explanation for why there is so much redundancy in SAE features at any given layer (observation made by @Sauers_ ). For example, if you search through the Qwen3-4b transcoder feature labels provided by Neuronpedia, there are 139 features generically related to the concept of 'color' in just layer 14. There are even more if you consider specific colors such as 'blue' or 'green', and this redundancy is repeated across layers... making it very annoying to interpret raw circuit graphs without performing some form of clustering.

ryanpirl's tweet photo. This would provide a great explanation for why there is so much redundancy in SAE features at any given layer (observation made by @Sauers_ ).

For example, if you search through the Qwen3-4b transcoder feature labels provided by Neuronpedia, there are 139 features generically related to the concept of 'color' in just layer 14. There are even more if you consider specific colors such as 'blue' or 'green', and this redundancy is repeated across layers... making it very annoying to interpret raw circuit graphs without performing some form of clustering.

4

69

6

42

19K

tal_haklay retweeted

Goodfire

@GoodfireAI

17 days ago

Consider the parable of the blind men encountering an elephant for the first time. Each touches a different part—the trunk, the tusk, the leg—and comes to a different conclusion about the elephant: one says it's like a tree, another says it’s like a rope, and so on. (2/7)

GoodfireAI's tweet photo. Consider the parable of the blind men encountering an elephant for the first time. Each touches a different part—the trunk, the tusk, the leg—and comes to a different conclusion about the elephant: one says it's like a tree, another says it’s like a rope, and so on. (2/7) https://t.co/XUdwokrPHh

2

81

2

9

5K

Tal Haklay @tal_haklay

16 days ago

You should read our paper — and stay tuned 👀

Goodfire

@GoodfireAI

17 days ago

The most popular way to interpret AI is missing the bigger picture. Models think in curved shapes. But sparse autoencoders (SAEs) work with straight lines. Can they still capture models’ curved neural geometry? Yes, but not how you might think! (1/7)

24

1K

150

761

169K

0

67

3

30

6K

tal_haklay retweeted

Ekdeep Singh Lubana @EkdeepL

17 days ago

Super excited to have this paper finally out! So many nuggets here, but a critical highlight: you should *not* interpret SAE features in isolation. The population geometry is where it's all at! Similar to this image of us @GoodfireAI folks playing out the elephant parable. :P

EkdeepL's tweet photo. Super excited to have this paper finally out! So many nuggets here, but a critical highlight: you should *not* interpret SAE features in isolation. The population geometry is where it's all at! Similar to this image of us @GoodfireAI folks playing out the elephant parable. :P https://t.co/ZE1kikyJDD

2

141

14

41

8K

tal_haklay retweeted

Thomas Fel

@thomas_fel_

17 days ago

How do SAEs capture concept manifolds? 🍩 I think this is important work. we study how SAEs handle the geometric structures we've identified and find they tile/shatter them in a particular way we characterize, letting us recast unsupervised manifold discovery as inverse Ising

1

82

11

32

5K

tal_haklay retweeted

Goodfire

@GoodfireAI

17 days ago

The most popular way to interpret AI is missing the bigger picture. Models think in curved shapes. But sparse autoencoders (SAEs) work with straight lines. Can they still capture models’ curved neural geometry? Yes, but not how you might think! (1/7)

24

1K

150

761

169K

tal_haklay retweeted

Adam Belfki @adambelfki

19 days ago

When @sheridan_feucht first told me about these results I was kind of skeptical, until they mentioned I can trace the modulo base-10 addition in Llama 8B just using Logit Lens. 🔍 So I opened https://t.co/h9fWNZMGY4 to check it out myself, and this is what I saw:

adambelfki's tweet photo. When @sheridan_feucht first told me about these results I was kind of skeptical, until they mentioned I can trace the modulo base-10 addition in Llama 8B just using Logit Lens. 🔍

So I opened https://t.co/h9fWNZMGY4 to check it out myself, and this is what I saw: https://t.co/An7AWLN6nu

1

19

5

3

3K

tal_haklay retweeted

Goodfire

@GoodfireAI

19 days ago

We're in Forbes today! @ericho_goodfire and @banburismus_ spoke to @RashiShrivast18 about why understanding AI is one of the most important problems in the world right now.

GoodfireAI's tweet photo. We're in Forbes today!

@ericho_goodfire and @banburismus_ spoke to @RashiShrivast18 about why understanding AI is one of the most important problems in the world right now. https://t.co/oe7uk7YRKq

3

82

14

12

5K

Tal Haklay

@tal_haklay

Last Seen Users on Sotwe

Trends for you

Most Popular Users