Michael Pearce

27 days ago

Neural networks do math by rotating shapes. We found a shape-rotating calculator hidden inside an LLM – and it’s used for more than just math! (1/6)

122

4K

556

3K

935K

_MichaelPearce retweeted

interested in consciousness tech, animal welfare, enlightenment, and a bunch of other things

about 1 month ago

Neural networks might speak English, but they think in shapes. Understanding their rich *neural geometry* is key to understanding how they work – and to debugging and controlling them with precision. Starting today, we’re releasing a series of posts on this research agenda. 🧵

307

11K

2K

9K

3M

Who to follow

Louis Arge

@louisvarge

Jakub Schimmelpfennig

@psychedmt

First steps as #psychedelics (DMT) and #consciousness researcher. Chief of Psychedelic and Consciousness Research Group on @SWPSUniversity Neuropsychology 🧠 🍄

𖡼 silio 𖡼

@siliolithic

🪨🧠

about 2 months ago

biorxiv: https://t.co/odVJJ9uRK7 Great work by @nkwang24, @RyoYbioinfo, @thomasdooms, and the teams at @GoodfireAI and @MayoClinic !

0

4

0

1

122

about 2 months ago

Excited to share our work using @arcinstitute’s Evo2 to achieve state-of-the-art pathogenicity prediction for genomic variants.

about 2 months ago

We achieved state-of-the-art performance in predicting which of 4.2 million genetic variants cause diseases by interpreting a genomics model, in a new preprint with @MayoClinic. We're now releasing an open source database for all variants in the NIH's clinvar database. 🧵(1/8)

GoodfireAI's tweet photo. We achieved state-of-the-art performance in predicting which of 4.2 million genetic variants cause diseases by interpreting a genomics model, in a new preprint with @MayoClinic.

We're now releasing an open source database for all variants in the NIH's clinvar database. 🧵(1/8) https://t.co/PTrRAqjDMA

10

883

172

582

221K

2

54

8

14

5K

about 2 months ago

Our predictions use covariance probes as a lightweight improvement over mean-pooling that can handle the large variability in gene lengths and can avoid the complexities of full sequence-to-sequence autoencoders. Shout out to @thomasdooms for developing the approach.

2 months ago

New research: we propose *covariance pooling* as a better replacement for mean pooling that improves probing for sequence-level properties. E.g., genomic model embeddings are often mean-pooled to understand genes - but that throws away all info about feature co-occurrence! (1/3)

GoodfireAI's tweet photo. New research: we propose *covariance pooling* as a better replacement for mean pooling that improves probing for sequence-level properties.

E.g., genomic model embeddings are often mean-pooled to understand genes - but that throws away all info about feature co-occurrence! (1/3) https://t.co/GtjxmJgFL1

1

325

29

248

36K

1

9

0

1

251

_MichaelPearce retweeted

Garyk Brixi

@garykbrixi

3 months ago

Evo 2 is out in Nature today, showing that genome language models can predict and design across the full complexity of life, from phages to eukaryotes. A few surprises from the project, including how ignoring trillions of nucleotides was key to getting a good model. 🧵

garykbrixi's tweet photo. Evo 2 is out in Nature today, showing that genome language models can predict and design across the full complexity of life, from phages to eukaryotes.

A few surprises from the project, including how ignoring trillions of nucleotides was key to getting a good model. 🧵 https://t.co/fcp70p4i7h

14

1K

207

515

102K

_MichaelPearce retweeted

4 months ago

We raised a $150M Series B at a $1.25B valuation to fundamentally change the field of AI. Scaling is powerful, but we can't intentionally design what we don't understand.

30

502

60

166

215K

_MichaelPearce retweeted

4 months ago

We've identified a novel class of biomarkers for Alzheimer's detection - using interpretability - with @PrimaMente. How we did it, and how interpretability can power scientific discovery in the age of digital biology: (1/6)

GoodfireAI's tweet photo. We've identified a novel class of biomarkers for Alzheimer's detection - using interpretability - with @PrimaMente.

How we did it, and how interpretability can power scientific discovery in the age of digital biology: (1/6) https://t.co/SHBawjo7qi

50

2K

222

924

397K

_MichaelPearce retweeted

7 months ago

Why use LLM-as-a-judge when you can get the same performance for 15–500x cheaper? Our new research with @RakutenGroup on PII detection finds that SAE probes: - transfer from synthetic to real data better than normal probes - match GPT-5 Mini performance at 1/15 the cost (1/6)

GoodfireAI's tweet photo. Why use LLM-as-a-judge when you can get the same performance for 15–500x cheaper?

Our new research with @RakutenGroup on PII detection finds that SAE probes:
- transfer from synthetic to real data better than normal probes
- match GPT-5 Mini performance at 1/15 the cost

(1/6) https://t.co/GQXvUA8Nl4

12

331

48

300

70K

_MichaelPearce retweeted

Wes Gurnee @wesg52

8 months ago

New paper! We reverse engineered the mechanisms underlying Claude Haiku’s ability to perform a simple “perceptual” task. We discover beautiful feature families and manifolds, clean geometric transformations, and distributed attention algorithms!

wesg52's tweet photo. New paper! We reverse engineered the mechanisms underlying Claude Haiku’s ability to perform a simple “perceptual” task. We discover beautiful feature families and manifolds, clean geometric transformations, and distributed attention algorithms! https://t.co/AA2mYPZgY4

43

3K

320

2K

460K

_MichaelPearce retweeted

8 months ago

Agents for experimental research != agents for software development. This is a key lesson we've learned after several months refining agentic workflows! More takeaways on effectively using experimenter agents + a key tool we're open-sourcing to enable them: 🧵

GoodfireAI's tweet photo. Agents for experimental research != agents for software development.

This is a key lesson we've learned after several months refining agentic workflows!

More takeaways on effectively using experimenter agents + a key tool we're open-sourcing to enable them: 🧵 https://t.co/Azo3pNrMU9

3

221

30

162

71K

_MichaelPearce retweeted

9 months ago

We're excited to announce a collaboration with @MayoClinic! We're working to improve personalized patient outcomes by extracting richer, more reliable signals from genomic & digital pathology models. That could mean novel biomarkers, personalized diagnostics, & more.

GoodfireAI's tweet photo. We're excited to announce a collaboration with @MayoClinic!

We're working to improve personalized patient outcomes by extracting richer, more reliable signals from genomic & digital pathology models.

That could mean novel biomarkers, personalized diagnostics, & more. https://t.co/fx89w5mOJT

3

72

10

21

10K

_MichaelPearce retweeted

9 months ago

Does making an SAE bigger let you explain more of your model's features? New research from @ericjmichaud_ models SAE scaling dynamics, and explores whether SAEs will pack increasingly many latents onto a few multidimensional features, rather than learning more features.

GoodfireAI's tweet photo. Does making an SAE bigger let you explain more of your model's features?

New research from @ericjmichaud_ models SAE scaling dynamics, and explores whether SAEs will pack increasingly many latents onto a few multidimensional features, rather than learning more features. https://t.co/itzr8Iy7Fw

3

154

19

85

27K

10 months ago

@ATinyGreenCell @pdhsu @GoodfireAI Good questions, we used a set of 2400 prokaryote genomes with complete assemblies that are representative genomes in both the GTDB and NCBI databases. No viruses or metagenomic assemblies!

1

0

79

10 months ago

@PhilEmmanuele @GoodfireAI The flow of gathering activations from random genomic regions and averaging is described in the post. We’re happy to share the resulting data—the species-averaged embeddings and the phylogenetic distances between species—for you to play with. Will let you know when it's available

0

16

10 months ago

@J33P4 @GoodfireAI We indeed expected it to be there! But what's novel is understanding how the model represents the tree of life (manifold structure and low-dim subspace) and the techniques we developed to isolate the representations, which we plan to extend to more complex bioinformatic questions

1

0

59