Transluce @transluceai - Twitter Profile

about 1 month ago

Proud to partner with @CommonSense to help develop a rigorous science around youth AI safety. Millions of kids are already using AI every day, and our understanding of these systems and their impacts has to catch up. https://t.co/Ot7uoH3qlO

1

13

3

2K

TransluceAI retweeted

Aryaman Arora

@aryaman2020

about 2 months ago

This paper is now a spotlight at ICML! https://t.co/wdUtUas7mb

10

316

31

166

33K

TransluceAI retweeted

Jacob Steinhardt @JacobSteinhardt

4 months ago

New blog post:"Building Technology to Drive AI Governance". I argue that many governance challenges are fundamentally bottlenecked by technical gaps, and consider case studies from other fields (food safety, climate change) that illustrate this dynamic.

JacobSteinhardt's tweet photo. New blog post:"Building Technology to Drive AI Governance". I argue that many governance challenges are fundamentally bottlenecked by technical gaps, and consider case studies from other fields (food safety, climate change) that illustrate this dynamic. https://t.co/cRgTVXfyPX

4

123

29

68

16K

Transluce

@TransluceAI

4 months ago

Use Docent to analyze your own traces: https://t.co/pBFq06qJjI Read our Blog: https://t.co/lPULGJnDOK

0

8

0

1

1K

Transluce

@TransluceAI

4 months ago

Why does GPT-5.1 Codex score 6.5% worse than GPT-5 Codex on Terminal-Bench, with the same scaffold? 🧵 GPT-5.1 times out at ~2x the rate of GPT-5. Excluding timeouts, GPT-5.1 wins by 7.2%. We analyzed 256M+ tokens of traces and found this in under an hour. Here’s how 👇

TransluceAI's tweet photo. Why does GPT-5.1 Codex score 6.5% worse than GPT-5 Codex on Terminal-Bench, with the same scaffold? 🧵

GPT-5.1 times out at ~2x the rate of GPT-5. Excluding timeouts, GPT-5.1 wins by 7.2%. We analyzed 256M+ tokens of traces and found this in under an hour. Here’s how 👇

2

74

15

19

10K

Transluce

@TransluceAI

4 months ago

You can replicate our full analysis with 5 min of setup. Clone our Terminal-Bench data & follow along: https://t.co/lPULGJnDOK

1

10

2

1

1K

Transluce

@TransluceAI

5 months ago

See the full post and apply here: https://t.co/2y1xk7Km6A

1

15

2

33

3K

Transluce

@TransluceAI

5 months ago

We're hiring a Governance & Policy Fellow to help define how independent AI evaluation works in practice—setting standards, supporting mental health evals, and supporting government evaluators. Hybrid technical + policy background, $200K–$300K. Link in replies.

5

241

43

226

26K

TransluceAI retweeted

Aryaman Arora

@aryaman2020

5 months ago

our circuit tracing codebase from this project is public now! https://t.co/w7ieuPgcpn please try it out and ping me if you have any questions 😄 and expect more updates soon!

0

146

17

96

15K

TransluceAI retweeted

Jacob Austin @jacobaustin132

6 months ago

I admire the folks at Transluce a lot. They're super smart and have a good model for how to do useful AI oversight work without being embedded in (read: beholden to) any big AI labs. Read their stuff and consider supporting!

0

16

2

3

6K

TransluceAI retweeted

Ethan Perez

@EthanJPerez

6 months ago

Transluce is a top-tier AI safety research lab - I follow their work as closely as work from our own safety teams at Anthropic. They're also well-positioned to become a strong third-party auditor for AI labs. Consider donating if you're interested in helping them out!

2

157

7

39

14K

TransluceAI retweeted

Sarah Schwettmann

@cogconfluence

6 months ago

All @TransluceAI work that I described in my NeurIPS mech interp workshop keynote is now out! ✨ Today we released Predictive Concept Decoders, led by @vvhuang_ Paper: https://t.co/fhAK9VozDZ Blog: https://t.co/53t4oenA1N And here's @damichoi95's work on scalably extracting latent representations of users from model internals: https://t.co/F8fs7rhaX7

1

88

17

61

10K

Transluce

@TransluceAI

6 months ago

Paper: https://t.co/NGMyCALgD4 Blog: https://t.co/ZFGGAXzfjQ Authors: @vvhuang_, @damichoi95, @_ddjohnson, @cogconfluence, @JacobSteinhardt If you’re excited about building scalable interpretability assistants, visit https://t.co/DQZsWs3bPh

0

13

0

7

2K

Transluce

@TransluceAI

6 months ago

Transluce is developing end-to-end interpretability approaches that directly train models to make predictions about AI behavior. Today we introduce Predictive Concept Decoders (PCD), a new architecture that embodies this approach.

2

165

33

67

37K

Transluce

@TransluceAI

6 months ago

Chat with a live version of our PCD at https://t.co/hCnfYwtPq6. Try testing whether the decoder can accurately predict Llama-3.1-8B’s behavior, and check whether the decoder’s response is consistent with the encoder’s active concepts!

TransluceAI's tweet photo. Chat with a live version of our PCD at https://t.co/hCnfYwtPq6. Try testing whether the decoder can accurately predict Llama-3.1-8B’s behavior, and check whether the decoder’s response is consistent with the encoder’s active concepts! https://t.co/fT5njeGOoZ

1

15

0

4

4K

Transluce

@TransluceAI

Last Seen Users on Sotwe

Trends for you

Most Popular Users