Ben Mann

nate parrott @nateparrott

about 2 months ago

Seeing Claude Design take off inside Anthropic has been amazing. So excited to share this with the world and see what beautiful and useful things come out! My personal favorite is using this to make just-in-time decks in Anthropic's design language for internal presentations, and then slide-roulette.

about 2 months ago

meet claude design! A simple tool that lets designers prototype and share interactive artifacts built using code.

529

112K

8enmann retweeted

AI @AnthropicAI Social learning enthusiast. Opinions and dumb jokes my own.

about 2 months ago

Introducing Claude Opus 4.7, our most capable Opus model yet. It handles long-running tasks with more rigor, follows instructions more precisely, and verifies its own outputs before reporting back. You can hand off your hardest work with less supervision.

claudeai's tweet photo. Introducing Claude Opus 4.7, our most capable Opus model yet.

It handles long-running tasks with more rigor, follows instructions more precisely, and verifies its own outputs before reporting back.

You can hand off your hardest work with less supervision. https://t.co/PtlRdpQcG5

81K

10K

12K

14M

8enmann retweeted

Ronan Farrow

@RonanFarrow

about 2 months ago

(🧵1/11) For the past year and a half, I've been investigating OpenAI and Sam Altman for @NewYorker. With my coauthor @andrewmarantz, I reviewed never-before-disclosed internal memos, obtained 200+ pages of documents related to a close colleague, including extensive private notes, and interviewed more than 100 people. OpenAI was founded on the premise that A.I. could be the most dangerous invention in human history—and that its C.E.O. would need to be a person of uncommon integrity. We lay out the most detailed account yet of why Altman was ousted out by board members and executives who came to believe he lacked that integrity, and ask: were they right to allege that he couldn't be trusted? A thread on some of of our findings:

RonanFarrow's tweet photo. (🧵1/11) For the past year and a half, I've been investigating OpenAI and Sam Altman for @NewYorker. With my coauthor @andrewmarantz, I reviewed never-before-disclosed internal memos, obtained 200+ pages of documents related to a close colleague, including extensive private notes, and interviewed more than 100 people.

OpenAI was founded on the premise that A.I. could be the most dangerous invention in human history—and that its C.E.O. would need to be a person of uncommon integrity. We lay out the most detailed account yet of why Altman was ousted out by board members and executives who came to believe he lacked that integrity, and ask: were they right to allege that he couldn't be trusted?

A thread on some of of our findings:

579

37K

21K

8enmann retweeted

Noah Zweben

@noahzweben

2 months ago

Thrilled to announce Claude Code auto-fix – in the cloud. Web/Mobile sessions can now automatically follow PRs - fixing CI failures and addressing comments so that your PR is always green. This happens remotely so you can fully walk away and come back to a ready-to-go PR.

335

503

Who to follow

Kamal Ndousse

@kandouss

girish sastry

@girishsastry

AI & other things. I used to work at OpenAI on Policy Research.

Chelsea Sierra Voss

@csvoss

engineeress ✨ Member of Technical Staff @openai // board $QCLS // past @pilothq, @sendwaveapp, @khanacademy, CS/math @MIT, 2x IBO gold

8enmann retweeted

3 months ago

Claude can now build interactive charts and diagrams, directly in the chat. Available today in beta on all plans, including free. Try it out: https://t.co/tHPAZRgQkn

42K

16K

12M

8enmann retweeted

Armand

@armandcognetta

3 months ago

There’s an enormous gap in the longevity field that almost no one is talking about. No existing therapeutic modalities are capable of both systemic distribution and complex transformations. Until we solve this, we won’t solve aging. 🧵

armandcognetta's tweet photo. There’s an enormous gap in the longevity field that almost no one is talking about.

No existing therapeutic modalities are capable of both systemic distribution and complex transformations.

Until we solve this, we won’t solve aging. 🧵 https://t.co/F7EkyqTpb5

122

27K

3 months ago

Among all the frontier AI labs, Anthropic took the lead in supporting our warfighters and the American way starting in 2024. I am saddened by today's developments and hope we can find a way to continue our support without compromising our values.

Anthropic

@AnthropicAI

3 months ago

A statement on the comments from Secretary of War Pete Hegseth. https://t.co/Gg7Zb09IMR

42K

18M

211

11K

8enmann retweeted

Chris Painter

@ChrisPainterYup

4 months ago

My bio says I work on AGI preparedness, so I want to clarify: We are not prepared. Over the last year, dangerous capability evaluations have moved into a state where it's difficult to find any Q&A benchmark that models don't saturate. Work has had to shift toward measures that are either much more finger-to-the-wind (quick surveys of researchers about real-world use) or much more capital- and time-intensive (randomized controlled "uplift studies"). Broadly, it's becoming a stretch to rule out any threat model using Q&A benchmarks as a proxy. Everyone is experimenting with new methods for detecting when meaningful capability thresholds are crossed, but the water might boil before we can get the thermometer in. The situation is similar for agent benchmarks: our ability to measure capability is rapidly falling behind the pace of capability itself (look at the confidence intervals on METR's time-horizon measurements), although these haven't yet saturated. And what happens if we concede that it's difficult to "rule out" these risks? Does society wait to take action until we can "rule them in" by showing they are end-to-end clearly realizable? Furthermore, what would "taking action" even mean if we decide the risk is imminent and real? Every American developer faces the problem that if it unilaterally halts development, or even simply implements costly mitigations, it has reason to believe that a less-cautious competitor will not take the same actions and instead benefit. From a private company's perspective, it isn't clear that taking drastic action to mitigate risk unilaterally (like fully halting development of more advanced models) accomplishes anything productive unless there's a decent chance the government steps in or the action is near-universal. And even if the US government helps solve the collective action problem (if indeed it *is* a collective action problem) in the US, what about Chinese companies? At minimum, I think developers need to keep collecting evidence about risky and destabilizing model properties (chem-bio, cyber, recursive self-improvement, sycophancy) and reporting this information publicly, so the rest of society can see what world we're heading into and can decide how it wants to react. The rest of society, and companies themselves, should also spend more effort thinking creatively about how to use technology to harden society against the risks AI might pose. This is hard, and I don't know the right answers. My impression is that the companies developing AI don't know the right answers either. While it's possible for an individual, or a species, to not understand how an experience will affect them and yet "be prepared" for the experience in the sense of having built the tools and experience to ensure they'll respond effectively, I'm not sure that's the position we're in. I hope we land on better answers soon.

110

238

674

208K

8enmann retweeted

4 months ago

Introducing Claude Opus 4.6. Our smartest model got an upgrade. Opus 4.6 plans more carefully, sustains agentic tasks for longer, operates reliably in massive codebases, and catches its own mistakes. It’s also our first Opus-class model with 1M token context in beta.

39K

11M

4 months ago

Unprecedented times are coming, and with that, we will all need to work together to figure out how to make the transition go well. I love Dario's new essay on this topic. Worth the read!

Dario Amodei

@DarioAmodei

4 months ago

The Adolescence of Technology: an essay on the risks posed by powerful AI to national security, economies and democracy—and how we can defend against them: https://t.co/0phIiJjrmz

877

15K

17K

8enmann retweeted

Anthropic

@AnthropicAI

4 months ago

We’re publishing a new constitution for Claude. The constitution is a detailed description of our vision for Claude’s behavior and values. It’s written primarily for Claude, and used directly in our training process. https://t.co/CJsMIO0uej

518

968

8enmann retweeted

Jan Leike

@janleike

5 months ago

Interesting trend: models have been getting a lot more aligned over the course of 2025. The fraction of misaligned behavior found by automated auditing has been going down not just at Anthropic but for GDM and OpenAI as well.

$janleike's tweet photo. Interesting trend: models have been getting a lot more aligned over the course of 2025. The fraction of misaligned behavior found by automated auditing has been going down not just at Anthropic but for GDM and OpenAI as well. https://t.co/8DYm9SP7wF$

118

828

255

314K

8enmann retweeted

5 months ago

Introducing Cowork: Claude Code for the rest of your work. Cowork lets you complete non-technical tasks much like how developers use Claude Code.

86K

58K

50M

8enmann retweeted

taylor

@tayroga

5 months ago

to achieve jhana, put a good feeling in a ralph wiggum loop

121

8enmann retweeted

Jack Clark

@jackclarkSF

5 months ago

https://t.co/LmpEfSs06w

402

6 months ago

I've been using this for a while now, not just for frontend development, but also for checking docs, sending Slack messages, and doing research. Game changer!