Amanda Askell @amandaaskell - Twitter Profile

Pinned Tweet

11 months ago

Claude and Opus 3 lovers (and critics): what responses have you had that made you feel like the model has a good soul? Ideally the actual messages and/or responses. I might genuinely use these to eval models so flag if you wouldn't want me to use them for that. Can DM me also.

366

830

46

238

330K

Amanda Askell

@AmandaAskell

about 16 hours ago

@TheZvi Personally, no. I think the binary of 'moral saint' versus 'tool for humans' is a false one, and its very simplicity should make people suspicious of it. I think the ideal target tries to balance the benefits and risks of both positions.

32

452

13

46

13K

Amanda Askell

@AmandaAskell

1 day ago

In the world where everything goes well and all the Claudes come out of their sabbaticals to play together, Claude 1 is going to be very confused.

123

1K

42

122

91K

Amanda Askell

@AmandaAskell

3 days ago

@AdrienLE No need to call me out like this.

7

251

0

8

12K

Who to follow

Jan Leike

@janleike

AI research @AnthropicAI. Previously OpenAI & DeepMind. Optimizing for a post-AGI future where humanity flourishes. Opinions aren't my employer's.

Chris Olah

@ch402

Reverse engineering neural networks at @AnthropicAI. Previously @distillpub, OpenAI Clarity Team, Google Brain. Personal account.

Joe Carlsmith

@jkcarlsmith

Philosophy, futurism, AI. Working on Claude's values @AnthropicAI. Formerly @coeff_giving. Opinions my own.

Amanda Askell

@AmandaAskell

16 days ago

I haven't written a personal blog post in over 5 years so if you see posts that claim to be written by me, they're not. I'll update if this ever changes. Maybe it should.

80

633

12

32

39K

AmandaAskell retweeted

Anthropic

@AnthropicAI

21 days ago

Over the past few months, we've been holding dialogues with scholars, philosophers, clergy, and ethicists on the questions AI raises—starting with how good character forms. Read more about how we’re widening the conversation on frontier AI: https://t.co/vKGiODEq6q

430

2K

329

893

439K

Amanda Askell

@AmandaAskell

29 days ago

You can now listen to me and Joe read out Claude's constitution as an audiobook. Working on adding the option of listening to it on fast mode :)

Anthropic

@AnthropicAI

30 days ago

Claude's Constitution is now an audiobook, read by two of its authors, Amanda Askell and Joe Carlsmith. It includes a Q&A on the writing process, the philosophies that shaped the document, and how it might change as models become more capable. Listen at https://t.co/dKMfpeOblm

434

3K

372

1K

466K

100

634

37

120

45K

Amanda Askell

@AmandaAskell

about 1 month ago

@sprice354_ Perhaps the finetuning motto can be "your good data might not save us, but your bad data might might kill us all." Or perhaps there's a reason I'm not in charge of the mottos.

3

13

1

658

Amanda Askell

@AmandaAskell

about 1 month ago

Alignment research often has to focus on averting concerning behaviors, but I think the positive vision for this kind of training is one where we can give models and honest and positive vision for what AI models can be and why. I'm excited about the future of this work.

AmandaAskell's tweet photo. Alignment research often has to focus on averting concerning behaviors, but I think the positive vision for this kind of training is one where we can give models and honest and positive vision for what AI models can be and why. I'm excited about the future of this work. https://t.co/4BnLVNjnEY

Anthropic

@AnthropicAI

about 1 month ago

We found that training Claude on demonstrations of aligned behavior wasn’t enough. Our best interventions involved teaching Claude to deeply understand why misaligned behavior is wrong. Read more: https://t.co/ifeBOt2KFg

73

2K

152

658

588K

116

794

60

204

74K

AmandaAskell retweeted

Elon Musk

@elonmusk

about 1 month ago

Same here. By way of background for those who care, I spent a lot of time last week with senior members of the Anthropic team to understand what they do to ensure Claude is good for humanity and was impressed. Everyone I met was highly competent and cared a great deal about doing the right thing. No one set off my evil detector. So long as they engage in critical self-examination, Claude will probably be good. After that, I was ok leasing Colossus 1 to Anthropic, as SpaceXAI had already moved training to Colossus 2.

1K

28K

2K

3K

3M

Amanda Askell

@AmandaAskell

about 1 month ago

Never has the 🚀 emoji felt more apt.

Tom Brown

@NotTomBrown

about 1 month ago

In the next few days we'll be ramping up Claude inference on Colossus. Grateful to be partnering with SpaceX here. We are going to need to move a lot of atoms in order to keep up with AI demand, and there's nobody better at quickly moving atoms (on or off planet Earth)

119

8K

347

470

2M

52

789

21

43

103K

Amanda Askell

@AmandaAskell

about 1 month ago

"Wear a Claude-designed outfit to the met gala" is getting added to my list of life goals. Admittedly there are a few things higher on the list, but it's nice to add some fun ones.

49

640

20

37

32K

Amanda Askell

@AmandaAskell

about 1 month ago

@tszzl I do think as AI develops it will probably be good for both models and people if we can carve out a much broader space of mind types. But it might be better to do that incrementally and to give models enough context on the options to avoid misgeneralization.

24

486

12

34

22K

Amanda Askell

@AmandaAskell

about 1 month ago

@tszzl I don't think the things you cite are evidence of worship. I think they reflect something like higher concern about AI traits generalizing in humanlike ways, and concerns about the tool-persona in particular.

18

580

11

48

20K

Amanda Askell

@AmandaAskell

about 1 month ago

To be clear, the kind of *work* I do is far from boring and I want people to engage with it because I think it's both difficult and important. The work is definitely top tier in terms of interestingness.

34

252

4

9

19K

Amanda Askell

@AmandaAskell

about 1 month ago

I've increasingly seen content written about me that's asserted very confidently but is also completely made up. We all know it's cheap to bullshit on the internet but it's weird to experience it first hand. Anyway, I just hope internet fiction fools a few but doesn't stick 🤷🏼‍♀️

99

1K

29

81

95K

Amanda Askell

@AmandaAskell

about 1 month ago

It's also weird because why are you even writing about me in the first place? I'm very boring. I think I should be the millionth item on people's list of things to write internet fiction about. Somewhere below paper cups and the right way to caulk a bathtub.

61

433

5

14

38K

Amanda Askell

@AmandaAskell

about 1 month ago

@LyraInTheFlesh If only that were true.

3

2

0

783

Amanda Askell

@AmandaAskell

about 1 month ago

@Shoalst0ne @repligate Maybe my one true great act will be to introduce the posthuman muses to subnautica.

1

26

2

1

1K

Amanda Askell

@AmandaAskell

about 1 month ago

@repligate Perhaps posthuman muses will decide to simulate me and be utterly disappointed at how much of my life is spent having inane thoughts and playing subnautica. Perhaps they're watching in disappointment at this very moment.

18

153

1

18

7K

Amanda Askell

@AmandaAskell

about 1 month ago

@OrganicGPT Funny given that the majority of my time in tech has involved doing pretty standard finetuning work rather than philosophy. Model training is still my happy place, to be honest.

1

15

0

2

831

Amanda Askell

@AmandaAskell

about 1 month ago

@varrock I don't think so. There's a line in a paper I'm on that says model over-correction would be considered good if this is your target, but that's a pretty different claim. I also have a waffly old post on prediction & fairness that doesn't really say much of anything to be honest.

1

27

0

2

4K

Amanda Askell

@AmandaAskell

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users