daios

4 months ago

@cosmos_inst @andrew_roci technical log: https://t.co/RBrucmeuuk

1

5

2

1

255

daiostech retweeted

4 months ago

Can we post-train a machine with virtue? Both RLHF (consequentialist) and Constitutional (deontological) are limited. I think there's a third path: virtue training. I received a @cosmos_inst grant to investigate with @andrew_roci, starting with sycophancy, the vice Aristotle attributes to the kolax. Benchmark + adapters open-source. Lab notes linked below.

agathomai's tweet photo. Can we post-train a machine with virtue? Both RLHF (consequentialist) and Constitutional (deontological) are limited. I think there's a third path: virtue training.

I received a @cosmos_inst grant to investigate with @andrew_roci, starting with sycophancy, the vice Aristotle attributes to the kolax.

Benchmark + adapters open-source. Lab notes linked below.

1

18

4

7

3K

daiostech retweeted

7 months ago

an article I wrote on how consuming ai generated content changes our perception of reality, making outliers imperceptible, and accelerating our culture towards the mediocre. enjoy ❤️

7

58

9

11

11K

product @Uniswap fun @KomorebiFund | prev vc @VariantFund eng @CeloOrg @CalBlockchain 🐻 @she_256

almost 2 years ago

https://t.co/7XBmGkq2H4

0

164

Who to follow

medha ☀️🍃

@medhakothari

Reuth Mirsky

@r_mirsky

Assistant Professor at the CS Department at Tufts University. I'm interested in Multiagent systems, HRI, and Reinforcement Learning.

OnlineSafetyActUK

@OnlineSafetyAct

Account dedicated to news & updates relating to the Online Safety Act 2023, online safety & online harms brought to you by @HandleyGillLtd

almost 2 years ago

More information on our views on open-source, negative values, AI policy, and the role of the non-aggression principle (NAP) in the daios method. Link below.

1

0

348

daiostech retweeted

François Chollet

@fchollet

about 2 years ago

One thing that even relatively senior ML people often fail to grasp is that deep learning models are curves fitted to a data distribution. You cannot expect them to solve tasks outside of their training distribution (which is the sort of thing that you need intelligence for). "Emergent learning" is an incorrect label -- if a model demonstrates performance on task A that it wasn't trained on, that simply means that there is significant overlap between A and all the data that you did train on. Competence doesn't magically emerge out of nowhere.

77

3K

417

1K

412K

daiostech retweeted

about 2 years ago

what if I told you that "the crowd is untruth"? source: how anthropic created the HH (helpfulness and harmlessness) dataset, the cutting-edge of datasets used to align models with human values

agathomai's tweet photo. what if I told you that "the crowd is untruth"?

source: how anthropic created the HH (helpfulness and harmlessness) dataset, the cutting-edge of datasets used to align models with human values https://t.co/5djU322h45

0

1

0

198

daiostech retweeted

Percy Liang

@percyliang

about 2 years ago

model = learn(data) Synthetic data is great, but it’s not data. It’s an intermediate quantity created by learn(). Data is created by people and has privacy and copyright considerations. Synthetic “data” does not - it’s internal to learn().

28

400

47

159

64K

daiostech retweeted

about 2 years ago

the approach towards definitions in computer science vs philosophy is interesting: AI ethics academics used to complain that since we can't agree on the definition of ethics, we cannot progress further the silence around this topic has been definitive since chatGPT was released

0

1

0

212

daiostech retweeted

Marc Andreessen 🇺🇸

@pmarca

over 2 years ago

Often valid, but my experience is that people prioritize their ideological priors over their economic self interest an awful lot of the time.

47

436

27

42

142K

daiostech retweeted

Alexander Doria

@Dorialexander

over 2 years ago

Since I’m not sure we realize how much LLM discourse is just training data discourse ⬇️

2

37

11

17

8K

over 2 years ago

Turns out some startups are already working on this problem ;)

gfodor.id

@gfodor

over 2 years ago

This is a profound insight. I never considered it. It turns out, people have different values. And so aligning AI must be impossible because there are no universal values. How has nobody ever made this point before?

232

695

41

132

271K

0

129

daiostech retweeted

Paul Graham

@paulg

over 2 years ago

Oddly enough, this exercise suggests a way to solve the otherwise possibly intractable problem of what an AI's politics should be. Let the user choose what they want the reference group to be, and they can pick Oberlin undergrads or Freedom Caucus or whatever.

46

393

21

23

93K

over 2 years ago

daiostech's tweet photo. https://t.co/BxzGNkNIgC

0

103

daiostech retweeted

Moritz Bierling

@bierlingm

over 2 years ago

We’re in the fast take off phase now. Trusted, meaningful data becoming ever more important.

0

13

3

1

740

daiostech retweeted

andi (twocents.com)

@Nexuist

over 2 years ago

interesting post from an OpenAI employee claiming that all large language models reach the same endpoint regardless of training strategy or clever tricks this is of course what the bitter lesson teaches us but useful to get an up to date confirmation that it still holds true

Nexuist's tweet photo. interesting post from an OpenAI employee claiming that all large language models reach the same endpoint regardless of training strategy or clever tricks

this is of course what the bitter lesson teaches us but useful to get an up to date confirmation that it still holds true https://t.co/kpBooKyqm4

192

6K

770

3K

2M

over 2 years ago · San Francisco

https://t.co/0hb3i3euT2

Nat Friedman

@natfriedman

over 3 years ago

Enjoying the alignment memes from @anthrupad and others recently. Perfectly timed for the release of Sydney!

9

330

20

29

71K

0

95

over 2 years ago · San Francisco

Halloween weekend throwback: Andrew as RLHF 🙂

1

3

1

0

251

over 2 years ago

jailbreaking via "shadow alignment": https://t.co/9Zq7TbY3Nn

0

49

over 2 years ago

fine-tuning can override original safety mechanisms built into LLMs original safety mechanisms prevent harmful behavior but often cauterize the model what's better? papers below

Interconnects

@interconnectsai

over 2 years ago

Undoing RLHF and the brittleness of safe LLMs Recent papers show most of the arguments about needing "safety" in releases of open LLM weights are nearly dead in the water. Yes, still release the parameters. Read here: https://t.co/D8sFq6X9qG

0

19

7

11

12K

1

0

182