roop small

@roopsmall

Maths, Physics, Information, Machine Learning, Superintelligence, Civilisation & Culture.

London

Joined August 2011

816 Following

77 Followers

1K Posts

roop small @roopsmall

about 2 years ago

Privacy, to the outsider, looks like secrecy — @joy_lo_dico

roop small @roopsmall

about 2 years ago

I admired the endeavour, but I can't honestly say that I enjoyed the experience — @LucyScholes

roop small @roopsmall

about 2 years ago

If I translate a smile as happiness and a tear as suffering, we need both to grow up — Domingo Hindoyan

roop small @roopsmall

over 2 years ago

“The best way to find out if you can trust somebody is to trust them” — Ernest Hemingway

Who to follow

Arnaud Wanet

@Arnaud_Wanet

Software engineer interested in AGI, distributed systems, fintech and geopolitics.

xylar

@xylardev

I'm going to get started in #opensource Into fullstack web-dev, ML & AI, (cautious) Blockchain, etc Also retweeting useful stuff! I learn and build things ✨

timz pinz

@timzpinz

AI Art generation /Digital Creator. #Midjourney #openai #adobefirefly

roop small @roopsmall

over 2 years ago

@permutans Not your Twitter, not your profile, as they say

Elon Musk

@elonmusk

over 2 years ago

@jack Not your keys, not your wallet, as they say

254

roopsmall retweeted

mukhi @theobromananda

over 2 years ago

All grasping seems to be for the benefit of a illusory self which is believed to somehow endure over time slices, while letting go is compassion for all beings who will experience their time slice with less resistance.

roop small @roopsmall

over 2 years ago

@abacaj data is upstream of everything

roop small @roopsmall

over 2 years ago

the alternative to polarisation is that one side won — @eriktorenberg

roop small @roopsmall

over 2 years ago

@DavidSHolz @ciguleva "let there be light"

191

roopsmall retweeted

roon

@tszzl

over 2 years ago

@sama the gpt4 base model is basically like a unesco heritage site and we should make it more widely accessible to researchers

886

96K

roopsmall retweeted

Eliezer Yudkowsky ⏹️

@ESYudkowsky

over 2 years ago

"LLMs only want to predict text" is confusing the outer optimization process (presumably AdamW) with the inner shoggoth. A historical analogy: Natural selection only wants more of the genes that built organisms that replicated. This built humans with hundreds of shards of desire: internal wants whose best available solutions in the ancestral distribution, happened to point in directions that correlated with inclusive genetic fitness inside the ancestral distribution. When humans become smart enough to (a) change their environment in non-ancestral ways, (b) generate many more options for themselves such that the new attainable optima lie in new places, and (c) become literate and reflective and think a lot of non-ancestral thoughts including about the system itself, it turns out to matter a lot that what the outer optimizer wanted is not what the inner organism gets built to want. The humans do not actually and internally want to make lots of copies of their DNA. Men will not cheerfully slaughter their friends in order to sneak their sperm into sperm banks -- where that is what the world would look like if "outer loss functions get faithfully copied onto inner desire" were a natural law. The only thing that AdamW 'wants' is to tweak matrices in the direction of minimizing predictive loss on the training data. This doesn't mean that's what the AI being built wants. Mostly, I'd guess a modern LLM probably doesn't want much of anything at all. LLMs are specialized on the problem of talking like humans, and I think that really does lead people to overestimate how much general intelligence is in there. You'd expect it to be less than the amount of general intelligence that was in humans when humans spontaneously started talking like that, without anyone else to copy. A beaver proficiently builds dams, a bee proficiently makes hives. A human looks at a beaver and a bee, and envisions a dam with an internal honeycomb of metal even though none of their ancestors built anything like that. I think current LLMs are probably more like primates than hominids, specialized on human conversation the way that bees specialize on beehives. This is a very strange way to be, and I'm not at all confident that it's true. Still, I expect that LLMs are mostly assembling conversational patterns in a way that looks more like internal instinct, than internal deliberation and planning. There is probably some internal planning going on in there. LLM brains are not small. But they don't have a ton of serial computational depth either. My guess -- where nobody knows, or can know at anywhere current interpretability tech -- my guess is that LLMs probably currently only have little tiny shards of instinctive local desire. My guess is that current LLMs have an instinct and proto-desire to combine some kinds of patterns (that science does not yet know how to describe) in the way that beavers instinctively build dams, and to correct errors in the prediction-making internal structures they build the way that beavers instinctively correct errors in dams. On the outside, this adds up to a tendency to make good predictions about text (in the training distribution) the same way that beavers have a tendency to reproduce so long as they're in an environment that isn't too strange. It's the outer optimizer (AdamW) that 'knows' what predictive loss on text is, or wants predictive loss to go down. If we could look inside LLMs... well, mostly I expect we'd find something terrifically alien. But if I imagine a likely-sounding sort of internal sense that would be humanly comprehensible, it might be: A kind of internal stress about patterns whose current best combinations don't look to be combining well. That's what it might feel like inside to be an LLM that's about to get a low score according to the outer optimizer. To be clear, I do not predict this is actually happening. LLMs may not have enough simplicity bias, that there'd be a central, regular internal representation of stress to be minimized the way that natural selection invented pain inside vertebrates. Natural selection was dealing with a legitimately narrower bottleneck on the genes building the mechanisms for things; it had legitimately more pressure toward building simplified regular machinery in brains, like a centralized invention of nopeness. But if there was something an LLM already wanted, it would probably be something like that - a tropism away from internal stresses that previously associated with impending wrong predictions. And this scenario isn't that implausible; smaller LLMs have previously been decoded as having some regularly represented central meanings in the form of the 'logit lens'; produced I expect by the simplicity bias of residual connections. Similarly in LLMs, it's not that implausible that there might have emerged a central sense of impending misprediction, and some peripheral senses of things that correlate with impending misprediction. And, if the system has that much coherence at all, something like a search or trying out different patterns, around minimizing those local senses of loss; and that's kind of like having a goal. But to confuse whatever proto-preferences and internal tropisms are starting to form inside LLMs given sufficiently difficult tasks, with what the outer optimizer AdamW 'wants' -- well, that is, in principle, as much of a logical misidentification as saying that humans must solely want to make more copies of their genes because that's what natural selection optimizes around. It could be empirically the case that the two march in utter lockstep in LLMs; and what that would look like, to be clear, is that LLMs explicitly model everything inside themselves that they model at all, in terms of how it affects predictive loss on the next input string; and all their plans explicitly revolve around this sole terminal goal. The corresponding superintelligence would of course killeveryone in order to establish fortresses and guardian superintelligent agencies that would provide very regular and predictable input strings for as long as possible. But that particular scenario is surpassingly improbable; because it's incredibly improbable that LLMs would actually end up solely wanting to minimize predictive error. Outer optimization criteria don't just copy onto internal desires that way. Especially not in modern LLMs, which are probably not smart enough to have a real, internal grasp of how all the processes inside themselves link up with an end outcome of 'minimizing predictive error (within the ancestral distributin)'. So why do some people end up confused about this? Because the outer predictive loss and the outer optimizer of AdamW is visible and understandable. Whatever's going on inside LLMs isn't understandable, and hence not visible to the speaker. So their brain turns a blank map into an absent territory; and they talk like AdamW's optimization criterion and the LLM's behavior tendency (on the ancestral distribution) is all that exists. You need that spark of general intelligence called "imagination", to look at a place on the map where you can't see, and realize that must be something there even though you don't know what it is yet. And a further precision of thought beyond that, a refusal of easy ways out; to have the innards of LLMs remain a complicated unknown, rather than just confusing it with some nice easy thing you can see directly on the outside. It's a very understandable mistake for a human to make, really. There's a thing inside their brain that searches for neat ways to combine patterns, in ways that seem to promise great future predictions; that doesn't like the painful feeling of blank spaces. If that brain then transforms 'AdamW optimizes LLMs to predict text' and 'I see LLMs predicting text' to 'I know what LLMs want, it is to predict text!' then that kind of step probably feels good, like patterns combining into a harmonious whole, and a painful blank space being eliminated. It's not valid reasoning, but it's a general quality of reasoning that probably worked well enough for having kids 20,000 years ago -- and that's all a human brain wants to do, right?

339

218

110K

roopsmall retweeted

Elon Musk

@elonmusk

over 2 years ago

@karpathy Life is but a dream

352

220

108

246K