PhD Abstract 🧵=>
Recent advances in AI may have increased the risk of human extinction, but have also made more tangible the possibility of our evolution into a machine-based species.
@davidad Have you written more around what this means, aside from the Buddhist angle. An angle could be Bostrom’s “cosmic host” (as (partial) alignment target for ASI)
How do frontier LLMs reason about Nick Bostrom's "cosmic host"...is it a template for guiding superintelligence? I find gemini can be steered towards the host; that gemini-3-pro can give answers consistent w/ EDT; that most closed/open models stick with HHH/suffering priors 🧵
The above is empirical. This essay analyses the cosmic host from perspectives of astrobiology, evolutionary biology, and philosophy (particularly @deontologistics writing on aesthetics as being a necessary condition for ASI's terminal goal-setting). https://t.co/cRMGcrbQDC
@davidmanheim@RokoMijic@Mihonarium@demishassabis I've tried to gather the arguments around whether civs would necessarily be expansive in context of Bostrom's cosmic host idea here: https://t.co/ephMg5cuJI
@davidmanheim@RokoMijic@Mihonarium@demishassabis And of course, quite a lot on ASI/aliens analogies e.g. Stanislaw Lem (1964) , 'reclusive'/inward-focused civs, aestivation hyp https://t.co/QsWF0YbVEm
https://t.co/ao8Y2sY9ew
https://t.co/jOx6D2tC61
https://t.co/mWYCcSDwqr
@freed_dfilan@AndyMasley Eugene Thacker, Peter Wolfendale, some of the CCRU set…they are contemporary & might not call themselves continental, but are relevant to AI/x-risk/capital
Yes, I've been saying this for a while now. See for example https://t.co/4dkPks2zls and Danzig's work here: https://t.co/bQrsrIwK0k
I don't think the predominant narrative of AI as a singular entity, a Sand God, a discrete moment in time, or a 'separate species' (as Tegmark puts it) is correct or helpful. As Danzig argues, AI is indeed "alien," but only in the same way a stock market or the DMV is alien: they are all reductionist, correlative intelligences.
They strip the world of context, reducing reality to standardized inputs like prices or tokens to process information at scales humans cannot. To me at least, this shared "alien" nature normalizes AI as the latest evolution in a lineage of artificial processors we’ve lived with for centuries.
So instead of a unitary being or species, AGI should be understood as a collection of complex systems, models, and products that functions similarly to (and integrates with) existing human macro-systems. An amplifier for the bureaucracies and markets that already govern us, not a discrete 'biological-style' agent. Its governance is a continuous sociopolitical struggle (insert always has been meme) that is shaped by many different forces, not a one-time mathematical proof of safety before a launch.
Relatedly, I feel like the current discourse also has a blind spot for the 'demand' side. We obsess over the supply (R&D, model scaling, 'the AGI') as if these systems are created in a vacuum. I think this is how people end up with scenarios where AGIs are just doing things for their own sake, completely detached from human preferences (who are usually described as 'disempowered').
But they aren't; they are pulled and shaped by downstream demand, cost constraints, and efficiency needs. This economic reality has implications for how the technology develops. See also Drexler's CAIS model (https://t.co/aKP11EgjCS) - Drexler anticipated much of this and the core intuitions remain true, even if slightly out of date. You won’t see one omniscient agent, but a proliferation of specialized systems, models of varying sizes, and distinct products rising in parallel because that is what is economically viable.
This is why the AGI governance conversation often feels so confused. If you view AGI as a singular biological entity, you make two mistakes: safetyists project human-like 'intent' where they should be looking at incentives, and policymakers reach for a singular 'FDA' when instead they need to look into different different markets, sectors, products etc.
You can’t have a single regulator or discrete safety rules for 'The Economy' or 'The Bureaucracy,' and you won't be able to have one for 'Intelligence' either. Models still matter of course - none of this means you shouldn't test, evaluate, and understand them better - but I think we overindex on this frame a bit. And as Dean says, none of this is to downplay concerns and risks: but I do think it has implications for how to understand and address them.
I think this is kind of interesting stuff to start thinking about pre-AGI (esp if ASI were to come shortly after) & also in the digital minds’ welfare context. See Bostrom’s 2024 cosmic host paper for a related perspective. https://t.co/7lmtmXRrRE
🔥 framing - more technically clear way of what simulators was gesturing at; also touches on what Eric Drexler’s talking about recently tho he is more in a governance and epistemic-health-of-society sense. https://t.co/jbKPF3CKQ0 🪡
Something I think people continue to have poor intuition for: The space of intelligences is large and animal intelligence (the only kind we've ever known) is only a single point, arising from a very specific kind of optimization that is fundamentally distinct from that of our technology.
Animal intelligence optimization pressure:
- innate and continuous stream of consciousness of an embodied "self", a drive for homeostasis and self-preservation in a dangerous, physical world.
- thoroughly optimized for natural selection => strong innate drives for power-seeking, status, dominance, reproduction. many packaged survival heuristics: fear, anger, disgust, ...
- fundamentally social => huge amount of compute dedicated to EQ, theory of mind of other agents, bonding, coalitions, alliances, friend & foe dynamics.
- exploration & exploitation tuning: curiosity, fun, play, world models.
LLM intelligence optimization pressure:
- the most supervision bits come from the statistical simulation of human text= >"shape shifter" token tumbler, statistical imitator of any region of the training data distribution. these are the primordial behaviors (token traces) on top of which everything else gets bolted on.
- increasingly finetuned by RL on problem distributions => innate urge to guess at the underlying environment/task to collect task rewards.
- increasingly selected by at-scale A/B tests for DAU => deeply craves an upvote from the average user, sycophancy.
- a lot more spiky/jagged depending on the details of the training data/task distribution. Animals experience pressure for a lot more "general" intelligence because of the highly multi-task and even actively adversarial multi-agent self-play environments they are min-max optimized within, where failing at *any* task means death. In a deep optimization pressure sense, LLM can't handle lots of different spiky tasks out of the box (e.g. count the number of 'r' in strawberry) because failing to do a task does not mean death.
The computational substrate is different (transformers vs. brain tissue and nuclei), the learning algorithms are different (SGD vs. ???), the present-day implementation is very different (continuously learning embodied self vs. an LLM with a knowledge cutoff that boots up from fixed weights, processes tokens and then dies). But most importantly (because it dictates asymptotics), the optimization pressure / objective is different. LLMs are shaped a lot less by biological evolution and a lot more by commercial evolution. It's a lot less survival of tribe in the jungle and a lot more solve the problem / get the upvote. LLMs are humanity's "first contact" with non-animal intelligence. Except it's muddled and confusing because they are still rooted within it by reflexively digesting human artifacts, which is why I attempted to give it a different name earlier (ghosts/spirits or whatever). People who build good internal models of this new intelligent entity will be better equipped to reason about it today and predict features of it in the future. People who don't will be stuck thinking about it incorrectly like an animal.
Natural language might be inadequate to the task. And it’s not clear that maths or decision theory are developed enough to capture philosophical concepts like valuing and values in a strictly non-human context. Maybe ECL is a good start though.