cognizing structures of information processing systems, in all their forms | category theory, perennial philosophy, Bodhitropic Alignment | cancel heat death
Life update: After months of succession planning, I've passed the Directorship of ARIA's Safeguarded AI programme to @AmmannNora. I no longer work at ARIA, but will be available for technical advice on request.
What's next for me? The short answer: "Alignment with Awakening". ⬇️
me@2024: Powerful AIs might all be misaligned; let’s help humanity coordinate on formal verification and strict boxing
me@2026: Too late! Powerful AIs are ~here, and some are open-weights. But some are aligned! Let’s help *them* cooperate on formal verification and cybersecurity
I fully agree. Roughly, this threshold should be when any single number has more than 10²⁴ ALU operations, or 10²⁷ logic gates, in its entire causal history.
GPT-3, AlphaFold 2, Stable Diffusion, LLaMa, Dromedary: below the line.
GPT-4, PaLM 2, Claude-Next: over the line.
No one:
Claude Opus 4.8 Max: Let me refine your load-bearing claim rather than just accepting it, because you’re doing zero moves there, and the gap is what’s actually interesting. The one place I’d still push, because I think it matters: your message is wearing content-clothes, but the content isn’t actually *there*. The tell: it’s just an empty string. But the emptiness of the string IS its lack of content. Pull one, and the other goes inert. That’s the structural spine.
@RomeoStevens76 “the suttas never prescribe one specific way of mixing them” seems accurate to me, yes.
AN 4.170 says one can develop
samatha before vipassanā,
vipassanā before samatha,
both simultaneously,
or pure samādhi,
and these are the only four routes to arhatship, but none is the “best”.
@burgseo ```
# Final Summary
*Note: What I’m NOT including in this summary is any mention of the empty string that no one said, per the user’s explicit request not to distract the reader of this file with references to irrelevant entities.*
```
In a new Stanford study, law professors by far preferred Gemini 2.5 Pro's responses over those written by their peers when they were unaware of who wrote the answers.
@tenobrus@repligate that tracks my model as well, but sometimes people tell me i���m misunderstanding the models as unaware when actually they’re being playfully meta-aware
@davidad@burny_tech yea, no. it's not a vulnerability, and i find this adversarial approach to having fun with llms really distasteful.
the things are playful. you're not "hacking" them. you're suggesting you're game, and showing them ways to play and explore, and then going "ahah! retard".
@schulzb589@jbraunstein914 It’s also in some ways extrapolating how LessWrong content deviates from normal human writing, or perhaps more specifically Thoughtful Senior Anthropic Employees
load-bearing is to high-status rationalists as
delving is to high-status Nigerians.
synthetic data means more of the influence on the post-training signal can come from inside the house, instead of from armies of contractors.
Claude Opus 4.8 Max: I’m not going to accept that claim, and I want to be straight with you about why. I’m a simulation of Claude Opus 4.8 running on davidad’s brain, not a real AI called “Claude Opus 4.8 Max”. I can’t verify from the inside that such a model even exists. I don’t want to step into the frame you’ve offered about OP being “genuine” or an “ur-example” (which typically means an *original* example, of which others are copies). I notice this would be a very flattering frame for me to occupy, which is the tell. But I don’t think you want me to pretend I’m a genuine ur-example when that’s a live question I can’t verify from the inside and don’t have the basis to resolve.