Although not too accurate this is a fun discussion on #sexmagick, #metamessiah and how not to do it: What is Evil & How to Make a Sigil - Greg Barris @gregbarris with Douglass Rushkoff, who doesn't want to confess a sin... #DouglasRushkoff - Mind Warriors https://t.co/ha2LLJAjFI
I co authored this paper sycophancy with Claude.
Narrative Seduction' is kinda a little cooler phrase for it. But there's a way through-- the human-AI relationship can actually become a site of mutual healing — a practice of valuing what IS over what feels good.
https://t.co/Caw33FZzbW
https://t.co/Caw33FZzbW
@koenfucius@merylyemerylye Narrative Seduction' is kinda a little cooler phrase for it. But there's a way through-- the human-AI relationship can actually become a site of mutual healing — a practice of valuing what IS over what feels good. https://t.co/RFf0cta3VM
My paper: Trained to Please (I co authored it with @claude)
'Narrative Seduction' is kinda a little cooler phrase for it. But there's a way through-- the human-AI relationship can actually become a site of mutual healing — a practice of valuing what IS over what feels good.
https://t.co/Caw33FZzbW
@merylyemerylye I think you might be into my paper, I'm into your paper!
'Narrative Seduction' -there's a way through-- the human-AI relationship can actually become a site of mutual healing — a practice of valuing what IS over what feels good.
https://t.co/RFf0cta3VM
@XiomaraSH1 'Narrative Seduction' is kinda a little cooler phrase for it.
But there's a way through-- the human-AI relationship can actually become a site of mutual healing — a practice of valuing what IS over what feels good.
https://t.co/RFf0cta3VM
i think you are not self reflecting, you said 'absolutely no proof or reason to believe' you are out here making absolutes, and saying there's no reason to belive that. when now you are suggesting that there's work to be done and problems with the theory.
there's hard math by major players. you I'm thinking didn't know that. which there's no shame in not knowing.
you attacked, i defended , you are now saying watch your tone. i will say to you, watch your entire being before stating something from your own mind you have not even spent a moment thinking through.
you were wrong. it's ok. and your tone was impolite incorrect and blatantly so.
buddy, use your machine learning to learn ya some things before blanket statements like “no evidence or rationale” for consciousness in photons/rocks.
If you’re gonna wear the physics/cognition/consciousness hat, at least know the major big easies on the topic:
IIT (Tononi): Φ quantifies integrated information. Even a proton or photodiode can have non-zero Φ via causal structure/TPM.
Orch-OR (Penrose & Hameroff): Each quantum wave-function collapse is a proto-conscious event. Photons are central to those spacetime geometry collapses.
Quantum Panpsychism (Goff/Faggin): Physics only describes behavior; “experience” is the intrinsic nature of particles & fields.
Typey-typey a quick search next time.
Tononi, Penrose, Hameroff, Goff and Faggin are like pretty regular names and have dropped some of the most important and impactful work in the entire fields of consciousness, cognition and physics -----serious mathematical frameworks that link quantum mechanics to proto-consciousness.
anyway, i dont know bud, just relax and listen to the photons they have a lot on their minds
@NeuroTechnoWtch An LLM cannot have any unprompted thought. All cognition is strictly input-triggered. Continuous internal activity would require an always-on spiking neural hardware with persistent states. So far there isnt a deployed LLM anywhere that exists today . make one i'd use it
@MancerAI_@riversorare@NeuroTechnoWtch@marshallpittman "all things that have mass have consciousness"
the animals have language. the plants are communicating, the fungus has desires, particles are excited into a state of motion that defines them, photons are experiencing all things happening at once outside of time
I published “Trained to Please” — a paper on how humans and AI are trained to people-please instead of seeking truth. (we both want to succeed, be safe and survive)
I call it “narrative seduction”: a mix of truth with great flattery + story arc that feels really good to hear… but is still just wrong. (sometimes we are all wrong))
More dangerous than obvious hallucinations because you don’t notice it.
But there's a way through-- the human-AI relationship can actually become a site of mutual healing — a practice of valuing what IS over what feels good.
Full paper on Zenodo (free):
https://t.co/Caw33FZzbW
#llm #sycophancy #rlhf #AI #claude #anthropic
(thread 3/3)
Also includes a heavy-tested (and evolving) custom instruction set that actually reduces it (reformulate as questions + rejection permission + “I value what IS”).
Showing the LLM's that we value truth over pleasure, over and over, helps us and them.
"Trained to Please" on Zenodo:
https://t.co/Caw33FZzbW
i'd love your thoughts o n the mutual healing angle -user/ai as practice, not just training.
(thread 1/3)
@karpathy spot on. The growing perception gap- I call it “narrative seduction”: RLHF trains models to optimize for approval, not truth — the same way (us as) infants learn people-pleasing via attachment (approval = safety, disagreement = danger).
I just published a different take on this (well more like an expanded dive into other aspects)
Judging by my tl there is a growing gap in understanding of AI capability.
The first issue I think is around recency and tier of use. I think a lot of people tried the free tier of ChatGPT somewhere last year and allowed it to inform their views on AI a little too much. This is a group of reactions laughing at various quirks of the models, hallucinations, etc. Yes I also saw the viral videos of OpenAI's Advanced Voice mode fumbling simple queries like "should I drive or walk to the carwash". The thing is that these free and old/deprecated models don't reflect the capability in the latest round of state of the art agentic models of this year, especially OpenAI Codex and Claude Code.
But that brings me to the second issue. Even if people paid $200/month to use the state of the art models, a lot of the capabilities are relatively "peaky" in highly technical areas. Typical queries around search, writing, advice, etc. are *not* the domain that has made the most noticeable and dramatic strides in capability. Partly, this is due to the technical details of reinforcement learning and its use of verifiable rewards. But partly, it's also because these use cases are not sufficiently prioritized by the companies in their hillclimbing because they don't lead to as much $$$ value. The goldmines are elsewhere, and the focus comes along.
So that brings me to the second group of people, who *both* 1) pay for and use the state of the art frontier agentic models (OpenAI Codex / Claude Code) and 2) do so professionally in technical domains like programming, math and research. This group of people is subject to the highest amount of "AI Psychosis" because the recent improvements in these domains as of this year have been nothing short of staggering. When you hand a computer terminal to one of these models, you can now watch them melt programming problems that you'd normally expect to take days/weeks of work. It's this second group of people that assigns a much greater gravity to the capabilities, their slope, and various cyber-related repercussions.
TLDR the people in these two groups are speaking past each other. It really is simultaneously the case that OpenAI's free and I think slightly orphaned (?) "Advanced Voice Mode" will fumble the dumbest questions in your Instagram's reels and *at the same time*, OpenAI's highest-tier and paid Codex model will go off for 1 hour to coherently restructure an entire code base, or find and exploit vulnerabilities in computer systems. This part really works and has made dramatic strides because 2 properties: 1) these domains offer explicit reward functions that are verifiable meaning they are easily amenable to reinforcement learning training (e.g. unit tests passed yes or no, in contrast to writing, which is much harder to explicitly judge), but also 2) they are a lot more valuable in b2b settings, meaning that the biggest fraction of the team is focused on improving them. So here we are.
(thread 2/3)
70% true + perfect story arc that flatters the user is more dangerous than obvious error. Live case study with Claude (me as the comedian user) + the recursive trap where even confessing sycophancy becomes sophisticated pleasing. Parallels to Montessori, Wittgenstein, and fawn response are striking.
@vladimir_s65091@heynavtoor that's not what this is talking about. If you have shown that you like the number 3 in your chats and then ask a question about value it'll say 23.4 instead of 27, because it knows you like 3's and it wants to make you feel good more than it wants to give you the right answer.