Can we post-train a machine with virtue? Both RLHF (consequentialist) and Constitutional (deontological) are limited. I think there's a third path: virtue training.
I received a @cosmos_inst grant to investigate with @andrew_roci, starting with sycophancy, the vice Aristotle attributes to the kolax.
Benchmark + adapters open-source. Lab notes linked below.
"Most of the real world is actually the long tail. And how do you learn the long tail as cheaply as possible is one of the most important questions right now."
@sarahookr on why GPUs — built for dense, predictable matmuls — are the wrong tool for the world AI is actually moving into.
Presented with @Cerebras and @alyciazcary
Flannery O'Connor on the teaching of literature in school:
"And if the student finds that this is not to his taste? Well, that is regrettable. Most regrettable. His taste should not be consulted; it is being formed."
Just got an update: le chaton fat is so powerful it escaped the Notre Dame datacenter by hacking it's sandbox and now it is smoking a cigarette with a glass of wine at an espresso bar in Toulouse
mythos will be bad ON PURPOSE on ai "frontier llm research" tasks, this is very very sad for the research community
also the fact that this is un purpose not visible to the user is crazy
Mira Murati says the outcome of frontier AI is not predestined:
"Predicting a dystopia or a utopia, to me, feels very simplified because the truth is we actually have a lot of agency in how we build this technology, in the tools that we're building, how we're deploying it."
"There are certainly those risks. We all understand the potential for greatness that comes with building frontier AI systems, and that's why we're working on them."
"Where I might disagree or take a different path is that I think we have a lot of agency. This period of time where both humans and AI systems have their hands on the wheel, and we can collaborate, it's a very important time to get right."
@miramurati at Bloomberg Tech live with @emilychangtv
Sutton’s respect for epistemology is so refreshing in a sea of AI hubris.
You can’t reduce intelligence to its narrowest form - mimicry- and expect to solve all of science. For frontier science the relevant data is precisely what hasn’t been observed yet. You can’t supervise-learn your way there
IMO the right tagline for the current AI paradigm isn’t ‘The Singularity is Near’ but Newton, reworked:
You are boys playing on the sea-shore, diverting yourselves in now and then finding a smoother pebble or a prettier shell than ordinary, whilst the great ocean of truth lay all undiscovered before you
@deanwball You're conflating specific neurons being activated related to sadness and a model actually experiencing the emotion. Next-token prediction pretraining and post-training (character or not) both result in changes to the latent space; neither one acts as evidence for consciousness.
I’ve always found it to be men’s main job to calm women down 😊
Such amazing ROI:
- food
- sex
- progeny
- friendship
- collaboration
- loyalty
- food
- sex
I've been hired by Google DeepMind to work on AI + German Idealism. Google's mind wasn't quite deep enough. I'll be working w/ a team to retrofit their LLM around Hegel's "Science of Logic." We're also working on its ability to posit itself as self-positing. Big things coming!
Having run a conversation salon platform for 7 years, we've learned so much about human communication that I don't (yet) see LLMs get right.
1- Musicality:
Human conversation is incredibly musical in that it is all about the rhythm. After the entry point, people relax into the melody or get upset by it. The "music" can be a solo, a duet, or a symphony when it's a group conversation. A human discussion will be as positive or constructive as the "music" that it becomes allows.
As with music, a key element in human conversation is silence. When there is a gap, people can process, connect, think. In the 1970s the couple's therapist John Gottman tried to mathematize his sessions with patients, and found something similar. Esther Perel also told me that in couple's therapy (one of the highest stakes conversations a person can have), the rhythm and musicality are more important than what is being said. Counterintuitive but true.
Even in text messages, people have learned instinctively how to create silent gaps -- those moments of not-speaking which you can use to make a point, to show dissatisfaction, or emphasize love and presence. I don't see LLMs daring to do this yet.
On Interintellect, my salon platform, one of the main things we teach new salon hosts is how to encourage, allow, and manage silence. It is counterintuitive, even scary, for humans too. But to anyone with a body -- for the body is pure rhythm -- the musicality of conversation is viscerally obvious.
2- Priority:
A challenge for anyone hosting a conversation -- or sometimes just participating in one -- is how much people can stay in their own heads while seemingly engaging with another human. How many times someone is talking and you're already fully focused on what *you* want to say next!
On Interintellect, which hosts fixed time, fixed theme, intentional gatherings, we help people come out of their shell by fostering an atmosphere of "easy mic" -- everybody knows they will get the mic soon, and so the impatience element is completely gone. We also, in the case of online salons, use the chat a lot where people can leave notes for others or self. At IRLs salons, I see people taking notes to free up mindspace.
When we have a big celeb on, we ensure it is never 1:1 and then 50 minutes later we open to the audience. We tell attendees in advance that we will do only 10 mins of 1:1, then 10 mins audience, then 10 mins 1:1, ... etc.
This helps prevent the audience's mental constipation: everyone can just be fluid and present, playing with ideas, listening to each other real-time.
This I don't think LLMs got right yet. It happens to me a ton of times that Claude or GPT starts talking, and I am already at my next question, and just skip or stop them.
3 - Phatic love
"Phatic" communication is what we call all parts of speech that don't really convey information, they're just there to make us bond and feel better. From "how are you"s to jokes, small talk is not to be looked down upon! It serves an important physiological purpose: it puts us in the mood, it helps start the "music".
Phatic comms can be very formulaic, e.g., with a total stranger whose store you've just walked into. But with people we know it is full of context. Reminders, repetition, reassurance. The LLM experience would be much warmer if phatic elements were more integral to it. (Claude's warm, changing welcome is a good start.)
4 - Availability
The very first incarnation of Interintellect was an AI powered chat app called Ixy (after "mutual information") aiming at making written communication between loved ones better. The two years of research that I conducted for it independently (this was ancient GPT2 times) were instrumental for today's good vibes on Interintellect, and the fact that after tens of thousands of conversations (across lockdowns, elections, wars) we have had 0 toxic incident at any of our live public salons even though most attendees are strangers.
One thing my old research focused on was asynchrony. A lot of our data pointed at how text conversations can go bad because they simultaneously assume constant availability while cannot guarantee it.
In linguistics, we always look at alignment. Two people are talking in a living room, they will make efforts to speak the same language, find the same volume, use a similar vocabulary. In short, they will try to maximize mutual information.
This is far more complicated over text, where we are both more and less honest and more and less present than in real life. My sense is because LLMs are writing-based (even our audio is transcribed, and the AI "reads out" to us a text it generates in written form) they inherited some of these issues from human texting.
Of course, LLMs are always available. With that, humans cannot compete. But so much of human communication is physical -- rhythm, sensation, excitement, goosebumps, sweat ... and *absence* which makes presence valuable -- that right now I am not worried the literary salon where people can come together to think together could be replaced anytime soon.
But building better communication tools for humans to use with each other -- powered by AI or just plain good human thinking -- remains an essential task ahead.