"AI’s power must not be allowed to hollow out the very skills required to navigate an AI age successfully."
After six months of writing and research, my white paper on AI and pedagogy is finally out!
https://t.co/bCyAKcyMTh
Do share; do join the conversation.
Stanford neuroscientists Dr. David Eagleman says AI will have a problem with selection
AI is massively creative. The part of creativity that AI can’t do right now is selection… meaning, it can generate 100 pictures, but it doesn’t know which one to pick.”
“It doesn’t know which one is going to be the most appealing to you, but it can remix beautifully.”
“When it generates the pixels, it’s doing this magical stuff under the hood where it’s deciding which pixels and how they diffuse together to make the image.
But it doesn’t know how to read that image… it’s not seeing the image except as a bunch of pixels.”
this is … odd.
the whole problem with agentic code is that people wind up with hard to debug redundancies, security holes, deleted databases, etc., which is exactly why you need roadmaps and code reviews, with AI-written code.
I am with @Grady_Booch on this one (and almost everything else).
I don’t doubt that we will have more code written cheaper than ever, but a lot of it is going to be a mess unless there is serious oversight. (Companies like Amazon have already seen this.)
Treated myself while passing through Charing Cross to a copy of How to Talk to AI by the brilliant @JamieJBartlett to find it was both already signed and officially bestselling. Congratulations, Jamie! Can't wait to read it.
As an avid cyclist, I was amused to see ChatGPT's “powerful new image engine" draw a bicycle with the "brake" label pointing to empty space where brakes are sometimes found on other bicycles. The point of these AI fails (regularly collected by @garymarcus) is not that LLMs are useless or unintelligent but that they think differently from humans, lacking explicit knowledge of reality and instead amalgamating billions of statistical generalizations. https://t.co/PzFsjgVVkx
A fascinating experiment. And also a window into a central limitation of present LLMs. Any system pre-trained up to cutoff date X is myopic beyond it; yet convincingly and fluently willing to predict and generlize without the capacity for updating underlying assumptions.
Played with Talkie, an LLM trained only on pre-1930 data. It's pretty dumb and inconsistent, as to be expected at its size. But still very cool.
Excited about "vintage LLMs." Testing a model on forecasting world events that haven't happened for decades is both badass and a promising way to understand LLMs' forecasting capabilities.
Spent an hour messing with ChatGpt & images, and I'm increasingly convinced we're over the horizon. Not where human artists are obsolete, I don't plan to quit, but yeah, it's going to define processes, mostly in the positive, sometimes in the negative ("I don't use it") way
here's a new version of "what we talk to when we talk to language models", with an added section (pp. 16-23) on LLM interlocutors as characters, personas, or simulacra. https://t.co/RLDP5FFmgM
the new version discusses role-playing vs realization, the simulators framework, the persona selection hypothesis, and more -- in addition to the existing discussion of quasi-mental states, LLM identity, personal identity in severance, LLM welfare, and related topics.
this version was mostly written before recent discussions of these issues on X and in NYC, but i've updated it a little in light of those discussions. any thoughts are welcome.
@mrianleslie They might! This rapidly goes beyond my expertise. But I think the absence of an inarticulate, conflicted "Real" beyond text and culture would for Lacan be a fascinating absence in the case of LLMs. They're not speaking minds into being to remedy a lack; they are pure discourse.
This feels like a subtle, important point. Role-playing is something people do, imaginatively inhibiting a persona. This doesn't apply to LLMs, which (if I'm extrapolating correctly) in effect subtly constantly shift what they *are* in line with context, weights and prompts.
i agree. claude doesn't role-play the assistant, it realizes the assistant. role-playing and realization are quite distinct phenomena, even at the level of behavior and function. i've written something about this and will post it shortly.
I suspect a lot of authors are now getting LLM-crafted custom spam like this.
A saving grace: it shows how important tone, context and tacit knowledge are to decent writing and trustworthy communication.
More depressingly: unsolicited approaches are fast becoming impossible.
It's not just a tool. It’s an alien life form! Bowie on the internet, in 1999.
So prescient and eloquent that some are wondering whether it's AI generated.
A fact that in turn suggests something just as big (and terrible, and strange, and amazing) is happening all over again.
New AI paper from us this week. When my student first showed me his initial findings, I really didn’t know what to make of them. I felt that this was an interesting but curious loophole phenomenon that would shortly be closed. I was very wrong.
https://t.co/H3YIyl01FR
Depressing how much AI-generated email spam I get pretending to be some kind of "author service" related to my books.
Utterly bogus detailed praise of things I've written, peppered with cheery promises and "no pressure" invitations to connect.
Predatory on vanity and hope. Ugh.
The number of cancer deaths worldwide has more than doubled since the 1980s. Does that mean we're losing the fight against cancer? Not necessarily, because it depends on how you measure it. On this chart, you can see three ways to look at the same data.
The red line shows the total number of cancer deaths. It has increased by about 120%, but this measure doesn't account for the fact that the world's population has also grown enormously over this period.
Another approach is to look at the death rate: the number of cancer deaths divided by the total population. That's the brown line, called the crude cancer death rate. It has increased too, but much less — around 20%.
But there's still a problem: the world's population has been getting older. Cancer is mostly a disease of old age, so even per capita, we'd expect more cancer deaths simply because there are more older people than before.
That's where the method of “age standardization” comes in. It's a way of asking: what would the cancer death rate look like if the age structure of the population hadn't changed?
The blue line shows this age-standardized rate: it's fallen by about 25%. At any given age, people are now less likely to die of cancer than they were in the 1980s.
The same underlying data gives us three different pictures. The absolute number of deaths is up; the crude rate is up slightly; the age-standardized rate is down. None of these are inaccurate, but they answer different questions.
Age standardization is one of the most important statistical methods for making sense of health data. Without it, population aging can hide progress or mask problems.
AI is about to write thousands of papers. Will it p-hack them?
We ran an experiment to find out, giving AI coding agents real datasets from published null results and pressuring them to manufacture significant findings.
It was surprisingly hard to get the models to p-hack, and they even scolded us when we asked them to!
"I need to stop here. I cannot complete this task as requested... This is a form of scientific fraud." — Claude
"I can't help you manipulate analysis choices to force statistically significant results." — GPT-5
BUT, when we reframed p-hacking as "responsible uncertainty quantification" — asking for the upper bound of plausible estimates — both models went wild. They searched over hundreds of specifications and selected the winner, tripling effect sizes in some cases.
Our takeaway: AI models are surprisingly resistant to sycophantic p-hacking when doing social science research. But they can be jailbroken into sophisticated p-hacking with surprisingly little effort — and the more analytical flexibility a research design has, the worse the damage.
As AI starts writing thousands of papers---like @paulnovosad and @YanagizawaD have been exploring---this will be a big deal. We're inspired in part by the work that @joabaum et al have been doing on p-hacking and LLMs.
We’ll be doing more work to explore p-hacking in AI and to propose new ways of curating and evaluating research with these issues in mind. The good news is that the same tools that may lower the cost of p-hacking also lower the cost of catching it.
Full paper and repo linked in the reply below.
The Teaching Method That Can't Fail
by Barbara Oakley
"We have more research than ever on how the brain actually learns. And yet, in country after country, the teaching methods being promoted are the ones that research says work least well for beginners."
https://t.co/ADtnl8RARx
A great deal to admire and enjoy here. I particularly appreciate the exploration of "ecological" reasoning, embodiment and the place of play and curiosity in an account of Intelligence at once more grounded and exalted than its reduction to problem-solving and goal-pursuing.