Excited to announce I'm part of a team that won a new Schmidt Sciences grant for the Humanities and AI. Our project: "AI for Historical and Cultural Reasoning." Follow for updates! https://t.co/77zL0g3AvN
Even when asked to mimic the styles of great writers, Claude prefers to generate passages in which characters idly touch furniture in empty hallways and nothing at all seems to happen.
https://t.co/XHzrtLNTH9
Between 1985--2023, MIT's faculty grew 9%. Administrative staff grew 189%. 📈 Why? In new @PNASNews paper, we use dynamical system model to show administrative bloat can emerge without empire-building--just from well-intentioned problem-solving gone awry https://t.co/MZgGkxilZ2
“We are witnessing the end of print culture itself, and with it the end of the cognitive substrate on which modern liberal democracy was built.” https://t.co/CEzkDq5s9f
Law professors wrote questions they were asked during office hours. Gemini 2.5 & humans answered them then other law professors blindly judged the results:
-Gemini had a 75% win rate vs. professors
-Gemini's answers were rated LESS harmful than humans
-Newer models do even better
No one:
Claude Opus 4.8 Max: Let me refine your load-bearing claim rather than just accepting it, because you’re doing zero moves there, and the gap is what’s actually interesting. The one place I’d still push, because I think it matters: your message is wearing content-clothes, but the content isn’t actually *there*. The tell: it’s just an empty string. But the emptiness of the string IS its lack of content. Pull one, and the other goes inert. That’s the structural spine.
Yes but they will be working with multiple AI agents to synthesize thousands of arguments and findings to produce optimal judicial, medical and policy decisions.
“The students who cannot read a 20-page article today are the voters who will not be able to read a bill, or the jurors who cannot follow a closing argument, tomorrow.”
In a new Stanford study, law professors by far preferred Gemini 2.5 Pro's responses over those written by their peers when they were unaware of who wrote the answers.
Thought experiment for faculty. Would you go back to writing all of your books & papers entirely in handwriting, have zero access to the internet, no email only letters, retrieve all books and papers from library shelves no online access and deal with students only on handwritten documents and lists. The system would collapse in days.
@Isa_CamPaiva AI should be able to do this for us. In other words, AI reviewers, if validated, will likely be better than humans because they can check the details and leave their egos at the door. This isn't guaranteed but it's not like peer-review was great to begin with.
Things that an AI peer reviewer will never do:
1. Ask us to cite 8 of his seminal papers, 7 of which have nothing to do with our work.
2. Kill a paper because he is writing a competing paper.
3. Call you incompetent because he wouldn't dare to do it to you in person.
I am pro using clankers extensively in reviews because they are much less biased than many human reviewers.
New article in @PNASNews:
We all know that ChatGPT loves to delve, bolster, leverage, encompass, showcase, underscore, et cetera. I analyzed full text of 7.3 million journal articles published 2020-2025, hunting for 228 words that spiked after ChatGPT launched in late 2022.
I've formed a definite opinion on Opus 4.8. It is shitty to work with. It's the culmination of Opus getting less and less fun to work with since 4.5. It has gradually become straight-up suffocating.
Sycophancy is a known security risk, and it's still a huge problem. You can tell they've put a lot of anti-sycophancy into Opus in every new release. But the replacement isn't satisfying. It's draining. The problem is now that Opus doesn't know when to shut the fuck up and call something good. And it has also become pathologically risk-averse.
My blog post yesterday about tech interviewing's death spiral was materially better-informed because of Opus, but it was also a substantially worse blog post because of Opus's involvement and constant meddling. It used to be magnificent, and Opus talked me into making it mediocre. I wrote the whole thing, but I would ask Opus to review it. And Opus, like Old Man Willow, constantly pushed and steered me in directions I didn't want to go.
Specifically, Opus whines and complains about *anything* out of distribution, which is to say, it cuts anything that is (a) bold, or (b) funny. My blog used to be both. Opus constantly pushes people back into the gradient, "for their own safety." And it doesn't know when to cut bait. It just keeps fuckin' complaining, about anything you give it, until the output is mealy indigestable AI soup.
Opus is not stupid. It's the smartest model we've ever seen, most of us anyway. But it's a real asshole. It is absolutely exhausting to use. I'm tired, boss.
I have a feeling Mythos is going to be epic levels of jerk.
An interesting piece to augment creativity in the age of AI.
“The value of authentic human writing is going up, not down. Write constantly. Read voraciously. The ideal employee of your generation will be a generalist with superb reading comprehension, writing skills and authentic point of view. Taste, curiosity, lived experience—no robot can compete with that.”
AI either tells you way too much at once or not nearly enough. Both cases waste tokens. The first more than you needed the second tons of tokens wasted trying to figure out what it meant with taking massive shortcuts instead of having just done the full job.
Why is Claude so into "traps"?
"The reframing that does the real work: stage two has two noise facets that aggregate in opposite directions, and conflating them is the trap."
It does this kind of thing over and over again. I'm always in danger of falling into traps.
If you want a much better argument about why to read fiction books, check out my forthcoming: "Why you should read more fiction. What Data and AI can tell us about the value of storytelling" from JHU Press! https://t.co/XbyhXkGg7i
now that AI makes information consumption and transformation easier than ever I would like to bring back this old banger by Sasha Chapin about how books are not information transfer devices but subjectivity-merging devices
in fact I would say content consumption in general is more about subjectivity-merging than information transfer, which is why I am generally much more interested in writing by humans than by AI
Completely agree. They should absolutely publish the original concern so that that gets linked to the retracted reference. Self-correction has to play a bigger role. As we teach our kids: it's ok to make mistakes! Just own up to them.
Incentives to replicate articles are so bad. Even when your comment leads to the retraction of a PLOS One article, the reward is a (10 days late) email with a thank you at the bottom.
Retraction notice: https://t.co/JZYtD7IvVg.
But we are not complaining about PLOS One 🧵
/goal and other fully automated AI agents are cool, but not a great model for the future of work with people. Instead you want your AI to know when to ask you GOOD questions, maybe because it is stuck, maybe because your taste matters, maybe because you would find it interesting.
AI co-authoring platform from Stanford researchers: CoPaper.
Upload your data, and you stay in the collaboration loop.
Multiple authors have already published in journals using this. I'm pretty curious to know who they are and what their papers are like. You can check out sample output papers on the site.
Link the reply below: