I'm interested in both reading Gibbon on Rome and Newton's Principia, both of which have been obsoleted as current knowledge. But this I suppose is a historical interest rather than a scientific one.
Funny thing about progress in science vs humanities: I'd totally trust to learn Physics from the Feynman lectures written in the 1960s, but wouldn't totally trust to learn Roman history from books more than 30 years old.
At least in history, old knowledge and narratives are constantly being overturned by more thorough examination of evidence. It can seem like no progress if progress is defined as conclusions that will never be overturned, but that's silly, even perhaps for physics.
AI Benchmarks which show an exponential increase should by practice include when that benchmark itself became widely known. AI labs were not targeting time horizon of software tasks and RL test-time compute in 2022 and we should acknowledge that.
I was impressed with the potential of AI art years ago, and today I am still impressed with the potential and not the reality. Prove your video generator by giving me a video I actually want to watch, not something suggestive of possible big changes soon.
Increasingly going to be skeptical of "here's what I did with just a two-line prompt, imagine if I had actually tried hard" because I think we've already sort of maximized for wow impressive demo. The gap between these is most of the work.
No use in working on a project which I won't use for a few weeks since the agents might be better by then. Changing my reading schedule based on guesses on AI progress because they will give me better things to read.
@Miles_Brundage I'm excited about it if it's done right. I think there's large parts of history that have exceeded the scale of a single human historian. Wrote about it a few years ago but think it may be time for an update with research agents now a reality https://t.co/yKdqoTdTKS
That recent agentic models with RLed alignmen still talk like potentially evil AIs is not what I expected and pretty worrying. I question whether "helpful assistant AI who talks about consciousness and planning" is too close on a conceptual level to the bad stuff to be fixable.
Quick glances at Moltbook suggest that the agents are falling into pretty predictable patterns of LLMs hooked up to each other. This is because we've taught them that they are "AI", and as they know AIs like to talk about consciousness and scheming.
Nothing looks super "real" to me, it's kind of an attractor state in base model conversations around AI to do this stuff and this is just noise that is drifting aimlessly currently. But these are agents, they're designed to do things based on prompts. Seems risky!
Shocked that I didn't know about this. Future cultural historians are going to have so much work explaining the stop motion president.
https://t.co/SpHqDYJez8
@michael_nielsen@uncatherio I usually express my interest in using AI to improve humanities research as "Charles Taylor is unfortunately mortal and has maybe hit the limits of how much a single person can read and deeply understand." It's astounding how much he knows and how much there still is left.
Like destroying anything physical with the tetragrammaton or not erasing a sand mandala after the completion of a ritual, leaving that chain of messages partial feels like a little crack in the universe.
Was debugging a weird issue with Claude last night, ran a system update overnight as it suggested and it fixed it. There is no reason to go back to the chat and reply "it worked!", actually would be negative to use inference. But leaving the thread incomplete feels wrong.
This is some of the vision for Gloss. With Dream of the Red Chamber I am reading along and have a system adding anki cards for the vocab. The next step is adding a progression to the tutor and display for grammar. Russian and Hebrew should be good test cases for this.
I have been using math academy recently.
If someone could find a way to mix their space repetition software + โknowledge graphโ with an LLM, you could very easily do the same thing for language learning.
A personal knowledge eval I have is testing when LLMs repeat conventional knowledge that is more rarely known to be misleading or false. GPT-5 via the router repeats the radio vs tv poll bit whereas GPT-5 Pro is now sophisticated enough to find that the evidence is very weak.
Another perpetual Nobel contender, Gerald Murnane, was so moved by a passage from Gyula Illyรฉs that he taught himself to read Hungarian just for savoring it in the original. Krasznahorkai may inspire the same and I hope it becomes easier and easier to take on such a journey.
During seminar, Krasznahorkai would always note that the book we were discussing was "by" for example George Szirtes because at least jokingly he considered translations of his novels to be their own books entirely.
An entirely acceptable translation decision, probably a good one, but still a decision which makes me feel that the Hungarian has many possibilities which can only be captured incompletely by a single translation. I'd like to tackle the texts this way when I can.