Makes me idly suspect a few things:
1. Biology is going to be very hard for LLMs, even given the massive corpus. I expect slow progress here, outside of things like protein folding.
2. Hallucination is a fundamental issue of the architecture, and can't be fully tuned away.
I've been paying more attention to LLM hallucination, and my main conclusion is that it happens A LOT, we just learn to filter it out and/or not ask questions it has a hard time answering. (This is all just like... vibe-tweeting so take it with a grain of salt)
I think one of the reasons llms are so effective at software is that the code IS the reasoning. We're tapping pretty directly at the underlying model of how a program works with language. There's little hidden state. There's little emergent behavior in a well-designed program.
These are all hallucinations, but over time I've learned to completely ignore them to the point where I'm not even consciously aware of it most of the time. It just gets eliminated in the first skim.
I wonder how much of my perception of AIs hallucinating less is that factor.
I realize that lots of my experience of AIs hallucinating less over time is actually me just getting used to filtering through the output. They still hallucinate like crazy, but I just habitually ignore it.
"Would you like me to draw it out? It really starts to make sense when you see a diagram of the whole process." Like does it actually? What are you basing that off of? Have you seen a diagram or are you just saying things?
"Most people underestimate..." Do they really?
Claude one-shotting a software project for me (Nov last year): Wait we're moving this is insane. 3 years max.
Claude superspeeding an AI research stab for me: Okay we're actually deep in the feedback loop. 1 year.
Today: Uhh... hmm....
My gut beliefs about AGI timeline over time:
Pre-2019: Not in my lifetime, reading yudkowski is fun.
GPT-2 Release (2019): In my lifetime holy fuck okay let's try to remain stable.
Agentic loop demos (2021): In some sense we're farther than I realized. 15 years.
ChatGPT (2022): No real update
Me seeing ChatGPT to solve a game it couldn't on its own by writing an RL model (2023): Okay 10 years.
ChatGPT 4 (2023): Impressive, not crazy. Let's say 7 years.
[a whole lot of therapy ]
Cursor (2025 for me): This isn't very good. 5 years still.
last night i had a dream that i learned a bunch of weird, neat theory about transmission lines. Today i woke up to find...
that it's all true!!! λ/4 z transformers are real!!! line capacity is often reactive power-limited!!! IT'S COOL ALRIGHT>????
i realize the overarching feeling I've been living with for the past 5 years technologically is waiting waiting waiting for the other shoe to drop. What a way to spend a life.