And today's minimum viable product of my full day's market research:
The Chief Unit Economist GPT
It helps forward deployed engineers to identify and determine the following:
1. Analyze this workflow economically
2. Should we hire or automate
SITUATION DETECTED: Antares Nuclear’s Mark-0 reactor has achieved initial criticality at Idaho National Laboratory, what the U.S. Department of Energy is calling one of the most significant technological achievements in nuclear energy in over 40 years.
A new study showed that those who ate the highest amount of ultraprocessed foods—like packaged cookies, hot dogs and chips—had a 58% higher risk of later developing dementia. https://t.co/7mAsq3lOSS
Hermes agent has been the single best remedy to my ADHD. I created a random thoughts vault in Obsidian and now whenever the flood of a million invasive thoughts come I just add them to this folder via speech to text.
I then created a markdown called life priorities that explain to Hermes exactly what’s important for me by order of priority personally and professionally. I’ve instructed Hermes to review the Obsidian vault multiple times a day and map to calendar, email, projects in progress, etc and then cross reference the priorities and deliver to me a few things.
1. A cron that acts on all of this context to deliver a beginning of day and end of day (prep for tomorrow) newsletter
2. An agent that also uses this context to send me texts throughout the day as relevant things I may need to consider (this has taken a ton of tweaking but better now)
What I want to build next? A listener, coach
A tool that has access to my screen 24/7 and is watching what I am doing and delivering proactive insights via voice/audio as I’m working
Pulled the trigger today and switched 100% of Lindy traffic to DeepSeek v4, churning from Anthropic models.
Saves us millions of $ and we're actually seeing an *increase* in performance on many core use cases. Transformative for the business.
Now we know why Peter Thiel packed his bags for Argentina.
Milei just submitted his AI legislative framework to Congress, where he proposes:
- zero regulation on AI development,
- a brand-new "non-human corporation" category for AI/robot-operated entities with limited liability
-a low-tax regime with flexible governance rules.
The Dutch East India Company gave the world the limited liability company in 1602. Milei wants Argentina to do the same for autonomous AI agents in 2026.
Far from reality but the best way to reproduce IMHO is intelligent IVF - at the very least, we should fix diseases before they manifest. Ideally, we should evolve to be better species
BEAUTIFUL!
This anti-vax study that I showed was fatally flawed has now been retracted.
It was garbage and never should've been published. It's been used to dissuade parents from getting their kids vaccinated for a potentially deadly disease.
Glad it's gone.
Jensen Huang: Because of the way I was raised, I have no trouble taking criticism
"Asian parents' way of showing love is to criticize you.
That's their way of saying I love you, I want you to be better, it looks like you did the best you could, but that's not good enough."
Brutal 😂
Melinda French Gates will expand her giving to improve women’s health globally, pledging another $215 million to support contraceptive access and maternal care, as well as initiatives aimed at middle-aged women, including further study of menopause. https://t.co/t1uZECKti8
We just published internal data on how much of Claude's development is already being done by Claude:
- Over 80% of all code merged into our codebase is now written by Claude
- It's been months since many researchers at Anthropic hand-wrote code
- The typical Anthropic engineer ships 8x as much code as they did in 2024
- On the most open-ended engineering tasks, Claude's success rate jumped from ~26% to 76% in 6 months
- When research sessions went off-track, Claude proposed a better next step than the human took 64% of the time
We're not at recursive self-improvement yet, but it could come sooner than most expect. I highly recommend reading the full blog post.
We're fixing a codex bug today that was causing us to undercount tokens being served to some Pro and Plus accounts by a small amount. This impacted < 15% of accounts.
Not the kind of bug you want us to fix, but didn't want to do this silently and thought you should know.
I think it is really worth reading this piece on RSI at Anthropic.
There is a bit of navel-gazing, some marketing, and a lot of very sincere beliefs about what Anthropic thinks is likely in the near future of AI that you probably want to be aware of. https://t.co/A5yxryBjHv
Our internal data shows Claude is accelerating AI development—a possible path to recursive self-improvement, or AI autonomously building a more capable successor.
It’s happening faster than we thought, and the implications deserve greater attention. https://t.co/OVVPJO7VQx
Holy moly, Anthropic is getting very serious about recursive self-improvement!
One word: acceleration.
Insane blog article.
Tl;dr:
•We are close to an AI capable of fully autonomously designing and building its own successor
•They stress this isn’t here yet and isn’t inevitable, but could arrive sooner than most institutions are ready for
•Anthropic engineers now ship on average 8x as much code per quarter as they did in 2021–2025
•Task length AI can reliably complete is doubling roughly every 4 months (up from every 7 months)
•Opus 3 (Mar 2024) handled ~4-minute tasks; Sonnet 3.7 (a year later) ~90-minute tasks; Opus 4.6 (a year after that) 12-hour tasks
•SWE-bench went from low single digits to saturated in two years; CORE-bench (research reproduction) went ~20% to saturated in 15 months
•METR found Claude Mythos Preview could work “at least” 16 hours, at the top of what they can currently measure
•As of May 2026, Claude authored 80%+ of code merged into Anthropic’s codebase (low single digits before Claude Code launched in Feb 2025)
•A March 2026 poll of 130 research staff: median respondent estimated ~4x output with Mythos Preview
•One April 2026 example: Claude shipped 800+ fixes cutting a class of API errors 1,000x, work an engineer estimated would have taken a human four years
•Claude-written code quality: worse than human in late 2025, roughly at parity now, expected to be strictly better within the year
•On the hardest open-ended tasks, Claude’s success rate hit 76% in May 2026, up 50 points in six months
•Code-speedup test: Opus 4 averaged ~3x speedup (May 2025), Mythos Preview ~52x (April 2026); a skilled human needs 4–8 hours to hit 4x
•In an AI-safety research project, Claude agents recovered 97% of a performance gap (vs ~23% for two human researchers in a week), over 800 compute-hours and ~$18K
•On picking the better “next step” in research sessions, the best model beat the human choice 51% (Nov 2025, Opus 4.5) rising to 64% (April 2026, Mythos Preview)
•Human comparative advantage, for now: research taste and judgment, i.e. choosing which problems matter and when an approach is a dead end
Three possible futures
•The trend stalls (S-curve), but today’s capabilities still diffuse widely; they consider this least likely
•Compounding efficiency gains, with humans still setting direction; 100-person firms doing the work of 10,000+; they think this is the likely path
•Full recursive self-improvement, where AI builds its successors and pace is set by compute; the alignment outcome here is what they’re least certain about
And today's minimum viable product of my full day's market research:
The Chief Unit Economist GPT
It helps forward deployed engineers to identify and determine the following:
1. Analyze this workflow economically
2. Should we hire or automate