One aha from class with @alighodsi at MS&E 435 this wk:
Open source closing the gap with closed source may be inevitable.
Why? Distillation.
The old training substrate was the internet. Common Crawl is roughly 2T tokens (10^12).
The new training substrate is AI-generated output. If OpenAI + Anthropic have produced ~$50B of tokens at ~$5 per million tokens, that 10^16 of proprietary-model tokens in the wild.
That is four orders of magnitude more than Common Crawl. (10^4)
At some point, every training run is learning not just from humans, but from the exhaust of the best closed models.
The gap may close because *the teacher cannot stop teaching*.
@martin_casado there are two large, capable, and well-resourced entities with clear strategic interests in ensuring open models keep up: China and Nvidia
preventing distillation and capturing market share are in tension. it'll be hard to distill GPT-7-BioChem, easy to distill Default Claude.
Lesson on power laws: My essay on High Agency is getting ~200,000 monthly readers, mainly from group chats, 12 months later.
It's outperformed everything I've posted in the last 10 years *combined* by 100x.
I think there's two ways to create content right now:
1. The slop zone - Increase quantity of posts. Post 10-20x per day. Churn out clip farms. Win the war of throwing as much shit at the algorithm. People don't remember what you post, but they remember that you post.
2. The golden hippos on unicycles zone - Increase quality of posts. Post something awesome 1-4x per year. Whilst everyone is speeding up, slow down, and spend 100x more time on the quality. Focus on making something people remember and share one year later.
The slop zone is entering the Red Queen moment. You have to post more to reach the same people. It will soon go from 10-20 posts per day, to 40-50 posts per day, to 400-500 posts per day, to maintain the same impressions. Who has the biggest army of fifteen year olds, using an army of AI agents, to chop up the most sensationalist clips will win.
The slop arms race means the golden hippos on unicycles have an advantage, because people are craving content that wasn't made in a day more than ever. It took seven months to write high agency, and I didn't post once the entire time.
In an age of online dementia, where people can't remember one post from hours of scrolling from yesterday, the best filter to ask is: What can I make that people will still remember one year from now?
[ DOOMER ]
ANTHROPIC LAUNCHES "AGENTIC ECONOMY FUND" FOCUSED ON INVESTING IN AREAS OF CRYPTO THAT ENABLE AGENTS TO PARTICIPATE IN THE GLOBAL ECONOMY: BBG
I'm convinced that adding "Open-" to your company name instantly 10x's your odds of success.
OpenAI
OpenEvidence
OpenTable
OpenRouter
OpenCode
OpenDoor
OpenGov
OpenWeb
OpenText
OpenView
OpenSea
OpenStore
OpenFX
OpenSpace
OpenArt
OpenHands
OpenPipe
OpenNote
In 1945, Vannevar Bush imagined a machine to extend a scientist's memory. He called it the MemEx.
80 years later, we built one for LLM agents.
Tool outputs become Python objects; only print statements reach the model's context.
🧵 https://t.co/YyrGsn3TB7