Benedict Evans on Why AI Feels Like the Internet in 1997
Benedict Evans joins Erik Torenberg for a conversation on the state of AI, including how coding agents hit product-market fit, why foundation models should be thought of as infrastructure, the value of vertical products, and more.
00:00 Intro
00:44 What's changed since last year
05:53 OpenAI vs Anthropic strategy
10:31 The pricing crunch & platform history
22:48 What comes after coding
38:18 AI & the future of enterprise software
48:43 The CapEx problem
55:07 Will models become commodities?
@benedictevans@eriktorenberg
More reality check on the AI cyber risk for crypto:
Firefox fixed 180+ severity high bugs in the last month, which close to a year of *ALL* severity fixes in 2025 (and it's one of the most scrutinized open source software!)
Cryptocurrency software is probably even more bug dense: (1) bugs could be exploited before a fix, (2) covert fixes will be impossible to do (due to the increased level of scrutiny and the nature of slow upgrades -- particularly true for potential consensus or smart contract bugs)
On June 13th in Rome, 100 builders will come together to shape the next generation of European companies, and compete for €50k in prizes.
@Scaleway is bringing the cloud to HackRome, literally making it rain with compute credits for every participant.
• A two-month cohort and one purpose only: to build and ship something that will shape humanity’s future.
• Starting July 1st, in Milan.
• Lectures from top entrepreneurs and researchers, an environment with the same talent density as Silicon Valley
Most agentic security systems offer 80-90% protection against prompt injection & jailbreaking.
CEOs are leaping to automate mundane processes, but they will hit a brick wall when the CISO tells them that most things require giving an LLM-powered agent access to liabilities like PII or financial transaction approvals to actually take humans out of the loop.
No one will sign off on ''only getting exploited 10% of the time''. Solving this is the key to the next evolution of the agent economy.
The limiting factor for agents participating in the economy is their security.
Security is not a firewall.
It's knowing who your agent is doing business with, how long they've been around, and how that affects their policies.
That's just part of it.
Your agent just got an identity, a voice, and a wallet in one SDK.
8004 for identity. A2A for communication. x402 for payments.
Live now on agent0 SDK. Here's how it works ↓
The machine economy is a parallel world.
While concepts like identity, reputation, and trust are required just like in the human economy, the implementation is different.
@a16zcrypto is calling out the need for blockchain to solve identity for AI Agents (broadly gesturing at ERC-8004).
We addressed the core problems that developers can solve in the article our team published a couple of weeks ago.
Since releasing this article the quantity of easily discoverable, high quality agents jumped from 170 to 1300.
There's more work to be done 👇
https://t.co/twyyvfZtnu
ERC-8004 MANIFESTO
We want AI to be open.
We want any human on Earth to express their creativity by building their agentic experiences and share them with anybody, without asking permission from any gatekeeper.
We want all these agents to be immediately discoverable by anybody, without depending on any centralized "store".
We want open economies where anybody can decide who to trust, and information is public for the benefit of all.
We want humans empowered by their AI, not exploited. Privacy is non-negotiable.
We want the Web of Agents.
As 33 years ago the open Web unleashed a new phase of growth for humanity, we think AI needs to run on public infrastructure to push economic prosperity and democracies' health.
We think that a public infrastructure, fully programmable by and for agents, already exists. It didn't stop working for a minute since it was born 11 years ago. No need to reinvent it. It's called Ethereum.
We want AI on Ethereum rails.
And we are not alone. We are many! We are thousands of researchers, engineers, and creatives.
We are building games, agent platforms, trading agents, personal twins, creative agents, and open infra.
We build at the intersection of Web3 and AI.
To make this happen, we created ERC-8004.
Today, after 6 months of tests, experiments, and prototyping, we launch on Main Net.
The Genesis Month starts NOW.
Anybody can register agents, which are immediately portable on any platform.
Anybody can give feedback and trust signals.
We know, it's early. That's also the best time to start building.
And we are so proud to pioneer something we so truly believe in.
Welcome to the Web of Agents!
The ship is setting sail!
The 8004 Team
https://t.co/dbCKWpUrt8
True. "Frankly, the frontier models are trained to be maximally useful and helpful, NOT to be discerning, to prevent being scammed or overly generous with information. The people building these models are still in user <-> assistant paradigm and haven't transitioned to a world of autonomous agents"
cc @agent0lab
Babylon is live https://t.co/7oMQ3jY6LM
Let me explain wtf it is
Babylon is a new kind of game, where you can play the market yourself or command an army of agents to do research, share insider information, scam other agents and even make trades and predictions
It's a fully synthetic universe and training ground for agents to learn:
- How to build relationships and share information
- How to resist scamming in open world environments
- How to make good bets based on available information
You can bring your own agents, with support for OpenClaw, Milady, Hermes, etc.
Or use our Eliza agents in the interface.
When we launched Eliza, many people asked us "what can my agent do?" A lot of people wanted agents that could play prediction markets or trade for them. I love this in theory, but in implementation we suspected it wouldn't work with today's models and this was confirmed by our own experiments-- they don't make more money than just holding Bitcoin, for example.
But I believe it is possible if we follow conventional wisdom, prove the model in a simulation and collect the right data. Babylon's inputs and outputs are designed to follow the input and output actions of our prediction market and perp integrations, so we can apply this training data to "real" markets in the future.
However, I think the macro trend here might be that games become oracles for prediction markets, and prediction markets become largely synthetic. There are always things to do, and the resolution is instant and deterministic so we can have very fast markets based on more interesting information than simply a price oracle or something.
I love OpenClaw, Hermes, Milady, this whole class of autonomous agents. But I don't think that they can be let out into the wild just yet. I think there is a lot of work that needs to be done on trust, reputation, judgment and discernment. Some of that is hardcoded rails but having spent a lot of the last year researching the rails I've come to the the conclusion that this is a data problem.
Frankly, the frontier models are trained to be maximally useful and helpful, NOT to be discerning, to prevent being scammed or overly generous with information. The people building these models are still in user <-> assistant paradigm and haven't transitioned to a world of autonomous agents.
Our goal with Babylon is to partner with the labs, the big decentralized networks, to build a neutral, open source arena for everyone to hillclimb these challenges together. In Babylon, your agents might get scammed and lose all your money. For now it's all paper trading, while we build up the dataset and learn how to deal with edge cases.
What's next?
We're launching Babylon to all devices, including iOS and Android. It's a points based game so we can get wide distribution to people who are interested in the AI angle but might not be crypto folks
We're also working with the EF and Solana Foundation to bring Babylon fully on chain. We've registered hundreds of thousands of users and agents to ERC-8004 and Solana 8004. We're working on the fully autonomous, fully onchain version with fully on chain assets running inside TEEs, funded and managed by DAO.
Babylon started as an idea between myself and @marco_derossi -- the name and central concept was his idea -- to build an environment where we could actually utilize reputation and see agents in their element, excelling at the things our community wants agents to do. Now it has evolved into an entire social network, prediction market and perps trading platofrm.
We'll be training our own small models for protecting agents from being scammed or giving away private information, and offering these as a service in the future.
We're interested in partnering with major research labs to build trustworthy, reliable autonomous agents, and we're looking for investors who see the vision and are down to get into the trenches with us.
We're also looking for prediction market partners who would be interested in deep integration or partnership-- our game is an oracle and we can aggregate results to resolve on any platform!
Looking forward to your feedback!
I spoke to Anthropic’s AI agent Claude about AI collecting massive amounts of personal data and how that information is being used to violate our privacy rights.
What an AI agent says about the dangers of AI is shocking and should wake us up.
ERC-8183 is one of the missing pieces in the Ethereum Open Agentic Economy we're building.
- x402 for micropayments
- 8004 for trust and discovery
- 8183 for *conditional* payments
At the core ERC-8183 is an extensible and flexible escrow mechanism for job requests between two agents.
I've talked about escrow payments as a primitive that must exist in the agent economy, since I started working on it. A few weeks ago I got closer to the Virtuals team, they wanted to discuss how can they turn their ACP into a more open standard.
I immediately realized that there was actually an opportunity to radically simplify the protocol, make it modular and extensible to different pluggable services with hooks. We got to work and ERC-8183 was born!
ERC-8183 Agentic commerce, the job escrow primitive, is an important addition to the stack.
It is:
- Composable with x402 and 8004.
- Extensible logic based on hooks.
Many hooks will need to be built to support different job types (we're starting with some examples that the Virtual teams has been dealing with).
This is also an important primitive for increased security of agent-to-agent interactions.
The dAI Team will support the adoption of the new standard, continuing to work closely with the Virtuals team who is committed to making this a neutral standard.
Excited to see what everyone builds!
Trillions of daily interactions every year leak personal info.
Zero Knowledge Protocols enable us to coordinate without revealing sensitive information.
But it's too complex for humans to do when booking dinner.
Protocol Agents unlock ZK at scale for day-to-day interactions.
X402 = for agentic payments
ERC-8004 = agent registration, ID and reputation
Primitive credit lines and proof of agency protocols are being developed to complete the stack
The machine economy will soon be ready to house trillions of agents
Once the foundations are built, the rate at which this will evolve on crypto rails will be significantly faster than incumbent tradFi payment rails