CEOs are uniquely prone to AI psychosis because they’re sufficiently distant from the last mile of work that still has to happen to generate most value with AI.
So when they play with AI, they see the happy path results, often not considering the next 10 or 20 things that have to happen to get sustainable results from agents.
“Look I made this awesome product prototype”. Yes but you didn’t have to review the code before it went into production and fix a bunch of issues.
“Look I generated a contract”. Yes but you didn’t verify all the terms before it goes out to the counterparty and didn’t have to wire up all the past contracts to work with.
The best thing you can do as a CEO is to use AI a *ton* to figure out the real implications of agents in the enterprise, and come out the other side with an appreciation for both the upside and the real work that goes into them.
everyone wants their product to work with AI agents. but no one wants their product to be abused. and let's be honest: AI means more + better "bad bots"
we are building a product that can help you, by letting you verify there's a real human behind AI agents. we think it'll be really useful for:
- dev products
- e-ticketing
- agent credit cards
- social networks
we are starting a private Beta of Human Principal with a few companies that want to integrate and give us feedback.
interested? https://t.co/l5lml4DIEq
@thegeeknarrator I disagree. Code is slow for humans. The more we read or write it the slower we go. To gain productivity from AI we need to disengage from code and put our energies into managing the structure, not the syntax, of the code.
@ASpittel It's honestly been a renaissance for legacy Linux features: grep, find, tmux, sed.
It shows how simple and well thought out these tools were in their day.
Software engineers don't get paid to write code; they get paid to solve problems.
The faster you realize this, the sooner you'll stop being afraid that AI will replace you and the better your career will be.
🎙️Introducing Max Agency
Max Agency is a new podcast where we go deep on how the best agents are actually being built: architecture decisions, tradeoffs, evals, and everything in between. Each episode, I sit down with engineering leaders who are doing this work in production.
Our first episode features Izzy Miller (@isidoremiller), AI Engineer at Hex (@_hex_tech). Hex has been shipping data agents since before most teams were even thinking about them, starting with single-cell text-to-SQL and graduating to a full Notebook agent that can work autonomously for 20 minutes on a complex analysis.
Izzy has a lot of perspective on what it actually takes to get agents working well in production, and what breaks along the way.
A few takeaways from our conversation:
- Keep your eval sets small enough to hold in your head: Izzy runs 30-50 handcrafted "traps" with multiple repetitions, rather than hundreds of variants. If you can't explain why your agent fails each one, your eval set is too big
- Day zero performance is almost irrelevant: The more interesting question is how the agent compounds. Izzy is building a 90-day simulation where the warehouse evolves and the agent has to accumulate understanding
- You can catch agent errors without seeing the raw outputs: By running an LLM-as-a-judge over production usage and clustering the results, you can surface places where something likely went wrong, without needing to read individual conversations
Watch the full episode on:
- Youtube: https://t.co/AdkQbV3Pq2
- Apple Podcasts: https://t.co/1MKF7mcYSr
- Spotify: https://t.co/DxACw24oob
Thrilled to announce Claude Code auto-fix – in the cloud. Web/Mobile sessions can now automatically follow PRs - fixing CI failures and addressing comments so that your PR is always green.
This happens remotely so you can fully walk away and come back to a ready-to-go PR.
Many think AI will automate away knowledge workers.
Yet if you use these tools daily, it’s obvious that AI *increases* how much time you spend working.
Why? There’s infinite work to be done. Work stalls due to expertise gaps and turnaround times now massively reduced by models.
Unpopular Opinion: We aren't building the future 10x faster with AI. We are just generating legacy code 10x faster.
Everyone is currently bragging about developer velocity. "I built this entire backend in a weekend!" "AI wrote 80% of my codebase!"
But here is the reality check we are ignoring: Code is a liability, not an asset.
If an AI tool spits out 1,000 lines of functional boilerplate in five seconds, that is still 1,000 lines that a human being has to read, review, secure, and maintain when the dependencies inevitably break next year.
We are treating code generation like a pure productivity win, but we are optimizing for the wrong metric. The bottleneck in software engineering was never how fast we could type. The bottleneck has always been comprehension, architecture, and maintenance.
If we don't shift our focus from "generation speed" to "architectural sanity," the tech debt of the next five years is going to be an absolute, unmaintainable nightmare.
auth is hard
1. google first recommended API keys were public
2. google then changed what each key was allowed to do (more permissions). this seems related to an AI adoption push
3. now leaked keys can result in lots of $$
🚨big MCP news!
new auth spec is in. how does it work? 4️⃣ steps
the MCP server is now a "resource server" in oauth parlance (think API), so:
1️⃣ MCP client makes first request to MCP server
2️⃣ MCP server tells clients how they can authenticate to it with a file like this 👇 at a well known location
3️⃣ the MCP client then reaches out to authorization_servers to authenticate and obtain credentials (think a jwt access token, could be others)
side note: @auth0 we are looking forward to being used as the authorization server for a lot of MCP servers. if you are interested in protecting your MCP server with @auth0 DM me :)
4️⃣ the MCP client then calls MCP server tools authenticating with the credentials from 3️⃣
this was a great industry wide collaboration that greatly improves the protocol! big 👏 to @dsp_ for shepherding this through
blog post from Den Delimarsky with more details about the protocols involved in reply (I took the screenshot from it)