Google Brain co-founder said something on stage that most AI teams do not want to hear.
Andrew Ng:
"You need 5 functions in a team of 2 humans. Each person covers more than one role."
AI fills the other three.
30+ minutes on self-improving loops, the product management bottleneck, and why trying to improve by 2% is harder than improving by 50%.
He validated the solopreneur model.
More signal in 30 minutes than most AI courses deliver in a week.
At HumanLayer, we’re on a mission to solve the AI slop code problem.
In 2025 we open-sourced our Research, Plan, Implement framework, now deployed inside fortune 500s like Block and Uber - places where shipping slop is just not an option
And that was just the beginning.
Today, we’re opening access to HumanLayer - an Agentic IDE, collaboration platform, and building blocks for your software factory.
HumanLayer enables engineers solving hard problems in complex codebases to:
> move 2-3x faster across the entire SDLC (not just coding)
> maintain rigorous standards for system architecture and program design
Hundreds of engineers at companies of all sizes are already using HumanLayer to ship fast without sacrificing quality.
I'm excited to invite you to try humanlayer today at https://t.co/cQ648EkrnG, and I'm even more excited to see what you build.
@0xblacklight and I are deeply grateful to our team, our customers who give us so much incredible energy and feedback, our investors who have always been in our corner, and our friends and family who have supported us along this crazy journey
if you're a staff or principal engineer trying to make AI coding work at scale for your team, we'd love to hear from you
as @swyx likes to say - let's make this the year of no more slop
I have a deep distrust of almost any 'self-improvement' loop in coding agents
I.e. automatically created memories, CLAUDE.md suggestions applied after every session
Often the suggestions themselves are shit
But even if they're good, the agent often over-indexes on them in a way that's super unhelpful.
It makes the agent impossible to steer. And often because these memories are scoped per-project, each project is unsteerable in its own way.
What's the right name for this? Instruction rot?
In light of what happened, I'm doubling down on skills like /improve.
A frontier model got pulled. If it happened once, it's gonna happen again. Fable today. 4.9 tomorrow or maybe gpt 6 one day.
So, treat intelligence as borrowed. Drain intelligence when it's available. Build a catalog of plans today. Then implement later with a cheaper, open source, or a model you control.
Build the backlog now.
https://t.co/rqHw0fPv4G
Opensource AI MUST WIN
OpenAI / Anthropic winning is
- At best a world we can tolerate
- At worst a dystopia
We lose in all scenarios, the degree of that loss is a matter of how things play out + being UNDER THEIR MERCY
This is our MOST IMPORTANT FIGHT
Existential.
I believe on-prem and local AI - based on @huggingface open-source models - will be an important answer to the GPU shortages this year (because they are cheaper, faster, safer than cloud APIs)!
Great collaboration between @huggingface & @MichaelDell@Dell to make this a reality for enterprise today. Announced at the main keynote of Dell Technologies World.
hey pi kids. if you use @opencode zen or go for their free models, expect them to rate limit you hard. pi will not add the special headers needed for that not to happen.
gentlemen's agreement.
I strongly believe there are entire companies right now under heavy AI psychosis and its impossible to have rational conversations about it with them. I can't name any specific people because they include personal friends I deeply respect, but I worry about how this plays out.
I lived through the great MTBF vs MTTR (mean-time-between-failure vs. mean-time-to-recovery) reckoning of infrastructure during the transition to cloud and cloud automation. All those arguments are rearing their ugly heads again but now its... the whole software development industry (maybe the whole world, really).
It's frightening, because the psychosis folks operate under an almost absolute "MTTR is all you need" mentality: "its fine to ship bugs because the agents will fix them so quickly and at a scale humans can't do!" We learned in infrastructure that MTTR is great but you can't yeet resilient systems entirely.
The main issue is I don't even know how to bring this up to people I know personally, because bringing this topic up leads to immediately dismissals like "no no, it has full test coverage" or "bug reports are going down" or something, which just don't paint the whole picture.
We already learned this lesson once in infrastructure: you can automate yourself into a very resilient catastrophe machine. Systems can appear healthy by local metrics while globally becoming incomprehensible. Bug reports can go down while latent risk explodes. Test coverage can rise while semantic understanding falls. Changes happens so fast that nobody notices the underlying architecture decaying.
I worry.
AI slop is good, actually. Slop is what enables fast parallel experimentation. The etiquette and skill is understanding the boundaries of where slop exists and the extent to which it should be cleaned up and how.
A few examples:
I’m working on the internals of some system right now. The API and GUI of this thing is fully zero shame slop. It’s horrible. But it lets me focus on the core quality while shipping a usable piece of alpha quality software to testers (transparent about the slop frontend).
Similarly, this system has plugins. We sent agents in Ralph loops overnight to generate dozens of plugins. The plugins are slop. The quality is bad. The plugin API/SDK is absolutely not done.
But we can test a full GUI with a full plugin ecosystem. When we change the API, we can regenerate them all. The cost of change is just tokens, the velocity is incomparable to before.
I built Terraform. We tested and shipped TF 0.1 with about 3 very weak providers. Because we ran out of time. Building was slow. And when we changed our SDK the cost was immense. Totally different today, 10 years later. Today, I would’ve slop generated 100 providers (again, with transparency and cleanup later, but just to prove it out).
As an anti example, I would not PR this (without prior warning) to another project. I would not throw this onto customers without full review or transparency (as I’m already doing). I would not accept first pass slop. It’s almost never right.
Slop is a tool. And like anything else it’s not blanket bad or good. The context is everything.
@AlanOliverDev@guiassisbrasil It was a quick experiment, so I simply downloaded the artifacts and moved them into a "mockups" directory in my project. I then asked Claude to use the mockups and to implement them on my stack (Backstage plugin)
@mattpocockuk I love Domain Events: it helps me reason about the system (and provides a nice connection with event storming practices) and ties nicely in automated testing.
The team reached out to me and it seems the diff is actually in my bad and myconfiguration.
And that in fact is the same model on both claude max and the API
@thsottiaux Does this not apply to users on a business plan? Also, the "up to $500 in credits" promo is not working for us.
Btw, since this week, all I can do in a 5-hours window is a single feature... In my case, the price hike is 20 USD / month to 100+ USD / day.