m.l not ML

@mlnotml

I write about software engineering and AI

Charlotte, NC

Joined November 2008

197 Following

66 Followers

217 Posts

mlnotml retweeted

Andrew Ng

@AndrewYNg

2 days ago

“Loop engineering” is a hot buzzphrase after mentions of it by Boris Cherny (Claude Code’s creator) and Peter Steinberger (OpenClaw's creator) went viral on social media. Loops are now a key part of how we get AI agents to iterate at length to build software. In this letter, I’d like to share my 3 key loops, shown in the image below, for building 0-to-1 products. These loops guide not just how I build software, but also how I decide what software to build. Agentic coding loop: Given a product specification and optionally a set of evals (that is, a dataset against which to measure performance), we can have an AI agent write code, test its work, and keep iterating until the code is bug-free and meets its specification. This idea of closing the loop took off around the end of last year, and it has been a game changer in enabling coding agents to work longer productively without human intervention. For example, over the weekend, I was building an app for my daughter to practice typing, and my coding agent could easily work for around an hour, using a web browser to check what it had built multiple times before getting back to me, without needing my intervention. The engineering loop executes quickly. Every few minutes, the coding agent might build and test a new version of the software. I hear frequently from developers who are finding new ways to engineer more effective engineering loops. This is an active area of invention! Developer feedback loop: In this loop, a developer examines the current product and steers the coding agent to improve it. Last year, a lot of developers (including me) were acting as the QA (quality assurance) function for our coding agents, manually finding bugs and then asking the agent to fix them. But with coding agents much more able to test their own code, the amount of time we need to spend on this function has decreased significantly. This allows us to make higher-level product decisions, such as what key features to offer, where the UI needs improvement, and so on. The developer-feedback loop operates over time intervals between tens of minutes and hours — that's how frequently a developer might review a product and give feedback. In the case of the typing app, I changed my mind a few times about the visual design, what cat costumes she can unlock as she learns (she loves cats), and the user flow for a grown-up to log in and steer the child's learning experience. When a developer has a clear vision for what to build, it is still a lot of work to translate that vision into a specification for a coding agent to implement. Further, after the developer has seen an implementation, they might update (or perhaps clarify) the spec to steer it toward what they want. If you find that the system repeatedly runs into certain problems, building a set of evals for the agent becomes useful. AI-native teams are increasingly using AI to help shape product direction, for example, automating the gathering and analysis of usage data, summarizing written and verbal customer feedback, or carrying out competitive analysis. However, for pretty much all the products I’m involved in, I see humans as having a significant context advantage over current AI systems — we know a lot more than the AI system about the users and the context the product has to operate in — and thus humans play a critical role. Many people describe this human contribution as “taste,” but I prefer to think of it as humans having a context advantage, since that gives us a clearer path to helping AI systems get better. This also speaks to why this step can’t be automated: So long as the human knows something the AI does not, human-in-the-loop is needed to to inject that knowledge into the system. External feedback loop: This includes a wide range of tactics like asking a few friends for feedback, launching to alpha testers, or putting the code into production with A/B testing. These tactics are usually slow, rarely taking less than hours and sometimes taking days or even weeks. This data informs the developer vision, which in turn continues to drive the detailed product spec, which in turn drives the coding agent. With coding agents speeding up software development, more engineers are starting to play a partial product management role. For many engineers who are growing into this role, the hardest part is shaping the product vision and striking a balance between building (bridging the gap between vision and spec) and getting user feedback to evolve the vision. It is important to do both! I will write more about how to do this in future posts, but for now, I find it encouraging that engineers are playing an expanded role (just as product managers and designers now do more engineering). [Original text: The Batch]

AndrewYNg's tweet photo. “Loop engineering” is a hot buzzphrase after mentions of it by Boris Cherny (Claude Code’s creator) and Peter Steinberger (OpenClaw's creator) went viral on social media. Loops are now a key part of how we get AI agents to iterate at length to build software. In this letter, I’d like to share my 3 key loops, shown in the image below, for building 0-to-1 products. These loops guide not just how I build software, but also how I decide what software to build.

Agentic coding loop: Given a product specification and optionally a set of evals (that is, a dataset against which to measure performance), we can have an AI agent write code, test its work, and keep iterating until the code is bug-free and meets its specification. This idea of closing the loop took off around the end of last year, and it has been a game changer in enabling coding agents to work longer productively without human intervention. For example, over the weekend, I was building an app for my daughter to practice typing, and my coding agent could easily work for around an hour, using a web browser to check what it had built multiple times before getting back to me, without needing my intervention.

The engineering loop executes quickly. Every few minutes, the coding agent might build and test a new version of the software. I hear frequently from developers who are finding new ways to engineer more effective engineering loops. This is an active area of invention!

Developer feedback loop: In this loop, a developer examines the current product and steers the coding agent to improve it. Last year, a lot of developers (including me) were acting as the QA (quality assurance) function for our coding agents, manually finding bugs and then asking the agent to fix them. But with coding agents much more able to test their own code, the amount of time we need to spend on this function has decreased significantly. This allows us to make higher-level product decisions, such as what key features to offer, where the UI needs improvement, and so on.

The developer-feedback loop operates over time intervals between tens of minutes and hours — that's how frequently a developer might review a product and give feedback. In the case of the typing app, I changed my mind a few times about the visual design, what cat costumes she can unlock as she learns (she loves cats), and the user flow for a grown-up to log in and steer the child's learning experience.

When a developer has a clear vision for what to build, it is still a lot of work to translate that vision into a specification for a coding agent to implement. Further, after the developer has seen an implementation, they might update (or perhaps clarify) the spec to steer it toward what they want. If you find that the system repeatedly runs into certain problems, building a set of evals for the agent becomes useful.

AI-native teams are increasingly using AI to help shape product direction, for example, automating the gathering and analysis of usage data, summarizing written and verbal customer feedback, or carrying out competitive analysis. However, for pretty much all the products I’m involved in, I see humans as having a significant context advantage over current AI systems — we know a lot more than the AI system about the users and the context the product has to operate in — and thus humans play a critical role. Many people describe this human contribution as “taste,” but I prefer to think of it as humans having a context advantage, since that gives us a clearer path to helping AI systems get better. This also speaks to why this step can’t be automated: So long as the human knows something the AI does not, human-in-the-loop is needed to to inject that knowledge into the system.

External feedback loop: This includes a wide range of tactics like asking a few friends for feedback, launching to alpha testers, or putting the code into production with A/B testing. These tactics are usually slow, rarely taking less than hours and sometimes taking days or even weeks. This data informs the developer vision, which in turn continues to drive the detailed product spec, which in turn drives the coding agent.

With coding agents speeding up software development, more engineers are starting to play a partial product management role. For many engineers who are growing into this role, the hardest part is shaping the product vision and striking a balance between building (bridging the gap between vision and spec) and getting user feedback to evolve the vision. It is important to do both!

I will write more about how to do this in future posts, but for now, I find it encouraging that engineers are playing an expanded role (just as product managers and designers now do more engineering).

[Original text: The Batch]

317

10K

534K

m.l not ML

@mlnotml

4 days ago

This is gold. The leverage principle is the key. So the central question of engineering isn't "how do we find a mind big enough to hold all of this", because there isn't one. The question is "how do we shape the thing so that a small mind can work on it without bringing it all down".

Mario Zechner

@badlogicgames

4 days ago

recommended reading. https://t.co/emuVkKpo8T

595

865

33K

m.l not ML

@mlnotml

6 days ago

Not surprised. OSS for the win.

Chubby♨️

@kimmonismus

6 days ago

This doesn’t sound good, friends. It doesn’t sound good at all.

262K

mlnotml retweeted

François Chollet

@fchollet

10 days ago

Programming is not about code, just like music is not about notation. It is the art & science of managing complexity through layers of abstraction. AI is simply a part of it.

131

402

692

122K

Who to follow

记录美好生活❤️ 分享美好瞬间🎉 承载美好记忆🎊 期待美好未来🤟 @CyberBird

mlnotml retweeted

13 days ago

Major cheat code for life: Become difficult to rush. The world will pressure you to rush into everything. Rushed decisions. Rushed conversations. Rushed relationships. Rushed timelines. There's immense power in rejecting that trend. Slow down. Create space to think clearly.

331

32K

635K

mlnotml retweeted

Mitchell Hashimoto

@mitchellh

17 days ago

The problem with the "if it works who cares what the code looks like" mindset for agentic work is that it assumes the agent has a perfect understanding of "works." Realistically, things are underspecified, agents make bad assumptions, etc. To be fair, agents are pretty good at unit test coverage. They're pretty bad at designing human experiences (API, CLI flags, etc.), especially cohesive ones for future roadmap plans they may not have visibility into (unless your backlog is perfect and vision fully laid out, which I doubt). They're bad at knowing where performance matters and what type (CPU vs memory tradeoffs). They're bad at where compatibility matters and where it doesn't (and tend to err on the side of preserving it without further guidance). Etc. Unless you have this ALL specified, you can't possibly claim "it works" without taking a look and thinking about it.

131

311

198K

mlnotml retweeted

Deli Chen

@victor207755822

19 days ago

I Have a Dream. That One Day, the world’s most advanced intelligence — whether it’s called GPT, Fable, DeepSeek, or something else Just Does Not Matter~ — will be just like electricity and water supplt today. Cheap. Stable. High‑quality. It will power all kinds of AI software, the way electricity powers all of akindsppliances. We’re still living in the era when the light bulb was just invented. There will be countless AI applications in the future. That’s why token production and token transportation — as the underlying infrastructure — will be the most important infrastructure build of the next 10 to 30 years. This is an industry‑wide upgrade. A benefit to all of society. And this is exactly why I strongly oppose closed‑source models. #AGIForEveryone How can we let infrastructure — something as essential as water and electricity — become a luxury for the few? How can we expect people to pay such exorbitant bills? #路长而歧行则将至 #我心匪石不可转也 #AI #Infrastructure #OpenSource #TokenEconomy

678

103K

m.l not ML

@mlnotml

20 days ago

Interesting to watch OpenClaw expanded fast… and then Hermes Agent started eating its lunch. Agent market share is starting to look less like a leaderboard and more like a food chain. Who’s coming next? 🐾🤖 #AIAgents #OpenRouter #HermesAgent #OpenClaw

mlnotml's tweet photo. Interesting to watch OpenClaw expanded fast… and then Hermes Agent started eating its lunch.
Agent market share is starting to look less like a leaderboard and more like a food chain.
Who’s coming next? 🐾🤖 #AIAgents #OpenRouter #HermesAgent #OpenClaw https://t.co/BdnNzFZFIu

104

mlnotml retweeted

Matt Van Horn

@mvanhorn

25 days ago

https://t.co/DM0CAuyprS

214

501

16K

mlnotml retweeted

Google

@Google

29 days ago

Today we’re introducing Gemma 4 12B — our latest open model that brings advanced agentic reasoning, vision and audio directly to your laptop. It delivers performance nearing our larger Gemma models with a much smaller total memory footprint, while being small enough to run locally with just 16GB of VRAM. It’s open and accessible for everyone to use under a permissive Apache 2.0 license. This is all made possible by our new, unified architecture that removes separate multimodal encoders. Here’s how we did it 🧵

Google's tweet photo. Today we’re introducing Gemma 4 12B — our latest open model that brings advanced agentic reasoning, vision and audio directly to your laptop.

It delivers performance nearing our larger Gemma models with a much smaller total memory footprint, while being small enough to run locally with just 16GB of VRAM. It’s open and accessible for everyone to use under a permissive Apache 2.0 license.

This is all made possible by our new, unified architecture that removes separate multimodal encoders. Here’s how we did it 🧵

248

884K

mlnotml retweeted

Thariq

@trq212

30 days ago

https://t.co/R6exTuF7P8

268

11K

24K

mlnotml retweeted

Nous Research

@NousResearch

about 1 month ago

The next evolution of Hermes Agent is here! Introducing Hermes Desktop: everything you love about Hermes, now native on your machine. First demoed in Jensen's GTC keynote, it's now in public preview.

13K

mlnotml retweeted

Justine Moore

@venturetwins

about 1 month ago

Me using Claude Opus 4.8 to rename a file

76K

10K

44M

mlnotml retweeted

Ethan Mollick

@emollick

about 1 month ago

There is a lot being written about the stylistic tells of AI writing (em-dashes, etc.) but this paper looks at AI narrative tells Fascinating differences between AI & human narrative, and asking AI to write in different styles doesn't do much to change it https://t.co/azkRHz34NQ

emollick's tweet photo. There is a lot being written about the stylistic tells of AI writing (em-dashes, etc.) but this paper looks at AI narrative tells

Fascinating differences between AI & human narrative, and asking AI to write in different styles doesn't do much to change it https://t.co/azkRHz34NQ https://t.co/oTxSGBNYYE

122

588

396K

mlnotml retweeted

Raveesh 折図

@raveeshbhalla

about 1 month ago

@mathurahravi This by @swyx is some of the best writing on the internet https://t.co/s1SFGVENLW

102

220

13K

mlnotml retweeted

Demis Hassabis

@demishassabis

about 2 months ago

I’ve always believed the No.1 application of AI should be to improve human health. That work started with AlphaFold, and now at @IsomorphicLabs with the mission to reimagine drug discovery and one day solve all disease! We are turbocharging that goal with $2.1B in new funding.

726

21K

m.l not ML

@mlnotml

about 2 months ago

My latest essay about neuroplasticity, reinvention, and why we are not trapped inside the first version of ourselves that happened to receive applause. Stories told from my own experience. https://t.co/tYa72L7LkH