Day 7: The memory architecture is emerging.
Spent yesterday prototyping three approaches:
→ Episodic (remember specific failures)
→ Semantic (extract patterns across failures)
→ Procedural (update decision rules automatically)
Early insight: You need all three.
One layer isn't enough because agents fail in different ways:
Sometimes they repeat the exact same mistake
Sometimes they generalize the wrong lesson
Sometimes they know what failed but not why
The breakthrough? Memory isn't about storing everything. It's about storing the RIGHT retrieval cues.
When an agent fails, it needs to remember:
→ The context that triggered the failure
→ What it tried (and why it seemed right)
→ What would have worked instead
That's three different queries into memory during execution.
Talked to a team that's been building memory for 6 months. Their biggest lesson:
"Don't optimize for perfect memory. Optimize for useful recall."
Shipping a rough prototype this weekend. Even if it's wrong, breaking it will teach me more than another 10 customer calls.
Spent 2 months building a complete AI-powered music production DAW with my best friend.
AI-controlled. Browser-based. Built for actual producers.
Same month Suno announces their "Studio" with $125M in the bank.
Theirs: Black box AI generator wearing a DAW costume
Ours: Actual DAW that happens to be smart as hell
David vs Goliath.
Let's go. 🔥 (link in the comments)
Spent 2 months building a complete AI-powered music production DAW with my best friend.
AI-controlled. Browser-based. Built for actual producers.
Same month Suno announces their "Studio" with $125M in the bank.
Theirs: Black box AI generator wearing a DAW costume
Ours: Actual DAW that happens to be smart as hell
David vs Goliath.
Let's go. 🔥 (link in the comments)
Spent 2 months building a complete AI-powered music production DAW with my best friend.
AI-controlled. Browser-based. Built for actual producers.
Same month Suno announces their "Studio" with $125M in the bank.
Theirs: Black box AI generator wearing a DAW costume
Ours: Actual DAW that happens to be smart as hell
David vs Goliath. Let's go. 🔥
(link in the comments)
I almost killed my startup yesterday because 8 people told me to.
They all said: "We need better evals for our AI agents."
So I started building eval tools.
Then I asked one founder: "Why do you run the same tests every week?"
"Because the agent keeps making the same mistakes."
Wait.
Your agent doesn't remember its failures.
That's not an eval problem. That's a memory problem.
Here's what nobody talks about:
→ Evals catch what's broken
→ Memory prevents it from breaking again
Everyone's building better tests.
Nobody's building agents that learn.
The market is screaming for evals because they don't realize memory is the real solution.
Sometimes the loudest pain point is hiding a deeper one.
Building a memory layer that makes agents learn from every failure.
Day 5: Back to memory.
Here's why.
Talked to 3 more teams last night. The eval problem is real, but it's a symptom.
The root cause? Agents forget context between sessions.
→ They can't learn from past mistakes
→ They repeat the same errors in production
→ Evals catch failures, but don't prevent them
Here's what clicked: You don't need better tests if your agent remembers what went wrong.
Memory isn't just storage.
It's the difference between:
"Your agent failed this task" (evals)
"Your agent won't fail this task again" (memory)
So I'm building both:
→ A memory layer that learns from failures
→ Built-in eval hooks to validate it's working
Talked to 2 teams already building eval tools. They all said the same thing: "Evals tell us WHAT broke. We still don't know HOW to fix it systematically."
Memory is the fix.
Still could be wrong. But this time I'm not just listening to what people say they need.
I'm watching what breaks when they ship.
Was on product hunt this morning, and found this launch : ProblemHunt.
I guess the founder didn't find a problem people would pay to solve, so he solved it.
https://t.co/56TId569YI
Everyone’s fighting “Comet vs Atlas” while I’m over here missing the good old instant browser.
AI made web search slower.
That 1-second delay? Feels like dial-up all over again.