Even rn, I think we’re trending toward harnesses that enable agents to self improve its own harness! Coding agents can improve its own agents md file and write scripts (more or less tools) on the fly
Whether a model can effectively create just-in-time harnesses for niche tasks in one shot, idk if we can get there.
But just like how agents have gotten better at tool calling through training, it’d be super interesting to see models that are trained at harness improvement (agent engineering, whatever you wanna call it) - iteratively getting better at building its own harness through past sessions and traces
still true
just-in-time harnesses
bespokely engineered for your exact task
verified with evals
this is a recipe for today - models will alter how we build harnesses in the future, but we need to build today :)
The fallacy of this is that more creates more. More hours, more hiring, more something.
And it is true in a sense. If you put in more work, more work will happen. But I think for most startups, the leverage is really in how differently you approach the problem, how well you cultivate your team, and the strategy.
Any large company can outspend you on hours. They have thousands or tens of thousands more people, spending more hours. If hours worked were the metric, every large company and government organization would always win and do the best work. More hours, better output.
This thinking is often representative of younger founders, where the startup becomes their identity and life. They have a hard time doing anything else, and cannot understand that your work is not the person that is you. But activities outside of work can grow you as a person too and make you do better work.
I’ve never worked this way. As a designer, I always saw the need to take a step back, to take a break. At times, I might work 12 hours or 16 hours, or whatever amount was needed, but it wasn’t the norm. You just can't grind design, you need inspiration. But taking that step away from the work, would give me more perspective, inspiration and I could approach the problem differently or I could just see the solution.
Grinding is never good for any creative problem, and startups or creating new products are often mostly about creative problem solving. Grinding works ok for email jobs, or where you just executing on very clear playbook.
With Linear, we’ve never worked this way. We work reasonable hours, 5 days a week. All of us founders have families. Many of our employees have families. I personally stop every evening, spend time with the family, cook dinner for the family, eat dinner together, and focus on things outside of work. Sometimes I work in the late evenings or weekends, but to me the pride is that I don’t need to. Company should be succesful without it.
My goal is to build a company that is sustainable in the long term, and doesn’t require heroics or personal sacrifices every single day.
There are times when our team is heroic. Launches, incidents, some other work that just needs to be done. They will work late into the night because they know it is the right thing. But we don’t require that every day or every week, and the more this happens, the more I think it is a failure of our company and leadership. The team and the leaders should always keep a reserve to use when something is needed.
Our thinking was also that quality, which we value, doesn’t emerge from working more or stressing people more. It emerges when you create the conditions for it to emerge. Often it is the appreciation, space, time, and how the person feels. A person who is rested will do better work.
I wouldn’t attribute much of our success to working a lot. The success came from having clear thinking, ideas, and focus to do the right things.
I sometimes wish we could move the culture more toward a Zen master.
Real mastery is not exerting the most effort. It is achieving the outcome with the least necessary effort.
Reading this, I was thinking "how is he writing everything i'm thinking!" I've been using ChromaDB as my local vector db, paired with SQLite to build a lot of the use cases you've been working on with SQLite AI to solve.
It's awesome to hear how your team is building all of this with SQLite Extensions! (do I have that right?)
Not sure if I want to refactor my code, but I'll definitely keep in mind for my next project : )
Just learned that they call Utah, silicon slopes. Silicon bay, forrest, slopes, beach.
It’s ok guys, not every place with a tech company needs to be silicon-based
FULL EPISODE
00:05:35 - Intro
00:07:40 - Timeline
00:11:23 - OpenAI V Anthropic
00:16:36 - The Psychosis
00:24:35 - The Signs
00:31:35 - Working day without Psychosis
00:35:31 - AI Psychosis Detector
00:42:12 - Real Questions
00:44:03 - Advice
00:52:21 - Is AI really a multiplier
In practice, I’ve been using agents to search for trends from meeting transcripts. I’ve been playing with vector searching with connections, and brute force crawling to build agents context
For vector searching, I try to find “starting points” and then use preprocessed tags so an agent can explore other documents with the same tags (I think tags are a bit more flexible than tree based structures like fs)
For certain tasks, I will use subagents to brute force read every document in a given set of documents and build a working memory for a main agent, but this is only workable when working with a smaller set of documents.
I’ve definitely been thinking about connecting session context for automated tasks, especially cron-based tasks. But I think for non-cadenced tasks, I’ve personally been happier not connecting session context for the reasons I mentioned on “rabbit holes”
I’m sure there are plenty of use cases that are useful for shared memory across tasks tho! What type of tasks are you working on solving?
I haven't explored having a personal AI memory, but I do using agents to search my notes and transcripts. For this use case, I still prefer having memory tied to session/task. Every new task for an agent gets its own session. And every session gets its own memory.
I really like you're pre-processing ideas you have in you blog! so I've done some task-agnostic processing per note/document. But I do prefer an agent's memory to be "cleared" for every task.
There's less risk that "rabbit-holes" an agent goes down will pollute future tasks this way
@NewAgeRetroNerd What’s interesting is the difference between “understanding” and “mastering” the fundamentals. The latter can only be done with intentional practice. ai has empowered generalists who can now learn and “understand” a broader surface, but idk if it’s helped as much in mastery
@BenjDicken This may be oldschool, but I use tmux with neovim and opencode. tmux lets me split panes and tabs with new sessions. And then I'll just use git diff and bash to navigate my files and directories