Hey, hey! 👋 I'm Matt.
I'm a Python web developer with 15+ years of experience building on the web.
If you'd like to learn more about:
💻 Python
🐍 Django
🌎 web dev
✏️ writing good software
Follow me! I share tips and articles on these topics. 😁
I saw The Sheep Detectives today with my family and in-laws at the theater.
I thought the film was surprisingly touching and witty. I really enjoyed it.
I'm reading through a GitHub Actions config file on a Saturday (because I know how to have a good time! 😅).
This actions file is for Swamp and it's an adversarial review for PRs. It has cool techniques that I had not considered much before like making heavy use of GH Actions template variables and using the `gh` tool to update the PR with its findings.
I'm really impressed by what is here and it looks like a useful tool or pattern for other projects to adopt. What I'm not clear about is the AI cost involved for this kind of review.
https://t.co/ljixTiz5Rq
last night I was testing the hypothesis "if I give the agent the right tools and close the feedback loop, even a smaller model with a closed loop can outperform a stronger model that relies on human-observed feedback".
my setup was the following: deepseek-v4-flash (opencode-go sub btw), bevy (rust game engine), and pi coding agent. The goal was to build a flappy bird clone first, then add a space-invaders like mechanic where the bird shoots lasers from its eyes.
I did some small scaffolding, and added a pi extension that had 3 tools: getState, adjustClock and sendInput.
The tools talked to a json http endpoint embedded in the game binary. The agent could read the important game state, adjust the clock so game logic advanced in lockstep with its reasoning, and send inputs to control the game.
I have to say that because the game time was a first-class knob for the agent, it turned into a unbelievably strong way of closing the feedback loop. The agent was running the game, adjusting the json output endpoint, discovering bugs and quickly iterating. I would say it is much much more effective than using a frontier model but monitoring the output myself and giving the agent directions based on the observed results.
So the practical learning from the experiment: do whatever you can, but close the damn loop.
I like the level of rigor applied to evaluating the quality of an AI skill.
These are some interesting tools that I may explore adding to my own toolbox as I author personal skills.
I particularly like the idea that skills function somewhat like code because of how they are added to the context for LLMs.
https://t.co/b6RAFGMdgq
I am deeply impressed by this. I don't know if it's the nature of what I'm working on vs. what Dan is doing, but I'm struggling to understand how to exhaust a weekly quota in 10 minutes on personal projects.
All I can think is that at massive amount of AI inference is kicked off in parallel simultaneously. I'm really curious about what this looks like and what kinds of requests are happening.
"Ambitious" is the best word that comes to mind as I think about what must be going on.
What do people like for token-intensive workflows for personal projects?
With a semi-autonomous workflow, I can exhaust a week's quota on the $200/mo codex plan in ~10 minutes of my time. If I'm in the loop, I get ~10x(?) the productivity per token, but it's time inefficient and
It's weird to see the hammer problem manifest in someone else.
I listened to a developer on a podcast and he was praising a design pattern to a level of near absurdity. Nearly every problem was a nail for this shiny hammer.
What made me bristle so much at this? It was the lack of considerations for the negative tradeoffs. The dev's description for when the pattern *didn't* hold was shockingly weak, as if the design pattern could walk on water.
I'm all for having great and powerful tools in my toolbox, but I think there is need to be clear headed and know, with precision, when a design concept is not appropriate.
While I'm not crazy about the name, Swamp feels like the direction that software is going in. Swamp builds strongly type models of the world based on your direction, then builds deterministic workflows out of them.
This reminds me of extensions in Pi since Pi can build an extension and then hot load the extension and run it (or, that's my understanding of how it works).
I'm becoming more and more convinced that putting AI on deterministic pathways via the constraints of type systems and constrained execution environments is how we're going to get the determinism that we need when building large and complex systems.
https://t.co/LBd3CXdfTx
I'm looking at @aarondfrancis's Solo with keen interest. It solves that pain point as you push into more and more agents.
Sadly, I'm on Omarchy so I can't try it yet (no shade to Aaron; building solid products on a single platform is hard enough! 😁).
I've yet to find something like this on Linux that loads of people have rallied behind. The options that I'm encountering appear to be TUI wrappers around tmux.
Gonna try Agent Deck (https://t.co/VniRHLzzUh), but I think it lacks the polish and power of Solo.
Why does AI seem to reach for TanStack whenever I say that I need a React app?
I wanted an answer to that question, so I've been reading the docs and experimenting with TanStack libraries over the last couple of weeks.
The evidence is obvious in hindsight. TanStack is very committed to using TypeScript typing to its maximum. This is a fantastic quality for an AI agent to check the correctness of its solution.
Type systems as AI guardrails is a very good thing. I have strong conviction that typed languages (or languages like Python when it makes full use of optional typing) will come to dominate all generated code coming out of AI agents.
The correctness constraints from the type systems will be (and is already!) a huge boon.
@dansult That sounds cool!
I think AI is bringing new life back to a bunch of my side projects. It really does get the minutia out of the way, and it sounds like it's giving you back time in a similar way.
AI is definitely forcing me to be better.
I have a side project and I want to use agents to help with code maintenance like Dependabot upgrades.
The bottleneck was on deployment. The last thing I need is for changes to pile up because of dependencies and the risk they carry.
Thus, I finally added CI/CD to the project along with artifact building, acceptance tests, automated deployment, and post-deployment verification tests.
Without AI, I would not have bothered to do any of this. But the reality is that I don't fully trust AI agents. I put in CD with all these verification steps because it is the only way I could relinquish control and be able to rest comfortably.
Thankfully, AI aided me in setting up all these extra things. The work required for this side project was too much effort previously, but I could use AI to unlock more AI in a virtuous loop.
Now I'm excited for the prospects of what this means for my side project. All those issues sitting in the issue tracker that I never had time for may suddenly be possible with this new dev loop. Essentially, I'm trying to get my dev factory going so that I can be more hands off. This should let me have time to focus more on marketing and the user experience if I want to get more people using the product.
AI is a good enabler for me.
It's pretty annoying that Codex will randomly stop as if I wanted to have a chance to chime in when it's in the middle of doing work.
The number of times where my prompt message is just "go" or "keep going" is rising and I don't like it.