7 days, even at 12 hr/day, is much easier than it was a few years ago.
Not only do I *want* to do that now, I actually *can*.
Working with agents feels much easier cognitively. It’s a series of short mental sprints with waiting/rest periods in between instead of constant focus for hours. Using voice input helps a lot with that too.
More AX ergonomics: Agents must create `.spec.ts.md` sidecars for each test.
They’re even closer to pure intent, easier for me to skim, and much less noisy for agents than the full `.spec.ts`.
Those also make the auto-feedback loop very easy to discover:
code change -> affected tests -> rerun those tests
The actual `.spec.ts` can then be generated from the `.spec.ts.md` via a skill pretty reliably.
Taking this to the extreme: one seam, and it’s the UI.
I’ve moved almost entirely to full E2E tests: user-perspective only, against a real pre-filled test DB that gets restored between each test. No mocks.
Yes, they’re heavy and slow. But the verbosity doubles as specs, and agents have a much easier time picking up intent vs implementation.
I’d love a real benchmark for the “dumb zone” @mattpocockuk talks about.
Not needle/haystack retrieval. More like: give the model real tasks while progressively filling context with semi-relevant prior conversation.
Where does problem-solving quality actually start dropping, and how steep is the curve?
@Dimillian In my experience, auto-running linters, IDE indexing, and tests are the biggest resource hogs.
The agent’s actual code modifications shouldn’t be enough to saturate the SSD or CPU on their own, right?
@benjamincowen Actually, that would be a very interesting tool:
Redraw the price history and see how the MAs / BMSB would respond.
What if we had an Oct 2025 blow-off top to $170k and then continued from there?
@benjamincowen It’s already been very different: we never had a blow-off top.
All of your MAs would look dramatically different now if we had. Given how atypical the "top" was, it feels like a pretty shaky basis for assuming the bottom will match prior cycles.
Being new on Twitter is wild.
Days of talking to an empty room with zero reaction, then suddenly hundreds of people are actively agreeing with you.
And this tweet? Probably straight back to the empty room.
https://t.co/dnptFBEmaO
@TrentKelly2472@thsottiaux The main point here is to use the existing batch API, aka they can process it whenever more capacity is available off-peak and in turn you get lower usage. /goal does not do this. /overnight /goal combination would be sweet though
@TrentKelly2472@thsottiaux The main point here is to use the existing batch API, aka they can process it whenever more capacity is available off-peak and in turn you get lower usage. /goal does not do this. /overnight /goal combination would be sweet though
Is there already a good tool for composing AGENTS.md / CLAUDE.md from multiple sources?
Thinking: shared team rules + personal prefs + repo-specific rules + tool-specific overrides, then generate/sync clean AGENTS.md, CLAUDE.md etc.
Any recommendations?