Rare to hear a founder of an AI coding harness be this honest. A reason I respect @thdxr
"We’re shipping way more hacks where we should have just rethought the whole system from the ground up, or redesigned it to make it more flexible.
So I think our judgement is just off.”
Anthropic engineer:
"You're not supposed to prompt Claude. You're supposed to build a system that prompts itself."
this is one of the best workflows I've seen in a long time
in this video she breaks down exactly how most people are using Claude:
- the 14% you lose to CLAUDE.md before typing a word
- the automation workflows most users don't know exist
- the daily task pipelines that run without touching the keyboard
- the daily workflows Anthropic's own engineers automated first
if you've been using Claude for more than a month and never left the chat window, you've been using one agent when you could be running a team of them
instead of another show tonight, watch this
make sure to bookmark it before it gets lost in your feed
the guide is in the article below
Anthropic engineer:
"You can build 5 assistants in one afternoon. Each one handles a task you've been doing manually every single day."
In 45 minutes he builds 5 focused agents from scratch on camera.
Most people are still doing code review, testing, and documentation by hand every single day
Watch the session, then save all templates below 👇
introducing vyupgrade
it automatically rewrites old vyper contracts into modern vyper, then proves the rewrite is safe by checking it compiles under both source and target versions, diffing the abi, method ids and storage layout.
it supports all syntax changes from 0.2.1 to 0.4.3, as well as picking up dependencies and modules like snekmate.
some rewrites are not inherently safe and need your judgement. the tool flags those and shows you exactly what to look at.
$ uvx vyupgrade contracts/
https://t.co/c2seMgoxh0
We used these lessons to enforce a much tighter review process for our own tasks.
Dive deeper in our blog: https://t.co/ogKsyB31FQ
Check out the dataset on GitHub: https://t.co/SXVUL3kwBd
Today we’re releasing DeepSWE, a new standard for agentic coding benchmarks.
On public leaderboards, top models often look relatively close in capability. DeepSWE shows where they actually diverge, reflecting the realistic experience of developers in their day-to-day work.
Introducing: Browser Use Terminal. A Rust harness + TUI that gets real work done in the browser.🦉
Browser Harness gave LLMs freedom in Chrome.
We built a full LLM harness around it - in Rust.
> Direct CDP — raw browser control
> Real Chrome — use your logged-in browser
> Rust TUI — watch, steer, stop, resume
> 2x cheaper, 2x faster than Browser Harness
> GPT, Sonnet/Opus, Kimi, GLM, Qwen, DeepSeek
I’ve used it for SF parking permits and Azure admin permissions. It asked me to login, then finished end-to-end. What are you going to use it for?
Automate the boring stuff in the browser. 🔥
100% open source ↓
Shopify's Head of Engineering: "If you don't figure out how to harness agents in 2026, you'll be behind."
This interview is the most practical breakdown of enterprise AI coding I've seen this year.
Farhan Thawar explained the full Shopify AI playbook here.
Watch the interview, then grab the exact template below 👇
Singapore’s Foreign Minister, Dr Balakrishnan casually explaining how he built his own AI agent (a 2nd brain for diplomacy) using Claude & WhatsApp integration etc. on a Raspberry Pi
“You cannot govern a technology you have only been briefed on.” 🇸🇬
This is a worthwhile read from Meta engineer @championswimmer (who I met last time I was in London - great guy)
His point is that a lot of these “AI layoffs” could well be backwards: they are prob happening because more AI spend doesn’t correlate with better business results…
There's more detail+honesty in this note on layoffs than 10 other ones combined, whether you like it or not. A few:
- Crypto is in a down market, as is Coinbase's business right now
- Fear of doing nothing is losing
- A big gamble in a structure that has not really been tested
Two Anthropic engineers spent 24 minutes exposing every Claude Code feature you didn't know existed.
Most people will scroll past this. Don't be most people.
Imagine every pixel on your screen, streamed live directly from a model. No HTML, no layout engine, no code. Just exactly what you want to see.
@eddiejiao_obj, @drewocarr and I built a prototype to see how this could actually work, and set out to make it real. We're calling it Flipbook. (1/5)
I know a number of British people who lived 1 to 2 years in Italy and then came back. The constant is that they have young children. Whatever they tell you, if you ask them about the Italian school system, they will eventually admit that it was, if not the main one, one of the critical items for them.
Italian primary school is much harder than the British one. An awful lot of Italian parents cope with that by literally abandoning their children to their own devices. Most take a more proactive stance, so they either start tutoring their children themselves (a couple of hours a day per child starting in year 1) or pay for tutors to do it in their stead. In primary school, British kids have homework once per week. Italian kids have homework once per day, doubled over the weekend. If you visit Italian homes in the afternoon and they have children, it is pretty standard to see the kids sitting at the main table with books and notebooks spread all around, with a parent or a tutor sitting with them for the whole session.
Also, the amount of books they have to carry to school every day is borderline unbelievable. You would think they are training them to carry legionary backpacks.
For people accustomed to the gentle British primary schooling, the Italian system feels borderline insane. Note also that it has massively eased up: in my childhood, we had to memorise a long poem every weekend (which back then meant Sunday, as Saturday was school day).
h/t @GroovySciFi