We talk a lot about how important it is to set up self-verification loops. Especially in the age of powerful models that can run for long periods of time, self-verification is a key ingredient that enables the model to run for much longer, delivering a result that is closer to what you intended, so you can do more without having to constantly check in on Claude as it works.
@delba_oliveira gives a great breakdown of what that looks like and why it matters
At Anthropic's event, Metaview engineer:
"We stopped fixing our prompts. The system reviews its own output and rewrites its own instructions now."
In 16 minutes, he shows the Claude Code loop running in production on thousands of reviews, not in a demo.
Watch the talk, then grab the full loop setup below👇
The dark web has massive marketplaces like Amazon. Ratings, reviews, product listings — except the products are stolen identities and login details. Yours might be listed right now.
Serus scans dark web marketplaces for your data and alerts you if it surfaces.
Explore for free.
Steps to become a senior programmer:
1. Install my /teach skill
npx skills add mattpocock/skills --skill teach
2. Create a new working directory on your laptop
mkdir junior-to-senior
cd junior-to-senior
3. Kick off your coding agent in the directory
claude
4. Copy this prompt
/teach me how to be a great strategic programmer. My opinion is that AI is eating 'tactical, on-the-ground' programming. The day-to-day work of a developer involves not only coding, but also planning, QA, codebase design, and much more. I'm interested in learning the strategic skills - that, in a previous era, would take me from junior to senior - but in this era are table stakes.
5. Paste it into the coding agent
Below is an example of what the first output will look like. I used Opus 4.8, medium effort.
6. Continue working with the agent until you're a senior
I've read every Obsidian guide on the internet.
Most of them cover one piece. Folder structure. Daily notes.
A specific plugin.
Nobody had written the complete guide that takes you from zero to a vault that actually runs your life.
So I wrote it.
Philosophy, architecture, linking system, Claude integration, five automated workflows, the daily practice, and what month six actually looks like.
Everything in one place.
Read this and Bookmark it now.
4 MAC MINIS RUNNING A 671B PARAMETER MODEL AS A CLUSTER
No data center, no Cloud, no expensive hardware and not a single API call..
Just 4 Mac minis connected through EXO running DeepSeek v3.1 671b locally and actually fast.
The part nobody talks about is that you don’t need one monster machine, you can cluster old computers you already have and split the load between them.
The full breakdown of tools that make this possible is in the article below.
Save and Read it today ↓
about loop engineering.
everyone's saying the same thing this week. you don't prompt agents anymore, you design loops that prompt them.
here's the job that loop hands right back to you.
a loop running unattended is also a loop failing unattended.
loop engineering takes you off prompting. it takes you off curating context. it takes you off babysitting a single run. it does not take you off debugging. it just moves the debugging somewhere worse, into runs you were never watching, with far too much of it to read through by hand.
even the loop engineering posts admit this themselves, usually somewhere near the end. you can only walk away from a loop if you trust the thing checking it. a checker you don't trust drops you right back into reading every output by hand, which is the exact work the loop was supposed to take off you.
so stack the layers up, prompt, context, harness, loop, and one job survives all of them. closing the loop on failure. the leverage point moved. debugging stayed exactly where it was.
i was writing about this exact gap yesterday, before the loop talk picked up today. the idea was simple. make debugging its own loop. a failure leads to a root cause, a proposed fix, a rerun against the exact inputs that broke, and a test that locks it out for good. the checker gets built from your real failures instead of guessed at up front.
Opik, the tool i was writing about, does exactly this. a built-in agent reads the trace, finds the root cause, proposes a diff, you approve it, and that failure becomes a permanent regression test. every break you debug makes the loop a little harder to break next time, which is the kind of checker the loop engineering crowd keeps saying you need before you walk away.
if you're designing loops you actually plan to walk away from, it's worth a look.
Opik is 100% open-source under Apache-2.0 license.
GitHub repo: https://t.co/MEC26owCdo
(don't forget to star 🌟)
loop engineering moved the leverage point. it didn't remove the engineer who still has to close the loop when something breaks.
the full article, Your Agent Harness Should Repair Itself, is quoted below.
Erik, Multi-Agent Research at Anthropic, breaks down the common failure modes and the context engineering that prevents them.
a persistent agent fails differently than a chat: no escalation rule and it floods you, no trigger and it never fires.
so each of my 17 prompts is written as a job: a trigger, a body, a rule for when to interrupt me. that took it from noisy to 5 weeks running unattended.
they name the failure modes.
the article has the prompts that dodge them.
you know, what to do.
firewalls can't stop this.
A developer just open sourced a tunnel that smuggles your entire internet through port 53 the port every router on earth is forced to leave open.
It's called MasterDnsVPN. It hides your traffic inside DNS queries, the one type of packet no network can block without breaking itself.
Every firewall on earth has to allow DNS. Schools, airports, hotels, hotel WiFi, entire countries running ISP-level censorship all of them keep port 53 open or nothing on the network resolves. This repo turns that loophole into a full encrypted tunnel.
Here's what makes it different from every other DNS tunnel that came before:
→ Custom ARQ layer gives you TCP-level reliability over UDP DNS, so nothing drops even on garbage networks
→ Sends every packet through up to 12 different resolver paths at the same time, if 11 fail the packet still arrives
→ Auto probes the maximum DNS payload your path can handle, then locks in the fastest MTU possible
→ AES-256-GCM, ChaCha20, AES-128, AES-192 all built in, pick your encryption
→ SOCKS5 proxy on 127.0.0.1:1080 point any browser or app at it and you're through
Killed: $12/mo Mullvad, $10/mo NordVPN, $15/mo Astrill, every commercial DNS tunnel charging monthly fees for the exact same idea.
Pre-built binaries for Windows, Linux AMD64, Linux ARM64, macOS ARM64. No Python install needed. Configure two DNS records, drop in the encryption key, run the executable.
Works in environments where every other VPN protocol is dead on arrival.
MIT License. 100% Opensource.
A CHINESE TRADER BUILT A SECOND BRAIN IN OBSIDIAN THAT GENERATES 3 TRADING IDEAS EVERY MORNING AT 6AM AND MADE $180,000 IN 6 MONTHS.
No Bloomberg terminal.
No analytics desk.
No team of analysts.
A Mac Mini by the wall.
An iPhone in his pocket.
One local Obsidian vault.
Six N8N pipelines running 24/7, pulling every article he reads, every podcast he listens to, and every voice note he drops into a Telegram bot—directly into the vault.
Every night, a neural network reads across 4,000 connected notes and finds the strongest connections between fresh information and old theses.
Every morning at 6AM, a brief lands in his inbox:
- 3 trading ideas with confidence scores
- The emerging thesis of the week
- Any note that contradicts an active position
The system only wakes him up when a fresh note contradicts his thesis, or when an idea breaks 90% confidence.
Everything else runs without him.
The monthly bill: $120 in API costs.
The monthly return: approximately $30,000 into the account.
Traditional quant funds pay teams of 8 people to produce the same flow of insights.
He pays $120 and a Mac Mini.
The full system breakdown is in the article below.
Bookmark this before you pay for a Bloomberg subscription.
Follow @cyrilXBT for every solo operator setup that changes what one person can build.
Vector databases are no longer a cloud product. They're becoming a pip install.
A new open-source project called turbovec just crossed 10K stars on GitHub. And once you understand what it does, you understand why.
It's a Rust vector index with Python bindings, built on Google Research's TurboQuant algorithm, a quantizer accepted at ICLR 2026 that compresses embeddings to within a hair of the theoretical Shannon limit.
No codebook training. No train phase. No rebuilds as your corpus grows. You add vectors, they're indexed. Done.
The headline number: A 10 million document corpus takes 31 GB of RAM as float32. turbovec fits it in 4 GB and searches it faster than FAISS.
Read that again. Faster than FAISS. The library Meta has tuned for a decade. Hand-written NEON and AVX-512 kernels beat FAISS FastScan by 12–20% on ARM and match-or-beat it on x86.
(And the recall benchmarks are published openly against FAISS as the baseline including the configs where it loses. That honesty alone is rare in this space.)
But the speed isn't even the strategic part. The strategic part is what this enables:
Fully local, air-gapped RAG.
10M documents in 4 GB means your entire company knowledge base fits in the RAM of a MacBook. Pair it with an open-source embedding model and nothing not a query, not a vector, not a document ever leaves your machine.
It also ships drop-in replacements for the vector stores inside LangChain, LlamaIndex, and Haystack. Swap one import, keep your pipeline. The switching cost is approximately zero.
The obvious comparison is SQLite.
Databases used to be servers you provisioned and paid for. Then SQLite made the database a file inside your app, and an entire category of managed infrastructure became optional for most use cases. The same compression-driven collapse is now coming for vector search.
Every startup selling "managed vector search" as a line item should be paying attention. When the index fits in laptop RAM, runs faster than the industry standard, and installs in one line the moat was never the database.
The vector database is becoming an embedded library, not a cloud service. And the frontier of RAG just moved on-device.
Really cool to see.