AI coding agents are getting better at writing tests that look convincing.
That doesn't make false PASSes rarer. It makes them harder to see.
I wrote about green CI, hollow tests, circular validation, and why agents need delivery systems.
https://t.co/e7yWOVWV0I
There's an issue with the selected model (claude-fable-5). It may not exist or you may not have access to it. Run /model to pick a different model.
I'm so sad.
As a result of a US government directive, we are suspending access to Claude Fable 5 for all users. You can continue to use all other Claude models.
Here’s what this means for you:
Across Claude products, new sessions will run on your selected default model or Opus 4.8, and existing Fable 5 sessions will end with an error.
On the Claude Platform, requests to Fable 5 will also return an error. Please update your integrations to other Claude models.
We know this is a disruption to your workflows; we appreciate your patience and support.
gpt-5.3-codex is gone from the freshly fetched codex debug models catalog—we only see gpt-5.5, gpt-5.4, gpt-5.4-mini, and gpt-5.3-codex-spark now. But since the old models_cache.json still has gpt-5.3-codex, picking it from an old catalog or session throws a 400 error on the server side.
Do not overthink CLAUDE.md.
It can be as simple as:
“I write practical business copy.
Keep it plain.
Avoid vague corporate language.
If something is unclear, say so.
Check the file before answering.
Tell me what changed after edits.”
That is enough to start.
Make it useful before you make it perfect.
If you use ChatGPT, Claude, or coding agents every day, make one file for them.
Not a complex system.
Just a plain note that says:
how you work
what AI may do
what AI must not decide
what tone to avoid
what to verify before replying
This is the simplest version of CLAUDE.md.
Start with one page.
You can expand the context window to 1M tokens in Claude Code by running:
/model opus [1m]
or
/model claude-opus-4-6 [1m]
This gives you a much larger context window compared to the default 200k, which means Claude can hold more of your codebase, conversation history, and tool results in memory without compaction kicking in.
This is part 2 of the False PASS series.
Part 1 (the problem): why AI agents produce tests that pass but verify nothing
Part 2 (the system): delivery gates — and why they fail too
For hands-on tools: False PASS Detection Kit ($19)
https://t.co/jyvUdObSUl
Your AI agent says "done." Tests pass. CI is green. Everything looks fine.
Then an independent reviewer finds 59 defects.
New essay: what we built after that — and how it broke, too.
A delivery gate has three rules:
1. Pass criteria defined before the agent starts — not after
2. The verifier never shares context with the builder
3. Physical execution — run the software, don't read the report
Simple rules. Hard to follow. Here's what happens when you don't:
https://t.co/JcgD357TTU
@jun_song The scary part is not that AI might be conscious.
It’s that our definition of “conscious” has been running on vibes longer than anyone wanted to admit.
@embirico The underrated part is not “future work.”
It’s outsourced anxiety.
A tiny ops manager living next to the repo, tapping your shoulder before the kitchen catches fire.
@thdxr Haiku at $0.62 is like that one seasoned kitchen hand who does all the prep work perfectly for the price of a coffee. Unpretentious and essential.
The backstory: I've been running multi-agent AI development daily. In the past year I've caught this pattern more times than I'd like to admit.
The free essay explains the problem: https://t.co/ftAZZM2nlm
The kit is the solution you use Monday morning.
I just published my first paid product.
"False PASS Detection Kit" — 3 PDF tools for catching the moment when your AI agent's tests all pass but nothing works.
$19 on Gumroad 🧵👇
What's inside:
- Detection Checklist — 5 checks to run before merging any AI-generated PR
- Destructive Verification Runbook — prove a test has value by proving it can fail
- Agent Review Separation Guide — 3 models for splitting implementation from verification
👉 https://t.co/5IL6dMS7AZ