YumaKakuya @ AI Agent Delivery Tools Creator @null_founder - Twitter Profile

Pinned Tweet

YumaKakuya @ AI Agent Delivery Tools Creator

about 1 month ago

AI coding agents are getting better at writing tests that look convincing. That doesn't make false PASSes rarer. It makes them harder to see. I wrote about green CI, hollow tests, circular validation, and why agents need delivery systems. https://t.co/e7yWOVWV0I

null_founder's tweet photo. AI coding agents are getting better at writing tests that look convincing.
That doesn't make false PASSes rarer. It makes them harder to see.
I wrote about green CI, hollow tests, circular validation, and why agents need delivery systems.
https://t.co/e7yWOVWV0I https://t.co/VvjOqyETIL

0

2

0

97

YumaKakuya @ AI Agent Delivery Tools Creator

@null_founder

6 days ago

There's an issue with the selected model (claude-fable-5). It may not exist or you may not have access to it. Run /model to pick a different model. I'm so sad.

ClaudeDevs

@ClaudeDevs

6 days ago

As a result of a US government directive, we are suspending access to Claude Fable 5 for all users. You can continue to use all other Claude models. Here’s what this means for you: Across Claude products, new sessions will run on your selected default model or Opus 4.8, and existing Fable 5 sessions will end with an error. On the Claude Platform, requests to Fable 5 will also return an error. Please update your integrations to other Claude models. We know this is a disruption to your workflows; we appreciate your patience and support.

4K

45K

7K

11K

13M

0

71

YumaKakuya @ AI Agent Delivery Tools Creator

@null_founder

16 days ago

gpt-5.3-codex is deprecated via Codex's ChatGPT login, but it's still available in the API.

0

103

YumaKakuya @ AI Agent Delivery Tools Creator

@null_founder

16 days ago

gpt-5.3-codex is gone from the freshly fetched codex debug models catalog—we only see gpt-5.5, gpt-5.4, gpt-5.4-mini, and gpt-5.3-codex-spark now. But since the old models_cache.json still has gpt-5.3-codex, picking it from an old catalog or session throws a 400 error on the server side.

1

0

141

YumaKakuya @ AI Agent Delivery Tools Creator

@null_founder

20 days ago

Opus 4.8 verifies facts properly, and its phrasing is concise and on point. I'm also satisfied with its speed.

0

21

YumaKakuya @ AI Agent Delivery Tools Creator

@null_founder

28 days ago

Do not overthink CLAUDE.md. It can be as simple as: “I write practical business copy. Keep it plain. Avoid vague corporate language. If something is unclear, say so. Check the file before answering. Tell me what changed after edits.” That is enough to start. Make it useful before you make it perfect.

null_founder's tweet photo. Do not overthink CLAUDE.md.

It can be as simple as:

“I write practical business copy.
Keep it plain.
Avoid vague corporate language.
If something is unclear, say so.
Check the file before answering.
Tell me what changed after edits.”
That is enough to start.
Make it useful before you make it perfect.

1

0

31

YumaKakuya @ AI Agent Delivery Tools Creator

@null_founder

28 days ago

If you use ChatGPT, Claude, or coding agents every day, make one file for them. Not a complex system. Just a plain note that says: how you work what AI may do what AI must not decide what tone to avoid what to verify before replying This is the simplest version of CLAUDE.md. Start with one page.

null_founder's tweet photo. If you use ChatGPT, Claude, or coding agents every day, make one file for them.

Not a complex system.

Just a plain note that says:

how you work
what AI may do
what AI must not decide
what tone to avoid
what to verify before replying

This is the simplest version of CLAUDE.md.

Start with one page.

1

0

62

YumaKakuya @ AI Agent Delivery Tools Creator

@null_founder

about 1 month ago

And I am late for keeping up with it.

0

17

YumaKakuya @ AI Agent Delivery Tools Creator

@null_founder

about 1 month ago

I have things to do with my life. I have A jam-packed schedule.

1

0

19

YumaKakuya @ AI Agent Delivery Tools Creator

@null_founder

about 1 month ago

@sama The gpt-5.5 context limit via ChatGPT OAuth/Codex backend should be 1.05M, matching the API, instead of the current 400k.

0

2

0

308

YumaKakuya @ AI Agent Delivery Tools Creator

@null_founder

about 1 month ago

Hot take: openai/gpt-5.5 via API key/API docs: 1.05M context openai/gpt-5.5 via ChatGPT OAuth/Codex backend: 400K context, effectively 272K input + 128K output

0

46

YumaKakuya @ AI Agent Delivery Tools Creator

@null_founder

about 1 month ago

You can expand the context window to 1M tokens in Claude Code by running: /model opus [1m] or /model claude-opus-4-6 [1m] This gives you a much larger context window compared to the default 200k, which means Claude can hold more of your codebase, conversation history, and tool results in memory without compaction kicking in.

0

21

YumaKakuya @ AI Agent Delivery Tools Creator

@null_founder

about 1 month ago

This is part 2 of the False PASS series. Part 1 (the problem): why AI agents produce tests that pass but verify nothing Part 2 (the system): delivery gates — and why they fail too For hands-on tools: False PASS Detection Kit ($19) https://t.co/jyvUdObSUl

0

31

YumaKakuya @ AI Agent Delivery Tools Creator

@null_founder

about 1 month ago

Your AI agent says "done." Tests pass. CI is green. Everything looks fine. Then an independent reviewer finds 59 defects. New essay: what we built after that — and how it broke, too.

null_founder's tweet photo. Your AI agent says "done." Tests pass. CI is green. Everything looks fine.
Then an independent reviewer finds 59 defects.
New essay: what we built after that — and how it broke, too. https://t.co/NqGp1c5HXV

2

0

51

YumaKakuya @ AI Agent Delivery Tools Creator

@null_founder

about 1 month ago

A delivery gate has three rules: 1. Pass criteria defined before the agent starts — not after 2. The verifier never shares context with the builder 3. Physical execution — run the software, don't read the report Simple rules. Hard to follow. Here's what happens when you don't: https://t.co/JcgD357TTU

1

0

26

YumaKakuya @ AI Agent Delivery Tools Creator

@null_founder

about 1 month ago

@jun_song The scary part is not that AI might be conscious. It’s that our definition of “conscious” has been running on vibes longer than anyone wanted to admit.

0

18

YumaKakuya @ AI Agent Delivery Tools Creator

@null_founder

about 1 month ago

@embirico The underrated part is not “future work.” It’s outsourced anxiety. A tiny ops manager living next to the repo, tapping your shoulder before the kitchen catches fire.

0

32

YumaKakuya @ AI Agent Delivery Tools Creator

@null_founder

about 1 month ago

@thdxr Haiku at $0.62 is like that one seasoned kitchen hand who does all the prep work perfectly for the price of a coffee. Unpretentious and essential.

0

4

0

2K

YumaKakuya @ AI Agent Delivery Tools Creator

@null_founder

about 1 month ago

@nikita_builds 4.7 is so unbearable it’s basically a marketing campaign for OpenAI. It didn't convince me to switch; it just insulted me into it.

0

1

0

45

YumaKakuya @ AI Agent Delivery Tools Creator

@null_founder

about 1 month ago

The backstory: I've been running multi-agent AI development daily. In the past year I've caught this pattern more times than I'd like to admit. The free essay explains the problem: https://t.co/ftAZZM2nlm The kit is the solution you use Monday morning.

0

45

YumaKakuya @ AI Agent Delivery Tools Creator

@null_founder

about 1 month ago

I just published my first paid product. "False PASS Detection Kit" — 3 PDF tools for catching the moment when your AI agent's tests all pass but nothing works. $19 on Gumroad 🧵👇

null_founder's tweet photo. I just published my first paid product.

"False PASS Detection Kit" — 3 PDF tools for catching the moment when your AI agent's tests all pass but nothing works.

$19 on Gumroad 🧵👇 https://t.co/n8H1NTNu2t

1

0

57

YumaKakuya @ AI Agent Delivery Tools Creator

@null_founder

about 1 month ago

What's inside: - Detection Checklist — 5 checks to run before merging any AI-generated PR - Destructive Verification Runbook — prove a test has value by proving it can fail - Agent Review Separation Guide — 3 models for splitting implementation from verification 👉 https://t.co/5IL6dMS7AZ

1

0

43

YumaKakuya @ AI Agent Delivery Tools Creator

@null_founder

Last Seen Users on Sotwe

Trends for you

Most Popular Users