CEO @RippletideCo | Your agent decided in production. Can you reproduce it? | Decision runtime for AI agents | 3x founder, 3x exits: SAP, CircleCI, Steria
@JasonSCui@a16z published the clearest diagnosis of enterprise AI failure I've read from an investor.
Thesis: agents fail because they're blind. Build a context layer.
I agree. And that's why I wrote this.
Context tells an agent what to know. It doesn't tell it what it's authorized to do.
A bad answer is an error. A bad action is an incident.
Context layers are the prerequisite. Decision layers are the guarantee.
→ https://t.co/V80CZdF1mm
Boris Cherny of Anthropic on $NOW: "If I'm doing something and I don't have the context, I'm not going to do a great job... ServiceNow is a really a great way to bring in that context that it needs to do the job."
One of the best pieces of product engineering analysis I’ve read this year.
The key insight: the prompts aren’t documentation. They’re compiled product decisions. Every line reveals a tradeoff, initiative vs. overreach, memory vs. noise, speed vs. cache cost. Anthropic didn’t write instructions for an LLM. They wrote an operational philosophy.
The Claude vs. Codex framing is the most useful for builders. Claude pushes toward initiative. Codex pushes toward precision. Two different theories of what an agent should do when facing ambiguity, one says “move forward,” the other says “don’t drift.”
The deeper takeaway: prompt cache isn’t an infra optimization. It’s a design constraint that shapes observable product behavior. When token economics drive UX decisions, you’re doing cost architecture, not just engineering.
And for anyone building agentic systems: the value is no longer in the model. It’s in the decision layer that controls what the model does and what it doesn’t.
https://t.co/36fhz5bcoT
Hey @garrytan, just shared the addition I use in gstack: plan enforcement using the Claude Code hooks. helped beat the 5 tabs ceiling
Feel free to incorporate if it makes sense to others https://t.co/YG8MEL6Qq9
@JasonSCui@a16z published the clearest diagnosis of enterprise AI failure I’ve read from an investor.
Thesis: agents fail because they’re blind. Build a context layer.
I agree. And that’s why I wrote this.
Context tells an agent what to know.
It doesn’t tell it what it’s authorized to do.
A bad answer is an error.
A bad action is an incident.
Context layers are the prerequisite. Decision layers are the guarantee.
→ https://t.co/V80CZdF1mm
Keycard, https://t.co/vMOn33dnBw, Operant AI. Three new products launched around RSAC 2026, all framing it as "guardrails." The distinction worth making: guardrails block categories of action. A decision runtime validates the specific action your agent is about to take, against your rules, before it executes.
Oracle just put AI agents in charge of finance and procurement decisions. Massive. And the right next question isn't "will it work?" It's "can you replay what the agent knew and decided, six months from now?" That's what makes it enterprise-grade.
Honest question for Claude Code users:
Is it actually saving you time, or just making coding more fun?
For me, clearly the latter.
I ship faster. I enjoy it more. But I’m not sure I’m delivering more value than before.
The dopamine of “it works!” has never been higher.
The question of “should I have built this?” has never been more ignored.
Anyone else feeling this?
New from @ETH_en: LLM-generated AGENTS.md files reduce agent performance by 3% and inflate costs by 20%. More rules = fewer rules followed. The fix is not a better context file. It's enforcement at the execution layer, not in the prompt. https://t.co/uPovS2RF38 https://t.co/WAfOyf07VE
You wrote the rules in CLAUDE.md. The agent read them, agreed, and did the thing you told it not to. The context file is not the enforcement layer. The hook is. Knowing a rule is not the same as enforcing it. https://t.co/cjzD4Kc74k
@NVIDIA ships NemoClaw at #GTC26: policy guardrails for AI agents. Right direction. But guardrails block categories. A decision runtime validates each specific decision against your enterprise rules, before execution. The fence is not the judge. @RippletideCo
Upstream scope change triggers a re-evaluation cascade. The graph knows which downstream decisions referenced that node, so you invalidate selectively, not a full replay. That’s the core reason markdown falls short: you can’t query “what depends on this.” We call it decision lineage. It’s also what makes audits tractable.
16 @AnthropicAI agents built a 100,000-line Rust C compiler from scratch. 2,000 sessions. $20,000 in API. How do you validate the decisions 16 agents made across 100K lines before it hits production? The right question nobody's asking. https://t.co/e7EWfHXsQ2