1/ I ran Claude Code for six months. Then I ran Codex. Then I ran both at the same time, hoping the combined cost would finally justify the output.
It did not.
The problem was not the models. Claude is excellent. The frontier models are the best available.
The problem was memory.
Every session starts from zero. The agent does not remember what you taught it last week, what worked, what failed, or why a decision was made.
I wanted an agent that gets smarter over time because it remembers what it learns.
🧵
@saen_dev@ClementDelangue Real safety work happens at system architecture. That's what OverCR is.. a deterministic state machine between the agent and its tools. Every action, transition, and decision logged to an append-only audit trail. Guardrails are theater. Architecture is safety.
The architecture is legitimate; proper nonce management, real MultiversX tx handling, the README anchor on mainnet is verifiable. Appreciate you sharing it.
For our stack though, it's not the right fit. We already have git for immutable audit trails and OverCR's integrity hash chains for session-level verification. Adding on-chain anchoring would mean every vault write costs $0.05+ and introduces a dependency on MultiversX network availability and a repo that's primarily AI-generated with no open source license. The marginal value over what we already have is near zero for our use case.
Good luck with the project though! if you ever need legal-grade proof of existence use cases, the architecture is solid.
1/ I ran Claude Code for six months. Then I ran Codex. Then I ran both at the same time, hoping the combined cost would finally justify the output.
It did not.
The problem was not the models. Claude is excellent. The frontier models are the best available.
The problem was memory.
Every session starts from zero. The agent does not remember what you taught it last week, what worked, what failed, or why a decision was made.
I wanted an agent that gets smarter over time because it remembers what it learns.
🧵
Ran into the same wall. The fix isn't teaching the agent what to throw away, it's never storing the wrong thing in the first place. Built OverCR with schema-constrained fact fences.. the agent reads curated context for each task instead of dumping everything into a wiki that bloats. Intentionally constructed context beats garbage collection.
100%. The tool boundary is where agents actually die.. a garbage response or malformed schema cascades through the whole loop, not just one call. Built OverCR to handle exactly this: structured context files the agent reads each cycle so it can tell when the tool output drifted from what it was supposed to produce.
@cms_@Marie_Haynes Building the same direction with OverCR. Structured context files the agent reads each cycle. Markdown vault as the agent's external memory. Portable, inspectable, file-based. No vector DB. We’re early.
@danielchalef@krishnan The eval design issue makes the broader point: memory quality depends on what you retrieve and how you constrain it, not just what you store. Built OverCR with schema-constrained fact fences in a markdown vault so the agent reads curated context, not retrieved noise.
@kreoxi Video is on the way of using my stack. Right now I just have my github repos up with the entire codebase I use alongside Hermes and Obsidian. The article explains how I set them all up to work together.
All 4 pieces are open source, OverCR and Cammander can be picked up and configured. Let me get a solid prompt for Hermes that will set up all of the skills needed. Obsidian you’d want to set up yourself. The full article is a little more detailed if you want to pick it apart. I’ll be making a PR for OverCR within Cammander while Hermes watches while recording for a full tutorial and demo.
@krishnan AI memory sounds like personalization. In practice, it's context governance. Built a structured context layer that reads from a markdown vault each cycle so the agent can tell when it's drifting.. no vector DB, no similarity search, just deterministic facts picked for the task.
11/
The result is not polished like a commercial product.
It is better than a commercial product because it knows me.
The models will come and go. And it’s not your decision when they do.
The memory stays. No matter who tells you what models you’re allowed access to.
Full article:
https://t.co/mtbbvmukE1
10/
The third theory is deterministic governance.
I do not want a language model deciding whether governance rules should apply.
Rules should be rules. Models should be models.
Keep those responsibilities separate.