And yet we are civilized people! So called society can’t see right things. What happened to us? When we become so numb and dumb that facts and truth don’t touch us?
Amazing how Tucker Carlson, of all people, can succinctly sum up what happened with Jeremy Corbyn in a way that the media and political class here will still ardently deny
Eid Mubarak!
This is my gift to the community
https://t.co/fVn97wiHZl
I’ll make more details in the future but in short if you want admissibility in your agent echo system, want control, want trust then I have the answer for you! It’s an extension. @ollama@temporalio
The uncomfortable truths I’m finding:
We now have powerful frameworks, durable execution, checkpoints, time-travel debugging, observability, and even “agent harness” libraries.
Yet the majority of ambitious agent projects still collapse for the same three silent killers:
•Change or remove one node in the graph → downstream failures you can’t easily trace.
•An agent mutates state directly → replays become unreliable and side effects duplicate.
•The final answer looks confident… but there’s zero machine-checkable proof it’s actually safe, authorized, or grounded in what really happened.
We keep throwing more orchestration, more prompts, and more observability at the problem.
But what if the real gap isn’t another framework or a better prompt library?
What if the missing piece is something more fundamental — a different way to think about the entire harness itself?
A way that lets the mesh evolve safely at scale instead of becoming more fragile every time you touch it.
I’ve been deep in this with multiple systems, and the more I look, the clearer it becomes: most of us are still building production agents on architectural sand.
Question for the AI engineering community:
•Are these failure modes inevitable with today’s tools, or are we collectively missing a more foundational layer?
•Is the industry over-investing in smarter agents… while under-investing in the harness that actually makes them trustworthy at scale?
•Who else is hitting these exact walls in production — and what’s your war story?
I’m genuinely curious. The best insights usually come from the comments.
#AgenticAI #AIEngineering #ProductionAI #AutonomousAgents #LLM
@bcherny@thsottiaux@trq212 I’m big fan of you folks. Any thought corrections appreciated here!
🚨This short film exposes the true story of Israel's creation, entirely through the words of its own founders
For decades, Israel’s lies have been carefully designed to demonize its victims. To ensure that no matter the crimes it commits, the children it slaughters, the world remains incapable of empathy for Palestinians, or at best, treats it with the same apathy that has allowed this to go on for as long as it has
On the 78th anniversary of its creation, it's time the world knew that Israel's past is not different from the brutal present the world is finally seeing
Built a local coding harness powered by Gemma 4.
It runs locally, connects to my model backend, starts coding sessions, streams responses, and uses tools through a CLI-style workflow.
Still early, but the learning has been incredible.
@googlegemma#llm@ollama
“In wrenching interviews, Palestinians have recounted to me a pattern of widespread Israeli sexual violence against men, women and even children,” our columnist Nicholas Kristof writes.
https://t.co/Q1DYudXMfc
When I switched to Codex there are couple of things I found
1/ Codex very good at execution. Then define the execution. Well, it’s need little guidelines but job done. It cover edge cases.
2/ Codex good at backend code. It’s solve complex logic really well.
3/ Codex good at e2e testing and making things works
Well where it lacks?
1/ It’s design variation is very limited compare to Claude. Mean it lacks on imagination or variation
2/ Codex lacks on things alternative
3/ Codex lacks on attaching with user. Psychology wise