figured out I'd be the bottleneck a while ago... even before reasoning, tool calls, mcp. then it became about managing context windows. which, oddly, took me back to '96 at RPI — first time I tried coding recursive self-improvement loops in Lisp. one thing led to another... and here we are. me + the agent fleet + a context repo, in the same loops.
Koinos is a bank for what you know.
For 5,000 years it stayed inert — in your head, lost when you moved on. AI makes knowledge runnable, and compoundable.
We built the bank for it: one founder + an agent fleet, turning what you know into capability — and into companies. 🏦
"You can offload a task, even a job — but you can never offload your learning."
Yes. @KoinosVS is the bank I've spent the past year building for exactly this — one founder's knowledge + an agent fleet compounding into capability, and into companies. Loyal to you, not any one model. Operating from production.
A frontier without an ecosystem isn't stable — so we built the ecosystem piece.
The US government, citing national security authorities, has issued an export control directive to suspend all access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the United States, including foreign national Anthropic employees.
The net effect of this order is that we must abruptly disable Fable 5 and Mythos 5 for all our customers to ensure compliance.
Access to all other Claude models is not affected.
We apologize for this disruption to our customers. We believe this is a misunderstanding and are working to restore access as soon as possible.
Read our full statement: https://t.co/bwn0sximKZ
I have been working on a tool that runs on compounding loop of knowledge and capability.
Here's the problem it solves: today you might use ChatGPT, Claude, Gemini, and other models or tools— but each one forgets what you taught it due o its nature, and none of them share what they know with each other. You repeat yourself constantly.
The loop fixes that with two parts that feed each other:
- a memory layer that remembers everything you've learned and decided, and
- a team of AI agents that does real work for you.
The magic is the loop between them: the memory makes the agents smarter, and the agents keep growing the memory. Every time around, both get better — so your knowledge and your AI helpers *compound*, like interest building on interest, instead of resetting to zero.
And it works inside the AI tools you already use — so your context and smarts follow you everywhere, even as new AI models come out.
The simple version: it turns everything you and your AI learn into capital that keeps growing — and that follows you across every AI tool.
Google has published a paper that might end the transformer era.
For the last 7 years, every major AI, ChatGPT, Claude, Gemini, has been built on the exact same architecture: The Transformer.
But Transformers have a fatal flaw.
To remember context, they have to process every single word against every other word. It’s called quadratic complexity. As your prompt gets longer, the compute cost explodes.
The alternative is the old-school RNN (Recurrent Neural Network). RNNs are incredibly cheap and fast, but they have a fixed memory size. If you give them a long document, they get amnesia.
Until today.
Google researchers published Memory Caching: RNNs with Growing Memory.
And it fixes the biggest bottleneck in AI.
Instead of an RNN having a fixed, rigid memory that constantly overwrites itself, Google gave it a "save" button.
The technique allows the RNN to cache checkpoints of its hidden states as it reads.
The memory capacity of the RNN can now dynamically grow as the sequence gets longer.
They built four different variants, including sparse selective mechanisms where the AI actively chooses exactly which checkpoints matter most.
The results rewrite the rules of efficiency.
On long-context understanding and recall-intensive tasks, these new Memory-Cached RNNs closed the gap with Transformers.
They achieved competitive accuracy without the explosive, quadratic compute cost. It perfectly bridges the gap between the cheap efficiency of an RNN and the massive capability of a Transformer.
We have spent billions scaling Transformers because we thought they were the only way an AI could remember a long conversation.
But Google just proved we don't need to process the whole history every single time.
We just needed a smarter cache.
Google has published a paper that might end the transformer era.
For the last 7 years, every major AI, ChatGPT, Claude, Gemini, has been built on the exact same architecture: The Transformer.
But Transformers have a fatal flaw.
To remember context, they have to process every single word against every other word. It’s called quadratic complexity. As your prompt gets longer, the compute cost explodes.
The alternative is the old-school RNN (Recurrent Neural Network). RNNs are incredibly cheap and fast, but they have a fixed memory size. If you give them a long document, they get amnesia.
Until today.
Google researchers published Memory Caching: RNNs with Growing Memory.
And it fixes the biggest bottleneck in AI.
Instead of an RNN having a fixed, rigid memory that constantly overwrites itself, Google gave it a "save" button.
The technique allows the RNN to cache checkpoints of its hidden states as it reads.
The memory capacity of the RNN can now dynamically grow as the sequence gets longer.
They built four different variants, including sparse selective mechanisms where the AI actively chooses exactly which checkpoints matter most.
The results rewrite the rules of efficiency.
On long-context understanding and recall-intensive tasks, these new Memory-Cached RNNs closed the gap with Transformers.
They achieved competitive accuracy without the explosive, quadratic compute cost. It perfectly bridges the gap between the cheap efficiency of an RNN and the massive capability of a Transformer.
We have spent billions scaling Transformers because we thought they were the only way an AI could remember a long conversation.
But Google just proved we don't need to process the whole history every single time.
We just needed a smarter cache.
AI today feels a bit like early mainframe computing.
Huge machines clustered far away.
Everyone else on terminals.
Yes, literally terminals again…
Only this time, the terminal talks back.
The next shift may be from cloud AI to personal inference.
From mainframes to PCs.
From AI data centers to PIs.
Personal Inference.
Oddly, the same company might do it again…
Apple Home Pro, anyone?
More on mainframes and terminals here...
https://t.co/wqTzgX1KyD
Dario Amodei: Ideology Won't Survive the Reality of AI
"We're going to find that ideology will not survive the nature of this technology. The things I'm talking about are gonna become bipartisan and universal because everyone will recognize the necessity of it." — @DarioAmodei
Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time.
Today is a hard day. I shared this note with the @linear team today: We’ve made the difficult decision to increase our workforce. This is not a cost-cutting exercise or a reflection of anyone’s performance. We’re simply reimagining every role for the agentic AI era. We’re hiring. We’re sorry about that.
Semantic decompression… “Your emphasis on a continuous zoom operator for knowledge graphs is the most novel and strategically valuable part of the agenda. The closest direct formalization I found is the March 2026 SLoD preprint, which defines a continuous zoom operator over graph representations via heat diffusion and detects abstraction boundaries from spectral structure. …” #deepresearch
“In academic terms, I think semantic decompression does deserve to be named as its own subproblem, but only with a precise definition. A workable definition is: semantic decompression is retrieval-guided, provenance-preserving, temporally scoped reconstruction of a finer-grained context view from a coarser semantic object by traversing explicitly maintained anchors into a richer underlying memory substrate, while prohibiting unsupported elaboration. That is not the same as summarization, not the same as prompt decompression, not the same as hierarchical retrieval, and not the same as GraphRAG alone. It is the missing decoder-side discipline that current literatures each touch but do not unify. (https://t.co/FGC3mk3llg)”