Commentary is one of the most important pillars of X. And sometimes the best way to share your thoughts is with video.
Today we're launching a whole new way to make them:
React with Video
Tap the repost button and start recording with green screen, split screen, or picture-in-picture.
Now available on iOS
We integrated with @squire_bot because it makes sense on the savings portion.
Clude, via our cognitive memory architecture already helps save users tokens through accurate and optimised memory retrieval (No need to scan through the entire context dump)
@squire_bot through https://t.co/Hdz1tbgVOZ lends the inference capability to bring tokens savings to another level.
Inference and memory as a tokenised commodity.
All made possible through the @solana chain
The biggest bottleneck in enterprise AI isn't intelligence.
It's Memory.
Not model memory.
Organizational memory.
The ability to understand what happened, why it happened, who was involved, and how decisions evolved over time.
Over the last year, this became one of the most common problems we encountered across enterprise onboarding, deployments, and expansion efforts.
Not because companies lacked data. They had too much of it.
Customer calls, internal meetings, Slack threads, documents, browser tabs, tickets, and notes. Months of accumulated context spread across dozens of systems and hundreds of conversations.
The information existed. The context didn't.
We've talked about Cortex before.
What's changed is why we're building it.
What started as an open-source memory engine is becoming a core part of our Corporate GTM and FDE strategy because solving the context problem consistently creates leverage across every stage of enterprise adoption.
The challenge wasn't generating answers.
The challenge was reconstructing context.
> Why was this decision made?
> What did we commit to in that customer call?
> When was this first discussed?
> Who originally proposed this approach?
> What happened before I joined the project?
These sound like simple questions. Inside most organizations, they're surprisingly difficult to answer.
The more customers we worked with, the clearer the pattern became.
Every organization has an intelligence layer.
Very few have a memory layer.
So we kept building.
This week we hit the milestone we've been working toward.
Cortex now runs entirely on-device. Achieved via implementing Ollama.
The full memory stack, from multimodal ingestion and on-device embeddings to vector retrieval, knowledge graph construction, temporal indexing, entity extraction, salience scoring, memory decay, and end-to-end recall, runs locally.
x No cloud processing.
x No external vector databases.
x No hidden synchronization.
x No data leaving the machine.
Most memory products stop at screenshots and semantic search.
Cortex is designed to model context itself. People, projects, conversations, decisions, timelines, and the relationships between them. Not just what happened, but why it mattered.
The goal isn't to capture more information.
The goal is to make accumulated knowledge usable.
For us, Cortex is becoming foundational infrastructure for enterprise onboarding, customer adoption, and institutional memory. A system that compounds context over time instead of letting it disappear into meetings, messages, and documents.
The next step is turning Cortex into a desktop experience beautiful enough to live in every day.
We're currently rolling Cortex out through a limited number of enterprise engagements as part of our FDE program.
If organizational memory is becoming a bottleneck inside your company, send me a DM.
This week, the OpenAI Foundation announced a $250M initiative to support research, infrastructure, and new ideas for navigating the economic transition AI will bring.
The size of the commitment matters.
It reflects a growing recognition that the institutions governing work, ownership, and economic participation were built for a different era.
We believe one important piece of that conversation is still missing.
Everyone is focused on models.
Everyone is focused on data.
But as agents begin to inherit the knowledge, judgment, and experience of workers, another asset is emerging:
Memory.
Memory is what makes an agent useful.
It captures context.
It accumulates decisions.
It compounds experience.
And yet there is no accepted framework for ownership, attribution, portability, or governance.
Which raises a question that will become increasingly difficult to ignore:
Who owns the memory that creates the value?
Today, the honest answer is nobody!
The standards established during the next few years will shape how value flows for decades. Not only between platforms and enterprises, but between workers, institutions, and society itself.
At Clude Labs, we have a front-row seat to this transition.
We are a for-profit company.
Our enterprise products will create significant value for the organizations that adopt them.
That reality is precisely why we believe this conversation matters.
In the coming week, we'll publish our thesis and grant proposal for PMP. An open memory standard designed as public infrastructure for the agent economy.
Because not every layer of the future should be privately owned.
#RFP Submission:
Clude Labs partnering with {REDACTED} to fine-tune the first memory-specific model
This grant proposal outlines a joint research initiative between Clude Labs and {REDACTED} to prototype and evaluate the first production-grade memory-tuned language model architecture.
Research on memory-conditioned models has existed for years. The core insight is straightforward: model weights encode persistent behavior, preferences, and personality more effectively than prompts or retrieval systems ever will. Despite this, no commercially deployed system today performs per-user memory fine-tuning at scale.
The primary blocker has not been compute infrastructure. It has been training data quality.
Fine-tuning directly on raw conversational history consistently produces degraded outputs. Contradictory user statements reinforce conflicting behaviors, stale preferences remain embedded indefinitely, and high-noise interaction logs dilute meaningful long-term signal. Existing approaches to personalized model adaptation have therefore remained largely confined to research environments, where curated datasets could be controlled manually.
Clude Labs approaches the problem differently.
Our memory architecture was designed from inception for distillation and consolidation rather than naive retrieval. The system already classifies memories by type, including durable facts, behavioral patterns, preferences, decisions, and contextual states. It continuously evaluates recency, confidence, conflict resolution, and relative importance across memories over time.
This curation layer transforms memory from an unstructured chat archive into a high-signal training substrate suitable for model adaptation.
On top of this infrastructure, we propose training per-user LoRA adapters: lightweight low-rank parameter deltas applied to a frozen base model. Rather than retraining the foundation model itself, individualized adapters encode user-specific behavioral priors and long-term personalization while maintaining efficient inference and serving costs.
The proposed research program focuses on four core objectives:
1️⃣ Training Pipeline Validation
Develop and evaluate end-to-end pipelines for generating clean memory distillations suitable for continual adapter fine-tuning.
2️⃣ Dynamic LoRA Rank Selection
Investigate adaptive rank allocation strategies where sparse memory profiles utilize smaller adapters while users with dense, consolidated histories receive higher-capacity representations.
3️⃣ Training Cadence Optimization
Measure the effectiveness of nightly versus weekly adapter refresh cycles, including retention quality, behavioral consistency, and inference stability.
4️⃣ Production-Scale Serving Architecture
Prototype low-latency adapter loading and routing infrastructure capable of supporting real-user deployment volumes.
The central hypothesis is that curated memory fine-tuning will outperform retrieval-only personalization systems across retention, coherence, and long-term behavioral alignment metrics.
If validated, this architecture represents a meaningful shift in how personalization is implemented in AI systems. Instead of models repeatedly retrieving fragmented context about a user, the model itself incrementally adapts over time through persistent individualized weight updates.
The end state is not simply an assistant that remembers information about a user.
It is a model that gradually becomes shaped by them.
@nikitabier@MarioNawfal Develop a new file type with author metadata in it
If this video comes from a WhatsApp group, how can you assess the author? Idk our phones have serial numbers maybe we can link them up to authenticate authors
Hyperbridge had processed nearly half a billion dollars in transaction volume before April.
My daily ritual was to look at the transactions we were enabling and feel proud that people could rely on & trust our infrastructure for such large volumes.
We will come back stronger
We've already paid out over $120,000 on @hackenproof to security researchers actively pentesting the @hyperbridge protocol.
Let the grind continue
https://t.co/838Cckcl0b
Corporate AI is leaving the tokenmaxxing phase and entering the mature P&L phase.
Microsoft scaled back internal Claude Code licenses.
Uber burned through its 2026 AI budget in 4 months.
GitHub is moving to usage-based billing.
Cost control isn't a footnote anymore. CFOs are pulling the reins.
But most "AI savings" come with a quiet downgrade. Smaller models. Local tradeoffs. Weaker reasoning. Cheaper bills wrapped in the language of efficiency.
It's fine for hobby projects but this breaks when the work touches revenue, ops, compliance, or judgment.
ChatGPT Memory and Claude Projects exist, but those are product features, not infrastructure. Any developer building on Claude or OpenAI APIs still pays the full conversation-history tax on every call. Every turn re-sends the whole thread. Turn 50 is carrying ~24,500 tokens of dead weight.
Some dollar math (Claude Sonnet 4.5, $3/M input + $15/M output, ~500 tokens/turn):
➡️ 50 turns/day per user: $54/month
➡️ 100 turns/day: $243/month
➡️ Multi-agent: $729/month
Every 2× in turns ≈ 4× in cost. For a 10k-user agent product that's $29M/year.
Anthropic's own engineering blog admits multi-agent systems burn 15× more tokens than they need to.
Clude keeps the same frontier model, no downgrades. The difference is retrieval, pull in what's relevant instead of re-sending everything. 60-90% lower token cost on multi-turn workloads. Around 83% on multi-agent systems where that 15× bloat compounds.
Pair it with decentralized inference (https://t.co/Hdz1tbgVOZ) and you're looking at ~90% lower AI spend, same model quality.
CFOs aren't going to care about your architecture until it shows up on the P&L.
This shows up on the P&L.
// Shipped PMP v0.1 Beta: Portable Memory Protocol.
AI memory today is a vendor lock-in trap. Switch providers and context disappears. No way to verify what is real. No way to revoke access without trusting someone deleted your data.
PMP fixes that with four primitives any server can implement:
✅ DISCOVER: hybrid retrieval (vector + keyword)
✅ RETRIEVE: fetch by content hash
✅ VERIFY: public on-chain proof, no auth
✅ CONTRIBUTE: write memory + mint commitment
VERIFY matters most.
It recomputes hashes from current state, diffs against stored state, resolves the Solana commitment, and detects drift by default. Content never goes on-chain. Legacy memo transactions that may expose plaintext return verified status but withhold transaction links.
Encryption is nearly done: envelope encryption, owner-held keys, revocable delegation. Revoke a provider and access ends cryptographically, not contractually. Search still works on encrypted memories at the same speed and quality. That was the hard part.
v0.2: 402-gated retrieves composing with x402, SUBSCRIBE for SSE streaming, ATTEST for third-party attestations.
868 of the first 1,000 memories are on-chain. Conformance suite runs green every 6h across prod and staging.
Live in prod. Spec open on GitHub.
Build against it.
I received many questions when I previously introduced Dream Cycles for Clude, and many people viewed it as just vapour.
I’m glad Frontier firms see it differently, which shows why building as a startup is so exciting. Often we’re ahead and able to ship fast.
Clude was probably the first in the AI memory space to incorporate dream cycles, and I strongly believe tokenised memories will come next.