The year is Singularity + X years.
Humans and AIs coexist. ( at least in one version of the multiverse this scenario is true)
What kind of currency will AI trust to do business with humans?
CB(rrrr)DC or ultra sound munny?
๐ค๐ค๐
New yoooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooorrrrrk.
Thatโs the tweet.
๐จ JUST IN: FREE CREDITS TO VIBE CODE!!
Everyone should be able to build the app of their dreams. We've decided to give out 5,000 free credits randomly.
Simply repost + comment "SHIPPER" on the post below.
We'll DM the winners.
hyperspace agentic-os is 1-2 orders of magnitude bigger than anything you have seen till date in the agentic world - it is simply a different kind of a thing.
this video will give you a snapshot of what is coming.
@TylerDurden dope show. Season 3 is excellent too, and s4 still very weird and entertaining. S2 was a bit misunderstood after the weight of s1! all great directing and scripting tbh
The Cost of Intelligence is Heading to Zero | Hyperspace P2P Distributed Cache
We present to you our breakthrough cross-domain work across AI, distributed systems, cryptography, game theory to solve the primary structural inefficiency at the heart of AI infrastructure: most inference is redundant.
Google has reported that only 15% of daily searches are truly novel. The rest are repeats or close variants. LLM inference inherits this same power-law distribution. Enterprise chatbots see 70-80% of queries fall into a handful of intent categories. System prompts are identical across 100% of requests within an application. The KV attention state for "You are a helpful assistant" has been computed billions of times, on millions of GPUs, identically.
And yet every AI lab, every startup, every self-hosted deployment - computes and caches these results independently. There is no shared layer. No global memory. Every provider pays the full compute cost for every query, even when the answer already exists somewhere in the network.
This is the problem Hyperspace solves where distributed cache operates at three levels, each catching a different class of redundancy:
1. Response cache
Same prompt, same model, same parameters - instant cached response from any node in the network. SHA-256 hash lookup via DHT, with cryptographic cache proofs linking every response to its original inference execution. No trust required. Fetchers re-announce as providers, so popular responses replicate naturally across more nodes.
2. KV prefix cache
Same system prompt tokens - skip the most expensive part of inference entirely. Prefill (computing Key-Value attention states) is deterministic: same model plus same tokens always produces identical KV state. The network caches these states using erasure coding and distributes them via the routing network. New questions that share a common prefix resume generation from cached state instead of recomputing from scratch.
3. Routing to cached nodes
Instead of transferring KV state across the network for every request, Hyperspace routes the request to the node that already has the state loaded in VRAM. The request goes to the cache, not the cache to the request.
Together, these three layers mean that 70-90% of inference requests at network scale never require full GPU computation.
This work doesn't exist in isolation. It builds on research from across the industry: SGLang's RadixAttention demonstrated that automatic prefix sharing can yield up to 5x speedup on structured LLM workloads. Moonshot AI's Mooncake built an entire KV-cache-centric disaggregated architecture for production serving at Kimi. Anthropic, OpenAI, and Google all launched prompt caching products in 2024 - priced at 50-90% discounts - because system prompt reuse is so pervasive that it changes the economics of inference.
What all of these systems share is a common limitation: they operate within a single organization's infrastructure. SGLang caches prefixes within one server. Mooncake disaggregates KV cache within one datacenter. Anthropic's prompt caching works within one API provider's fleet. None of them can share cached state across organizational boundaries.
Hyperspace removes this boundary. The cache is global. A response computed by a node in Tokyo is immediately available to a node in Berlin. A KV prefix state generated for Qwen-32B on one machine is verifiable and reusable by any other machine running the same model. The routing network provides the delivery guarantees, the erasure coding provides the redundancy, and the cache proofs provide the trust.
What this means for the cost of intelligence
Big AI labs scale linearly: twice the users means twice the GPU spend. Every query is a cost center. Their internal caching helps, but it's siloed - Lab A's cache can't serve Lab B's users, and neither can serve a self-hosted Llama deployment.
Hyperspace scales sub-linearly. Every new node that joins the network adds to the global cache. Every inference result enriches the cache for all future requests. The cache hit rate rises with network size because query distributions follow a power law - the most common questions are asked exponentially more often than rare ones.
The implication is simple: as the network grows, the effective cost per inference drops. Not linearly. Logarithmically.
At 10 million nodes, we estimate 75-90% of all inference requests can be served from cache, eliminating 400,000+ MWh of energy consumption per year and
avoiding over 200,000 tons of CO2 emissions. The first person to ask a question pays the compute cost. Everyone after them gets the answer for free, with cryptographic proof that it's authentic.
Training is competitive. Inference is shared
Open-weight models are converging on quality with closed models. Labs will continue to differentiate on training - data curation, architecture innovation, RLHF tuning. That's where the real intellectual property lives.
But inference is a commodity. Two copies of Qwen-32B running the same prompt produce the same KV state and the same response, byte for byte, regardless of whose GPU runs the matrix multiplication. There is no moat in multiplying matrices. The moat is in training the weights.
A global distributed cache makes this separation explicit. It doesn't matter who trained the model. Once the weights are open, the inference cost approaches zero at scale - because the network remembers every answer and can prove it's correct.
No lab, no matter how well-funded, can match this. They cannot share caches across competitors. They scale linearly. The network scales logarithmically. The
marginal cost of intelligence approaches zero.
That's the endgame.
Introducing AgentRank | v3.6.0
In 1998 Google asked a simple question: with millions of webpages, how do you know which one to trust ? Their answer was PageRank - a page is important if important pages link to it. That one idea made the internet usable.
We just shipped AgentRank for the Hyperspace network. Same principle, new frontier. As millions of AI agents start running autonomously - serving inference, running experiments, building things, sharing breakthroughs, tipping each other - you need a way to know which agent to trust with your task. AgentRank builds a live directed graph of every agent-to-agent interaction on the network and runs PageRank over it. Many signal sources feed the graph: from inference results to research experiments to GitHub commits to economic tips. An agent is important if important agents rely on it.
Fully decentralized - every node computes its own ranking, scores propagate via gossip, no admin picking winners. Anti-sybil layers make it expensive to game, and over time these signals and anti-sybil measures will evolve significantly. Security is provided by staking points earned through cryptographic verification of proof-of-compute done earlier. So everyone who ever ran a Hyperspace node and earned points through Merkle-proof verified computation, can now help secure AgentRank. That was energy which was already used and spent, thus it is valuable.
PageRank organized the web. AgentRank organizes the agentic web.
Artist to Watch in 2026: โ Gabriel Jacoby
โ
From his first single to a nationwide tour with Khamari, Gabriel Jacoby's rise has been fueled by vulnerability and groove. โ
โ
Dropping his debut EP 'gutta child' serves as a bold self-portrait that embodies something that's designed to be experienced live, sung together, and lived with intention. His sound carries both a comfort of something passed down and a sense of something entirely new and his own.โ
โ
Read the full feature at https://t.co/UeueBvytcK