When the creator of Redis starts thinking about KV cache, pay attention.
antirez is Salvatore Sanfilippo, the Sicilian programmer best known for creating Redis.
But “creator of Redis” is almost too small a label.
Before Redis, he was already an old-school systems hacker. He built hping, worked in network security, and invented the idle scan technique. This was the packet-level, C-programming, Unix-hacker world.
Then Redis happened.
The origin was not glamorous. He was building LLOOGG, a real-time web analytics service, and needed something faster and simpler than the tools he had. So he created Redis.
That is very antirez.
Start with a real bottleneck.
Avoid unnecessary abstraction.
Expose the right primitive.
Make it fast enough that people rethink the category.
Redis did not win because it looked like a traditional database. It won because it gave developers direct access to useful data structures: strings, lists, hashes, sets, sorted sets, streams, pub/sub.
It made memory programmable.
That is why his return to local AI is so interesting.
With ds4, or DwarfStar 4, antirez is not just building “another local inference engine.”
He is asking a very Redis-like question:
What is the real primitive here?
For LLMs, one answer is obvious: KV cache.
Most people treat KV cache as an implementation detail. It lives in RAM or HBM, grows with context, and quietly becomes the bottleneck.
antirez looks at DeepSeek V4 Flash, compressed KV cache, modern MacBook SSDs, and says: maybe KV cache should not only live in RAM.
His phrase is perfect:
“The KV cache is actually a first-class disk citizen.”
That one sentence is the whole story.
If Redis made in-memory data structures feel like application infrastructure, ds4 is exploring whether local LLM state can become durable infrastructure too.
Prefill once.
Persist the cache.
Resume later.
Let long-running agents reuse expensive context instead of rebuilding everything from scratch.
This matters because coding agents are not normal chatbots.
They carry huge system prompts, tool definitions, repo context, prior steps, and long task histories. If every request has to resend and recompute the entire conversation, local inference will always feel fragile and wasteful.
ds4 attacks that directly.
It is a deliberately narrow engine for DeepSeek V4 Flash, focused on Metal and CUDA, high-end personal machines, special quantization, long context, HTTP API, GGUF files crafted for the engine, official-logit validation, and agent integration.
There is also a funny and very current detail: he openly says ds4 was built with strong assistance from GPT 5.5, with humans leading ideas, testing, and debugging.
That is very 2026.
A legendary C programmer using an AI coding partner to build a local AI engine, so other coding agents can run locally with persistent KV state.
It sounds recursive because it is.
And he still has the same builder energy. After ds4 took off, he wrote that the first week felt like early Redis again, with 14-hour workdays, chaos, and excitement.
That is the part I like most: a true old-school builder.
Being able to use codex subscription to power agents in an API way is a huge gift. Noticed that usage from Pi/OpenClaw and had thought it gotta be only temporary. This is Open AI to some degree.
72 hours after YC demo day, I moved to Shenzhen for 8 weeks 🤠
I'm headed back to SF with new hardware in hand (sharing more soon), but some takeaways documented below:
> If you have even the slightest ambition to found a hardware company, visit SZ. Pre-raise, pre-team, pre-idea, pre-job departure, it doesn't matter. Just go.
> Plan your visit according to a major conference that interests you. Use that conference as a supplier meeting springboard - that's your ticket to any factory under the sun.
> At the factories, ask about lead times, don't ask about cost (wait on this). Your iteration rate is driven by the lead time on the longest lead time item in your assembly. It pays to identify these parts early to build project timelines.
> Visit Huaqiangbei (read: this is a mini-city, not a building). Robotic subassemblies, batteries, chassis's, electronic parts. They all have buildings where vendors are tightly clustered. Plan to spend 4-6 hours walking around before you find exactly what you're interested in.
> Business relationships are valuable commodities. Treat them as such. Pay attention to people, learn about them. Bring thoughtful gifts. Wait for them to sit first. With Baiju, fill the glass but with tea leave some room. Cultural customs are fun to learn, but also convey a seriousness towards the working relationship.
> Suppliers fit cleanly into discrete buckets. Level of complexity and execution on past projects indicates what is in scope for them. Trivial, but important to level your build expectations. It is easy to design a part with 12 subsequent manufacturing processes, exceptionally hard to find a supplier to fill this order.
If you need coffeeshop recs, food recs, or hotel recs I have a few.
Move to Shenzhen! Get to building!
Codex bug report:
In my /goal task that has been running over one day, there is an unrelated turn on Minecraft, obviously from another user’s ChatGPT session, dropped in. @thsottiaux