Francois Kruta @fkruta - Twitter Profile

fkruta retweeted

about 10 hours ago

Today we’re introducing Gemma 4 12B — our latest open model that brings advanced agentic reasoning, vision and audio directly to your laptop. It delivers performance nearing our larger Gemma models with a much smaller total memory footprint, while being small enough to run locally with just 16GB of VRAM. It’s open and accessible for everyone to use under a permissive Apache 2.0 license. This is all made possible by our new, unified architecture that removes separate multimodal encoders. Here’s how we did it 🧵

Google's tweet photo. Today we’re introducing Gemma 4 12B — our latest open model that brings advanced agentic reasoning, vision and audio directly to your laptop.

It delivers performance nearing our larger Gemma models with a much smaller total memory footprint, while being small enough to run locally with just 16GB of VRAM. It’s open and accessible for everyone to use under a permissive Apache 2.0 license.

This is all made possible by our new, unified architecture that removes separate multimodal encoders. Here’s how we did it 🧵

153

6K

750

2K

395K

Francois Kruta

@fkruta

1 day ago

@DjokovicFan_ Mensik is also a strong candidate.

0

21

fkruta retweeted

Michael Guo

@Michaelzsguo

2 days ago

When the creator of Redis starts thinking about KV cache, pay attention. antirez is Salvatore Sanfilippo, the Sicilian programmer best known for creating Redis. But “creator of Redis” is almost too small a label. Before Redis, he was already an old-school systems hacker. He built hping, worked in network security, and invented the idle scan technique. This was the packet-level, C-programming, Unix-hacker world. Then Redis happened. The origin was not glamorous. He was building LLOOGG, a real-time web analytics service, and needed something faster and simpler than the tools he had. So he created Redis. That is very antirez. Start with a real bottleneck. Avoid unnecessary abstraction. Expose the right primitive. Make it fast enough that people rethink the category. Redis did not win because it looked like a traditional database. It won because it gave developers direct access to useful data structures: strings, lists, hashes, sets, sorted sets, streams, pub/sub. It made memory programmable. That is why his return to local AI is so interesting. With ds4, or DwarfStar 4, antirez is not just building “another local inference engine.” He is asking a very Redis-like question: What is the real primitive here? For LLMs, one answer is obvious: KV cache. Most people treat KV cache as an implementation detail. It lives in RAM or HBM, grows with context, and quietly becomes the bottleneck. antirez looks at DeepSeek V4 Flash, compressed KV cache, modern MacBook SSDs, and says: maybe KV cache should not only live in RAM. His phrase is perfect: “The KV cache is actually a first-class disk citizen.” That one sentence is the whole story. If Redis made in-memory data structures feel like application infrastructure, ds4 is exploring whether local LLM state can become durable infrastructure too. Prefill once. Persist the cache. Resume later. Let long-running agents reuse expensive context instead of rebuilding everything from scratch. This matters because coding agents are not normal chatbots. They carry huge system prompts, tool definitions, repo context, prior steps, and long task histories. If every request has to resend and recompute the entire conversation, local inference will always feel fragile and wasteful. ds4 attacks that directly. It is a deliberately narrow engine for DeepSeek V4 Flash, focused on Metal and CUDA, high-end personal machines, special quantization, long context, HTTP API, GGUF files crafted for the engine, official-logit validation, and agent integration. There is also a funny and very current detail: he openly says ds4 was built with strong assistance from GPT 5.5, with humans leading ideas, testing, and debugging. That is very 2026. A legendary C programmer using an AI coding partner to build a local AI engine, so other coding agents can run locally with persistent KV state. It sounds recursive because it is. And he still has the same builder energy. After ds4 took off, he wrote that the first week felt like early Redis again, with 14-hour workdays, chaos, and excitement. That is the part I like most: a true old-school builder.

Michaelzsguo's tweet photo. When the creator of Redis starts thinking about KV cache, pay attention.

antirez is Salvatore Sanfilippo, the Sicilian programmer best known for creating Redis.

But “creator of Redis” is almost too small a label.

Before Redis, he was already an old-school systems hacker. He built hping, worked in network security, and invented the idle scan technique. This was the packet-level, C-programming, Unix-hacker world.

Then Redis happened.

The origin was not glamorous. He was building LLOOGG, a real-time web analytics service, and needed something faster and simpler than the tools he had. So he created Redis.

That is very antirez.

Start with a real bottleneck.
Avoid unnecessary abstraction.
Expose the right primitive.
Make it fast enough that people rethink the category.

Redis did not win because it looked like a traditional database. It won because it gave developers direct access to useful data structures: strings, lists, hashes, sets, sorted sets, streams, pub/sub.

It made memory programmable.

That is why his return to local AI is so interesting.

With ds4, or DwarfStar 4, antirez is not just building “another local inference engine.”

He is asking a very Redis-like question:

What is the real primitive here?

For LLMs, one answer is obvious: KV cache.

Most people treat KV cache as an implementation detail. It lives in RAM or HBM, grows with context, and quietly becomes the bottleneck.

antirez looks at DeepSeek V4 Flash, compressed KV cache, modern MacBook SSDs, and says: maybe KV cache should not only live in RAM.

His phrase is perfect:

“The KV cache is actually a first-class disk citizen.”

That one sentence is the whole story.

If Redis made in-memory data structures feel like application infrastructure, ds4 is exploring whether local LLM state can become durable infrastructure too.

Prefill once.
Persist the cache.
Resume later.
Let long-running agents reuse expensive context instead of rebuilding everything from scratch.

This matters because coding agents are not normal chatbots.

They carry huge system prompts, tool definitions, repo context, prior steps, and long task histories. If every request has to resend and recompute the entire conversation, local inference will always feel fragile and wasteful.

ds4 attacks that directly.

It is a deliberately narrow engine for DeepSeek V4 Flash, focused on Metal and CUDA, high-end personal machines, special quantization, long context, HTTP API, GGUF files crafted for the engine, official-logit validation, and agent integration.

There is also a funny and very current detail: he openly says ds4 was built with strong assistance from GPT 5.5, with humans leading ideas, testing, and debugging.

That is very 2026.

A legendary C programmer using an AI coding partner to build a local AI engine, so other coding agents can run locally with persistent KV state.

It sounds recursive because it is.

And he still has the same builder energy. After ds4 took off, he wrote that the first week felt like early Redis again, with 14-hour workdays, chaos, and excitement.

That is the part I like most: a true old-school builder.

13

209

25

114

12K

fkruta retweeted

MiniMax (official) @MiniMax_AI

3 days ago

Introducing MiniMax M3: The First Open-Weights Model to Combine Three Frontier Capabilities - Coding & Agentic Frontier: 59.0% SWE-Bench Pro, 66.0% Terminal Bench 2.1, 34.8% SWE-fficiency, 28.8% KernelBench Hard, 74.2% MCP Atlas - MiniMax Sparse Attention scales context to 1M - Natively Multimodal from Step Zero API: https://t.co/fHRdSV7BwZ Token Plan: https://t.co/BDCycxepZw 🚀New! MiniMax Code: https://t.co/GvB4YiB6Ul Weights & Tech Report in ~10 Days

MiniMax_AI's tweet photo. Introducing MiniMax M3: The First Open-Weights Model to Combine Three Frontier Capabilities

- Coding & Agentic Frontier: 59.0% SWE-Bench Pro, 66.0% Terminal Bench 2.1, 34.8% SWE-fficiency, 28.8% KernelBench Hard, 74.2% MCP Atlas
- MiniMax Sparse Attention scales context to 1M
- Natively Multimodal from Step Zero

API: https://t.co/fHRdSV7BwZ
Token Plan: https://t.co/BDCycxepZw
🚀New! MiniMax Code: https://t.co/GvB4YiB6Ul

Weights & Tech Report in ~10 Days

528

8K

1K

3K

3M

Who to follow

I like challenges and to challenge. #web #javascript #vanilla #noframework #oop Author of @r_r_0 and @cosmochrony

5 days ago

@MatthieuBoeche Oui que l’Etat utilise l’IA pour se moderniser lui même et surtout réduire les dépenses…et puis les impôts.

0

1

0

22

Francois Kruta

@fkruta

5 days ago

@DjokerNole @rolandgarros Absolute legend 🐐. IWe all hope to see you at your best in Wimbledon. You deserve so much the 25th SLAM. You can make it. IDEMO Nole💪

0

137

5

1

5K

Francois Kruta

@fkruta

5 days ago

@SVolee Mais franchement on a de la peine pour le résultat. Il mérite tellement son 25eme titre! Il a joué un match stratosphérique. 💪🐐

1

0

303

fkruta retweeted

Boris Cherny

@bcherny

6 days ago

Claude Opus 4.8 is out today. It's our strongest coding model yet: up on SWE-bench Pro (from 64.3 to 69.2) and noticeably more honest about its own work. It tells you when it's unsure and catches its own bugs instead of declaring victory early. Same price as 4.7.

432

6K

352

421

339K

Francois Kruta

@fkruta

6 days ago

@BenoitMaylin Quel scandale…en plus Djoko - Fonseca ça va être du grand spectacle…

0

108

Francois Kruta

@fkruta

7 days ago

@GrandjeanJ27374 @OlivierRoland Sì sì j’ai bien compris,..Je vous remercie pour votre remarque courtoise et constructive 😉. Ce sont de vraies questions qui ne sont pas adressées dans la vidéo il me semble.

0

45

Francois Kruta

@fkruta

7 days ago

@Chess_Strategy Djoko est un exemple et une inspiration au delà du sport 🐐🐐🐐🐐🐐🐐🐐

0

24

Francois Kruta

@fkruta

7 days ago

@EricLarch @_TBSO 👏🏻 effectivement il y a un créneau aujourd’hui et une opportunité de transformation avec l”IA.

0

75

fkruta retweeted

Logan Kilpatrick

@OfficialLoganK

9 days ago

We just launched the ability to build native Android apps directly in Google AI Studio for free! Since launch last week, people have created more than 250,000 Android apps. Likely >99% of these folks never built an Android app before, everyone can now build, no coding required!

OfficialLoganK's tweet photo. We just launched the ability to build native Android apps directly in Google AI Studio for free!

Since launch last week, people have created more than 250,000 Android apps. Likely >99% of these folks never built an Android app before, everyone can now build, no coding required! https://t.co/3pNyVMfg56

352

5K

484

2K

529K

fkruta retweeted

antirez @antirez

11 days ago

You didn't see this coming in ds4-agent, did you?

27

437

15

134

39K

Francois Kruta

@fkruta

12 days ago

@antirez Indeed DeepSeek is innovating a lot to bring down compute cost. An excellent post explaining this is here

GDP @ COMPUTEX

@bookwormengr

12 days ago

https://t.co/KHnPhxJBiz

42

2K

315

3K

1M

0

169

Francois Kruta

@fkruta

12 days ago

Very deep and interesting article on Deepseek innovations and likely long term strategy play.

GDP @ COMPUTEX

@bookwormengr

12 days ago

https://t.co/KHnPhxJBiz

42

2K

315

3K

1M

0

93

fkruta retweeted

DeepSeek

@deepseek_ai

12 days ago

We are making our discount permanent! 🎉 Enjoy building with DeepSeek-V4-Pro and bring your innovative ideas to life! 🚀

1K

23K

3K

6K

7M

Francois Kruta

@fkruta

12 days ago

@AgnesDupin Complement d’accord .

0

47

Francois Kruta

@fkruta

12 days ago

@namcios Let’s see if they keep up their resolution..Click up is clearly a SaaS that is made quite obsolete by AI and is on the top list of the SaaS we will stop using this year…

0

55

Francois Kruta

@fkruta

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users