DeepSeek v4 Flash with *local inference* after 24h of playing with that: even with the 2 bit selective quantization GGUF, iti is the FIRST time I feel I have a frontier model running on my computer. This is *crazy*, and probably a much stronger change in the landscape than PRO.
BTW, I vibe coded this LLM inference engine example in the official blog using Kimi K2.6 on my laptop😘.
I choose to use zig, not because it is easy, but because it is hard.
I've never written any zig and metal code in my entire life, and I can just build whatever I imagine with Kimi K2.6.
https://t.co/CQM5UlYb2K
During the last week I executed very long autonomous sessions of Claude Code Opus 4.6 and Codex GPT 5.4 (both at max thinking budget), in cloned directories (refreshed every time one was behind). I burned a lot of (flat rate, my OSS free account + my PRO account) of tokens...