I asked myself: if I'm shipping with AI agents every day, why can't I see what I actually built this week?
So I built a command center for it. No React. No Node. No cloud. Just open index.html.
Here's what's inside 🧵
Open-weight models are getting scary good.
The edge isn’t always the smartest single model anymore. It’s the best system around it.
Builder + Inspector agents = magic. What are you building with this? Drop your results
GLM 5.2 (text-only open model) just took #1 on Design Arena’s HTML web design benchmark — beating Claude Fable 5.
The wild part? It can’t even see what it builds. So we gave it eyes with Browser Use v2 multimodal QA agents.
https://t.co/zNxzYAZbpY
Try it yourself: Send this prompt to GLM 5.2 (in OpenCode/BrowserCode/etc):
https://t.co/wuy8XL9rrL
Pair any strong text model with Browser Use v2 and close the visual loop.
No more “it looks good in my head” — real verification.
I’ve been testing similar stacks — the cost drop is insane while output stays strong for most dev/agent tasks.
Perfect if you’re:
• Building agents
• Running heavy coding workflows
• Tired of API bills eating your margin
Who else is done with $20-50 daily spend?
R.I.P. $200+/mo on Claude Opus for everything.
If you're burning cash on frontier models for simple tasks, this changes the game:
local Ollama + cheap GLM 5.2 API + Claude Code for the hard stuff.
https://t.co/85k5xHCpmz
The real gold? Building autonomous loops with memory + verification. Stop copying prompts.
Start agents that run unsupervised, learn, and verify their own work.
This guide gives the exact 4-condition test + copy-paste setup to get your first one live this week.
@vschmalbach Decode speeds are night and day compared to the old llama.cpp backend.
It handles 8k–16k tokens very smoothly, but once you push past ~32k–40k the speed drops off noticeably (prefill especially). Still way better than before
What context lengths are you typically running?
Why the hell am I still paying for ChatGPT/Claude when I can run frontier models locally?
MiniMax M3 (1M context)
GLM-5.2 & Kimi K2.7 (coding beasts)
Qwen3 / DeepSeek-R1
Blazing fast on Apple Silicon with the new MLX update
Ollama is straight-up winning in 2026
This is how you build a real AI Second Brain with Obsidian + Claude Desktop. Based on @karpathy
’s LLM Wiki concept → video below
https://t.co/AnTbVD1qnE
@karpathy Why this hits different:
Your AI stops hallucinating generic answers and starts operating on your real context, history, and standards. It’s private, local-first, and actually yours.
The future of personal knowledge isn’t in the cloud — it’s in your graph.