Meet Gemma 4 12B!
A unified, encoder-free multimodal model designed to bring high-performance intelligence directly to your laptop, and released under an Apache 2.0 license.
Bridging the gap between edge efficiency and advanced reasoning. Here is what’s new with Gemma 4 12B: 👇
@forloopcodes parallel sub-agents can probably be made useful with enough guarding around file writes but people who literally have multiple harnesses open at once are stupid
@analogalok i tested the full precision BF16 weights with F16 kv with stock llama.cpp inference settings @ 128k context (essentially a production deployment) under the pi harness. it struggles deeply with tool calls (outputting json sometimes?) and will often fail to close its own thinking
@arm64le the models have executive dysfunction, if you give a complex problem they're like "implementing the middleware... considering the architecture..." in the CoT it's so bizarre. and they output that long before any tool calls. very hesitant models
@arm64le that being said though you should give qwen 3.6 27b a try at some point . really cool little model, incredibly capable if you don't let it stop until an independent verifier makes sure its done a good job (ie i use a pi extension for /goal)