@demian_ai ok but the rectangular panels still have to come from a cylindrical ingot, right? So it's just a question of at what step you cut the circle into rectangles?
Nassim Taleb: pick two people at random
If their combined height is 4.1m, it's basically 2.05 + 2.05.
If their combined wealth is $36M, it's almost never 18 + 18 - it's ~$1,000 and ~$36M.
Height lives in "Mediocristan," where the average tells you everything.
Wealth - and markets - live in "Extremistan," where one event dominates the whole picture.
Ruin there never comes from a string of bad days.
It comes from a single one.
~1hr lecture, free. The Black Swan author at Cambridge on why the statistics you were taught break exactly where it matters.
Being right on average means nothing if one tail empties the account.
@mikesimonsen I think maybe you are right; the signage should be like “coffee in 117 seconds” because the impression is you are just going to stand around waiting
Gemma 4 dropped a 12B.
I put it on RTX 5090 against its 31B sibling.
when you cut a model from 31B to 12B, what do you actually lose?
~ reasoning barely moves
GSM8K (math) 97.5 > 96.4 (−1.1)
ARC-C (sci reasoning) 97.6 > 94.0 (−3.6)
~ knowledge falls off a cliff
MMLU (world knowledge) 87.8 > 78.9 (−8.9)
HellaSwag (commonsense) 92.0 > 81.6 (−10.4)
~~~
parameters store facts, not thinking. the 19B you delete is mostly where the model kept its trivia and world-priors, cut it and recall collapses, while the reasoning machinery stays nearly whole.
a 12B reasons almost like its big brother. It just knows less.
122 tok/s vs 53 (2.3x faster generation), ~10GB instead of ~24, meaning that you get 20GB+ free on a 32GB card for long context or a second model.
so it depends of your workload:
reasoning / math / agentic loops = the 12B is nearly free
broad-knowledge Q&A with no retrieval = that's the one job worth paying for the 31B.
"Gemma 4 12B delivers benchmark performance nearing our larger 26B model", so it's worse than 26B-A4B and strictly worse than 31B; I think it's great to improve efficiency, but I want smarter models, not smaller models that are not quite as smart as the existing ones.