With only ~428B params, and ~23B activated params
M3 still handles frontier coding + long-horizon agents + native multimodal (text, image, video) at 1M-token context
few open-weight models do any of this. M3 does all of it.
Thanks @baseten 🚀
The major bottleneck in software now is the step between 95% done and ready to ship.
Here you are battling
1) spec drift
2) endless micro-bugs
3) absolutely necessary optimisations from monolithic, slop code, and
4) unexpected, random behavior from absolutely unnecessary "fallbacks"
Yet, if you don't experience this, you are NGMI.