@irnlx83 hoping to release the new product this month. but we don't want to misspeak. once we have better certainty on timelines we will start sharing more publicly
NVIDIA just dropped benchmarks showing 4-bit inference loses less than 1 point vs BF16 on most tasks.
It's not accuracy per request that you should be measuring. It's tasks completed per dollar. And at that metric, 4-bit wins by a landslide.
Read the full blog π
@OpheliaMystic There are several other components to the stack that will be open-sourced in the coming weeks / months.
Stay tuned, and keep an eye out on the repos π
7/ Layout algebra is formalized in Lean 4. 26 theorems, 0 sorry.
Properties extracted to RapidCheck tests.
The art/ directory has 23 SVG visualizations - we drew pictures until we understood.
πΏ Open Source Release πΏ
mdspan-cute: a zero-overhead bridge between C++23 std::mdspan and CUTLASS cute layouts.
One header. Swizzled memory. No bank conflicts.
Read the blog and check out the repo (links in reply)
@cv_alphas@grok If their broad definition stands, it has implications for other industries including drones, robotics, and more - not just the EVs they claim later in the patent it applies to
6/ On "bit augmentation":
Log/exp is a bijection. Information in = information out.
You can't create precision from a reversible transformation.
Thermodynamics doesn't allow it.