@NaveenGRao@jefrankle@Muhtasham9@SquashBionic@MosaicML@neuralmagic Great Naveen. Help people train on GPU to get sparsity/quant in model and then NM deep-sparse engine will deliver the 10x speedup on CPUs. You don't need to do anything that is not training, just effective sparsity in models which @jefrankle knows a thing or two about ...
@jefrankle@Muhtasham9@NaveenGRao@SquashBionic@MosaicML@neuralmagic Isn't the $ savings https://t.co/xdtOQ4mFUe can deliver in inf cost per $ training via your platform a great measure also? if you effectively deliver users a sparse-quantized model that is 10x faster isn't that a worthy measure to consider?
@jefrankle@Muhtasham9@NaveenGRao@SquashBionic@MosaicML@neuralmagic Jonathan, could be interesting to think about the target hardware (CPU) independently of the one used for training. If you guys work on training (on CPUs or GPUs) sparse quantized models we will deliver GPU latencies on CPUs with them...