@bcherny IDK if you work on the vscode extension but it's insanely laggy on high latency connections with Remote - SSH. The ui specifically. It took 45 minutes the other day or the gui to catch up to work Claude had already finished on a 600ms rtt connection.
@DigThatData@epiqueras1 Surprisingly I've had like no issues with fixed shapes? I guess it prevents you from doing dropless MoE in the obvious way, as someone mentioned, but you can turn it into a block sparse matmul. And I like fixed shape because it lets you do cost based graph optimizations.
I got my colab notebook to work again. Sadly with the limits on GPU time colab isn't really useful any more (and they broke CLIP, somehow). But I guess you can use it with a local runtime. https://t.co/bC6sSqcvo3
Hourglass + Diffusion = β€οΈ
We introduce a new transformer backbone for diffusion models that can directly generate megapixel images without the need for multiple stages like latent diffusion.
Read here! β https://t.co/h6mCtajfUJ
Project page β https://t.co/7LUi8Uo8p3