@SubhoGhosh02 Wrote a blog about this a week or so ago actually. I found it mainly depends on whether or not the kernel is persistent. Hilbert seemed to win when non-persistent, but regular cta swizzling works better in the persistent case
https://t.co/Kxo2icDt6i
Wrote about a few ways to add depth to a gemm kernel. The mbarrier’s role in the producer/consumer pipeline is especially elegant, and I also talk about multi-stage pipeline, wgmma depth, subtiling, and more
Old but great paper talking about how any given thing becomes the focus of our attention, which has clear implications for getting a better handle on what it means for some person (or agent) to have taste
https://t.co/DTSrAlYLiQ
One must wonder what it’s like to be a Waymo, especially on those tight turns that have mirrors so drivers can see oncoming traffic. Does it know a car is oncoming? Or maybe it just thinks a tiny car is coming out of nowhere right at it
Wait so we’re all in accord that a large reason for pushing the Mythos-will-hack-the-world notion was to get the state department to make it harder (somehow?) for Chinese labs to distill frontier models and hamstring open models even more?
@nikitabier Articles feature is great. Would love a sort of “save my location” feature to be able to easily get back to where I was if I leave the article