Yes.
It’s not that we’ve discovered some magic bullet, but rather that JAX, or at least the open source version of it, is mostly optimized for small to medium-sized training runs on Google TPUs, whereas we need to massive training runs on Nvidia GPUs.
Pipeline parallelism is essential and crushes fully-sharded data parallelism at scale.
And C will compile to the most efficient binary short of assembly. Maybe we will do a little assembly too.
We can see 46 billion light years in every direction.
The universe is 13.8 billion years old.
Light has not had enough time to travel that far.
Something out there is moving away from us faster than light itself, and physics says that’s allowed.
I still don’t fully understand why.
@Kekius_Sage Isn’t this confusing time and space? 46 b.l.y. Is a distance. 13.8 b.y. is time. It’s consistent with expansion of the universe and the cosmological constant.