1.1M tokens/sec on just one rack of GB300 GPUs in our Azure fleet.
An industry record made possible by our longstanding co-innovation with NVIDIA and expertise of running AI at production scale!
https://t.co/1qLvoS2z70
Falcon 9 lifts off from Florida, adding 21 @Starlink satellites to the constellation and completing our 400th overall mission with a flight-proven booster approximately eight years after our first successful reflight
Let me clear a *huge* misunderstanding here.
The generation of mostly realistic-looking videos from prompts *does not* indicate that a system understands the physical world.
Generation is very different from causal prediction from a world model.
The space of plausible videos is very large, and a video generation system merely needs to produce *one* sample to succeed.
The space of plausible continuations of a real video is *much* smaller, and generating a representative chunk of those is a much harder task, particularly when conditioned on an action.
Furthermore, generating those continuations would be not only expensive but totally pointless.
It's much more desirable to generate *abstract representations* of those continuations that eliminate details in the scene that are irrelevant to any action we might want to take.
That is the whole point behind the JEPA (Joint Embedding Predictive Architecture), which is *not generative* and makes predictions in representation space.
Our work on VICReg, I-JEPA, V-JEPA, and the works of others show that Joint Embedding architectures produce much better representations of visual inputs than generative architectures that reconstruct pixels (such as Variational AE, Masked AE, Denoising AE, etc).
When using the learned representations as inputs to a supervised head trained on downstream tasks (without fine tuning the backbone), Joint Embedding beats generative.
See the results table from the V-JEPA blog post or paper:
https://t.co/mfLvtvk8jj
A breathtaking celestial event is happening soon…
Mark your calendars for the total solar #eclipse crossing North America on April 8. Here are a few differences between the 2024 eclipse and the one in 2017. https://t.co/My3wfdNFIK
This month's full moon is also known as the Wolf Moon, the Ice Moon, or the Long Night Moon.
Will you be braving the cold to see our celestial neighbor tonight, or will you be enjoying the warm weather?
A sonification of two galaxies? Now, that’s a duet!
Scientists took data of this pair of interacting galaxies captured by @NASAHubble and converted it into sound, giving us a new way to observe #CosmicCollisions. Don’t miss a beat: https://t.co/dnXcRBUSEN