Meet GAIA-3, Wayveโs most advanced generative world model yet. ๐
Scaling the evaluation of autonomous driving systems is one of the toughest challenges in our industry. Real-world testing matters, but it is slow, costly, and rarely captures the safety-critical events that matter most. And traditional simulation has not delivered the realism or diversity needed.
GAIA 3 changes this. It reconstructs real environments with rich fidelity and generates realistic counterfactuals that unlock safe, repeatable, scalable evaluation for modern end-to-end driving systems.
See how GAIA 3 advances Wayve toward global autonomy: https://t.co/pIk8xG1ENe
#GAIA3 #EmbodiedAI #AISafety #GenerativeAI #AutonomousVehicles
So, in my experiments, RGB ViT beats latent ViT every time. I've tried stronger/weaker augs, bigger/smaller model sizes and image resolutions. RGB versions have ~3-5% higher accuracy than the latent ones among my experiments. You also need to use โencodeโ each time if using augs
Do you think ViT trained from scratch on ImageNet would have higher accuracy in RGB or in SD's VAE latent space? In theory, the latent space should give some prior or at least works as a smart downsampling. I'm really interested in what people think. I'll share results tomorrow!
@PDillis At the same time, the encoder has seen many more images than just 1M, if you encode/decode the images, they will look kind of the same with this compression.
Or it may act as a regularizer, preventing overfitting which ViTs are prone to
Do you think ViT trained from scratch on ImageNet would have higher accuracy in RGB or in SD's VAE latent space? In theory, the latent space should give some prior or at least works as a smart downsampling. I'm really interested in what people think. I'll share results tomorrow!
Accurate generation of the next frame, using several previous ones, makes it possible to predict the future or play Fortnight in a neural network. The first baseline is simply to use pix2pixHD to predict the next frame by few previous. Quality generation is still far away.
๐ Hey there! I'm Nikita Drobyshev, a dedicated Generative AI researcher with a Global Talent UK visa. I've been in London for a year, and today, I'm opening up about my pursuit for knowledge. ๐
Imagine dreaming of a UK PhD, full of ambition and credentials, only to hit a wall of unexplained refusals. Here's my reality: I've spent a year battling for my study permit (UK's "ATAS certificate") for my PhD, facing repeated refusals without any reason. ๐ซ๐ง Just got my second refusal, and I'm clueless why.
Guess what? I'm not alone. Many peers are in the same boat. After 3-12 months of waiting, they get "sorry" letters without an explanation. It's like solving a puzzle with missing pieces.
What's really bugging me? The lack of transparency. Imagine driving without clear road signs.๐
Here's a one more fact: I'm Russian. Not sure if that matters, but Russian pals face permit issues since the war began. If not a coincidence, it's unfair โ education should be equal, no matter your origin. โช๏ธ๐ตโช๏ธ
Let's set things straight: I've never been part of any big political stuff in Russia. I'm against the war in Ukraine started by the Kremlin. I left my country last year because of my beliefs ๐๐.
๐ Let's discuss, demand clearer rules, and ensure education without barriers. Together, we can break these obstacles and prove diverse minds shape a better world.
#EducationMatters #breakingbarriers #UnityInDiversity #academia #academy #PhD #atas #Guardian #theguardian #thesun #thetimes #theindependent #ai #aicommunity #uk
@relnox@Norod78 Wow, looks cool. But if you already have โ3D conditioningโ in a way of depth, you probably wouldn't benefit from PanoramaPipeline. I mean, real 3D depth should already produce results without stitching artifiacts.
A few months ago, I noticed stitching artifacts while using @BlockadeLabs. I implemented a fix that seamlessly transitions from the rightmost part to the leftmost part. Although I forgot about it for a while, it's now incorporated into the @huggingface https://t.co/0UPbjZJmZU
The original generation, without using circular_padding=True, resulted in a stitching artifact where the left and right parts didn't match seamlessly.
It just a small tweak for MultiDiffusion, for proper panoramas I'd use @BlockadeLabs (which seems to have resolved the artifact)