Exciting news: Gemini Omni Flash is now #1 in the Video Arena (both Text-to-Video and Image-to-Video)!
For Text-to-Video this is a massive +158 pt improvement over Veo 3.1 (1080p) and a large +61 pt lead over the next best model, Seedance 2.0.
Congrats @GoogleDeepMind for this huge milestone!
I will be in Denver for CVPR from June 3rd to 6th! ποΈ
I'm looking forward to connecting and discussing anything related to world models and verifiable rewards.
Or just even stress for graduation and job hunting, unhappiness of paper rejection, are all welcome.
I think people don't realize why Gemini Omni is different than other video AIs. It is fully multimodal, so it can edit video natively, too
I took the famous "train " movie from 1896 & made it a bullet train, LEGO, added a time traveler, a centipede, muppets... (see reflections?)
Congratulations to Dr. Yuyang Hu for successfully defending his PhD at WashU ESE. We are very impressed with all the research you have done over the past few years and wishing you even greater achievements in the years to come! π
Videoβs here to stay - introducing Veo 3.1 Lite, our most cost efficient video generation model to date, and on April 7th we are also reducing the price for Veo 3.1 Fast : )
Ten years ago, AlphaGoβs legendary match in Seoul heralded the start of the modern era in AI. Its famous βMove 37β signaled to us that AI techniques were ready to tackle real-world problems in areas like science - and ideas inspired by these methods are critical to building AGI
Everyone knows pretraining is powerful, but its potential in IR tasks beyond SFT or RL remains largely unexplored.
In this work, we show that reusing pretraining (e.g., Flux) is easier than you think!
π§± LEGO doesn't need pair-wise data and converts pretrained knowledge in an unsupervised (pseudo-supervised) way.
Huge congrats to our intern @YuyangHu_666 for this great work! This is my final paper of 2025. Can't wait to show you what we have for an even more exciting 2026! π
LEGO: our post-training framework adapts diffusion models to unseen domains without paired ground truth. Using large-scale image generation models as reference we synthesize "oracle" training pairs
LEGO creates sharp high quality results on real images where others fail
1/5
Honored to be named an IEEE Fellow for contributions to image processing, computer vision & biometrics. Also grateful to be an AAAI Senior Member and a 2025 Clarivate Highly Cited Researcher. Huge thanks to my students, mentors & collaborators! @jhuclsp@HopkinsEngineer
Hello from #NeurIPS2025 π
We're hosting a range of sessions at the @Google booth, including a Q&A with @JeffDean and the Gemini team, plus demos like SIMA 2 β our AI agent for 3D virtual worlds.
See the full schedule β https://t.co/B1TQ335HoS