Fun test of Gemini Omni's video editing and location knowledge ✨
I uploaded a video riding in a Waymo in Menlo Park.
Then I asked Omni to re-shoot the video in different locations based on screenshots from Google Maps.
It's seamless, as you can see from the transitions 👇
Gemini Omni is a major leap in world understanding & multimodal editing! It can take photos, video & audio and build entirely new scenes. Over time it’ll be able to handle any input & any output - starting w/ video
You can even give it your own videos & iterate on your ideas:
It's definitely informative / humbling to reach a plateau, but even more informative when you learn how to break through a rank you've been stuck at for months or even years. The most important and transferrable skills are learned at these points.
it was an interesting and spiritually informative experience being young and playing dota or starcraft or what have you and realize you actually hit your elo ceiling and haven’t been improving much
Agreed. Mass surveillance violates the Fourth Amendment and has a chilling effect on freedom of expression. Surveillance systems are prone to misuse for political or discriminatory purposes.
Thanks Jeff for standing up for what's right. We need to call out and denounce the absolutely sickening execution we witnessed today perpetrated by federal agents.
This is absolutely shameful. Agents of a federal agency unnecessarily escalating, and then executing a defenseless citizen whose offense appears to be using his cell phone camera. Every person regardless of political affiliation should be denouncing this.
Today, Veo 3.1 Ingredients to Video is getting more expressive + portrait mode has arrived (we heard you!)
You can now create vertical clips with livelier movement and better control directly in the @Geminiapp, @YouTube & Google Vids.
All this plus state-of-the-art upscaling to 1080p and 4K also coming to @flowbygoogle, the Gemini API and Vertex AI.
@davidpantera_ This is definitely feasible in the future! Spending time to simulate a few ways of doing a task in a world model to determine the best next step in non time-sensitive tasks will be very valuable.
Veo is not just a video generation model, it's a zero-shot reasoner (https://t.co/i0noHrUFYM) and we've now shown it's also a capable world model.
We evaluated Gemini robotics policies in an action-conditioned Veo model and found that the results directly correlate to real-world performance. We can test rollouts entirely in simulation and expect that the most successful policies will also transfer to reality. Crazy.
Generalist robots need a generalist evaluator. But how do you test safety without breaking things? 💥
🌎 Introducing our new work from @GoogleDeepMind:
Evaluating Gemini Robotics Policies in a Veo World Simulator
https://t.co/ZjvpYXFddZ
🧵👇
Generalist robots need a generalist evaluator. But how do you test safety without breaking things? 💥
🌎 Introducing our new work from @GoogleDeepMind:
Evaluating Gemini Robotics Policies in a Veo World Simulator
https://t.co/ZjvpYXFddZ
🧵👇
Testing out Veo 3.1 Reference, you can upload any of your characters and prompt them into any scene. Just like that. And yes, I am aware this is nothing new. But it's a cool feature to have, and we need more of this. 2025 is the year we solved consistency. I'm all for it.
Symbols, space, and time can represent most of the "information". In this eval paper, we show how video models are generalist "space-time reasoners". It's like "let's think step by step" in LLMs in 2022. Veo3 is like GPT-3 in 2020, and can't wait for its thinking/RL moment.
Veo mentioned 👀
Excited to have integrated Veo into Gemini Robotics 1.5 and looking forward to seeing how world models can make robots more intelligent and useful in the real world!
Thanks @SeanKirmani@AllanZhou17 for making this collaboration happen.
We’re making robots more capable than ever in the physical world. 🤖
Gemini Robotics 1.5 is a levelled up agentic system that can reason better, plan ahead, use digital tools such as @Google Search, interact with humans and much more. Here’s how it works 🧵