@giffmana Yeah, agents really take away many headaches from Linux. Another awesome use case I recently discovered is letting them deal with tricky git issues like resolving rebase conflicts.
@giffmana I recently let Gemini guide me through setting up a raspberry pi as a time machine backup server which was a total breeze! How much longer until we just give LLMs root and let them deal with all our sysadmin stuff?
Attending #ICCV2025? Come chat with us about our Minerva dataset that tests if models can truly reason about videos! π΅οΈββοΈ
@ahmetius and @SachitMenon will be presenting the dataset at Poster Session 5 tomorrow (Thurs, Oct 23) morning. Find them at poster #391.
We're excited to release Minerva π΅οΈββοΈ, a benchmark to evaluate if AI can truly reason about videos, from spotting game-changing moments in sports π to understanding character motivations in short films πΏ. We provide the "why" behind the answers! Pointers below π
Our team is hiring! If you have experience in video understanding and/or generation, join us @GoogleDeepMind and help push the frontiers with Veo and Gemini!
We're hiring at @GoogleDeepMind! Looking for a talented Research Engineer to help build the future of Video generation and undrestanding (Veo and Gemini).
Apply here: https://t.co/hYCj2jgvgw
Excited that our Minerva and Neptune datasets are both featured in the Gemini 2.5 tech report! Minerva is among the most challenging video benchmarks with a large gap between SotA (Gemini 2.5 Pro, 67.6%) and humans (92.5%).
https://t.co/mWROj5JXSz
The newly generally available Gemini 2.5 Flash and Pro are even better at video understanding than the versions we shared in the blog a month ago, see more details in the tech report π
Excited! VideoPrism-Base/Large are publicly available now: https://t.co/g5BNiA5O05
Check it out if you need a versatile video encoder for video-language or video-native tasks. Feedback appreciated!
Gemini 2.5 Pro sets the state of the art on our newly released Minerva video reasoning benchmark by scoring 63.5%.
π Paper: https://t.co/nEWfr1SbqA
π Dataset: https://t.co/JengJVEgH6
A lot of work went to make Gemini 2.5 SOTA at video understanding, check out this π§΅ for more details!
Looking back at where we were a year ago, the progress really feels phenomenal!
So many things to unlock and enable from video π₯ and we are only getting started!
We're excited to release Minerva π΅οΈββοΈ, a benchmark to evaluate if AI can truly reason about videos, from spotting game-changing moments in sports π to understanding character motivations in short films πΏ. We provide the "why" behind the answers! Pointers below π
The newly released Gemini 2.5 Pro (Preview 05/06) sets the state-of-the art on Minerva with 63.5% accuracy. Human accuracy is 92.5%.
https://t.co/qrDY5qqp2P
Excited to share Long-Video Masked Autoencoder (LVMAE) our team just published at @NeurIPSConf! We boost the context length of video models using an adaptive decoder and a dual-masking strategy and achieve SotA on several video benchmarks.
Paper: https://t.co/XeBME5RvFX
Training video understanding models on longer contexts is computationally intensive. To address this, we present a novel approach that reduces the computational load while also improving the quality of the learned representations. More at: https://t.co/56Vj3kOzOl
A nice new benchmark for long video understanding by Tobias Weyand @0xtob and others. This is likely to be one of the new frontiers of capabilities for large-scale multimodal models, and it's great to have a new benchmark to assess others in this area.
Can #AI truly understand long videos? Tobias Weyand & the Google Research team are testing the limits w/ Neptune, an open-source benchmark for long video understanding. Dive into the details & see how AI tackles temporal reasoning, cause & effect, & more βhttps://t.co/jNkgEYkdFA
The other day I let my kids talk to Gemini live. Today my 3 year old asked my 6 year old: "Can you tell me a joke?" - 6 year old: "Sorry, I'm just a language model."