A much needed step in world modeling. By generating 360° panos, world models will truly track everything happening around them, serving as a better form of memory.
Huge shoutout to @Dazitu_616 for spearheding this awesome work. It was amazing to work with you!
📢Introducing 360Anything, our method for lifting any perspective image or video to gravity-aligned 360° panoramas without using any camera or 3D information. This enables consistent novel view synthesis and 3D scene reconstruction.
Project page: https://t.co/qTOEip0Jw2
🧵
All results are unedited outputs from a single diffusion model. There is no NeRF or any postprocessing steps involved. Please check out our website for many more samples! https://t.co/yojdRhgNEn
[[THREAD]] Happy to announce 4DiM, our diffusion model for novel view synthesis of scenes! 4DiM allows camera+time control with as few as one input image.
Joint work with @srbhsxn* @lala_yi_li* @taiyasaki@fleet_dj
*equal contribution
Excited to share latest ✨Med-Gemini✨ additions - our new research unlocks possibilities in medical data analysis with 3 new models built upon Gemini 1.5 that can handle 2D medical images, and for the first time genomic risk score & 3D radiology scans.
https://t.co/QKz0NZqpDH
Excited to share our work on Video Interpolation with Diffusion Models.
https://t.co/Ckt7MtiTnf
VIDIM generates plausible short videos given a start and end frame.
Joint work with @watson_nn, @erictabellion , @holynski_ , @poolio and @jannekontkanen
Very excited to share VIDIM, our diffusion model for video interpolation. Cheap and high quality video diffusion will become a reality. I have also believed for a while now that using pixels as inputs is severely underrated-- there is so much more signal in there v.s. text.
VIDIM: Video Frame Interpolation Using Diffusion Models. Watch out for our generative frame interpolation magic in CVPR 2024. With Siddhant Jain, Daniel Watson, Eric Tabellion, Aleksander Holynski, and Ben Poole https://t.co/3DUlBMdQ6Y
Fast sampling with 'Multistep Consistency Models': We get 1.6 FID on Imagenet64 in 4 steps and scale text-to-image models, generating 256x256 images with 16 steps.
Guess which row is distilled?
With @emiel_hoogeboom@TimSalimans
Arxiv: https://t.co/BH7HzIGsgI
# on shortification of "learning"
There are a lot of videos on YouTube/TikTok etc. that give the appearance of education, but if you look closely they are really just entertainment. This is very convenient for everyone involved : the people watching enjoy thinking they are learning (but actually they are just having fun). The people creating this content also enjoy it because fun has a much larger audience, fame and revenue. But as far as learning goes, this is a trap. This content is an epsilon away from watching the Bachelorette. It's like snacking on those "Garden Veggie Straws", which feel like you're eating healthy vegetables until you look at the ingredients.
Learning is not supposed to be fun. It doesn't have to be actively not fun either, but the primary feeling should be that of effort. It should look a lot less like that "10 minute full body" workout from your local digital media creator and a lot more like a serious session at the gym. You want the mental equivalent of sweating. It's not that the quickie doesn't do anything, it's just that it is wildly suboptimal if you actually care to learn.
I find it helpful to explicitly declare your intent up front as a sharp, binary variable in your mind. If you are consuming content: are you trying to be entertained or are you trying to learn? And if you are creating content: are you trying to entertain or are you trying to teach? You'll go down a different path in each case. Attempts to seek the stuff in between actually clamp to zero.
So for those who actually want to learn. Unless you are trying to learn something narrow and specific, close those tabs with quick blog posts. Close those tabs of "Learn XYZ in 10 minutes". Consider the opportunity cost of snacking and seek the meal - the textbooks, docs, papers, manuals, longform. Allocate a 4 hour window. Don't just read, take notes, re-read, re-phrase, process, manipulate, learn.
And for those actually trying to educate, please consider writing/recording longform, designed for someone to get "sweaty", especially in today's era of quantity over quality. Give someone a real workout. This is what I aspire to in my own educational work too. My audience will decrease. The ones that remain might not even like it. But at least we'll learn something.
Hiring Research Scientists within
@GoogleDeepMind - Toronto to join our team & advance the next generation of medical AI, develop cutting-edge LLMs & Multi-modal models to tackle real-world healthcare challenges. Please submit your interest through:
https://t.co/FJBs3h7Nvr
Las palabras se las lleva el viento, Presidente
En educación: seguimos fracasando las pruebas PISA
Transparencia: No hay mejoras desde el 2019 (Transparencia Internacional)
Pobreza: incrementa desde 2019 (Banco Mundial)
Homicidio: incrementa desde 2019 (Procuraduría)
We have a student researcher opportunity in our team
@GoogleDeepMind in Toronto 🍁
If you’re excited about research on diffusion models, and generative video models, please fill the form :
https://t.co/3svxJfm8nO
and apply here:
https://t.co/82FhJvhV4B
El Procurador General de la Administración ha concluido SIETE violaciones a la Constitución Política de la República de Panamá en la Ley 406 (contrato minero) y cierra magistralmente su escrito con esta cita del ilustre Carlos Iván Zúñiga: "hay momentos supremos y de conciencia que hay que afrontar...¿sabe usted cuál es ese momento? Es el momento de la DIGNIDAD; de la autoestima, de los principios que indican que nada existe por encima de un estado de conciencia".