We built an AI model to simulate how a fruit fly walks, flies and behaves – in partnership with @HHMIJanelia. 🪰
Our computerized insect replicates realistic motion, and can even use its eyes to control its actions.
Here’s how we developed it – and what it means for science. 🧵
https://t.co/kWZUwWNOrz
Puntata 11 di #umAnI 🤖🚀
Parliamo con Marc Mezard del legame profondo tra fisica ed AI e di come si migliorano a vicenda.
Come "pensa" una rete neurale come un large language model?
Può l'AI accelerare scoperte scientifiche?
Questo e molto altro :)
#10 puntata del podcast #umani
Parliamo con Vera Gheno di:
- sociolinguistica,
- immaginare futuri con le parole,
- AI, modelli di linguaggio e impatto sulle teorie del linguaggio,
- tante altre cose interessanti :)
Enjoy!
https://t.co/YDQMdNbkQC
The last paper of my PhD is finally out ! Introducing
"Intuitive physics understanding emerges from self-supervised pretraining on natural videos"
We show that without any prior, V-JEPA --a self-supervised video model-- develops an understanding of intuitive physics !
If you're interested in AI4Science, a research grant is available to work with me and @matteobriganti2 on a cool interdisciplinary project: "AI for the Design of Open-Shell Nanographenes for Quantum Computing Applications". Reach out if you're interested!
https://t.co/V8wjuCcuYN
#9 puntata del podcast #umani
Parliamo con Giovanni Covone di:
- pianeti extrasolari,
- astrobiologia,
- ricerca della vita nello spazio,
- le storie dietro le scoperte fondamentali.
Enjoy!
https://t.co/sxtdLDbtbX
"Move 37" is the word-of-day - it's when an AI, trained via the trial-and-error process of reinforcement learning, discovers actions that are new, surprising, and secretly brilliant even to expert humans. It is a magical, just slightly unnerving, emergent phenomenon only achievable by large-scale reinforcement learning. You can't get there by expert imitation. It's when AlphaGo played move 37 in Game 2 against Lee Sedol, a weird move that was estimated to only have 1 in 10,000 chance to be played by a human, but one that was creative and brilliant in retrospect, leading to a win in that game.
We've seen Move 37 in a closed, game-like environment like Go, but with the latest crop of "thinking" LLM models (e.g. OpenAI-o1, DeepSeek-R1, Gemini 2.0 Flash Thinking), we are seeing the first very early glimmers of things like it in open world domains. The models discover, in the process of trying to solve many diverse math/code/etc. problems, strategies that resemble the internal monologue of humans, which are very hard (/impossible) to directly program into the models. I call these "cognitive strategies" - things like approaching a problem from different angles, trying out different ideas, finding analogies, backtracking, re-examining, etc. Weird as it sounds, it's plausible that LLMs can discover better ways of thinking, of solving problems, of connecting ideas across disciplines, and do so in a way we will find surprising, puzzling, but creative and brilliant in retrospect. It could get plenty weirder too - it's plausible (even likely, if it's done well) that the optimization invents its own language that is inscrutable to us, but that is more efficient or effective at problem solving. The weirdness of reinforcement learning is in principle unbounded.
I don't think we've seen equivalents of Move 37 yet. I don't know what it will look like. I think we're still quite early and that there is a lot of work ahead, both engineering and research. But the technology feels on track to find them.
https://t.co/JCxTdKpuzv
#8 puntata del podcast #umani
Parliamo con Edwige Pezzulli del
- rapporto tra scienza e società,
- questione di genere,
- visione di una scienza "plurale",
- importanza della divulgazione intesa come ridistribuzione del bene comune scientifico.
https://t.co/zOAkGg0eV1
It’s been an amazing last couple of weeks, hope you enjoyed our end of year extravaganza as much as we did!
Just some of the things we shipped: state-of-the-art image, video, and interactive world models (Imagen 3, Veo 2 & Genie 2); Gemini 2.0 Flash (a highly performant and efficient foundation model); Gemini-Exp-1206 model (top of the Chatbot Arena leaderboard); 2.0 Flash Thinking (our first ‘thinking’ model, expect a lot more news on this soon - as many of you remember, we pioneered this type of model with AlphaGo, AlphaZero, AlphaProof…); upgrades to @GeminiApp with Deep Research and more, fantastic new NotebookLM features, a new image remixing tool (Whisk); and a series of agentic research prototypes that can help people get things done (Project Astra, Mariner, Jules) all built on Gemini 2.0.
And then on top of all of that, we dropped the world’s most accurate weather prediction model (GenCast), the world’s most advanced quantum chip that performed a computation that would have taken 10 septillion years (Willow - could come in handy for training AI one day!), and it was the honour of a lifetime to receive the Nobel Prize for AlphaFold, which has revolutionised structural biology and is being used by 2 million researchers around the world to understand disease and accelerate drug discovery.
I could not be more proud of all the exceptionally talented teams at @GoogleDeepMind & @Google who have worked so incredibly hard on all these amazing projects. It’s the greatest joy one can have professionally to get to explore the outer reaches of science and human knowledge with such wonderful colleagues.
We’ve been inventing the future of AI for well over a decade now, and we’ll continue to be a relentless engine of innovation. In many ways we’re only just getting started… if you want to be at the frontier of the most exciting scientific and technological journey ever, this is the place to be, come join us!
@pwk@OpenAI No, non dà un sapore diverso. È così che funziona. Per come è pensato ARC non c'è alcuna information leak tra train e test. Vedi anche risposta di Chollet al post citato.
Che poi il "gridare AGI" sia esagerato sono d'accordo, ma forse siamo lì..
For those who didn't get it -- AlphaGo was a MCTS search process that made thousands of calls to two separate convnets in order to compute a single game move.
Something like o1 pro is also, best we can tell, a search process making thousands of calls to multiple LLMs to output a single answer.
Hence the 1:1 analogy here.
Totally agree! Human-written text naturally limits us to a human-driven condensed representation of the real world. Pre-training on videos is the way to go: 'next-frame prediction' :)
Brilliant talk by @ilyasut, but he's wrong on one point.
We are NOT running out of data. We are running out human-written text.
We have more videos than we know what to do with. We just haven't solved pre-training in vision.
Just go out and sense the world. Data is easy.
I’m pleased to announce our work which studies complexity phase transitions in neural networks! We track the Kolmogorov complexity of networks as they “grok”, and find a characteristic rise and fall of complexity, corresponding to memorization followed by generalization.
🧵