Frank (@frankly1092) and I had such a fun time preparing and competing in this.
Special thanks to @GoogleDeepMind for their mentorship and @databricks for organizing!
Stanford University is the 2026 Databricks Grounded Reasoning Cup champion! 🏆
The inaugural Grounded Reasoning Cup at #DataAISummit 2026 is in the books, with Stanford taking 1st, UMass Amherst earning 2nd, and Yale University securing 3rd after six high-intensity rounds of live AI agent competition over 200+ years of U.S. Treasury data.
Teams mentored by @AnthropicAI, @OpenAI, and @GoogleDeepMind built agents that tackled messy historical PDFs, multimodal chart interpretation, and web-augmented reasoning on real government data.
Thank you to @USAFacts, the U.S. Treasury, our student teams from 11 universities, and everyone who joined live.
Thrilled to launch Project Genie, an experimental prototype of the world's most advanced world model. Create entire playable worlds to explore in real-time just from a simple text prompt - kind of mindblowing really! Available to Ultra subs in the US for now - have fun exploring!
We’ve pushed out the Pareto frontier of efficiency vs. intelligence again.
With Gemini 3 Flash ⚡️, we are seeing reasoning capabilities previously reserved for our largest models, now running at Flash-level latency. This opens up entirely new categories of near real-time applications that require complex thought.
It’s available in the API, and rolling out today as the default model in AI Mode in Search and Gemini app globally.
Read more on the blog at: https://t.co/Uw9bmlJvhI
More in thread ⬇️
Gemini 3 models from @Google@GoogleDeepMind have made a significant 2X SOTA jump on ARC-AGI-2 (Semi-Private Eval)
Gemini 3 Pro:
31.11%, $0.81/task
Gemini 3 Deep Think (Preview):
45.14%, $77.16/task
We’ve been intensely cooking Gemini 3 for a while now, and we’re so excited and proud to share the results with you all. Of course it tops the leaderboards, including @arena, HLE, GPQA etc, but beyond the benchmarks it’s been by far my favourite model to use for its style and depth, and what it can do to help with everyday tasks.
This is Gemini 3: our most intelligent model that helps you learn, build and plan anything.
It comes with state-of-the-art reasoning capabilities, world-leading multimodal understanding, and enables new agentic coding experiences. 🧵
C2S is now open for everyone.
The biological LLM that learns the language of cells. Free for academic and commercial use.
https://t.co/I2OYXmQ0x3
Join the growing community building with C2S. 🌱
An exciting milestone for AI in science: Our C2S-Scale 27B foundation model, built with @Yale and based on Gemma, generated a novel hypothesis about cancer cellular behavior, which scientists experimentally validated in living cells.
With more preclinical and clinical tests, this discovery may reveal a promising new pathway for developing therapies to fight cancer.
🚨 Thrilled to announce our paper “Non-Markovian Discrete Diffusion with Causal Language Models” was accepted at #NeurIPS2025! 🎉 @YaleCSDept@YaleMed@yaledatascience
We introduce CaDDi, a new framework that unifies discrete diffusion and causal LMs. A quick explainer 🧵👇
Excited to share #AlphaGenome, a start of our AlphaGenome named journey to decipher the regulatory genome! The model matches or exceeds top-performing external models on 24 out of 26 variant evaluations, across a wide range of biological modalities.1/6
If you studied algorithms, I'm sure you've heard of Dijkstra’s algorithm to find the shortest paths between nodes in a weighted graph. Super useful in scenarios such as road networks, where it can determine the shortest route from a starting point to various destinations. It's been the most optimal algorithm since 1956!
Until now.
The O(E + V log V) complexity just went down to O(E log^(2/3) V) for sparse graphs.
It would be amazing if this kind of breakthrough came through AI that can code but I guess we're not there yet..
Introducing AlphaEvolve: a Gemini-powered coding agent for algorithm discovery.
It’s able to:
🔘 Design faster matrix multiplication algorithms
🔘 Find new solutions to open math problems
🔘 Make data centers, chip design and AI training more efficient across @Google. 🧵
Theta (@trytheta) allows AI agents to learn from their mistakes in real-time. Their memory layer has already improved the accuracy of OpenAI Operator by 43% with 7x fewer steps taken.
https://t.co/9uI9vbSYLs
Congrats on the launch, @RayanGarg, @tsha444, and @_gurvir_!
We will be presenting Intelligence at the Edge of Chaos at #ICLR2025. Come visit our poster!
🖼️ Poster: https://t.co/R95o1WEj6r
📜 Paper: https://t.co/nnLMDkzwtx
We got a robot to clean up homes that were never seen in its training data! Our new model, π-0.5, aims to tackle open-world generalization.
We took our robot into homes that were not in the training data and asked it to clean kitchens and bedrooms. More below⤵️