Google I/O TLDR: Good improvements across the whole AI stack.
3.5 Flash (the model): More persistent than before and codes well. No wall yet.
Antigravity (the harness): Reliably runs for hours now. Early signs of hands-free self-improvement.
Spark (the interface): Finally connects a decent model and harness to your email, calendar and workspace. Instead of just answering questions it can actually do work for you. Skills and schedules and all the other claw goodness.
Omni (the future): Closes the gaps between Gemini for text and visual/audio generation variants. This is the way.
TPU8i (the hardware): Better chips to make all the above go faster.
1/ Today at #GoogleIO, we’re releasing Gemini 3.5, our latest family of models combining frontier intelligence with action.
We’re starting by releasing 3.5 Flash, which is built to help you execute complex, long-horizon agentic workflows.
Gemini 3.5 Flash is our strongest model for coding and agent https://t.co/m62cBJhIjJ outscores 3.1 Pro on agentic and coding benchmarks like Terminal-Bench and MCP Atlas, while running 4x faster than other frontier models.
Used in Google Antigravity, 3.5 Flash is even further optimized to be up to 12x faster. It’s a powerful engine to deploy sub-agents that collaborate, run high-frequency iterative loops, and solve real-world problems at scale.
Some highlights we’re excited about 🔽
Continual learning is bottlenecked by realistic evaluations
Introducing FutureSim, which replays real-world events in the temporal order they occurred
We benchmark frontier agents at updating predictions about how our world evolves, in native harnesses like Codex, Claude Code
@_arohan_ Somewhat related, I liked the design of seqax. Have sent it to folks who found it useful for learning LLM parallelism https://t.co/tRxkMdrfl1
Excited to launch Gemini 3.1 Pro! Major improvements across the board including in core reasoning and problem solving. For example scoring 77.1% on the ARC-AGI-2 benchmark - more than 2x the performance of 3 Pro. Rolling out today in @GeminiApp, @antigravity and more - enjoy!
I wrote about Clawdbot/Moltbot/OpenClaw and Moltbook, the fascinating, weird and sometimes even useful social network for digital assistants to swap tips and gossip with each other https://t.co/1udoHpxs5g
Apply to do research with me on emergence of agency/planning in LLMs, out-of-context reasoning, understanding generalization from data, or propose your own direction!
Very excited to be mentoring this spring 💫