TOP AI Network is a decentralized AI Model platform and also the infrastructure for AI Agents, utilizing sharding technology and a three-layer network.
Grok in PowerPoint is now available.
Go from prompt to PowerPoint with real-time data, connections to your apps and MCPs, and diagram or image generation directly in your slides.
https://t.co/KX6Ejz3HGc
Introducing GLM-5.2: Frontier Intelligence, Open Weights
- Significant improvements in coding and agentic tasks
- Strong long-horizon capabilities with a 1M context window
- Two levels of reasoning effort: GLM-5.2 (max) pushes the limits, while GLM-5.2 (high) strikes a strong balance between performance and token efficiency
- MIT-licensed open weights
- Same API pricing as GLM-5.1
Tech Blog: https://t.co/LAsxUdN0JZ
Weights: https://t.co/g0A1C4UWx4
API: https://t.co/Kc3E22cbN7
Coding Plan: https://t.co/Nk8Y98HNhU
Chat: https://t.co/WCqWT0qCQb
Announcing Artificial Analysis Intelligence Index v4.1: a shift toward agentic workloads, featuring upgraded benchmarks and new per-task metrics
The Artificial Analysis Intelligence Index is our synthesis metric for assessing model intelligence and tracking AI progress. v4.1 marks a broader shift toward agentic workloads, with three main changes:
Updated and reweighted evaluations toward agentic tasks:
1. We upgraded three evaluations, removed one, and reweighted the Intelligence Index:
➤ Upgraded Terminal-Bench Hard to Terminal-Bench 2.1 and τ²-Bench Telecom to τ³-Bench Banking. Both move to newer, more robust task sets with harder, more realistic agentic scenarios that better separate frontier models
➤ Upgraded GDPval-AA to GDPval-AA v2. The upgrade re-baselines Elo to human performance at 1000, introduces a rotating panel of frontier-model judges, and raises the turn limit from 100 to 250 for longer-horizon agent trajectories
➤ Removed IFBench due to saturation. The benchmark no longer distinguishes frontier models sufficiently, so we have removed it from the Intelligence Index. We will continue to run it and publish results on new model releases
2. Cost per Task, Time per Task, and Tokens per Task:
Three new per-task metrics, reported for every model and based on the Intelligence Index. We take the total cost, total time, and total output tokens for a model to run the Intelligence Index and divide by the number of tasks across its evaluations, giving the average cost, time, and output tokens to complete a single Intelligence Index task
3. Cached input token reporting:
We now report cached input tokens and their impact on cost, including the cost to run the Intelligence Index, to better reflect the real cost of running each model
Key Results:
➤ Leading models: Claude Fable 5 (with Opus 4.8 fallback, 60) leads the Artificial Analysis Intelligence Index v4.1 by four points but is currently unavailable, leaving Claude Opus 4.8 (max, 56) as the most intelligent available model, ahead of GPT-5.5 (xhigh, 55) ➤ Open weights leading models: Among open weights models, DeepSeek V4 Pro (max, 44) and MiniMax M3 (44) lead, followed by Kimi K2.6 (43) and MiMo-V2.5-Pro (42)
➤Cost per Task: Claude Opus 4.8 (max) is the most expensive available model at $1.78 per task, with Claude Fable 5 the highest overall at $3.25. GPT-5.5 (xhigh) scores within a point of Opus 4.8 on the Intelligence Index at $0.99 per task. DeepSeek V4 Pro (max) stands out on the Intelligence vs Cost per Task chart at $0.04 per task, with other leading proprietary models costing 20x to 45x more
➤Time per Task: time per task (inference decode time) ranges from 1.5 minutes for Grok 4.3 (high) to 13.5 for Claude Sonnet 4.6 (max), a roughly 9x spread. Claude Opus 4.8 (max) completes a task in 6.4 minutes and GPT-5.5 (xhigh) in 3.7, while Gemini 3.1 Pro Preview stands out on the Intelligence vs Time per Task chart at 1.6 minutes for a score of 46
New from Code with Claude Tokyo: scheduled deployments and environment variables in vaults are in public beta in Claude Managed Agents, and dynamic workflows in Claude Code are generally available.
Agents now run on a schedule, use your tools securely, and take on bigger jobs.
For over 20 years, we've dedicated ourselves to removing language barriers so people can learn, speak and connect more deeply than ever before.
Today, we’re taking our next step with the release of Gemini 3.5 Live Translate — our latest audio model for live, speech-to-speech translation across 70+ languages. 🧵
Introducing Claude Fable 5: a Mythos-class model that we’ve made safe for general use.
Its capabilities exceed those of any model we’ve ever made generally available.
Over the past two weeks, TOP AI Network continued building its AI Agent Platform, delivered multiple Agents for enterprise teams, and explored new productivity paths with #MCP and skills.
Find more:
https://t.co/NXPxHCt8Qu
Introducing a more powerful NotebookLM 🚀
Massive upgrades deliver agentic capabilities in chat, more advanced reasoning, and a suite of new output formats. Tackling complex, multi-step research problems has never been easier.
Rolling out now to Google AI Ultra subscribers.
Anton Osika (@antonosika) is the co-founder and CEO of @lovable, where anyone can build software through conversation.
His working thesis: the most underrated moat in AI is trust, and earning it takes craft, care, and obsession.
We’re making Codex more useful for your work by expanding plugins beyond individual tools.
These plugins turn Codex into a specialist for a specific role with a single install, no coding required.
Codex can access 62 popular apps and 110 skills for work across sales, data analytics, creative production, product design, and public equity investing.
https://t.co/nunrYP2uMI
Building apps has never been easier.
With Sites, Codex can turn your work, ideas, and plans into an interactive website or app your team can explore, use, and share with a URL.
Rolling out to Business and Enterprise plans, before expanding more broadly.
Windows users, this one’s for you.
Computer use now works on Windows, so Codex can take action on your Windows computer.
And with Windows support for Codex in the ChatGPT mobile app, you can start, review, and steer tasks on the go while work continues on your Windows machine.
An early experience, but we’re working on more ways to keep your work moving, wherever you are.
Introducing Claude Opus 4.8: it builds on Opus 4.7 with sharper judgment, more honesty about its own progress, and the ability to work independently for longer than its predecessors.
Available today at the same price.
Private MCP servers 🤝 OpenAI products
Your team can keep MCP servers inside your network while ChatGPT, Codex, and the Responses API connect through outbound-only HTTPS.
🔗 https://t.co/UVq0KpT0km
The team continues focusing on the Agent platform, enabling more workflows through AI #Agents to improve automation and productivity.
We're also continuing enterprise support, platform optimization, and multi-Agent collaboration development.
Find more👇
https://t.co/NtzyG8tERe
We’re dropping Gemini Omni: our first step towards a model that can create anything from anything - starting with video.
It combines Gemini’s intelligence with our generative media systems - representing a leap forward in world understanding, multimodality, and editing 🧵
Google’s new Gemini 3.5 Flash is the clear leader on the Intelligence vs Speed Pareto frontier and makes large gains on GDPval-AA (real-world agentic tasks), but is 5x the cost of Gemini 3 Flash
@GoogleDeepMind gave us pre-release access to Gemini 3.5 Flash, the latest model in its Flash family, which has traditionally has offered faster, lower-cost alternatives to Gemini Pro models. Gemini 3.5 Flash scores 55 on the Artificial Analysis Intelligence Index, up 9 points from Gemini 3 Flash, driven primarily by agentic performance gains and hallucination reduction. It achieves speeds of over 280 output tokens/s, but higher token usage and token pricing make it over 5x more costly to run the Intelligence Index than Gemini 3 Flash, and 75% more costly than Gemini 3.1 Pro. Gemini 3.5 Flash is $1.50/1M input and $9/1M output tokens, Gemini 3 Flash was $0.5/$3 per 1M input/output tokens, a 3x increase. The rest of the increase was driven by higher token usage when running our benchmarks
Key results for Gemini 3.5 Flash with ‘high’ thinking level:
➤ 9 point Intelligence Index improvement: Gemini 3.5 Flash scores 55 on the Artificial Analysis Intelligence Index, up 9 points from Gemini 3 Flash. This places it ahead of Grok 4.3 (high, 53) and Claude Sonnet 4.6 (max, 52). The model improves across nearly all evaluations, with the largest gains coming from agentic evaluations and AA-Omniscience (knowledge and hallucination). On AA-Omniscience, Gemini 3.5 Flash improves by 11 points, driven primarily by reduced hallucinations, with its hallucination rate falling to 61%, a 31 point decrease compared to Gemini 3 Flash
➤ Agentic capability improvements: Gemini 3.5 Flash improves substantially over Gemini 3 Flash across our agentic evaluations, in both GDPval-AA (real-world agentic tasks) and Tau2-Bench Telecom (agentic tool use). Its GDPval-AA result is especially notable, achieving an Elo of 1656, well ahead of Gemini 3 Flash (1204) and Gemini 3.1 Pro (1314), and just behind GPT-5.4 (xhigh, 1674). This represents a meaningful step forward for Google in agentic performance, which has historically been a relative weakness for Gemini models
➤ Speed-intelligence frontier: Gemini 3.5 Flash achieves speeds of over 280 output tokens per second, ~70% faster than Gemini 3 Flash and models such as gpt-oss-120b and GPT-5.4 mini (xhigh). With its 55 Intelligence Index score, this places Gemini 3.5 Flash on the speed-intelligence Pareto frontier alongside Gemini 3.1 Pro and Gemini 3.1 Flash-Lite, reinforcing Google’s strength in models balancing speed and intelligence
➤ 5.5x increase in cost to run: Gemini 3.5 Flash costs $1,552 to run the Artificial Analysis Intelligence Index, 5.5x more than Gemini 3 Flash and 75% more than Gemini 3.1 Pro. This is driven by increases in both token usage and token prices. Output token usage is broadly unchanged from Gemini 3 Flash (73M vs. 72M), but input token usage increases significantly, driven primarily by an increase in the number of turns in agentic evaluations. Gemini 3.5 Flash is priced 3x higher than Gemini 3 Flash at $1.50/$9.00 per 1M input/output tokens, with a 90% discount for cached input tokens
➤ Google continues to lead multimodal performance: Gemini 3.5 Flash is multimodal, supporting image, video, and speech input alongside text. This differs from many proprietary models, including Claude Opus 4.7, Grok 4.3, and GPT-5.5, which support image input only. In our multimodal evaluation, MMMU-Pro, Gemini 3.5 Flash scores 84% - the highest score recorded. This puts models from Google in the top two spots, with Gemini 3.1 Pro scoring 82%
Key model details:
➤ Context window: Retains the same 1M context window as Gemini 3 Flash
➤ Multimodality: Text, image, video and speech input with text output only
➤ Pricing: $1.50/$9.00 per million input/output tokens, with a 90% discount for cached input tokens
Congratulations @GoogleDeepMind , @sundarpichai and @demishassabis on the great release!
Three years ago, Gemini started by understanding the world.
With Gemini 2, models learned to think and reason.
Late last year, Gemini 3 brought any idea to life.
Today, we’re continuing that journey with our Gemini 3.5 series, starting with Gemini 3.5 Flash, delivering frontier performance for agents and coding.
You must be feeling lucky 🍀
Consider this your official reminder for #GoogleIO. Join us May 19 at 10 a.m. PT for a first look at our latest product updates and AI breakthroughs.