Introducing Cosmos 3: Our latest frontier model for Physical AI
Cosmos 3 is the world’s first fully open omnimodel with native vision reasoning, world and action generation.
Today we’re releasing Super (32B) and Nano (8B) variants.
Today we're shipping Nemotron 3 Ultra.
A 550B MoE frontier-intelligence open model built for long-running agents.
It delivers 5x faster inference and lowers the cost of complex agentic tasks by up to 30% versus other open frontier models.
GPUs in Mumbai, training data in Iowa? Cross-region reads tax every epoch.
We put @Alluxio NVMe caching in front of the bucket with Ray Data on Anyscale: 1TB warm reads went 20x faster.
https://t.co/FqomwsoqpS
The bottleneck in drug discovery isn't designing molecules, it's making them. onepot combines robotic synthesis with large-scale ML inference on Anyscale to predict which reactions will work before they run, achieving 3B+ compounds and 10B+ reactions scored.
Case study: https://t.co/z0YjQmBkQB"
📣 Webinar: Anyscale on Azure — build and deploy AI at scale in your own tenant.
📅 Tue June 16, 8:30 AM PDT
Live with Daniel Arrizza (Anyscale) and Paul Yu (Microsoft): how to run production AI inside your own Azure tenant, plus a live build-train-serve demo.
Register 👉 https://t.co/kJQbUWA7c1
The co-founder of Kubernetes meets the co-creator of Ray.
@brendandburns and @robertnishihara on Anyscale on Azure — why AI belongs on AKS, and what it unlocks: a production layer for Ray, Azure Native, with 4x faster dev and 50% higher GPU utilization.
Watch 👇
🎬 New on Azure Friday: scaling Python AI workloads with managed Ray on AKS.
@shanselman talks with Anyscale's Omar Shorbaji on running build-train-serve on Ray, directly on AKS — no Kubernetes wrangling. Plus a live demo fine-tuning a robotics policy.
Catch it here 👇
https://t.co/FJnBvbNvoC
Huge congrats to the Microsoft AI team on MAI-Thinking-1.
Great to see large-scale RL systems converging around the SGLang + Ray ecosystem. Rocket’s design—async RL, separated rollout / inference / learner pools, router-based traffic control, prefix caching, and fault-tolerant inference—is very aligned with what we believe in slime: RL is not just an algorithm problem, but a full-stack infrastructure problem.
Excited to see more open RL infra ideas validated at frontier scale!
Congratulations to the Microsoft AI team on MAI-Thinking-1! Exciting to see Ray used in multiple parts of frontier-model development.
- Fast pre-training recovery via in-job restarts with hot standbys
- Async RL orchestration (managing learners, inference servers, rollout workers, and routers, each with distinct placement and fault-tolerance needs)
- A two-pool Ray cluster for building and grading SWE environments on 30K CPU cores
MAI-Thinking-1 is our first in-house reasoning model developed from scratch that is competitive with models of similar size on STEM reasoning and coding tasks.
35B active/1T total MOE.
💻Coding: 52.8% on SWE Bench Pro competitive with Opus 4.6
🧐 Reasoning: 97% on AIME 25
🤝Preferred to Sonnet 4.6 on blind side-by-side tests
It's interesting to see @MicrosoftAI uses ray actors not just for controller and rollout workers but problem workers for the posting training of the MAI-Thinking-1 model.
Instead of introducing third party dependency like @modal for sandboxing, Ray actors could provide finer granularity and control for heterogeneous compute which could translate to better utilization of the unused CPU resources in the GPU cluster and easier communication of the agent execution results.
Also the part of work I did was to support @sgl_project with @raydistributed backend to better support RL infra especially in weight syncing.
New: Anyscale on Azure, now in public preview.
Build AI on your own data, inside infrastructure you govern — sovereign AI. It's how Wayve and Xoople train production AI, from autonomous driving to satellite imagery.
https://t.co/Sk8g3ASAYK
Today at #MSBuild, Anyscale on Azure is now in public preview.
Enterprises can run the full AI lifecycle inside their own Azure environment — data, models, and pipelines stay in your tenant, with governed compute instead of per-token API costs.
Learn more here: https://t.co/9yumJwc5Oc
In this webinar, Neil Wadhvana, Staff ML Engineer at @torc_robotics, will walk through how Torc consolidated its autonomy data processing stack to support multimodal AI at scale with Ray on Anyscale. He will cover:
- The trends driving growth in autonomous driving developments,
- An overview of Torc’s data loop from production to consumption,
- The internal trends in multimodal AI that drove need for consolidation,
- The before and after Ray was adopted as common compute framework.
Register: https://t.co/YdA0I8D8uf
The bottleneck in drug discovery isn't designing molecules, it's making them. onepot combines robotic synthesis with large-scale ML inference on Anyscale to predict which reactions will work before they run, achieving 3B+ compounds and10B+ reactions scored.
Case study: https://t.co/KjN1EwD9NH
FWIW, I don't view Ray and Slurm as alternatives to each other, I think of them as solving different problems, e.g.,
Slurm is responsible for sharing compute resources among multiple workloads and multiple users. It provides workload multitenancy, queuing, prioritization, preemption, etc.
Ray is an actor framework and provides a distributed runtime for a single workload. It provides a single-controller programming model for distributed workloads, manages & coordinates processes, handles failures, etc.
It's very natural to run a Ray workload on top of Slurm, similar to how you'd run a Ray workload on top of Kubernetes.
Ray Day: London is one week out.
June 2 at Convene 200 Aldersgate. Keynote from Philipp Moritz, user talks from BMW, Adyen, Criteo, Kaiko & Xoople, plus a hands-on workshop on scaling AI with Ray.
A few spots left → https://t.co/OuF8YcU0HH
Don't miss the opportunity to learn:
- Build and scale data pipelines with Ray
- What is video data curation
- Stream large datasets from remote sources at scale
- Run distributed GPU inference with Ray Data
- Scale embedding generation with CPU actor pools
- Compose CPU and GPU stages into one streaming pipeline with Ray
Register: https://t.co/Se9LZ5Tvvg