America's best open model shipped toda (550 billion parameters) and it's serving tokens on @nebiustf
Nemotron 3 Ultra is the most intelligent open-weights model the US has shipped. NVIDIA gave away the weights, the data, and the recipes. The intelligence is now a free, downloadable file. That is the headline, and it is real.
But a free file is not a product. This model was co-designed for NVFP4 on Blackwell. The thing that makes it fast is the format and the hardware path, and that only pays off on a serving stack tuned to exploit it. Download the weights and run them naive, you leave most of the model on the floor.
The work that turns the file into fast, cheap tokens is cache-aware routing, disaggregated prefill and decode, speculative decoding shaped to your traffic, dedicated capacity, regional isolation. The conversion layer. Nobody clones that in an afternoon.
That layer is what a token factory is. The model is the crude. The factory is the yield.
Nemotron 3 Ultra is live on Nebius Token Factory today, tuned to run the way NVIDIA built it to run. The weights are everyone's. The throughput is the part you come here for.
Jensen shipped the weights, we shipped the throughput.
Speaking at @pydatalondon this week - Saturday June 6.
Talk: "Using coding agents with open models"
Coding agents like Cursor and Claude Code are genuinely changing how we build software. But most teams still run them on proprietary models by default - not always because open models aren't good enough, but because pairing the two takes a bit of work to get right.
That is exactly what I’ll be showing.
We’ll use open models running on @nebiustf , plug them into coding agents, and look at what actually matters in practice: Developer experience, Model behavior, Setup patterns, Best practices.
Live demos - not just slides.
And yes, attendees will get platform credits so you can try it yourself after the session.
If you’re at PyData London, come say hi. Bring your questions, your coding-agent war stories, and your opinions on open models.
📍 Convene Sancroft, St. Paul's · Grand Hall 2
🗓️ June 6 · 16:15–17:00
🔗 https://t.co/OjkcyMEbj0
#pydata #pydatalondon
🚀 New #1 on the @ArtificialAnlys Kimi K2.6 leaderboard: Eigen AI at 265 tok/s — in collaboration with @Nebius Token Factory. @nebiustf
On B200. Not GB300.
Topping the chart without Blackwell Ultra isn't a silicon story — it's a serving stack story. Every layer of EigenInference is co-designed for trillion-param MoE.
More coming 🔥
https://t.co/YiUoasxFko
Your production logs aren't just logs...
they are TRAINING DATA.
see how @nebiustf Data Lab feature allows you to capture / curate production logs and use them for training.
links below:
I love it when a room full of hands goes up when I ask, “who’s using open models?”
That was the vibe at https://t.co/V6khm3jm3V (@nebiusai) event yesterday.
@demian_ai and I spoke about "Engineering open models for production at scale"
We covered some of the work in Nebius Token Factory (@nebiustf )
- KV cache management across GPU + CPU memory
- speculative decoding (training draft models)
- disaggregated prefill (decoupling prefill and decode to better utilize compute and memory)
- strong support for post-training
Great questions, great discussions.
Thanks to everyone who showed up and made it awesome.
PS: Was a perfect sunny ☀️ spring day in Berlin
PPS: got legit good chocolate 🍫 - top-tier swag
If you’re at HumanX conf in San Francisco ...
Find the 'hidden' @nebiusai Speakeasy
Reward: a delicious drink 🍹
Stop by Nebius booth #1127
Reward: Nebius Token Factory @nebiustf credits 🎁
Headed to HumanX? Be sure to join our Masterclass Session on April 8 at 4:30 PM (Hall D-10) to break down the full engineering stack behind production LLMs.
📌 Plus don't miss:
• Live demos at our Booth #1127
• Meeting the team at our speakeasy
See you there!
Remember when we were all reworking websites to actually function on smartphones?
Now we’re doing it again - but this time, for agents 🤖
At our https://t.co/pSyO4rLwrj.SF Hackathon, the Injestor team - Benjamin Shyong, Vishal Verma , Alex Shirazi - built a tool to convert websites (originally made for humans) work better with agents - and took 1st place in a seriously competitive field (200+ attendees, 70+ projects).
Their stack:
- Models on Nebius Token Factory @nebiustf (they liked Nemotron3-Super for speed and good performance)
- @tavilyai for pulling web content
- And (my favorite) a Karpathy-style agent loop to iteratively optimize sites for agent use
Agents building sites for agents - Pretty meta 😄
As part of the win, we invited them join us at the @nebiusai booth at #nvidiagtc. It was great fun!
Checkout Benjamin's post for more details and a cool video : https://t.co/EuRFXKMDe5 ; And https://t.co/30jjKEJ8at
Great geeking out with you all - excited to see where this goes next.
I'm hiring a (US-based) Hackathons Lead to turn hackathons into a repeatable growth engine at @Nebius. Goal is not just more hackathons, it’s to systemize, scale and deliver the best hackathon program in the industry. If that's you (or know someone) pls DM or tag them here
Your LLM logs = training data 👀
But most teams don’t use them effectively.
See how Nebius Token Factory (@nebiustf) turns production logs into better models.
🗓️ March 26
🕐 10AM PDT / 12PM EDT / 6 PM CET
👉 https://t.co/h9HuB23cKX
Can’t make it live? No worries - register anyway and we’ll send you the recording + a 🎁.
Packed workshop on "synthetic data generation using Nvidia data designer" at Nvidia GTC.
Using open models (gpt-oss-120b, nemotron-3-super, kimi-2.5) powered by Nebius Token Factory (@nebiustf)
Great work @johnnypgreco and team 👏
🐍 Snakes at the Nebius booth @ Nvidia GTC '26🐍
…not that kind of snakes 😄
It’s the classic Snake game -
but each snake is controlled by an LLM in real time.
Powered by Nebius Token Factory @nebiustf
📍 Last day of the conference
🎮 Try different models and watch them compete live
Stop by the @nebiusai booth 👀
#Snakegame #LLMGames
“Nebius will take care of you,” — Jensen Huang, @NVIDIA’s Founder and CEO, in conversation with our CRO Marc Boroditsky at the Nebius booth here at #NVIDIAGTC.
We will.