Most language models only generate one token at a time.
We just released Nemotron-Labs-Diffusion, a family of diffusion language models that take a different approach, generating multiple tokens in parallel within a single model. Rather than committing to each token permanently, these models can revise as they go, resulting in faster inference that better utilizes modern GPUs.
The full model family ranges from 3B to 14B, including vision-language variants. Available now: https://t.co/L1Tp2aQDLJ
NVIDIA’s Ian Buck hand-delivered the first-ever NVIDIA Vera CPUs to our partners @AnthropicAI, @OpenAI, @SpaceX, and @OracleCloud. 🎉
Vera is NVIDIA's first custom CPU, purpose-built for the age of agentic AI. This is just the beginning. The road to Vera-powered systems starts here.
Thank you to our partners for being on this journey with us. The best is yet to come. 💚
Agentic AI is changing the rules for inference. With DeepSeek V4, NVIDIA Blackwell delivered 20x lower cost per token out of the box, running a 1.6T parameter MoE model with a 1M token context on day one.
But the real story is how:
NVIDIA is the only platform co-designed end-to-end across five rack-scale systems—engineered to operate as a unified AI factory rather than a collection of discrete components.
That’s what enables:
→ Higher throughput for agentic workloads
→ Lower latency across multi-step reasoning loops
→ Sustained improvements in token economics over time
As AI factories scale, cost per token becomes the metric that matters and extreme co-design is the advantage that compounds.
📗 https://t.co/389KyfelQ0
this is actually insane
> be tech guy in australia
> adopt cancer riddled rescue dog, months to live
> not_going_to_give_you_up.mp4
> pay $3,000 to sequence her tumor DNA
> feed it to ChatGPT and AlphaFold
> zero background in biology
> identify mutated proteins, match them to drug targets
> design a custom mRNA cancer vaccine from scratch
> genomics professor is “gobsmacked” that some puppy lover did this on his own
> need ethics approval to administer it
> red tape takes longer than designing the vaccine
> 3 months, finally approved
> drive 10 hours to get rosie her first injection
> tumor halves
> coat gets glossy again
> dog is alive and happy
> professor: “if we can do this for a dog, why aren’t we rolling this out to humans?”
one man with a chatbot, and $3,000 just outperformed the entire pharmaceutical discovery pipeline.
we are going to cure so many diseases.
I dont think people realize how good things are going to get
1/4 We see no wall in post-training. Scaling RL software, infra, and data keeps yielding major capability gains.
We trained across 30 RL environments with up to 4,000 instances per batch — math, code, STEM, agentic tool use, SWE, terminal, safety — all in a unified multi-environment RLVR setup.
Of course that's your contention. You're a first-time SaaS bear. You just got finished listening to some podcast, Dario on Dwarkesh, probably. Now you think it’s the end of white collar work and seat-based pricing is screwed. You're gonna be convinced of that til tomorrow when you get to “Something Big is Happening”. Then you’ll install ClawdBot on a Mac Mini, vibe code a dashboard on top of a postgres database and say we’re all just a couple ralph loops away from building a Salesforce competitor. That’s gonna last until next week when you discover context graphs, and then you're gonna be talking about how the systems of record will be disintermediated by an agentic layer and reposting OAI marketing graphics.
“Well, as a matter of fact, I won't, because ultimately the application layer is just ….”
The application layer is just business logic on top a CRUD database. You got that from Satya’s appearance on the BG2 pod, December 2024, right? Yeah, I saw that too. Were you gonna plagiarize the whole thing for us? Do you have any thoughts of your own on this matter? Or...is that your thing? You get into the replies of anyone posting a SaaS ticker. You watch some podcast and then pawn it off as your own idea just to impress some VCs and embarrass some anon who’s long SaaS? See the sad thing about a guy like you is in a couple years you're gonna start doing some thinking on your own and you're gonna come up with the fact that there are two certainties in life. One: don't do that. And two: you dropped thirty grand on Mac Minis and LLM API calls to come to the same conclusion you could’ve got for free by following a handful of VC accounts.
The #NVIDIARubin platform utilizes extreme co-design across hardware and software to deliver inference at 1/10th the token cost and trains MoE models with 1/4th the number of GPUs.
NVIDIA Rubin is designed to deliver unprecedented efficiency for training, inference, and advanced reasoning at scale. #CES2026
Starting the day seeing NVIDIA Nemotron 3 Nano trending at #1 on @huggingface🤗
Huge thank you to our developer community! Tell us what you are building with it.
➡️https://t.co/u9Ax9Tiqgj
🚀 Nemotron 3 Nano 30B-A3B is here! Open weights + open data + open source.
AA Intelligence Index: 52 (@ArtificialAnlys )
✅ 1M‑token context
✅ up to 3.3× higher throughput vs similarly sized open models
✅ stronger reasoning/agentic + chat
Details + links in the thread 🧵
Deeply amused by all the confident commentary that datacenters in space do not work from a physics and engineering perspective.
Elon operates two of the largest coherent GPU clusters in the world, SpaceX is responsible for over 90% of mass to orbit and SpaceX operates the largest satellite constellation in the solar system. More than 10 years later, no other company or country can consistently land and reuse orbital rockets.
He publicly stated that the “lowest cost way to do AI compute will be with solar powered satellites.”
Maybe, just maybe, his “pencil and paper analysis of the physics or the economics at play” is superior to yours. There might have even been more than just a “pencil and paper analysis” of the subject done by some of the best engineers in the world. Perhaps they have thought of a cooling solution that has not occurred to the galaxy brain accounts here even after they took several minutes to carefully think about the problem.
The CEO of Google also agrees that data centers in space will be “normal” within a decade.
If you are not currently operating a large AI datacenter, a large satellite cluster and have not landed a rocket, maybe be a little less quick to confidently assume that Elon and Google are *both* wrong on this topic.
Especially when there is a working, albeit very small, datacenter in space *today* - Starcloud’s orbital setup just successfully trained an LLM. Great name btw.
Yes, I am biased on these topics and as ever, time will tell.
What’s wrong with this world?
Kharkiv. Several air-dropped bombs. On a residential neighborhood. In broad daylight.
If it were New York or Paris — the world would burn with outrage.
But since it’s just Ukraine?
Silence.
F*ck everyone who stays silent and pretends not to see.
In this debate if AI replacing jobs, Arvind accurately sums up the nuances of jobs that lie in between the tasks (even if they are automate-able) are the most difficult parts of a any job that AI cannot .
I find the story of AI and radiology fascinating. Of course, Hinton's prediction was wrong* and tech advances don't automatically and straightforwardly cause job replacement — that's not the interesting part.
Radiology has embraced AI enthusiastically, and the labor force is growing nevertheless. The augmentation-not-automation effect of AI is despite the fact that AFAICT there is no identified "task" at which human radiologists beat AI. So maybe the "jobs are bundles of tasks" model in labor economics is incomplete. Paraphrasing something @MelMitchell1 pointed out to me, if you define jobs in terms of tasks maybe you're actually defining away the most nuanced and hardest-to-automate aspects of jobs, which are at the boundaries between tasks.
Can you break up your own job into a set of well-defined tasks such that if each of them is automated, your job as a whole can be automated? I suspect most people will say no. But when we think about *other people's jobs* that we don't understand as well as our own, the task model seems plausible because we don't appreciate all the nuances.
If this is correct, it is irrelevant how good AI gets at task-based capability benchmarks. If you need to specify things precisely enough to be amenable to benchmarking, you will necessarily miss the fact that the lack of precise specification is often what makes jobs messy and complex in the first place. So benchmarks can tell us very little about automation vs augmentation.
* Hinton insists that he was directionally correct but merely wrong in terms of timing. This is a classic motte-and-bailey retreat of forecasters who get it wrong. It has the benefit of being unfalsifiable! It's always possible to claim that we simply haven't waited long enough for the claimed prediction to come true.
I believe there are better opportunities for market-beating returns and compound wealth over the long run, without the constant threat of tech disruption - Companies like $DNP or $JDG.
That’s where my focus is now.
Have a great weekend!