🚀 Nemotron 3 Nano is live! Had a blast post-training this model with a cracked team. Its strong for its size, and highly efficient at inference.
And true to @nvidia's open release style: weights (BF16/FP8/base) + training recipes + code + datasets.
HF: https://t.co/IE7OCOU2qd
Blog + Nano tech report: https://t.co/Qioq9N2ewg
Today, we report a method for design of active enzymes, RFdiffusion2, in @naturemethods. For the first time, we are able to design enzymes with native-range catalytic activity.
We also are releasing our next frontier model, RFdiffusion3, code 👇
I am recruiting 2 PhD students to work on LM interpretability at UMD @umdcs starting in fall 2026!
We are #3 in AI and #4 in NLP research on @CSrankings.
Come join us in our lovely building just a few miles from Washington, D.C. Details in 🧵
[1/9] While pretraining data might be hitting a wall, novel methods for modeling it are just getting started!
We introduce future summary prediction (FSP), where the model predicts future sequence embeddings to reduce teacher forcing & shortcut learning.
📌Predict a learned embedding of the future sequence, not the tokens themselves
Considering a PhD/MSc in NLP?
I’m hiring students this cycle!
If you are passionate about making language models reliable and safe, eager about understanding and controlling language models, and would like to add to your research some multilingual flavor - apply to my group! 👇