In our first talk of the summer, we had @kothasuhas and @konwookim talk about their incredible work on scaling synthetic data with MegaDocs. See thread below for link to the video.
@lilianweng In our work -- https://t.co/O5bllH62er
We formally show the common theme connecting all the scaling laws -- Submodularity and Diminishing Returns.
I'm joining OpenAI next week!🥹 The job search turned out to be really challenging but also super rewarding, so I wrote a small blog to share what I learned along the way and hopefully make the process a little less mysterious for the next person. https://t.co/6FigSBdenD
1/ We fine-tune a lot of customer models, so we decided to systematically try and figure out some best practices for finetuning. SFT isn't sexy, but it's still important. We vary one SFT lever at a time across 2 model families, dense + MoE to 235B, on 4 real-world customer datasets.
What makes this clean is that each dataset is paired with an eval that took weeks to build with the customer, and the training outputs were generated to pass that eval. So the supervised target and the thing we measure downstream are the same criterion, which strips out the usual confounders
Most companies that say they want to own their model are going to fail at it. I like the conversation about "owning vs renting" intelligence. It's the right frame, and it's about to become the defining decision for most companies, because AI will be core for pretty much everyone, even if they don't realize it yet. For those where the model is the business today, these questions hit close to home.
Renting was the right call for the last three years. Call an API, ship, don't think about infrastructure. But the ground is shifting. The frontier is quietly closing up. Meta moved its newest flagship work to closed models under its Superintelligence Lab, and the strongest Chinese models, like Qwen's top tier, are now API-only. Open weights aren't going away, but counting on the best ones being there for you isn't a safe bet anymore. And the closed labs are compute-constrained enough that access itself is becoming something you reserve years in advance: OpenAI is already selling multi-year "Guaranteed Capacity" contracts.
So serious companies are deciding to own their models rather than rent them. Here's where almost everyone gets it wrong. They treat it as a compute problem, or a talent problem: get the GPUs, hire the team, and you can build a great model. They line up the compute, put a date on the calendar for the model, and then hit the real blocker. Data. Their proprietary data isn't ready for training. There isn't enough data to train on in the areas they really care about. That's the part almost nobody budgets for: data quality in a shape that can train the model your business needs.
Getting the data right also flips the economics that we have come to expect from the past three years. A small, domain-specific model built on the right data can go toe-to-toe with the best the frontier labs can build, and you keep your data, own your roadmap, control your costs, and build a moat that's actually yours.
The future worth betting on isn't three labs renting the same model to everyone. It's thousands of companies building their own domain-specific models, each better at its job than any general model could be. The frontier used to look like the shining house on the hill. Lately, it looks more like a landlord, happy to keep you renting as long as you never price out what owning could really look like.
1/ 🌞 Our Summer of Data Seminar brought together some of the sharpest minds in data curation last year. We are bringing it back in 2026! Let's recap the great talks from 2025!
1/ 🌞 Our Summer of Data Seminar brought together some of the sharpest minds in data curation last year. We are bringing it back in 2026! Let's recap the great talks from 2025!
🤔 What if scaling LLMs isn't just about adding parameters but about finding the right program hidden inside the model you already have?
If every layer block is viewed as a callable function, can we program an LLM for each input by choosing to keep a layer, skip it, or loop it?
ResNet and YOLO received to the Longuet-Higgins Test of Time award. Congrats!
Three thoughts:
- very “difficult” job for the committee this year
- people are still using both quite a bit
- time for an additional generation to feel old — I already got used to that 😅 #cvpr2026
NVIDIA Nemotron 3 Ultra is now live!
Frontier accuracy, 5X greater speed, 30% lower cost.
Deploy however you need - on-premise, on the cloud, or at the edge.
Model is live on HuggingFace under the OpenMDW 1.1 license.
https://t.co/IOfAwv3jB6
Embodied agents are getting better at reasoning… but they still make surprisingly brittle mistakes.
A robot that can “bring me a banana”
may completely fail at: “bring me a yellow curved fruit.”
Why?
Because current agents usually commit to the first action they come up with.
We introduce VeGAS: Verifier-Guided Action Selection for embodied agents. 🤖
Instead of acting immediately, the agent:
• samples multiple candidate actions
• verifies them
• executes the best one
Accepted as a #CVPR 2026 Findings paper.
🧵
What makes a dataset valuable? And when is "more data" not the same as "better data" in machine learning and AI? Read more to find out: https://t.co/Q0wPOtfm5d
The evidence for specialized pretraining keeps growing.
This really nice study shows how early exposure leads to robustness to forgetting.
Enterprises serious about AI use cases should start thinking about training custom models from scratch, not just post-training or RL.