Our co-founder and CEO, @robrombach, sat down with President Trump, President von der Leyen, President Macron, and other world leaders at the G7 to stress the vital role of open innovation in AI. With openness under pressure around the world, Robin urged governments and industry to make open and responsible development the norm, not the exception. Check out his speech below!
https://t.co/REOofUXPaM
Welp, that happened faster than I predicted. Thought it would be end of 2027, then early 2027, but agentic traffic growing so fast that bots have now passed human traffic online for the first time in the Internet's history. https://t.co/2zX5bHdhsa
"The fact that someone like Martin Scorsese โ one of the greatest, most impressive filmmakers to exist โ is using our technology and curious about exploring it...it's such a great proof point that this works.โ
- our CEO @robrombach in an interview with @brooksbarnes for the @nytimes. They discussed why Martin Scorsese joined BFL as an advisor.
At Black Forest Labs, we're building visual intelligence: AI models that can understand and reason in the physical and digital worlds. Scorsese is helping shape how our models serve creators who care deeply about their craft, whether they're storytellers, designers, engineers, or roboticists.
Link to article in the thread.
How does frontier training use 2,048 GPUs?
Because there are five dimensions you can split work across - and at scale, you use all of them at once.
Hope this helps demystify some of the model training frameworks out there:
@immortaldip@PandaAshwinee Of the 5 parallelism methods, DP has the highest ROI, and TP,CP come to play with with HBM constraints. PP is the least ROI imo, given the eng complexity, but frontier labs still use them, as evidenced by Deepseek Dualpipe schedule. EP is MOE specific. Different trade-offs
Will be at #MLSys2026! Would love to catch up with familiar faces and meet new ones ๐ especially keen to geek out on highly scalable, fault-tolerant training. If you're going, drop a reply or DM. Let's connect!
Decoupled DiLoCo is one of the most practically important training papers in a while. Spent a lot of time working through the systems implications. A few things I had to figure out myself that might save you time ๐งต
10/ Overall: Decoupled DiLoCo is one of the most practically important training papers in a while.
The core insight - prioritize availability
and partition tolerance over consistency โ-
will matter more as training scales to
geo-distributed clusters.