Introducing our first set of Llama 4 models!
We’ve been hard at work doing a complete re-design of the Llama series. I’m so excited to share it with the world today and mark another major milestone for the Llama herd as we release the *first* open source models in the Llama 4 collection 🦙. Here are some highlights:
📌 The Llama series have been re-designed to use state of the art mixture-of-experts (MoE) architecture and natively trained with multimodality. We’re dropping Llama 4 Scout & Llama 4 Maverick, and previewing Llama 4 Behemoth.
📌 Llama 4 Scout is highest performing small model with 17B activated parameters with 16 experts. It’s crazy fast, natively multimodal, and very smart. It achieves an industry leading 10M+ token context window and can also run on a single GPU!
📌 Llama 4 Maverick is the best multimodal model in its class, beating GPT-4o and Gemini 2.0 Flash across a broad range of widely reported benchmarks, while achieving comparable results to the new DeepSeek v3 on reasoning and coding – at less than half the active parameters. It offers a best-in-class performance to cost ratio with an experimental chat version scoring ELO of 1417 on LMArena. It can also run on a single host!
📌 Previewing Llama 4 Behemoth, our most powerful model yet and among the world’s smartest LLMs. Llama 4 Behemoth outperforms GPT4.5, Claude Sonnet 3.7, and Gemini 2.0 Pro on several STEM benchmarks. Llama 4 Behemoth is still training, and we’re excited to share more details about it even while it’s still in flight.
A big thanks to all of our launch partners (full list in blog) for helping us bring Llama 4 to developers everywhere including @huggingface, @togethercompute, @SnowflakeDB, @ollama, @databricks and many others👏 This is just the start, we have more models coming and the team is really cooking – look out for Llama 4 Reasoning 😉
A few weeks ago, we celebrated Llama being downloaded over 1 billion times. Llama 4 demonstrates our long-term commitment to open source AI, the entire open source AI community, and our unwavering belief that open systems will produce the best small, mid-size and soon frontier models. Llama would be nothing without the global open source AI community & we are so ready to begin this next chapter with you. 🦙
Read more about the release here: https://t.co/7mbK3uggjO, and try it in our products today.
Wow, diffusion models (used in AI image generation) are also game engines - a type of world simulation.
By predicting the next frame of the classic shooter DOOM, you get a playable game at 20 fps without any underlying real game engine.
This video is from the diffusion model.
3D capture is dope for media & entertainment, but the utilitarian applications might be even more impactful - literally x-ray vision making the unseen seen.
With tools like Pix4D + RTK GPS (cm level accuracy) you can capture critical infrastructure, and overlay it at real-world locations in the future - allowing you to 'see' through concrete for future maintenance and construction.
Elon nails exactly what's wrong with education here and what needs to change.
You may not know he actually started a remarkable school at SpaceX for his kids based on his principles...
The school was called Ad Astra, Latin for “to the stars”.
Elon hired @josh_dahn to build and run the school. When Elon’s kids graduated, Josh and I cofounded @synthesischool together.
Our goal is to take the best parts of the SpaceX school and make them accessible to everyone.
Elon famously follows a “first principles” approach. The primary principle of the school:
A good education should produce people who can SOLVE PROBLEMS — real, complex problems with a team. Like how to colonize Mars. Not fake school worksheet problems.
This principle was embodied in a class at the school called “Synthesis”. In the class, kids worked in teams to solve problems via complex games and simulations.
The skills they build in Synthesis — working with a team, making tradeoffs, risk assessment and critical thinking — are the foundational skills they’ll need to solve problems in any domain.
1/2