Today, we’re excited to announce our $50M Series B, led by @GreenfieldVC (formerly TPG Capital), with participation from @lightspeed and @notablecap. 🚀
At @PatronusAI, we develop simulations and evals to train and improve AI. The first phase of AI was built on static benchmarks, but that era is over now. As agents are used to solve longer and longer tasks, they need to practice in dynamic, living worlds to get better. Simulations are the critical infrastructure powering this next phase.
As a company, we’re behind the most influential research and products in AI evaluation, like FinanceBench, Lynx, and Percival. And things have moved at the speed of light since. ⚡ We partner with the world's leading frontier AI labs and enterprises, and our revenue has grown more than 15x over the past year.
Additionally, today, we’re introducing a preview of the first Digital World Model for AI agent training and simulation: Patronus-DWM.
Digital World Models are language diffusion world models that predict realistic environment behaviors and steer agent actions across digital workflows. Just as physical world models predict how objects move through space, we’re developing the equivalent for the digital world: predicting how agents act in digital workflows, then using that to scale the creation of high-quality training data for LLMs.
Digital World Models help us push the frontier of ultra long horizon workflows, and unlock a new class of self-improving RL environments. This is our scalable approach to simulating all of the world’s intelligence.
The round was also joined by @datadoghq, @SamsungVentures, @gokulr, @factorialcap, and a large cohort of amazing AI leaders and researchers across @AnthropicAI, @OpenAI, @GoogleDeepMind, @nvidia, @Recursive_SI, and more. ✨
It has been the ride of a lifetime. But we’re just getting started. The best is yet to come.
"Do not go gentle into that good night,
Rage, rage against the dying of the light"
- Dylan Thomas (1954)
AGI isn’t a bigger next token.
It’s agents that experience consequences, adapt strategies, and generalize across tasks.
Build better environments → get better agents.
Environments are the next step.
The path from AI to AGI won’t be “just add parameters.”
It’s richer RL environments where agents can act, get feedback, and improve.
Closing the loop from prediction → decision → consequence.
Alignment is part of it:
RLHF showed aligning models to human intent via feedback works in pure language. Now put that feedback inside interactive environments, so agents learn what to do and how to behave.
Open-ended worlds matter:
MineDojo (Minecraft) blends thousands of tasks + internet-scale knowledge, even learning rewards from video-language priors, exactly the “learn from the world” recipe.
Bridge to real user tasks:
On WebArena, GPT-4-based agents hit ~14% E2E success vs ~78% for humans. Great reality check and a north star for progress. We need better environments, tools, and credit assignment.
Scale the worlds too:
XLand shows agents getting broadly capable via open-ended play across many games, not memorized tasks. Curriculum emerges from environment design.
Evidence we’re on the right track:
• MuZero learns a world model and plans superhuman on Atari/Go/Chess/Shogi without rules encoded. Planning + learning in one agent.
Fighting Hallucinations is one of the most important features for a RAG system to have! ⚔️
SUPER excited to share a bit of what we've been cooking up with our friends @PatronusAI! 🚀
> The team at Patronus has created Lynx, a custom, state-of-the-art model for Hallucination Detection!
> On the Weaviate side of the coin, we have engineered the Query Agent to "cite its sources".
> This recipe illustrates how you can connect the `sources` response from the Query Agent to Patronus' Lynx evaluator!
The recipe is linked below, I hope this inspires your trust in responses from the Query Agent!
I also really hope you will check out Patronus AI, incredible team! 🔥
1/ Introducing Glider - the smallest model to beat GPT-4o-mini on eval tasks ⚡🚀
- Open source, open weights, open code
- Explainable evaluations by nature
- Trained on 183 criteria and 685 domains
Try it out for free at https://t.co/ZZai84VulJ 🔥
@__tinygrad__ I would highly recommend @LevyOperations
They fit in your budget and they are experts in getting the work done so you can focus on more important tasks
Join the industry experts in a #virtual panel discussion on “Stress Test Your P&L & Managing Liquidity”. Register now: https://t.co/j0dLftTqak
Live at 11 AM, April 6th, 2020
In these times of crisis let’s get together and support each other!
Join the industry experts in a #virtual panel discussion on “Stress Test Your P&L & Managing Liquidity”. Register now: https://t.co/Pz0PldeQLv
Live at 11 AM, April 6th, 2020
In these times of crisis let’s get together and support each other!
5. This is how Jio paved way for Reliance 2.0 by getting the highest market share holders, the kinara stores online to sell and the biggest chunk of Indian population, the rural population to buy.
What will be the secret sauce for Reliance Retail (Reliance 2.0)? How big of a role will Jio play? Was Jio the foundation of a bigger picture that only Mukesh Ambani could see? #jiodhandhanadhan#reliance2.0 #relianceretail#MukeshAmbani
4. Now that every small retailer has uninterrupted internet access, they are capable of using advance softwares to optimise their work and increase revenue by selling online and also the rural population can now buy everything online.