It's also what lets us keep Evidence free for every clinician, not just the systems that can afford it. The same safe, trusted answers, wherever they are.
This is step one. Out-of-session today, in-session next.
Full write-up: https://t.co/HbwmMMhbgy
Credit to the Heidi models team. Six weeks to parity is no small feat.
There’s been debate in the last couple days about whether general models beat specialized medical AI. It's the wrong question. This is an argument about how to measure.
You don't need frontier scale to reach frontier quality. Six weeks ago we matched the best frontier model in Heidi Evidence with a model of our own, a fraction of the size.
Here's how. 🧵
Owning the model matters more the closer we get to real clinical care. A medical device has to behave inside known error bars, every time, right down to the inference. You can't outsource that to someone else's API and still stand behind it. Owning the model is how we make that promise.
Introducing Agora-1, a world model that's learned to simulate multi-agent experiences. It's so fun.
Today we're launching a playable research preview, where you can relive your childhood and enjoy a multiplayer simulation of GoldenEye.
So excited about this new capability!
@BoWang87 This is incredible. If you guys ever want to do this at scale with our own hardware you should join Heidi. We have a Toronto team. Email me [email protected]
The study design caused these results. “AI scribes” /= Abridge, DAX and Ambience. If you study identical products you’ll get disappointing results. All of them are super embedded with no native interfaces outside the EHR, so it’s not a shock they’re demonstrating modest results for the few clinicians that adopt them.
We need new interfaces, integrate of course but we can’t be buried in legacy systems. I’ve been dogmatic on this point and it will play out, mark my words.
Medicine is no different to engineering, agent-first medicine is coming, it’s just taking everyone longer to figure it out. These products probably won’t exist long enough to build it.
#AI scribe adoption across 5 academic centers was associated with modest reductions in EHR and documentation time, plus a slight increase in weekly visit volume, especially for primary care and female clinicians.
https://t.co/S2TNIOmhyO
We need some sort of clinical olympics for AI teams. Every 4 months.
If I see another tech bro say we beat the USMLEs, I might have to start them.
Wdyt @ambossmed@doximity@EvidenceOpen@tryheidi@MedwiseAI anyone can join, medical establishment sets the gauntlet.
@elonmusk Is it in the base model or with search?
We do millions of health queries a week, we’d swap if it’s in the base model. Search optimisation doesn’t count it’s cheating haha.