Khaled Saab

Multimodal AMIE now published @NatureMedicine! This is from past work @GoogleDeepMind where we studied patients uploading images during diagnostic dialogue. We found that a multimodal reasoning harness that tracks a patient’s state greatly improves history taking and clinical accuracy. We also surpassed doctors across many evaluation axes in diverse primary care settings. https://t.co/kn71IF9cNN

Khaled Saab

@_khaledsaab

about 1 month ago

Audio-video can be a powerful and more accessible interface for clinical AI. Congrats @RyutaroTanno and team!

Google DeepMind @GoogleDeepMind

about 1 month ago

AI co-clinician is our new research initiative to help explore how multimodal agents could better support healthcare workers and patients. 🩺 Here’s a snapshot of our progress 🧵

227

566

361K

817

Who to follow

Tri Dao

@tri_dao

Asst. Prof @PrincetonCS, Chief Scientist @togethercompute. Machine learning & systems.

Albert Gu

@_albertgu

assistant prof @mldcmu. chief scientist @cartesia_ai. leading the ssm revolution.

Eric Nguyen

@exnx

Co-founder @ Radical Numerics PhD @ stanford @HazyResearch

_khaledsaab retweeted

Shiv Rao, MD

@ShivdevRao

about 1 month ago

GPT-5.5 early access results have been impressive. 25% lift in clinical quality. 30% less verbose. We orchestrate hundreds of AI tasks, some powered by our proprietary data flywheels, others by frontier models like GPT-5.5. We test rigorously for clinical accuracy, completeness, reasoning, real-world because benchmarking is a big deal in healthcare. Grateful to the OpenAI team for the early access and partnership.

ShivdevRao's tweet photo. GPT-5.5 early access results have been impressive.

25% lift in clinical quality.
30% less verbose.

We orchestrate hundreds of AI tasks, some powered by our proprietary data flywheels, others by frontier models like GPT-5.5.

We test rigorously for clinical accuracy, completeness, reasoning, real-world because benchmarking is a big deal in healthcare.

Grateful to the OpenAI team for the early access and partnership.

Khaled Saab

@_khaledsaab

about 1 month ago

@cyrilzakka Congrats!!

331

Khaled Saab

@_khaledsaab

about 1 month ago

Doctors are increasingly using ChatGPT to supercharge their workflows in delivering care. To support this high impact domain, we take steps to make ChatGPT more accessible, accurate, and measurable for clinical work. https://t.co/L6onbKavNT

Karan Singhal

@thekaransinghal

about 1 month ago

Today we’re introducing two big steps for health at OpenAI: - ChatGPT for Clinicians, a free version of ChatGPT designed for clinical work - HealthBench Professional, a new benchmark to evaluate real clinician chat tasks We’re excited about what this can unlock for care. ❤️

thekaransinghal's tweet photo. Today we’re introducing two big steps for health at OpenAI:

- ChatGPT for Clinicians, a free version of ChatGPT designed for clinical work
- HealthBench Professional, a new benchmark to evaluate real clinician chat tasks

We’re excited about what this can unlock for care. ❤️ https://t.co/FeBWhHQPiw

263

561

_khaledsaab retweeted

Karan Singhal

@thekaransinghal

3 months ago

https://t.co/sbZncI4gF9

304

391

65K

_khaledsaab retweeted

Tibo

@thsottiaux

3 months ago

Codex at 2M+ active users up 25% week over week... and that was before we launched the app on Windows and GPT-5.4!

80K

_khaledsaab retweeted

OpenAI

@OpenAI

3 months ago

Yesterday we reached an agreement with the Department of War for deploying advanced AI systems in classified environments, which we requested they make available to all AI companies. We think our deployment has more guardrails than any previous agreement for classified AI deployments, including Anthropic's. Here's why: https://t.co/k1Ge2MqqPr

583

_khaledsaab retweeted

Boaz Barak @boazbaraktcs

3 months ago

There is this narrative that up until this week, Anthropic had this wonderful contract that prevented the U.S. government from doing mass domestic surveillance or autonomous lethal weapons, and now all hell will break lose. As I wrote, I am not a fan of accelerating AI specifically in the national security space. If I had been an Anthropic employee at the time they signed their original deal with the DoW, I would have probably opposed it, especially given the reduced control since they worked through Palantir. And I don't think having some terms of use in the contract is what we can rely on to protect us. I believe the drama of the last week about these terms of use is more about politics than substance. The substance is about the details, which I hope more of which will come out soon. But it is wrong to present the OAI contract as if it is the same deal than Anthropic rejected, or even as if it is less protective of the red lines than the deal Anthropic already had in place before. Obviously I don't know all details of what Anthropic had before, but based on what I know, it is quite likely that the contract OAI signed gives *more* guarantees of no usage of models for mass domestic surveillance or autonomous lethal weapons than Anthropic ever had.

239

122K

_khaledsaab retweeted

Nathan Labenz

@labenz

3 months ago

230M people use ChatGPT for health & wellness questions every week. 📈 A recent RCT showed that it improves patient outcomes. ⚕️ And soon... it will be FREE for ALL 👏 (with no ads!) - @thekaransinghal, @OpenAI Health Lead, on "Universal Medical Intelligence"

Khaled Saab

@_khaledsaab

3 months ago

@flowersslop If she finds a case please share it with me

Khaled Saab

@_khaledsaab

4 months ago

@SRSchmidgall @taotu831 It’s near instant but also a prerequisite that has near term impact in health AI. What are your thoughts on how medical AGI evals should evolve? Cc @taotu831

213

Khaled Saab

@_khaledsaab

4 months ago

We’re at an inflection point of AGI evaluation where verification gets much harder. We started with multiple choice (instant verification) to research-grade problems (multi-day by a few experts). And soon it will be multi-month and multi-year (e.g., tape-out, clinical trials).

Sam Altman

@sama

4 months ago

We went from AI systems that struggled to do grade school math to AI systems that can solve research-level math problems in just a few years. I agree with Jakub this is perhaps the most important eval now. I am also pretty sure the main reaction will be "it's not that hard" :)

324

685

710

_khaledsaab retweeted

Jakub Pachocki

@merettm

4 months ago

Very excited about the "First Proof" challenge. I believe novel frontier research is perhaps the most important way to evaluate capabilities of the next generation of AI models. We have run our internal model with limited human supervision on the ten proposed problems. The problems require expertise in their respective domains and are not easy to verify; based on feedback from experts, we believe at least six solutions (2, 4, 5, 6, 9, 10) have a high chance of being correct, and some further ones look promising. We will only publish the solution attempts after midnight (PT), per the authors' guidance - the sha256 hash of the PDF is d74f090af16fc8a19debf4c1fec11c0975be7d612bd5ae43c24ca939cd272b1a . This was a side-sprint executed in a week mostly by querying one of the models we're currently training; as such, the methodology we employed leaves a lot to be desired. We didn't provide proof ideas or mathematical suggestions to the model during this evaluation; for some solutions, we asked the model to expand upon some proofs, per expert feedback. We also manually facilitated a back-and-forth between this model and ChatGPT for verification, formatting and style. For some problems, we present the best of a few attempts according to human judgement. We are looking forward to more controlled evaluations in the next round! https://t.co/jtLCOhJftv #1stProof

244

348

761

Khaled Saab

@_khaledsaab

4 months ago

Codex team is the best example of product research co-design I’ve witnessed.

Sam Altman

@sama

4 months ago

From how the team operates, I always thought Codex would eventually win. But I am pleasantly surprised to see it happening so quickly. Thank you to all the builders; you inspire us to work even harder.

271

627

_khaledsaab retweeted

Vivek Natarajan

@vivnat

4 months ago

Scientific discovery and clinical medicine are often treated as distinct phases. But for patients with rare, complex, and undiagnosed diseases, this separation is a luxury they cannot afford. The timeline from understanding a genetic mechanism to accessing subspecialist care is often too long and too fragmented. Two new @GoogleDeepMind @GoogleResearch collaborations with @StanfordMed , published in Advanced Science and @NatureMedicine respectively last week, demonstrate how AI can bridge this gap. 1. Accelerating discovery (the science) In Advanced Science, we present one of the first wet-lab validated examples of AI-assisted genetic discovery. Our AI identified a novel genetic factor for hearing loss (Crym) in mice, which Dr Gary Peltz and team validated using CRISPR knock-in experiments to restore the wild-type gene and rescue the phenotype. We applied this agentic AI scaffold to human patients with complex, undiagnosed conditions in a retrospective manner. The system analyzed genomic data for rare diseases, such as IRAK4 deficiency and ODC1 mutations, successfully identifying causative variants that matched expert clinical assessments. 2. Scaling expertise (the medicine) Discovery is only the first step; patients then need access to specialized care. As we note in our Nature Medicine paper, hypertrophic cardiomyopathy (HCM) is a leading cause of sudden cardiac death, yet ~60% of patients remain undiagnosed due to a lack of specialist centers . In our RCT (one of the first of its kind) using our research AI system AMIE, we showed AI could help bridge this gap. General cardiologists using AMIE reported the system helped their assessments in 57.0% of cases, missed no clinically significant findings in 93.5% of cases and reduced assessment time in 50.5% of cases. This suggests the AI can act as a helpful co-pilot and help generalists bridge the gap to specialists. Worth noting that these studies used models like Med-PaLM 2, Gemini 2.0 Flash, and Gemini 2.5 Pro with simple agentic scaffolds. The potential for Gemini 3 and AI co-scientist to accelerate both the biology of discovery and the delivery of care is profound and we will share more soon. Its a true privilege to collaborate with @euanashley , Jack W O'Sullivan, Dr Gary Peltz and their teams at Stanford Medicine. With incredible team mates at Google including @taotu831 @apalepu13 , @alan_karthi , @Mysiak and many more. Advanced Science paper - https://t.co/lkj0soEGuj Nature Medicine paper - https://t.co/G38tExCakN AI co-scientist blog - https://t.co/Br5JegvcF6 AMIE blog - https://t.co/iLErUQRsSW

vivnat's tweet photo. Scientific discovery and clinical medicine are often treated as distinct phases. But for patients with rare, complex, and undiagnosed diseases, this separation is a luxury they cannot afford. The timeline from understanding a genetic mechanism to accessing subspecialist care is often too long and too fragmented.

Two new @GoogleDeepMind @GoogleResearch collaborations with @StanfordMed , published in Advanced Science and @NatureMedicine respectively last week, demonstrate how AI can bridge this gap.

1. Accelerating discovery (the science)
In Advanced Science, we present one of the first wet-lab validated examples of AI-assisted genetic discovery.

Our AI identified a novel genetic factor for hearing loss (Crym) in mice, which Dr Gary Peltz and team validated using CRISPR knock-in experiments to restore the wild-type gene and rescue the phenotype.

We applied this agentic AI scaffold to human patients with complex, undiagnosed conditions in a retrospective manner. The system analyzed genomic data for rare diseases, such as IRAK4 deficiency and ODC1 mutations, successfully identifying causative variants that matched expert clinical assessments.

2. Scaling expertise (the medicine)
Discovery is only the first step; patients then need access to specialized care.

As we note in our Nature Medicine paper, hypertrophic cardiomyopathy (HCM) is a leading cause of sudden cardiac death, yet ~60% of patients remain undiagnosed due to a lack of specialist centers .

In our RCT (one of the first of its kind) using our research AI system AMIE, we showed AI could help bridge this gap. General cardiologists using AMIE reported the system helped their assessments in 57.0% of cases, missed no clinically significant findings in 93.5% of cases and reduced assessment time in 50.5% of cases. This suggests the AI can act as a helpful co-pilot and help generalists bridge the gap to specialists.

Worth noting that these studies used models like Med-PaLM 2, Gemini 2.0 Flash, and Gemini 2.5 Pro with simple agentic scaffolds.

The potential for Gemini 3 and AI co-scientist to accelerate both the biology of discovery and the delivery of care is profound and we will share more soon.

Its a true privilege to collaborate with @euanashley , Jack W O'Sullivan, Dr Gary Peltz and their teams at Stanford Medicine.

With incredible team mates at Google including @taotu831 @apalepu13 , @alan_karthi , @Mysiak and many more.

Advanced Science paper - https://t.co/lkj0soEGuj

Nature Medicine paper - https://t.co/G38tExCakN

AI co-scientist blog - https://t.co/Br5JegvcF6

AMIE blog - https://t.co/iLErUQRsSW

_khaledsaab retweeted

Jerry Tworek

@MillionInt

4 months ago

Run fewer experiments and think about them more

560

51K

_khaledsaab retweeted

Tao Tu @taotu831

4 months ago

Excited to share our latest research published today in @NatureMedicine, demonstrating how Large Language Models (LLMs) can help bridge the critical shortage of subspecialist medical expertise. https://t.co/5OE5tYw3dE

Khaled Saab

@_khaledsaab

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users