Chris Kelly

@chrisck

Health at Microsoft AI. Previously @GoogleHealth, @GoogleDeepMind, @KingsImaging. Paediatric doctor at @EvelinaLondon. Created

London, UK

Joined March 2009

955 Following

2.2K Followers

1.2K Posts

chrisck retweeted

elie

@eliebakouch

3 days ago

microsoft MAI tech report is a gold mine, one of the most transparent for a model at this scale. this model uses zero synthetic data or distillation from previous models. this means reasoning, agentic behavior, tool use are all learned fully during post-training with no cold start. bold choice that makes it harder and requires more iterations to reach sota, but you get FULL control over your model series and it proves they are serious about being a frontier lab. the tech report is insanely detailed and precise about numbers. to give an example, they give the exact MFU across all the iterations of the model, with the exact changes etc. they also share the full scaling ladder recipe, to my knowledge this is the first time i've seen this in a tech report at this scale let's look at all of this in this likely very long thread 🧵

eliebakouch's tweet photo. microsoft MAI tech report is a gold mine, one of the most transparent for a model at this scale.

this model uses zero synthetic data or distillation from previous models. this means reasoning, agentic behavior, tool use are all learned fully during post-training with no cold start. bold choice that makes it harder and requires more iterations to reach sota, but you get FULL control over your model series and it proves they are serious about being a frontier lab.

the tech report is insanely detailed and precise about numbers. to give an example, they give the exact MFU across all the iterations of the model, with the exact changes etc. they also share the full scaling ladder recipe, to my knowledge this is the first time i've seen this in a tech report at this scale

let's look at all of this in this likely very long thread 🧵

261

264K

Chris Kelly

@chrisck

about 2 months ago

@Geiger_Capital From trainers to training AI... :)

492

Chris Kelly

@chrisck

3 months ago

We've all long imagined a health companion that truly knows you - not a chatbot that gives generic advice, but something that accumulates understanding of your health over time, and gets smarter the more you use it. Today we're launching Copilot Health: a secure, separate space within Copilot where you can bring together your health records from 50,000+ US provider orgs, data from 50+ wearable devices, lab results, and health history all into one place. Copilot Health then applies medical intelligence to make sense of it all, giving you real agency over your own health. Your data, understood in context, patterns surfaced that you might not spot alone, alongside the confidence to walk into your doctor's appointment more informed, prepared, and empowered. We are at a unique inflection point in history - AI is reaching extraordinary capabilities in medicine, while health data is finally being mobilised...and almost everyone has a phone in their pocket (or knows someone with one). We're starting in the US, but ultimately want to bring medical expertise to the billions of people who have never had access to it. Read more here (https://t.co/Kcoif1sIT1) and you can sign up to the waitlist to help shape what comes next!

186

Chris Kelly

@chrisck

3 months ago

Really excited 🥳 to share two breast cancer AI papers from my time at Google, published jointly in Nature Cancer today! We set out in 2021 to answer a question that matters to millions of women: can AI safely improve breast cancer screening in the NHS? Five years, five organisations, and 125,000+ women’s scans later, here's what we found. 1️⃣ Our first paper (https://t.co/zIDDCGIF1x) evaluated Google's mammography AI across five NHS screening services, with 39-month follow-up including interval + next-round cancers: → AI achieved superior sensitivity to human readers (54% vs 44%, P<0.001) with non-inferior specificity → 25% of future interval + next-round cancers detected = potential for earlier diagnosis → Reading time reduced by 32% while cancer detection increased by 18% → No systematic disparities across age, ethnicity, deprivation, or breast density → Prospective deployment at 12 sites confirmed feasibility but revealed distribution shift requiring recalibration - a critical lesson for implementation 2️⃣ Our second paper (https://t.co/zIDDCGIF1x) tackled what happens when AI becomes the second reader. When readers disagree today, a specialist panel "arbitrates". We studied 50,000 women's screens with 22 readers, with and without AI as the second reader: → End-to-end including arbitration, our AI-enabled arm was non-inferior to standard double reading (P<0.001) → Human reading workload reduced by 46% → AI flagged far more interval + next-round cancers before arbitration, but many were overruled, even when the AI correctly localised the cancer → Future: better explainability, prior image integration, reader training, and new pathways to maximise AI success (e.g. supplemental imaging for high risk normal cases) An editorial from Allan Hackshaw and Rosalind Given-Wilson (https://t.co/Z8u2dlzGJT) covers this work really well - thank you! Conclusion: The AI works, and it can find cancers earlier. But how we integrate it into clinical workflows will determine whether that potential translates into better outcomes for women. This collaboration between @GoogleResearch, @imperialcollege, @RoyalSurrey, @stgeorgeshospital, St George's University Hospitals NHS Foundation Trust, and Imperial College Healthcare NHS Trust was funded by the NHS AI Award. We are deeply grateful to everyone involved. Thank you to @skourti_elena at Nature Cancer. Congratulations Lucy Warren, Marc Wilson, Jenny Venton, Ken Young, Mark Halling-Brown, Megumi Morigami, Lisanne Khoo, Deborah cunningham, Richard Sidebottom, Reddy Mamatha, Hema Purushothaman, Delara Khodabakhshi, Lesley Honeyfield, Amandeep Hujan, Tsvetina Stoycheva, Andy Joiner, Reena Chopra, Aminata Sy, Dominic Ward, Lin Yang, Rory Sayres, Daniel Golden, Namrata Malhotra, Rachita Mallya, Lihong Xi, Della Ogunleye, Charlotte Purdy, Alistair Mackenzie, Jane Chang, Jonathan Dixon, Elzbieta Gruzewska, Emma Lewis, Marcin Sieniek, Shawn Xu, @DrSusanThomas, @shravyas, @fjg28_fiona, @Ara_Darzi, Hutan Ashrafian 🎉

chrisck's tweet photo. Really excited 🥳 to share two breast cancer AI papers from my time at Google, published jointly in Nature Cancer today!

We set out in 2021 to answer a question that matters to millions of women: can AI safely improve breast cancer screening in the NHS? Five years, five organisations, and 125,000+ women’s scans later, here's what we found.

1️⃣ Our first paper (https://t.co/zIDDCGIF1x) evaluated Google's mammography AI across five NHS screening services, with 39-month follow-up including interval + next-round cancers:

→ AI achieved superior sensitivity to human readers (54% vs 44%, P<0.001) with non-inferior specificity
→ 25% of future interval + next-round cancers detected = potential for earlier diagnosis
→ Reading time reduced by 32% while cancer detection increased by 18%
→ No systematic disparities across age, ethnicity, deprivation, or breast density
→ Prospective deployment at 12 sites confirmed feasibility but revealed distribution shift requiring recalibration - a critical lesson for implementation

2️⃣ Our second paper (https://t.co/zIDDCGIF1x) tackled what happens when AI becomes the second reader. When readers disagree today, a specialist panel "arbitrates". We studied 50,000 women's screens with 22 readers, with and without AI as the second reader:

→ End-to-end including arbitration, our AI-enabled arm was non-inferior to standard double reading (P<0.001)
→ Human reading workload reduced by 46%
→ AI flagged far more interval + next-round cancers before arbitration, but many were overruled, even when the AI correctly localised the cancer
→ Future: better explainability, prior image integration, reader training, and new pathways to maximise AI success (e.g. supplemental imaging for high risk normal cases)

An editorial from Allan Hackshaw and Rosalind Given-Wilson (https://t.co/Z8u2dlzGJT) covers this work really well - thank you!

Conclusion: The AI works, and it can find cancers earlier. But how we integrate it into clinical workflows will determine whether that potential translates into better outcomes for women.

This collaboration between @GoogleResearch, @imperialcollege, @RoyalSurrey, @stgeorgeshospital, St George's University Hospitals NHS Foundation Trust, and Imperial College Healthcare NHS Trust was funded by the NHS AI Award. We are deeply grateful to everyone involved. Thank you to @skourti_elena at Nature Cancer.

Congratulations Lucy Warren, Marc Wilson, Jenny Venton, Ken Young, Mark Halling-Brown, Megumi Morigami, Lisanne Khoo, Deborah cunningham, Richard Sidebottom, Reddy Mamatha, Hema Purushothaman, Delara Khodabakhshi, Lesley Honeyfield, Amandeep Hujan, Tsvetina Stoycheva, Andy Joiner, Reena Chopra, Aminata Sy, Dominic Ward, Lin Yang, Rory Sayres, Daniel Golden, Namrata Malhotra, Rachita Mallya, Lihong Xi, Della Ogunleye, Charlotte Purdy, Alistair Mackenzie, Jane Chang, Jonathan Dixon, Elzbieta Gruzewska, Emma Lewis, Marcin Sieniek, Shawn Xu, @DrSusanThomas, @shravyas, @fjg28_fiona, @Ara_Darzi, Hutan Ashrafian 🎉

Who to follow

Google for Health

@GoogleForHealth

Official Google for Health news and updates on how we’re progressing our mission to help billions of people be healthier.

Alan Karthikesalingam

@alan_karthi

Director/Principal Scientist @GoogleDeepMind. AI Co-Clinician, AI Co-Scientist, AMIE, Med-Gemini, MedPaLM Hon Lecturer Vasc Surgery @ImperialVasc

Dr Maureen Baker CBE

@MaureenBrayPool

CMO @thehealthilyapp Trustee @ptsafetylearn Former Chair @rcgp Former Chair @ProRecordSB

chrisck retweeted

Nathan Benaich

@nathanbenaich

8 months ago

🪩The one and only @stateofai 2025 is live! 🪩 It’s been a monumental 12 months for AI. Our 8th annual report is the most comprehensive it's ever been, covering what you *need* to know about research, industry, politics, safety and our new usage data. My highlight reel:

319

507K

Chris Kelly

@chrisck

9 months ago

@AndrewLBeam Congrats Andrew! You’ll have to see if you can reserve a little space for neonatal discoveries while you’re there :)

Chris Kelly

@chrisck

9 months ago

@anothercohen This is absolutely genius! The swinging arms, the hands in pockets, everything - I love it! Best of luck with everything :)

Chris Kelly

@chrisck

11 months ago

Thanks to our great team at Microsoft AI - @HarshaNori, @mayankdaswani, @chrisck, @scottlundberg, @marcotcr, Marc Wilson, @DrXiaoLiu, @viknesh_s, @jcarlsonmsr, @mattlungrenMD, @baygross, @phames, @mustafasuleyman, @Dominic1King, @erichorvitz

274

Chris Kelly

@chrisck

11 months ago

New paper today! 🥳 How good is generative AI at diagnosis compared to human doctors? We introduce a novel, interactive medical benchmark (SDBench) for “sequential diagnosis”, and an orchestrator (MAI-DxO) that achieved over 4x higher diagnostic accuracy vs experienced physicians who played the benchmark. 🧵

chrisck's tweet photo. New paper today! 🥳 How good is generative AI at diagnosis compared to human doctors? We introduce a novel, interactive medical benchmark (SDBench) for “sequential diagnosis”, and an orchestrator (MAI-DxO) that achieved over 4x higher diagnostic accuracy vs experienced physicians who played the benchmark. 🧵

Chris Kelly

@chrisck

11 months ago

Preprint: https://t.co/fuiS8INrG1 Blog: https://t.co/EAKKPHORVX Ps we plan to release the full benchmark soon. Stay tuned. Subscribe to this thread for updates! 👀

304

chrisck retweeted

Nicole Chiou (she/her) @nicole_chiou

about 1 year ago

Our recent journal article compares the effectiveness of objective vs. subjective labels for AI-based detection of fetal hypoxia from CTGs. Key takeaway? Objective cord pH labels demonstrate greater robustness to temporal shifts. 🔗 Read more: https://t.co/oLF0AgPCLQ #MedicalAI

315

chrisck retweeted

Eric Topol

@EricTopol

over 1 year ago

The largest medical #AI randomized controlled trial yet performed, enrolling >100,000 women undergoing mammography screening, was published today @LancetDigitalH The use of A.I. led to 29% higher detection of cancer, no increase of false positives, and reduced workload compared with radiologists without A.I.. https://t.co/GGrLqdG5yq

907

982K