Very happy (and relieved) to see our work on multimodal conversational medical AI accepted in @NatureMedicine
https://t.co/HgaVwaBxKx
In the published version, we have substantially expanded on the analysis and evaluation. Kudos to @_cjpark@timstro@JanFreyberg@_khaledsaab
This work also formed an important precusor for our more recent work where we explored a similar problem but in real-time interaction: https://t.co/LjKswtYFrN
Both modes of UX (synchronous and asynchronous) are useful but in different ways.
Also a nice reminder that a prospective evaluation remains as an important future work.
Soonly article on electric fences hit @hackernews front page.
This comment destroyed me, a 9-year-old's advice about reconnecting after 20 years: "Maybe you should just write to her"
@AdamMGrant this echoes the research on dormant ties, kids get it right!
HN Discussion: https://t.co/ush3WiWpR1
✨New study from our team @GoogleDeepMind@GoogleAI - AMIE goes Multimodal✨
Our research conversational diagnostic AI now fluently considers visual photos/tests. In randomized OSCE study AMIE outperformed PCPs in simulated consultations in which patients uploaded photos of skin concerns, ECG tracings or lab tests. Medical dialogue can hinge critically on multimodal tests like these, so AI systems need to expertly reason about this complex information during a diagnostic conversation. 👀More here: https://t.co/UglrGlueSY (1/n)
Gemini powers our multimodal health research! 💙
In our new paper on multimodal AMIE, we're pushing conversational diagnostic AI beyond text to handle images such as skin photos, ECGs, and clinical docs, which provide crucial context in healthcare.
Blog: https://t.co/VAlKoR53Il
Paper: https://t.co/2zHQT0H5Pv
How do we make an AI reason like a clinician during a dynamic, multimodal conversation? One of our key contributions is multimodal state-aware reasoning, built on @GoogleDeepMind Gemini 2.0 Flash.
Instead of just reacting turn-by-turn, AMIE maintains an internal "understanding" of the consultation:
✅ What is known about the patient?
✅ What are the likely diagnoses?
✅ What information (text or visual) is missing?
This internal state allows AMIE to:
👉 Intelligently guide the conversation through phases like history-taking & diagnosis.
👉 Strategically ask for relevant images (like skin photos or screenshots of ECGs/docs) when its internal state shows uncertainty.
👉 Accurately interpret multimodal data and weave the findings back into the ongoing dialogue and diagnostic process.
Essentially, it mimics the adaptive reasoning clinicians use, leading to a more structured and effective consultation.
We evaluated multimodal AMIE against primary care physicians (PCPs) in a demanding, blinded OSCE study using 105 diverse multimodal scenarios.
The results demonstrate clear progress: AMIE achieved similar or superior performance when compared to PCPs across a wide range of metrics, including diagnostic accuracy, empathy, and critically, the handling and reasoning about multimodal data.
While the OSCE results are very promising, it's important to remember this was a test environment with patient actors! Real-world care is more complex. Making sure it's safe, reliable, and actually helpful in the real world needs more work, starting with our upcoming study with Harvard BIDMC.
The work would not have been possible without an amazing team @GoogleAI, @GoogleDeepMind: @RyutaroTanno, @alan_karthi, @vivnat, @AdamRodmanMD, @timstro, @taotu831, @hardyshakerman, @JanFreyberg, @_cjpark, @yasharmaa, @apalepu13, @arkitus, @weballergy, @valentinlievin, @ckbjimmy, @davidstutz92, @dgtbarrett, @yongcheng16@SaraM66905, @dr2w, @ymatias
Sharing progress: Our research AI agent, AMIE, now interprets visual medical information (images, test results) within diagnostic conversations.
We introduce a multimodal state-aware reasoning framework, built on @GoogleDeepMind's Gemini models, that aims to better handle complex clinical information.
In simulated clinical evaluations (OSCEs), AMIE met or exceeded human physicians on a broad range of benchmarks, including visual reasoning, diagnostic accuracy, management reasoning, and empathy.
Crucially, these results are from a controlled simulation using patient actors (see paper for full limitations). Proving safety, reliability, and utility requires rigorous testing in real-world settings. Our upcoming study with Harvard BIDMC is the first step in that essential validation.
Blog: https://t.co/MSodxG64aZ
Paper: https://t.co/bqALRUsaNs
A foundational step by a dedicated team.
@GoogleAI, @GoogleDeepMind: @RyutaroTanno, @alan_karthi, @vivnat, @AdamRodmanMD, @KhaledSaab11, @taotu831, @hardyshakerman, @JanFreyberg, @_cjpark, @yasharmaa, @apalepu13, @arkitus, @weballergy, @valentinlievin, @ckbjimmy, @davidstutz92, @dgtbarrett, @yongcheng16@SaraM66905, @dr2w, @ymatias
In 2018, I started calling one friend a day.
What followed was 24 months of social exhaustion.
Result? Broke my scroll addiction and built real connections.
Here's my hard-won playbook:
[1/5]
How to build deep connections one pebble at a time:
1. Notice small gestures
2. Share meaningful moments
3. Be consistent
4. Build on foundations
We're showing people how to transform relationships right now. Want to join? Learn the Penguin Pebbling Principle:
A window into a micro-mobility future:
Living in Santiago, Chile at the moment and I've seen more professionals commuting via electric scooter (often personally owned) than any other place I've visited.
Could Arizona follow in these footsteps?
@ryanmjohnson@culdesac