Reserach scientists at Google just tested an AI symptom checker on 14,000 real patients over 9 months via Fitbit.
In blinded evaluation, clinicians ranked the AI diagnosis as #1 in 53% of cases. Independent physicians: 24%.
But the real finding isn't "AI beats doctors.", but when users just type their symptoms and get an answer (the default mode of every consumer LLM right now), diagnostic accuracy drops ~27% compared to a structured AI-led interview.
ChatGPT, Claude, Gemini, none of them systematically interview users about their symptoms. They just respond. This study shows that's a measurable failure mode.
And then there's the second breakthrough: Fitbit data showed physiological shifts DAYS before users reported symptoms. Heart rate up, sleep disrupted, steps down, all visible before patients even opened the app.
Conversational AI that asks the right questions + wearable sensors that detect illness before you feel it. That's the exciting find here.