Are open source pathology models ready for the clinic? Despite significant progress - not quite yet! These models show increased accuracy on retrospective datasets but struggle to deal with real-world variation as outlined on the @arteraAI tech blog: https://t.co/1Py9Y97Sv2
ArteraAI Announced as the First-and-Only Predictive Test for Therapy Personalization in the 2024 NCCN Guidelines for Prostate Cancer
NCCN guidelines are widely considered the gold standard for clinical practice in oncology.
https://t.co/QO0WXsQbLA
My final PhD chapter on improving seizure detection with @HazyResearch and @rubinqilab was just published @npjDigitalMed.
TL;DR We found that scaling two dimensions of model supervision: (1) coverage of training data and (2) granularity of class labels– has a large impact on model performance AND subgroup robustness for seizure detection. The best part? We do it using freely available notes produced in routine clinical workflows!
Clinical importance of building reliable seizure detection tools.
Detecting seizures, their types, duration, etc. is a critical healthcare task in diagnosing and managing epilepsy. The best way to do that is EEG analysis (reading brain recordings). But EEG analysis is a BIG pain! These recordings can be very long (hours-days per patient) and demand a scarce resource: deep neurologic-epileptologic expertise. So we have a strong need to develop reliable tools to help clinicians analyze EEG more efficiently.
Why aren’t existing models widely used?
A big reason is trust. ML models often fool us on aggregate metrics, where they show expert-level performance on average and then, whoops, they turn out to rely on non-causal features and do a lot worse for certain subgroups. We can’t have that for high-stakes healthcare settings. Another major reason is high false-alarms on abnormalities that may look like seizures, leading to alarm fatigue.
Workflow notes: a hidden goldmine for supervision.
Standard seizure detection models rely on manual labels from experts, but this approach is too expensive to scale. Luckily, routine clinical EEG monitoring leaves a trail of helpful annotations from techs, fellows, & docs. These workflow notes provide an opportunity to freely scale supervision for seizure detection models.
Scaling coverage is not enough.
Using workflow notes, we scaled our training data to include ~70k hours of EEG from ~12k patients. While this gave us impressive overall performance, we found significant performance gaps among certain patient age groups and seizure subtypes. We also found many false positives on non-epileptic abnormalities.
Scaling granularity of class labels is also needed.
Since workflow notes also include events beyond seizures (e.g., spikes, patient movement), we trained a multilabel model to predict 26 classes (including seizure). The intuition here is that increasing class granularity teaches the model to differentiate between seizures and other non-seizure abnormalities, lowering false positives. We found that our multilabel model improved overall performance, and importantly, had no significant performance gaps among subgroups.
Concluding thoughts on supervision.
It’s amazing how supervision has such a large impact on model reliability. Since supervision in healthcare is scarce, we should always keep an eye out for how we can leverage existing routine workflows to supervise our models – I am sure many more exist that we aren’t taking advantage of yet!
—
This work was done alongside amazing collaborators @SiyiTang_, Mohamed Taha, and @ChrisLeeMesser
It was also inspired by earlier explorations with @jdunnmon and @ajratner
And was supported by @StanfordBrain and @StanfordHAI
Check out the paper for details, including how we used SSMs for modeling, and improved on clinical utility metrics.
https://t.co/ypZI5uo0on
It's truly heartening to witness patients sharing stories of how @arteraAI 's tests have made a positive impact on their lives: https://t.co/j7viKXfur7
If you share a similar passion for personalizing medical decisions through #ArtificialInteligence , please join us!
Thrilled and proud to announce that @arteraAI has received Medicare Reimbursement for our prostate cancer test - the first-ever test for localized prostate cancer that can predict therapy benefit and prognosticate long-term outcomes.
https://t.co/8qBsHXkrxC
Quadratic attention has been indispensable for information-dense modalities such as language... until now.
Announcing Mamba: a new SSM arch. that has linear-time scaling, ultra long context, and most importantly--outperforms Transformers everywhere we've tried.
With @tri_dao 1/
We are beyond excited to announce the publication of data in @NEJMEvidence, validating the first-ever predictive #AI biomarker of androgen deprivation therapy (ADT) benefit in #prostatecancer.
Read the full announcement as reported by @AP: https://t.co/N7yCz6reil
This report used data derived and validated from completed randomized phase III trials to demonstrate that an AI-based predictive model could identify pts with predominately intermediate-risk #ProstateCancer who are likely to benefit from short-term androgen deprivation therapy.
This Thursday, @SiyiTang_ from ArteraAI will be joining us to talk about modeling multivariate biosignals with graph neural networks and structured state space models. Catch it at 1-2pm PT this Thursday on Zoom! Subscribe to https://t.co/Tr8mryt5m1 #ML#AI#medicine#healthcare
(7/8) I will be presenting this work as a spotlight presentation at #ICLR2023#TSRL4H workshop on May 5 (Friday). The full paper has been accepted to #CHIL2023 as an oral presentation.