Siyi Tang

@SiyiTang_

Machine Learning Scientist @arteraAI | #MachineLearning for Medicine | PhD @Stanford

United States

Joined June 2018

290 Following

296 Followers

85 Posts

Pinned Tweet

Siyi Tang @SiyiTang_

about 3 years ago

(1/8) 🥳Introducing GraphS4mer, a #GNN for modeling multivariate biosignals (e.g., #EEG, #ECG). Paper: https://t.co/eEQyFYBXDR Code: https://t.co/8ujTdaTMxU 🧵

SiyiTang_'s tweet photo. (1/8) 🥳Introducing GraphS4mer, a #GNN for modeling multivariate biosignals (e.g., #EEG, #ECG).

Paper: https://t.co/eEQyFYBXDR
Code: https://t.co/8ujTdaTMxU

🧵 https://t.co/BDNHu28tDe

Siyi Tang @SiyiTang_

over 1 year ago

Checkout our tech blog on evaluating open source pathology foundation models on real-world image variations! https://t.co/sKUxrnXeNi

Nathan Silberman @NCSilberman

over 1 year ago

Are open source pathology models ready for the clinic? Despite significant progress - not quite yet! These models show increased accuracy on retrospective datasets but struggle to deal with real-world variation as outlined on the @arteraAI tech blog: https://t.co/1Py9Y97Sv2

946

223

SiyiTang_ retweeted

Andre Esteva, PhD

@AndreEsteva

over 2 years ago

ArteraAI Announced as the First-and-Only Predictive Test for Therapy Personalization in the 2024 NCCN Guidelines for Prostate Cancer NCCN guidelines are widely considered the gold standard for clinical practice in oncology. https://t.co/QO0WXsQbLA

SiyiTang_ retweeted

Khaled Saab

@_khaledsaab

over 2 years ago

My final PhD chapter on improving seizure detection with @HazyResearch and @rubinqilab was just published @npjDigitalMed. TL;DR We found that scaling two dimensions of model supervision: (1) coverage of training data and (2) granularity of class labels– has a large impact on model performance AND subgroup robustness for seizure detection. The best part? We do it using freely available notes produced in routine clinical workflows! Clinical importance of building reliable seizure detection tools. Detecting seizures, their types, duration, etc. is a critical healthcare task in diagnosing and managing epilepsy. The best way to do that is EEG analysis (reading brain recordings). But EEG analysis is a BIG pain! These recordings can be very long (hours-days per patient) and demand a scarce resource: deep neurologic-epileptologic expertise. So we have a strong need to develop reliable tools to help clinicians analyze EEG more efficiently. Why aren’t existing models widely used? A big reason is trust. ML models often fool us on aggregate metrics, where they show expert-level performance on average and then, whoops, they turn out to rely on non-causal features and do a lot worse for certain subgroups. We can’t have that for high-stakes healthcare settings. Another major reason is high false-alarms on abnormalities that may look like seizures, leading to alarm fatigue. Workflow notes: a hidden goldmine for supervision. Standard seizure detection models rely on manual labels from experts, but this approach is too expensive to scale. Luckily, routine clinical EEG monitoring leaves a trail of helpful annotations from techs, fellows, & docs. These workflow notes provide an opportunity to freely scale supervision for seizure detection models. Scaling coverage is not enough. Using workflow notes, we scaled our training data to include ~70k hours of EEG from ~12k patients. While this gave us impressive overall performance, we found significant performance gaps among certain patient age groups and seizure subtypes. We also found many false positives on non-epileptic abnormalities. Scaling granularity of class labels is also needed. Since workflow notes also include events beyond seizures (e.g., spikes, patient movement), we trained a multilabel model to predict 26 classes (including seizure). The intuition here is that increasing class granularity teaches the model to differentiate between seizures and other non-seizure abnormalities, lowering false positives. We found that our multilabel model improved overall performance, and importantly, had no significant performance gaps among subgroups. Concluding thoughts on supervision. It’s amazing how supervision has such a large impact on model reliability. Since supervision in healthcare is scarce, we should always keep an eye out for how we can leverage existing routine workflows to supervise our models – I am sure many more exist that we aren’t taking advantage of yet! — This work was done alongside amazing collaborators @SiyiTang_, Mohamed Taha, and @ChrisLeeMesser It was also inspired by earlier explorations with @jdunnmon and @ajratner And was supported by @StanfordBrain and @StanfordHAI Check out the paper for details, including how we used SSMs for modeling, and improved on clinical utility metrics. https://t.co/ypZI5uo0on

_khaledsaab's tweet photo. My final PhD chapter on improving seizure detection with @HazyResearch and @rubinqilab was just published @npjDigitalMed.

TL;DR We found that scaling two dimensions of model supervision: (1) coverage of training data and (2) granularity of class labels– has a large impact on model performance AND subgroup robustness for seizure detection. The best part? We do it using freely available notes produced in routine clinical workflows!

Clinical importance of building reliable seizure detection tools.
Detecting seizures, their types, duration, etc. is a critical healthcare task in diagnosing and managing epilepsy. The best way to do that is EEG analysis (reading brain recordings). But EEG analysis is a BIG pain! These recordings can be very long (hours-days per patient) and demand a scarce resource: deep neurologic-epileptologic expertise. So we have a strong need to develop reliable tools to help clinicians analyze EEG more efficiently.

Why aren’t existing models widely used?
A big reason is trust. ML models often fool us on aggregate metrics, where they show expert-level performance on average and then, whoops, they turn out to rely on non-causal features and do a lot worse for certain subgroups. We can’t have that for high-stakes healthcare settings. Another major reason is high false-alarms on abnormalities that may look like seizures, leading to alarm fatigue.

Workflow notes: a hidden goldmine for supervision.
Standard seizure detection models rely on manual labels from experts, but this approach is too expensive to scale. Luckily, routine clinical EEG monitoring leaves a trail of helpful annotations from techs, fellows, & docs. These workflow notes provide an opportunity to freely scale supervision for seizure detection models.

Scaling coverage is not enough.
Using workflow notes, we scaled our training data to include ~70k hours of EEG from ~12k patients. While this gave us impressive overall performance, we found significant performance gaps among certain patient age groups and seizure subtypes. We also found many false positives on non-epileptic abnormalities.

Scaling granularity of class labels is also needed.
Since workflow notes also include events beyond seizures (e.g., spikes, patient movement), we trained a multilabel model to predict 26 classes (including seizure). The intuition here is that increasing class granularity teaches the model to differentiate between seizures and other non-seizure abnormalities, lowering false positives. We found that our multilabel model improved overall performance, and importantly, had no significant performance gaps among subgroups.

Concluding thoughts on supervision.
It’s amazing how supervision has such a large impact on model reliability. Since supervision in healthcare is scarce, we should always keep an eye out for how we can leverage existing routine workflows to supervise our models – I am sure many more exist that we aren’t taking advantage of yet!

—
This work was done alongside amazing collaborators @SiyiTang_, Mohamed Taha, and @ChrisLeeMesser
It was also inspired by earlier explorations with @jdunnmon and @ajratner
And was supported by @StanfordBrain and @StanfordHAI

Check out the paper for details, including how we used SSMs for modeling, and improved on clinical utility metrics.
https://t.co/ypZI5uo0on

11K

Who to follow

MedAI Group

@MedaiStanford

The MedAI Group Exchange Sessions | Weekly series on #ML #AI #medicine #healthcare Organized by @BhaskharNandita, @AmaraTariq11, @das_avisha, @imonbanerjee6

Matthew McDermott

@MattBMcDermott

Assistant Professor at Columbia DBMI

Yuzhe Yang

@yang_yuzhe

AI Prof @UCLA | Researcher @Google | PhD @MIT | Building frontier health intelligence.

Siyi Tang @SiyiTang_

over 2 years ago

We have two opening positions at the AI team @arteraAI: - ML Engineer: https://t.co/zihVNFWCDR - ML Manager: https://t.co/yGnbLftoul

115

Siyi Tang @SiyiTang_

over 2 years ago

It's truly heartening to witness patients sharing stories of how @arteraAI 's tests have made a positive impact on their lives: https://t.co/j7viKXfur7 If you share a similar passion for personalizing medical decisions through #ArtificialInteligence , please join us!

251

SiyiTang_ retweeted

Andre Esteva, PhD

@AndreEsteva

over 2 years ago

Thrilled and proud to announce that @arteraAI has received Medicare Reimbursement for our prostate cancer test - the first-ever test for localized prostate cancer that can predict therapy benefit and prognosticate long-term outcomes. https://t.co/8qBsHXkrxC

SiyiTang_ retweeted

Albert Gu

@_albertgu

over 2 years ago

Quadratic attention has been indispensable for information-dense modalities such as language... until now. Announcing Mamba: a new SSM arch. that has linear-time scaling, ultra long context, and most importantly--outperforms Transformers everywhere we've tried. With @tri_dao 1/

_albertgu's tweet photo. Quadratic attention has been indispensable for information-dense modalities such as language... until now.

Announcing Mamba: a new SSM arch. that has linear-time scaling, ultra long context, and most importantly--outperforms Transformers everywhere we've tried.

With @tri_dao 1/ https://t.co/vXumZqJsdb

393

804K

SiyiTang_ retweeted

NEJM Evidence @NEJMEvidence

almost 3 years ago

Original Article: "Artificial Intelligence Predictive Model for Hormone Therapy Use in Prostate Cancer," by @DrSpratticus et al. https://t.co/mSWzYTRIC8 #AIinMedicine #Oncology #PCSM @felixfengmd @NRGonc @RadOncUH @CWRUSOM @UHhospitals @cwru @SiyiTang_ @stanford_ee @yilun_sun

NEJMEvidence's tweet photo. Original Article: "Artificial Intelligence Predictive Model for Hormone Therapy Use in Prostate Cancer," by @DrSpratticus et al. https://t.co/mSWzYTRIC8

#AIinMedicine #Oncology #PCSM @felixfengmd @NRGonc @RadOncUH @CWRUSOM @UHhospitals @cwru @SiyiTang_ @stanford_ee @yilun_sun

SiyiTang_ retweeted

ArteraAI @arteraAI

about 3 years ago

We are beyond excited to announce the publication of data in @NEJMEvidence, validating the first-ever predictive #AI biomarker of androgen deprivation therapy (ADT) benefit in #prostatecancer. Read the full announcement as reported by @AP: https://t.co/N7yCz6reil

arteraAI's tweet photo. We are beyond excited to announce the publication of data in @NEJMEvidence, validating the first-ever predictive #AI biomarker of androgen deprivation therapy (ADT) benefit in #prostatecancer.

Read the full announcement as reported by @AP: https://t.co/N7yCz6reil https://t.co/WJQ8rIui4H

SiyiTang_ retweeted

NEJM Evidence @NEJMEvidence

about 3 years ago

This report used data derived and validated from completed randomized phase III trials to demonstrate that an AI-based predictive model could identify pts with predominately intermediate-risk #ProstateCancer who are likely to benefit from short-term androgen deprivation therapy.

NEJMEvidence's tweet photo. This report used data derived and validated from completed randomized phase III trials to demonstrate that an AI-based predictive model could identify pts with predominately intermediate-risk #ProstateCancer who are likely to benefit from short-term androgen deprivation therapy.

67K

Siyi Tang @SiyiTang_

about 3 years ago

Excited to share that our GraphS4mer paper won the Best Paper Award (Models & Methods Track) at #CHIL23 🥳

Siyi Tang @SiyiTang_

about 3 years ago

(1/8) 🥳Introducing GraphS4mer, a #GNN for modeling multivariate biosignals (e.g., #EEG, #ECG). Paper: https://t.co/eEQyFYBXDR Code: https://t.co/8ujTdaTMxU 🧵

SiyiTang_ retweeted

chilconference @CHILconference

about 3 years ago

Huge congrats to our best paper award winners! @SiyiTang_ @w_la_cava @lehmer16

SiyiTang_ retweeted

chilconference @CHILconference

about 3 years ago

GraphS4mer is a #GNN that (1) captures temporal dependencies (2) represents non-Euclidean spatial correlations and (3)learns graph structures evolving over time. https://t.co/ZtWjNlXWFm @SiyiTang_, @jdunnmon, @vickyqu0, @KhaledSaab11, @TinaBaykaner45, @ChrisLeeMesser, @rubinqilab

CHILconference's tweet photo. GraphS4mer is a #GNN that (1) captures temporal dependencies (2) represents non-Euclidean spatial correlations and (3)learns graph structures evolving over time. https://t.co/ZtWjNlXWFm @SiyiTang_, @jdunnmon, @vickyqu0, @KhaledSaab11, @TinaBaykaner45, @ChrisLeeMesser, @rubinqilab https://t.co/LoznW5DP7m

457

SiyiTang_ retweeted

MedAI Group @MedaiStanford

about 3 years ago

This Thursday, @SiyiTang_ from ArteraAI will be joining us to talk about modeling multivariate biosignals with graph neural networks and structured state space models. Catch it at 1-2pm PT this Thursday on Zoom! Subscribe to https://t.co/Tr8mryt5m1 #ML #AI #medicine #healthcare

874

Siyi Tang @SiyiTang_

about 3 years ago

(8/8) Joint work with the amazing co-authors: @jdunnmon, @vickyqu0, @KhaledSaab11, @TinaBaykaner45, @ChrisLeeMesser, @rubinqilab.

130

Siyi Tang @SiyiTang_

about 3 years ago

(1/8) 🥳Introducing GraphS4mer, a #GNN for modeling multivariate biosignals (e.g., #EEG, #ECG). Paper: https://t.co/eEQyFYBXDR Code: https://t.co/8ujTdaTMxU 🧵

Siyi Tang @SiyiTang_

about 3 years ago

(7/8) I will be presenting this work as a spotlight presentation at #ICLR2023 #TSRL4H workshop on May 5 (Friday). The full paper has been accepted to #CHIL2023 as an oral presentation.

155

Siyi Tang

@SiyiTang_

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users