@CalebLareau is doing some of the most compelling work on viral contributions to human disease.
Here showing herpesvirus reactivation in Alzheimer’s disease.
https://t.co/lnvQJ04xCb
As someone who has covered cancer drug development for 25 years, one of the few things I am sure of is that the odds of technology folks thinking they understand biology are much higher than the odds they actually do.
Pinpointing the exact noncoding variants behind disease associations is challenging. Using MPRA, @LaylaSiraj, @julirsch, @ReillyLikesIt, @r_tewhey et al tested >220K variants in 5 cell types, resolving >13K & revealing mechanisms behind many disease traits https://t.co/I3WecxPl3U
⚠️ If you’re reading this, you’ve been infected* ⚠️
*~95% the human population has been infected by the Epstein-Barr Virus (EBV).
Today in @Nature with @nyeo_sherry, @EMC22381830, @RyanDhindsa@SlavePetrovski, we shed some light on what happens next.
https://t.co/Db1j4aHeFO
Ever wondered how to choose weights for rare variant burden/SKAT tests? @jschwart37 and I did too! Check out our (@illumina) straightforward new approach that improves discovery by ~50% or more than STAAR, Regenie, and DeepRVAT!
Let’s talk about rare variant association tests - we have a new method that helps you discover more with the same data! It’s called FlexRV to signal our approach: Flexibly modeling rare variant pathogenicity improves gene discovery for complex traits. https://t.co/1sMgxUXwkK
I am a strong believer that depth of data can be just as powerful as scale of data. Deep analysis of small groups of humans, and even individuals, can lead to novel biological insights and principles.
Today I am the guinea pig. In my own blood, @CalebLareau finds strong evidence that a synonymous genetic variant (no amino acid change in the protein) can impact T cell function via altered codon syntax.
https://t.co/Gw92eFvKnX
Excited to share my first contribution here at Illumina! We developed PromoterAI, a deep neural network that accurately identifies non-coding promoter variants that disrupt gene expression.🧵 (1/)
Though the coarse tree topologies may appear similar between versions, the relative placement of cells (color bar) in the two trees is ~random! We therefore disagree that ReDeeM is “robust and reliable” between versions as edge variants in -1 remains a large driving signal. 5/n
In addition to the technical points raised by @CalebLareau, several biological questions remain after this preprint. We would appreciate your thoughts as these were not addressed @chenweng1991, @jswlab, @bloodgenes. 1/n
Tools that predict the causal genes in GWAS loci are too complex! Many use XGBoost models with >45 features (i.e. L2G, Ei).
CALDERA achieves better performance despite using LASSO regression and only 12 features. This makes it much easier to understand why a gene was chosen.