Excited to share my first contribution here at Illumina! We developed PromoterAI, a deep neural network that accurately identifies non-coding promoter variants that disrupt gene expression.🧵 (1/)
New study using PromoterAI to show that disruption of key transcription factor motifs reduces gene expression and further strengthens experimental results (Figure 7).
https://t.co/xZyhondX7L
Just want to give a shout-out to David Kelley @drklly who I think often does not get the credit he deserves (outside our core community).
I want to highlight why I think he is such a fantastic scientist and leader in regulatory genomics. 1/
This administration entire policy is to torture hard working people who actually contribute to the nation. This will lead to faster decline & push even more skilled immigrants to other nations. A special congrats to tech/biotechMAGA.
Rare disease diagnoses can rely on exome sequencing, but answers may be hiding in noncoding regions. 🧬 PromoterAI is a new deep learning tool that identifies pathogenic promoter variants, which may account for up to 6% of rare disease genetic burden 🔍
https://t.co/1GoPVcAwNW
In a saturation MPRA of the MAPT promoter, PromoterAI tracked measured variant effects, supporting its use for prioritizing pathogenic promoter variants.
https://t.co/rw2eUT8Rr5
Excited to share our RegVelo paper in Cell
https://t.co/ZAnQphaXsg
We unify RNA velocity + GRNs into one model → better OOD prediction of perturbations (e.g. gene KOs), with examples incl. neural crest KO predictions 🔬
Big thanks to W Wang, Z Hu & T Sauka-Spengler 🙏
Exciting new insights on CpG islands (CGIs) regulation by transcription factors (TFs)! CGIs drive most transcription initiation with unclear regulation. We find that chromatin-opening TFs are key players—following a surprisingly simple rule.
https://t.co/mXkDodTALR
1/9
Our Human Multiomic Development Atlas paper is out in Nature today! A heart-felt "thank you" to all co-authors for their tireless work on this complex yet exciting project! Congrats all! https://t.co/iUiZz00KOt
It’s well known that inflammation increases cancer risk, but how?
The answer: the epigenome "remembers" inflammation and primes stem cells for cancer.
Here is our paper: https://t.co/FcnkLdpiKZ
And a special shoutout to the lead author @snaga13
A 🧵
We are also releasing self-contained lecture notes that explain flow matching and diffusion models from scratch. This goes from "zero" to the state-of-the-art in modern Generative AI.
📖 Read the notes here: https://t.co/RULWDgn9pm
Joint work with @EErives40101.
Models are typically specialized to new domains by finetuning on small, high-quality datasets.
We find that repeating the same dataset 10–50× starting from pretraining leads to substantially better downstream performance, in some cases outperforming larger models. 🧵
Great to the see the flurry of single gene knockdown Perturb-seq like atlases from cell-lines, mouse brain etc over the last few days. These are undoubtedly very valuable datasets. I just want to re-iterate a few other very important expt. design considerations 1/
@SashaGusevPosts@javier_maravall Gaps *this* wide have been shown before, in Figure 2D, for splice variant effect prediction (SpliceAI is 700K parameters). The x-axis ranges from 0 to 1 here so it may not be immediately apparent, but its the same 0.6 to 0.9 gap.
Can we simulate realistic evolutionary trajectories and “replay the tape of life”? In this work, we propose a flexible, generalizable framework for modeling how the entire protein seq evolves over time while capturing complex interactions across sites. 1/n
https://t.co/DIwQ40C478
AlphaGenome is out in @nature today along with model weights! 🧬
📄 Paper: https://t.co/1fHzSPiY1x
💻 Weights: https://t.co/z6JWLT4Mpv
Getting here wasn’t a straight path. We sat down @googledeepmind to discuss the story behind the model, paper & API: https://t.co/cT8CiXfnxQ
Molecular Genetics @ University of Toronto 🇨🇦@UofT🦠 is recruiting an Assistant Professor in #Virology!!
Come build your dream lab!
🔗https://t.co/MMq1l7y1PA