1/🚨 A bit late to the party, but excited to share our paper at #ICLR2026:
AutoSP — the first compiler-based solution to unlock long-context LLM training.
✅ Up to 2.7× longer contexts on NVIDIA (2.5x on AMD)
✅ Negligible runtime overhead
✅ Merged into DeepSpeed 🚀
🧵👇
I’ll be at NeurIPS from Dec 1-8! 🌊🏖️
Excited to catch up with old and new friends! Let’s grab coffee ☕️ or have a quick chat if you’re attending! #NeurIPS2025
Excited to finally share our #NeurIPS2025 paper "🔮PurpCode: Reasoning for Safer Code Generation"! 🙌
👐 First post-training recipe for training safe code reasoning models
🚀 SOTA for cybersafety + utility, outperforming Sonnet 4, o4-mini, R1
🥇 Winner of 2025 Amazon Nova AI Challenge
📝 Paper: https://t.co/9aOoiL5zgJ
🧵👇 1/11
Excited to finally share our new paper titled "CPU Autoscaling With a Kernel of Truth" at APSys 2025.
Huge thanks to my amazing collaborators and advisors @tianyin_xu, @SaugataGhose
Catch my virtual talk tonight if you're in Seoul!
Paper: https://t.co/xqwwu6NWiF
It’s wild how much chatGPT has supercharged learning… I use it regularly to understand topics across fields like biology, economics, politics, etc, which ordinarily would take me hours of research & browsing
I'm excited to announce NN-CIFT got into @NeurIPSConf 2025 (featuring a fancy, new title)💃💃🌴Can't wait to discuss it with everyone!!
Thank you @dilekhakkanitur and @convai_uiuc 🎉🎉
I've successfully defended my PhD thesis on automated information seeking! Extremely grateful to my advisor @hengjinlp, committee members and all collaborators.
Next, I'll be joining @GoogleDeepMind as a research scientist!
Link to defense slides: https://t.co/O4y30tnncC
🔬Interested in training AlphaFold3 faster, at scale, and beyond NVIDIA GPU? Now you can.
AlphaFold3 is a major leap in biomolecular modeling, but behind the scenes, it introduces severe system bottlenecks:
🧠 2D EvoAttention spikes memory usage
📉 Retrieval-augmented training pipeline causes long GPU idle time
⛔ Frequent but memory-intensive ops slow everything down
Today, I'm excited to announce MegaFold, a fully open-source system to make AlphaFold3 training fast, scalable, and cross-platform on both NVIDIA and AMD GPUs.
MegaFold delivers:
⚡ Up to 1.73x / 1.62x faster training on NVIDIA H100 / AMD MI250
🧬 Up to 1.35× longer sequences compared to PyTorch baseline
Key features:
🚀 Memory-Efficient EvoAttention via portable Triton kernels
💡 Ahead-of-Time Caching to eliminate GPU idle time in retrieval pipelines
🔗 DeepFusion for reducing overhead of small but frequent memory-intensive AF3 ops
📘 Project page: https://t.co/sSaxDtBT1O
📄 Paper: https://t.co/2sCArFoq82
💻 Code: https://t.co/gn7HR5kcQC
🤝 MegaFold is developed in collaboration between UIUC SSAIL Lab and researchers from University of Missouri and Lawrence Berkeley National Laboratory.
Kudos to the brilliant team: Hoa La, Ahan Gupta, Alex Morehead, Jianlin Cheng
#AlphaFold3 #AI #ProteinFolding #Bioinformatics #AMD #Triton #CrossPlatform #OpenSource
🔬Interested in training AlphaFold3 faster, at scale, and beyond NVIDIA GPU? Now you can.
AlphaFold3 is a major leap in biomolecular modeling, but behind the scenes, it introduces severe system bottlenecks:
🧠 2D EvoAttention spikes memory usage
📉 Retrieval-augmented training pipeline causes long GPU idle time
⛔ Frequent but memory-intensive ops slow everything down
Today, I'm excited to announce MegaFold, a fully open-source system to make AlphaFold3 training fast, scalable, and cross-platform on both NVIDIA and AMD GPUs.
MegaFold delivers:
⚡ Up to 1.73x / 1.62x faster training on NVIDIA H100 / AMD MI250
🧬 Up to 1.35× longer sequences compared to PyTorch baseline
Key features:
🚀 Memory-Efficient EvoAttention via portable Triton kernels
💡 Ahead-of-Time Caching to eliminate GPU idle time in retrieval pipelines
🔗 DeepFusion for reducing overhead of small but frequent memory-intensive AF3 ops
📘 Project page: https://t.co/sSaxDtBT1O
📄 Paper: https://t.co/2sCArFoq82
💻 Code: https://t.co/gn7HR5kcQC
🤝 MegaFold is developed in collaboration between UIUC SSAIL Lab and researchers from University of Missouri and Lawrence Berkeley National Laboratory.
Kudos to the brilliant team: Hoa La, Ahan Gupta, Alex Morehead, Jianlin Cheng
#AlphaFold3 #AI #ProteinFolding #Bioinformatics #AMD #Triton #CrossPlatform #OpenSource
@AkshayGoindani1 Thanks @AkshayGoindani1 looks like we might be quick to assume these gains since evils did not reproduce the baselines effectively
https://t.co/0LEygcoxJl
Confused about recent LLM RL results where models improve without any ground-truth signal? We were too. Until we looked at the reported numbers of the Pre-RL models and realized they were serverely underreported across papers. We compiled discrepancies in a blog below🧵👇
Would models know more about Indian food in Hindi and Turkey’s history in Turkish? Does the language of a question affect an LLM’s answer?
✨Yes!✨
@nbbozdag and I are excited to announce our newest preprint in which we explore “Language Specific Knowledge (LSK)”.