PyTorch 2.12 introduces major updates across compilation, export, distributed training, and accelerator support.
Highlights include up to 100x faster batched linalg.eigh on CUDA, the new torch.accelerator.Graph API, Microscaling quantization support in torch .export.save, and fused Adagrad.
The release includes 2,926 commits from 457 contributors since PyTorch 2.11.
Have questions? Join @AndreyTalman (@Meta), @albanDesmaison (@Meta), and @joespeez (@reflection_ai), moderated by @Chris_AI_HPC (@Meta), on May 20 at 10:00 AM PT for a live Q&A covering the release and answering questions from the community.
🔗 Read the release blog and register for the webinar: https://t.co/lSkHPD3FQR
#PyTorch #OpenSourceAI #MachineLearning #AIInfrastructure
Bacteria move around using a molecular machine called the flagellar motor that rotates faster than the flywheel of a race car engine and switches directions in an instant. After 50 yrs, scientists have finally figured out how it works. “My lifelong quest is now fulfilled.” Link⤵️
PyTorch 2.10 is now available, with updates focused on performance, determinism, and numerical debugging for modern training and post-training workflows.
Highlights include Python 3.14 support for torch.compile(), reduced kernel launch overhead in TorchInductor, a new varlen_attn() op for variable-length sequences, and improved tools for tracking numerical divergence.
🖇️ 🔥 Read the PyTorch 2.10 release blog and release notes: https://t.co/xavEMkPvxp
#PyTorch #OpenSourceAI #AIInfrastructure
PyTorch 2.9 is now available, introducing key updates to performance, portability, and the developer experience.
This release includes a stable libtorch ABI for C++/CUDA extensions, symmetric memory for multi-GPU kernels, expanded wheel support to include ROCm, XPU, and CUDA 13, and enhancements for Intel, Arm, and x86 platforms.
With 3,216 commits from 452 contributors, PyTorch 2.9 continues to advance open source AI for developers worldwide.
🔗 Read the full release blog: https://t.co/nGgOlzVvE0
#PyTorch #OpenSourceAI #AI #Performance
Compiling large #PyTorch models at Meta could take an hour+. Engineers cut PT2 compile time by 80% with parallel Triton compilation, dynamic shape marking, autotuning config pruning, and cache improvements now integrated into the stack.
🔗 https://t.co/x43aXjLIUe
As #training jobs grow, failures like preemptions and crashes cause costly delays. Efficient distributed #checkpointing is key. #PyTorch@Google built a local checkpointing solution using DCP to cut overhead, reduce rollbacks, and boost training goodput.
🔗 https://t.co/RV702mS43P
🖋️ @meta & @Google
Introducing DINOv3: a state-of-the-art computer vision model trained with self-supervised learning (SSL) that produces powerful, high-resolution image features. For the first time, a single frozen vision backbone outperforms specialized solutions on multiple long-standing dense prediction tasks.
Learn more about DINOv3 here: https://t.co/lQpKhJLTZQ
The *full* Python Documentary will be released this Thursday (Aug 28) at 10am PDT / 19:00 CET. More at https://t.co/ifkBoVOkxX Don't miss the online release party / chat! @TECHDOCU
Update from #PyTorch maintainers: 2.8 is out now.
🔹A limited stable libtorch ABI for third-party C++/CUDA extensions
🔹 High-performance quantized LLM inference on Intel CPUs with native PyTorch
& more!
📄 Release notes: https://t.co/O0cIdlhgL1
🔗 Blog: https://t.co/4duhixafzH
Update from the PyTorch ecosystem: The latest @nvidia DALI release adds DALI Proxy—making it easier to accelerate parts of your PyTorch DataLoader pipeline without a full refactor.
Highlights:
- Better GPU use in multiprocess mode
- Selective pipeline offloading
- New video decoding features
🔗 https://t.co/HHYk4caH1r
#PyTorch #OpenSourceAI #DataPipelines #DeepLearning
Update from the PyTorch maintainers: 2.7 is out now.
🔹 Support for NVIDIA Blackwell (CUDA 12.8)
🔹 Mega Cache
🔹 torch.compile for Function Modes
🔹 FlexAttention updates
🔹 Intel GPU perf boost
🔗 Blog: https://t.co/yvtI2Qh9eo
📄 Release notes: https://t.co/dWqpzRO6Jb
#PyTorch #OpenSourceAI
👀 3.6 billion medical imaging tests are performed globally each year.
See how @databricks Pixels 2.0 and #MONAI are reducing data labeling time by up to 75% using active learning. #NVIDIAhealthcare
Get the details 👉 https://t.co/3IXduYKpzF
AI made in 🇪🇺
OpenEuroLLM, the first family of open source Large Language Models covering all EU languages, has earned the first STEP Seal for its excellence.
It brings together EU startups, research labs and supercomputing hosts to train AI on European supercomputers ↓