Autotuning is the backbone of Helion, PyTorch's DSL for performance portable ML kernels. Currently Helion searches utilize Likelihood-Free Bayesian Optimization (LFBO) to find the most performant configs. While LFBO works well, it requires grinding through hundreds of compile-and-benchmark cycles per kernel.
What if, instead of starting the search blindly, you could ask an LLM to reason about the kernel and propose configurations?
In this blog, we look at how LLM-guided autotuning is a practical approach to dramatically faster kernel tuning at production quality.
Click the link in the comments section to learn more.
@JongsokC@oguz_ulgen
Ray (@raydistributed) Serve LLM and @vllm_project enable high performance distributed inference at scale. Awesome to see Foundation-hosted projects working together to advance the open source AI stack.
Learn more: https://t.co/ZabYFSVSbk
Ray Serve LLM now offers 4.4x higher request throughput on prefill-heavy workloads, and 24.8x higher request throughput on decode-heavy workloads!
๐Three major optimizations:
- Direct streaming, bypassing an intermediate Ray Serve deployment on the response path with a new, control plane-only endpoint picker
- A new, Ray V2 executor backend in vLLM, enabling optimizations such as async scheduling
- HAProxy ingress, for ingress request routing at the speed of C
All available in Ray 2.56. This is awesome work with @googlecloud and @vllm_project!
Prove you can work with the technology behind modern AI:
We are excited to announce the launch of PyTorch Certified Associate (PTCA).
As AI adoption continues to accelerate across industries, the demand for practical, hands-on engineering skills is growing rapidly. The PTCA is designed to validate your foundational ability to design, train, and deploy machine learning models within real-world environments.
Link to learn more in comments section๐
The schedule is now available for KubeCon + CloudNativeCon + OpenInfra Summit + PyTorch Conference China, September 7-9 in Shanghai.
AI is transforming how we build, deploy, and operate technology. Open source is making it possible. The agenda features engineers, maintainers, researchers, and technology leaders advancing cloud native infrastructure, open infrastructure, and AI.
Explore the sessions: https://t.co/PV5nDI4JTV
@CloudNativeFdn, @openinfradev
#PyTorchCon #KubeCon #CloudNativeCon #OpenInfraSummit
Nominations are officially open for the 2026 PyTorch Foundation Contributor Awards.
These awards celebrate outstanding impact across PyTorch Foundation hosted projects, including PyTorch, vLLM, DeepSpeed, Ray, Helion, and Safetensors.
Whether it be through code development, documentation improvements, mentorship or community leadership - we want to recognise the individuals who help move our ecosystem forward.
Submit your nomination by July 17. Link in comments
We're seeking a Senior Cloud Operations Engineer who will play a pivotal role in the PyTorch Foundation, leading cloud infrastructure and DevOps initiatives. Apply at: https://t.co/4GCuyBacyn
๐ Built something with #PyTorch?
Show it off at the Poster Sessions during #PyTorchCon North America, October 20-21 in San Jose, CA.
Poster submissions are due July 26: https://t.co/M7WmgnmuKJ
24 hours left to nominate yourself or someone else as a PyTorch Foundation Ambassador๐ฅ
The PyTorch Foundation Ambassador Program highlights and supports passionate community leaders who organize events, create technical content, mentor new users, and contribute to the open source ecosystem.
As we continue to expand representation across local PyTorch communities, we especially welcome applications from contributors in Africa, Latin America, the Middle East, Oceania, Southeast Asia, and Eastern Europe.
Learn more and apply before June 18, 2026 - link in comments.
Submit your talk for OSPOlogy + #OSPOSummit China, taking place September 7 alongside #KubeCon + #CloudNativeCon + #OpenInfraSummit + #PyTorchCon China in Shanghai.
Share lessons learned from building, scaling, and supporting #OpenSource programs.
๐ CFP closes July 12: https://t.co/NemIeieN4v
Bridging the gap between model optimization and production deployment
This tutorial walks through a typical end-to-end quantization workflow, from PyTorch model through how to export/compile an NVIDIA TensorRT engine for real-world inference speedup.
Read the full post: https://t.co/IuaJlNuUEi
๐ #AI frameworks evolve. Communities move them forward.
Meet the developers, researchers, & practitioners shaping the future of #PyTorch at #PyTorchCon North America.
๐San Jose, California
๐ October 20-21
Save $400 through July 31:
https://t.co/AVHdaIFT20
In this clip from his PyTorch Conference Europe 2026 keynote, Patrick von Platen (@MistralAI) discusses why real-world machine interaction requires models that can take continuous input and produce continuous output.
Using live transcription as an example, he explains how streaming architectures differ from traditional speech recognition approaches that process larger chunks of audio at once.
๐ Watch the full keynote: https://t.co/PmM1ak29ay
#PyTorchCon
The inaugural PyTorch Meetup Singapore brought together engineers, researchers, and community builders to talk about everything from vLLM project updates to the broader question of sovereign intelligence.
Read the full technical recap and find presentation slides in our latest blog: https://t.co/tYRSHBXdXi
๐ฃ New CNCF-hosted co-located event added!
Join OSPOlogy + #OSPOSummit China on September 7 alongside #KubeCon + #CloudNativeCon + #OpenInfraSummit + #PyTorchCon China in Shanghai.
Share your insights with the #OpenSource program office community. Submit to speak by July 12.
๐๏ธ Add OSPOlogy + OSPO Summit China to your event pass by June 30 for just ยฅ70 (USD$10) before prices increase.
Learn more: https://t.co/UQ1VLsAZgW
Enable smarter, longer-thinking agents
Scale agentic AI and reinforcement learning by shortening CPU execution time, increasing task throughput, and improving overall AI factory output.
The @nvidia custom Olympus core in the NVIDIA Vera CPU uses a neural branch predictor to reduce stalls in branch-heavy code. Combined with other prediction mechanisms, it can sustain two taken branches per cycle with zero penalty, maintaining throughput for deep software stacks such as PyTorch, graph workloads, and scripting engines.
Read the complete blog post:
https://t.co/L6NmBJWgDY
Submit your poster proposal for #PyTorchCon North America! Share your latest #AI, ML, #PyTorch, tooling, infrastructure, or research work with the community October 20-21 in San Jose, CA.
๐๏ธ Poster #CFP closes July 26: https://t.co/M7WmgnmuKJ