Thrilled to share that our paper with Tatsuya @yokotatsuya has been accepted to TMLR 🙌🙌🙌We introduce the broadcast product, a new operator that provides a mathematically rigorous formulation of broadcasting in NumPy, PyTorch, and Julia! https://t.co/G5ldiIyzim #TMLR
Fast matrix multiplication on GPUs has traditionally meant wrestling with threads, shared memory, and low-level hardware details. This webinar explores how NVIDIA’s CUDA Tile model—and its Julia port, cuTile.jl—makes high-performance GPU programming more accessible. Join Dr. Andy Terrel of NVIDIA and Dr. Tim Besard of JuliaHub to see real examples across linear algebra, AI inference, and HPC. Register here - https://t.co/SXn4kAmfTL
#JuliaLang #GPUProgramming #CUDA #HPC #AIInfrastructure
What does it take to translate GPU kernels across languages without introducing silent errors? This post explores how #AI agents helped automate translation from cuTile Python to cuTile.jl, turning a one-off porting task into a repeatable, validated workflow. From matrix multiplication to softmax, it shows how structured skills, rules, and testing can make #GPU kernel translation faster, safer, and more reusable in Julia. https://t.co/E9MQF9fD5B
#JuliaLang #GPUProgramming #CUDA #AIEngineering #HPC
How do you make high-performance #GPU#programming more accessible in Julia? This #webinar explores cuTile.jl, NVIDIA’s CUDA Tile model in Julia, through real-world examples in linear algebra, AI inference, and HPC—showing how tile-based abstractions can simplify Tensor Core programming. https://t.co/upwvt4wijj
#JuliaLang #GPUProgramming #CUDA #HPC #AIInfrastructure
Writing high-performance #GPU kernels in Julia is getting more intuitive. cuTile.jl v0.2 introduces native Julia for loops, improved floating-point control, stronger debugging, and major performance gains—making advanced GPU programming more accessible for #AI and HPC workloads. https://t.co/5pHs5TsbqB
#JuliaLang #GPUProgramming #CUDA #HPC #AIInfrastructure #HighPerformanceComputing #DeveloperTools #MachineLearning #ScientificComputing #NVIDIA
#GPU#programming is getting simpler. cuTile.jl brings NVIDIA’s tile-based programming model to Julia, making it easier to write high-performance #CUDA kernels without managing low-level hardware details. A big step toward more accessible, productive GPU development in Julia. https://t.co/5lOFJmLl49
#JuliaLang #AIInfrastructure #HighPerformanceComputing #GPUComputing #DeveloperTools