Goopax 6.1.0 is now available for download on our website.
With this new version, most warp level matrix features are now supported, and easily accessible: sparse matrices, block scaling, fp8, fp6, fp4.
Dedicated tensor memory, as used by Nvidia B100, B200 GPU, and Jetson Thor, is not yet supported. If you need it, or if another feature is still missing, let us know, so that we can adjust our priorities.
Also available:
- asynchronous memory copy between device memory and workgroup memory.
- CPU conversion functions for all data types.
We also fixed a number of bugs from the previous release, especially related to the new ptx generator. If you are using goopax 6.0.0, please update to 6.1.0.
Goopax 5.8.2 is now available!
New features in this release:
- more efficient register use
- bfloat16 type in Vulkan
- Vulkan/CUDA interoperability
- additional buffer creation flags for Vulkan
- warp_barrier
- Vulkan particle renderer now available in the example programs
We did a major rewrite of our cosmology example program. It got a lot faster.
Meet us at the SC25 for a live demonstration (November 18-20, St. Louis).
Are you developing HPC applications? Try goopax, our language-embedded GPU programming solution. Simply include it as a library and compile with a C++ compiler.
Goopax is highly portable and runs on a wide range of target platforms, from mobile devices to supercomputers.
We just released Goopax 5.8.1!
New features in this release:
- warp_matrix now available with the Vulkan backend
- warp_matrix: New datatype tf32
- new example program cosmology.cpp
- iOS simulator
- gpu_type<> runtime type deduction
- improved line number output in generated kernel files
- more efficient packing of local memory allocations
https://t.co/QeBXVENkRa
Goopax 5.8.0 ist now available!
New features:
- Vulkan backend! This will give us better hardware access and improve performance on non-Nvidia, non-Apple devices.
- warp_matrix now available for CPU backend (useful for debugging).
- atomic_thread_fence.
https://t.co/Vp0PCT1erX
Coming soon: Vulkan backend
In addition to our existing CUDA, Metal, OpenCL and CPU backends, we are currently working on a Vulkan backend.
Vulkan will allow us to access some hardware features that are not available in OpenCL. It will also increase the list of supported devices, particularly in the embedded sector.
The Vulkan backend will be available in our next release, which will be ready approximately in May.
Goopax 5.7.0 is now available!
New features in this release:
- tensor core support for the CUDA and Metal backends
- bfloat16_t is now available on all platforms
https://t.co/zI0ykp9OAz
Goopax is available for Linux, Windows, MacOS, iOS, and Android, and runs with almost any graphics card.
You can download it for testing at https://t.co/2qGa9HViP9.