By: Blog on JuliaGPU
Re-posted from: https://juliagpu.org/2021-01-08-cuda_2.4_2.5/
CUDA.jl v2.4 and v2.5 are two almost-identical feature releases, respectively for Julia 1.5
and 1.6. These releases feature a greatly improved findmin
and findmax
kernels, an
improved interface for kernel introspection, support for CUDA 11.2, and of course many bug
fixes.
Improved findmin
and findmax
kernels
Thanks to @tkf and @Ellipse0934,
CUDA.jl now uses a single-pass kernel for finding the minimum or maximum item in a
CuArray. This fixes compatibility with
NaN
-valued elements, while on average improving performance. Depending on the rank, shape
and size of the array these improvements vary from a minor regression to order-of-magnitude
improvements.
New kernel introspection interface
It is now possible to obtain a compiled-but-not-launched kernel by passing the
launch=false
keyword to @cuda
. This is useful when you want to reflect, e.g., query the
amount of registers, or other kernel properties:
julia> kernel = @cuda launch=false identity(nothing)
CUDA.HostKernel{identity,Tuple{Nothing}}(...)
julia> CUDA.registers(kernel)
4
The old API is still available, and will even be extended in future versions of CUDA.jl for
the purpose of compiling device functions (not kernels):
julia> kernel = cufunction(identity, Tuple{Nothing})
CUDA.HostKernel{identity,Tuple{Nothing}}(...)
Support for CUDA 11.2
CUDA.jl now supports the latest version of CUDA, version 11.2. Because CUDNN and CUTENSOR
are not compatible with this release yet, CUDA.jl won’t automatically switch to it unless
you explicitly request so:
julia> ENV["JULIA_CUDA_VERSION"] = "11.2"
"11.2"
julia> using CUDA
julia> CUDA.versioninfo()
CUDA toolkit 11.2.0, artifact installation
CUDA driver 11.2.0
NVIDIA driver 460.27.4
Alternatively, if you disable use of artifacts through JULIA_CUDA_USE_BINARYBUILDER=false
,
CUDA 11.2 can be picked up from your local system.
Future developments
Due to upstream compiler changes, CUDA.jl 2.4 is expected to be the last release compatible
with Julia 1.5. Patch releases are still possible, but are not automatic: If you need a
specific bugfix from a future CUDA.jl release, create an issue or PR to backport the change.