You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I just tested the code from README, but it has error with CUDA.jl:
using CUDA
using ImplicitBVH
using ImplicitBVH: BBox, BSphere
# Generate some simple bounding spheres; save them in a GPU array
bounding_spheres =CuArray([
BSphere{Float32}([0., 0., 0.], 0.5),
BSphere{Float32}([0., 0., 1.], 0.6),
BSphere{Float32}([0., 0., 2.], 0.5),
BSphere{Float32}([0., 0., 3.], 0.4),
BSphere{Float32}([0., 0., 4.], 0.6),
])
# Build BVH
bvh =BVH(bounding_spheres, BBox{Float32}, UInt32)
Error info on the main branch (v0.5.0):
ERROR: GPU compilation of MethodInstance for AcceleratedKernels.gpu__foreachindex_global!(::KernelAbstractions.CompilerMetadata{…}, ::ImplicitBVH.var"#12#13"{…}, ::Base.OneTo{…}) failed
KernelError: passing and using non-bitstype argument
Argument 3 to your kernel function is of type ImplicitBVH.var"#12#13"{CuDeviceVector{BBox{Float32}, 1}, CuDeviceVector{BSphere{Float32}, 1}, CuDeviceVector{Int32, 1}, Int32, Int32, Type{Int32}}, which is not isbits:.I is of type Type{Int32} which is not isbits.
Stacktrace:
[1] check_invocation(job::GPUCompiler.CompilerJob)
@ GPUCompiler ~/.julia/packages/GPUCompiler/2CW9L/src/validation.jl:92
[2] macro expansion
@ ~/.julia/packages/GPUCompiler/2CW9L/src/driver.jl:92 [inlined]
[3] macro expansion
@ ~/.julia/packages/TimerOutputs/NRdsv/src/TimerOutput.jl:253 [inlined]
[4]
@ GPUCompiler ~/.julia/packages/GPUCompiler/2CW9L/src/driver.jl:90
[5] codegen
@ ~/.julia/packages/GPUCompiler/2CW9L/src/driver.jl:82 [inlined]
[6] compile(target::Symbol, job::GPUCompiler.CompilerJob; kwargs::@Kwargs{})
@ GPUCompiler ~/.julia/packages/GPUCompiler/2CW9L/src/driver.jl:79
[7] compile
@ ~/.julia/packages/GPUCompiler/2CW9L/src/driver.jl:74 [inlined]
[8] #1145
@ ~/.julia/packages/CUDA/2kjXI/src/compiler/compilation.jl:250 [inlined]
[9] JuliaContext(f::CUDA.var"#1145#1148"{GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams}}; kwargs::@Kwargs{})
@ GPUCompiler ~/.julia/packages/GPUCompiler/2CW9L/src/driver.jl:34
[10] JuliaContext(f::Function)
@ GPUCompiler ~/.julia/packages/GPUCompiler/2CW9L/src/driver.jl:25
[11] compile(job::GPUCompiler.CompilerJob)
@ CUDA ~/.julia/packages/CUDA/2kjXI/src/compiler/compilation.jl:249
[12] actual_compilation(cache::Dict{…}, src::Core.MethodInstance, world::UInt64, cfg::GPUCompiler.CompilerConfig{…}, compiler::typeof(CUDA.compile), linker::typeof(CUDA.link))
@ GPUCompiler ~/.julia/packages/GPUCompiler/2CW9L/src/execution.jl:237
[13] cached_compilation(cache::Dict{…}, src::Core.MethodInstance, cfg::GPUCompiler.CompilerConfig{…}, compiler::Function, linker::Function)
@ GPUCompiler ~/.julia/packages/GPUCompiler/2CW9L/src/execution.jl:151
[14] macro expansion
@ ~/.julia/packages/CUDA/2kjXI/src/compiler/execution.jl:380 [inlined]
[15] macro expansion
@ ./lock.jl:273 [inlined]
[16] cufunction(f::typeof(AcceleratedKernels.gpu__foreachindex_global!), tt::Type{…}; kwargs::@Kwargs{…})
@ CUDA ~/.julia/packages/CUDA/2kjXI/src/compiler/execution.jl:375
[17] macro expansion
@ ~/.julia/packages/CUDA/2kjXI/src/compiler/execution.jl:112 [inlined]
[18] (::KernelAbstractions.Kernel{…})(::Function, ::Vararg{…}; ndrange::Int64, workgroupsize::Nothing)
@ CUDA.CUDAKernels ~/.julia/packages/CUDA/2kjXI/src/CUDAKernels.jl:103
[19] _foreachindex_gpu(f::Function, itr::UnitRange{Int32}, backend::CUDABackend; block_size::Int64)
@ AcceleratedKernels ~/.julia/packages/AcceleratedKernels/gP4jS/src/foreachindex.jl:17
[20] _foreachindex_gpu
@ ~/.julia/packages/AcceleratedKernels/gP4jS/src/foreachindex.jl:8 [inlined]
[21] #foreachindex#9
@ ~/.julia/packages/AcceleratedKernels/gP4jS/src/foreachindex.jl:160 [inlined]
[22] foreachindex
@ ~/.julia/packages/AcceleratedKernels/gP4jS/src/foreachindex.jl:141 [inlined]
[23] aggregate_last_level!
@ ~/.julia/packages/ImplicitBVH/0kmLq/src/build.jl:383 [inlined]
[24] aggregate_oibvh!(bvh_nodes::CuArray{…}, bvh_leaves::CuArray{…}, tree::ImplicitTree{…}, order::CuArray{…}, built_level::Int32, options::BVHOptions{…})
@ ImplicitBVH ~/.julia/packages/ImplicitBVH/0kmLq/src/build.jl:310
[25] BVH(bounding_volumes::CuArray{…}, node_type::Type{…}, morton_type::Type{…}, built_level::Int64; options::BVHOptions{…})
@ ImplicitBVH ~/.julia/packages/ImplicitBVH/0kmLq/src/build.jl:205
[26] BVH
@ ~/.julia/packages/ImplicitBVH/0kmLq/src/build.jl:160 [inlined]
[27] BVH(bounding_volumes::CuArray{BSphere{Float32}, 1, CUDA.DeviceMemory}, node_type::Type{BBox{Float32}}, morton_type::Type{UInt32})
@ ImplicitBVH ~/.julia/packages/ImplicitBVH/0kmLq/src/build.jl:160
[28] top-level scope
@ REPL[5]:1
Some type information was truncated. Use `show(err)` to see complete types.
Error info on v0.4.1:
ERROR: Scalar indexing is disallowed.
Invocation of getindex resulted in scalar indexing of a GPU array.
This is typically caused by calling an iterating implementation of a method.
Such implementations *do not* execute on the GPU, but very slowly on the CPU,
and therefore should be avoided.
If you want to allow scalar iteration, use `allowscalar` or `@allowscalar`
to enable scalar iteration globally or for the operations in question.
Stacktrace:
[1] error(s::String)
@ Base ./error.jl:35
[2] errorscalar(op::String)
@ GPUArraysCore ~/.julia/packages/GPUArraysCore/GMsgk/src/GPUArraysCore.jl:155
[3] _assertscalar(op::String, behavior::GPUArraysCore.ScalarIndexing)
@ GPUArraysCore ~/.julia/packages/GPUArraysCore/GMsgk/src/GPUArraysCore.jl:128
[4] assertscalar(op::String)
@ GPUArraysCore ~/.julia/packages/GPUArraysCore/GMsgk/src/GPUArraysCore.jl:116
[5] getindex
@ ~/.julia/packages/GPUArrays/qt4ax/src/host/indexing.jl:50 [inlined]
[6] bounding_volumes_extrema(bounding_volumes::CuArray{BSphere{Float32}, 1, CUDA.DeviceMemory})
@ ImplicitBVH ~/.julia/packages/ImplicitBVH/00zCQ/src/morton.jl:125
[7] #morton_encode!#5
@ ~/.julia/packages/ImplicitBVH/00zCQ/src/morton.jl:216 [inlined]
[8] morton_encode!
@ ~/.julia/packages/ImplicitBVH/00zCQ/src/morton.jl:209 [inlined]
[9] BVH(bounding_volumes::CuArray{…}, node_type::Type{…}, morton_type::Type{…}, built_level::Int64; num_threads::Int64)
@ ImplicitBVH ~/.julia/packages/ImplicitBVH/00zCQ/src/build.jl:170
[10] BVH
@ ~/.julia/packages/ImplicitBVH/00zCQ/src/build.jl:134 [inlined]
[11] BVH(bounding_volumes::CuArray{BSphere{…}, 1, CUDA.DeviceMemory}, node_type::Type{BBox{…}}, morton_type::Type{UInt32})
@ ImplicitBVH ~/.julia/packages/ImplicitBVH/00zCQ/src/build.jl:134
[12] top-level scope
@ REPL[11]:1
Some type information was truncated. Use `show(err)` to see complete types.
Julia Version 1.11.1
CUDA v5.5.2
ImplicitBVH v0.5.0 `https://github.com/StellaOrg/ImplicitBVH.jl.git#main`
or ImplicitBVH v0.4.1
The text was updated successfully, but these errors were encountered:
Hi, thanks for the report! I finally got on a cluster with Nvidia GPUs. Julia had some odd lambda capturing (it was trying to pass Type{Int32} to the kernel); separating that into another function fixed it.
I tested and benchmarked the GPU backend via CUDA for building, contact-detection and ray-tracing, and it seems to work.
Would you mind testing the #main version again?
For the other GPU backends, we are waiting on atomics for KernelAbstractions.jl kernels; it should already work on AMD GPUs (though I don't have one on hand to try). On Apple Metal it will be available in the next version of Atomix after a pull request I made; we're waiting on merging the oneAPI atomics extension - then all backends should work :)
Hi, I just tested the code from README, but it has error with CUDA.jl:
Error info on the main branch (v0.5.0):
Error info on v0.4.1:
The text was updated successfully, but these errors were encountered: