Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BVH building is not compatible with CUDA.jl #10

Closed
ZenanH opened this issue Nov 21, 2024 · 2 comments
Closed

BVH building is not compatible with CUDA.jl #10

ZenanH opened this issue Nov 21, 2024 · 2 comments

Comments

@ZenanH
Copy link

ZenanH commented Nov 21, 2024

Hi, I just tested the code from README, but it has error with CUDA.jl:

using CUDA

using ImplicitBVH
using ImplicitBVH: BBox, BSphere

# Generate some simple bounding spheres; save them in a GPU array
bounding_spheres = CuArray([
    BSphere{Float32}([0., 0., 0.], 0.5),
    BSphere{Float32}([0., 0., 1.], 0.6),
    BSphere{Float32}([0., 0., 2.], 0.5),
    BSphere{Float32}([0., 0., 3.], 0.4),
    BSphere{Float32}([0., 0., 4.], 0.6),
])

# Build BVH
bvh = BVH(bounding_spheres, BBox{Float32}, UInt32)

Error info on the main branch (v0.5.0):

ERROR: GPU compilation of MethodInstance for AcceleratedKernels.gpu__foreachindex_global!(::KernelAbstractions.CompilerMetadata{…}, ::ImplicitBVH.var"#12#13"{}, ::Base.OneTo{…}) failed
KernelError: passing and using non-bitstype argument

Argument 3 to your kernel function is of type ImplicitBVH.var"#12#13"{CuDeviceVector{BBox{Float32}, 1}, CuDeviceVector{BSphere{Float32}, 1}, CuDeviceVector{Int32, 1}, Int32, Int32, Type{Int32}}, which is not isbits:
  .I is of type Type{Int32} which is not isbits.


Stacktrace:
  [1] check_invocation(job::GPUCompiler.CompilerJob)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/2CW9L/src/validation.jl:92
  [2] macro expansion
    @ ~/.julia/packages/GPUCompiler/2CW9L/src/driver.jl:92 [inlined]
  [3] macro expansion
    @ ~/.julia/packages/TimerOutputs/NRdsv/src/TimerOutput.jl:253 [inlined]
  [4] 
    @ GPUCompiler ~/.julia/packages/GPUCompiler/2CW9L/src/driver.jl:90
  [5] codegen
    @ ~/.julia/packages/GPUCompiler/2CW9L/src/driver.jl:82 [inlined]
  [6] compile(target::Symbol, job::GPUCompiler.CompilerJob; kwargs::@Kwargs{})
    @ GPUCompiler ~/.julia/packages/GPUCompiler/2CW9L/src/driver.jl:79
  [7] compile
    @ ~/.julia/packages/GPUCompiler/2CW9L/src/driver.jl:74 [inlined]
  [8] #1145
    @ ~/.julia/packages/CUDA/2kjXI/src/compiler/compilation.jl:250 [inlined]
  [9] JuliaContext(f::CUDA.var"#1145#1148"{GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams}}; kwargs::@Kwargs{})
    @ GPUCompiler ~/.julia/packages/GPUCompiler/2CW9L/src/driver.jl:34
 [10] JuliaContext(f::Function)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/2CW9L/src/driver.jl:25
 [11] compile(job::GPUCompiler.CompilerJob)
    @ CUDA ~/.julia/packages/CUDA/2kjXI/src/compiler/compilation.jl:249
 [12] actual_compilation(cache::Dict{…}, src::Core.MethodInstance, world::UInt64, cfg::GPUCompiler.CompilerConfig{…}, compiler::typeof(CUDA.compile), linker::typeof(CUDA.link))
    @ GPUCompiler ~/.julia/packages/GPUCompiler/2CW9L/src/execution.jl:237
 [13] cached_compilation(cache::Dict{…}, src::Core.MethodInstance, cfg::GPUCompiler.CompilerConfig{…}, compiler::Function, linker::Function)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/2CW9L/src/execution.jl:151
 [14] macro expansion
    @ ~/.julia/packages/CUDA/2kjXI/src/compiler/execution.jl:380 [inlined]
 [15] macro expansion
    @ ./lock.jl:273 [inlined]
 [16] cufunction(f::typeof(AcceleratedKernels.gpu__foreachindex_global!), tt::Type{…}; kwargs::@Kwargs{})
    @ CUDA ~/.julia/packages/CUDA/2kjXI/src/compiler/execution.jl:375
 [17] macro expansion
    @ ~/.julia/packages/CUDA/2kjXI/src/compiler/execution.jl:112 [inlined]
 [18] (::KernelAbstractions.Kernel{…})(::Function, ::Vararg{…}; ndrange::Int64, workgroupsize::Nothing)
    @ CUDA.CUDAKernels ~/.julia/packages/CUDA/2kjXI/src/CUDAKernels.jl:103
 [19] _foreachindex_gpu(f::Function, itr::UnitRange{Int32}, backend::CUDABackend; block_size::Int64)
    @ AcceleratedKernels ~/.julia/packages/AcceleratedKernels/gP4jS/src/foreachindex.jl:17
 [20] _foreachindex_gpu
    @ ~/.julia/packages/AcceleratedKernels/gP4jS/src/foreachindex.jl:8 [inlined]
 [21] #foreachindex#9
    @ ~/.julia/packages/AcceleratedKernels/gP4jS/src/foreachindex.jl:160 [inlined]
 [22] foreachindex
    @ ~/.julia/packages/AcceleratedKernels/gP4jS/src/foreachindex.jl:141 [inlined]
 [23] aggregate_last_level!
    @ ~/.julia/packages/ImplicitBVH/0kmLq/src/build.jl:383 [inlined]
 [24] aggregate_oibvh!(bvh_nodes::CuArray{…}, bvh_leaves::CuArray{…}, tree::ImplicitTree{…}, order::CuArray{…}, built_level::Int32, options::BVHOptions{…})
    @ ImplicitBVH ~/.julia/packages/ImplicitBVH/0kmLq/src/build.jl:310
 [25] BVH(bounding_volumes::CuArray{…}, node_type::Type{…}, morton_type::Type{…}, built_level::Int64; options::BVHOptions{…})
    @ ImplicitBVH ~/.julia/packages/ImplicitBVH/0kmLq/src/build.jl:205
 [26] BVH
    @ ~/.julia/packages/ImplicitBVH/0kmLq/src/build.jl:160 [inlined]
 [27] BVH(bounding_volumes::CuArray{BSphere{Float32}, 1, CUDA.DeviceMemory}, node_type::Type{BBox{Float32}}, morton_type::Type{UInt32})
    @ ImplicitBVH ~/.julia/packages/ImplicitBVH/0kmLq/src/build.jl:160
 [28] top-level scope
    @ REPL[5]:1
Some type information was truncated. Use `show(err)` to see complete types.

Error info on v0.4.1:

ERROR: Scalar indexing is disallowed.
Invocation of getindex resulted in scalar indexing of a GPU array.
This is typically caused by calling an iterating implementation of a method.
Such implementations *do not* execute on the GPU, but very slowly on the CPU,
and therefore should be avoided.

If you want to allow scalar iteration, use `allowscalar` or `@allowscalar`
to enable scalar iteration globally or for the operations in question.
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:35
  [2] errorscalar(op::String)
    @ GPUArraysCore ~/.julia/packages/GPUArraysCore/GMsgk/src/GPUArraysCore.jl:155
  [3] _assertscalar(op::String, behavior::GPUArraysCore.ScalarIndexing)
    @ GPUArraysCore ~/.julia/packages/GPUArraysCore/GMsgk/src/GPUArraysCore.jl:128
  [4] assertscalar(op::String)
    @ GPUArraysCore ~/.julia/packages/GPUArraysCore/GMsgk/src/GPUArraysCore.jl:116
  [5] getindex
    @ ~/.julia/packages/GPUArrays/qt4ax/src/host/indexing.jl:50 [inlined]
  [6] bounding_volumes_extrema(bounding_volumes::CuArray{BSphere{Float32}, 1, CUDA.DeviceMemory})
    @ ImplicitBVH ~/.julia/packages/ImplicitBVH/00zCQ/src/morton.jl:125
  [7] #morton_encode!#5
    @ ~/.julia/packages/ImplicitBVH/00zCQ/src/morton.jl:216 [inlined]
  [8] morton_encode!
    @ ~/.julia/packages/ImplicitBVH/00zCQ/src/morton.jl:209 [inlined]
  [9] BVH(bounding_volumes::CuArray{…}, node_type::Type{…}, morton_type::Type{…}, built_level::Int64; num_threads::Int64)
    @ ImplicitBVH ~/.julia/packages/ImplicitBVH/00zCQ/src/build.jl:170
 [10] BVH
    @ ~/.julia/packages/ImplicitBVH/00zCQ/src/build.jl:134 [inlined]
 [11] BVH(bounding_volumes::CuArray{BSphere{…}, 1, CUDA.DeviceMemory}, node_type::Type{BBox{…}}, morton_type::Type{UInt32})
    @ ImplicitBVH ~/.julia/packages/ImplicitBVH/00zCQ/src/build.jl:134
 [12] top-level scope
    @ REPL[11]:1
Some type information was truncated. Use `show(err)` to see complete types.
Julia Version 1.11.1
CUDA v5.5.2
ImplicitBVH v0.5.0 `https://github.com/StellaOrg/ImplicitBVH.jl.git#main`
or ImplicitBVH v0.4.1
@anicusan
Copy link
Member

Hi, thanks for the report! I finally got on a cluster with Nvidia GPUs. Julia had some odd lambda capturing (it was trying to pass Type{Int32} to the kernel); separating that into another function fixed it.

I tested and benchmarked the GPU backend via CUDA for building, contact-detection and ray-tracing, and it seems to work.
Would you mind testing the #main version again?

For the other GPU backends, we are waiting on atomics for KernelAbstractions.jl kernels; it should already work on AMD GPUs (though I don't have one on hand to try). On Apple Metal it will be available in the next version of Atomix after a pull request I made; we're waiting on merging the oneAPI atomics extension - then all backends should work :)

@anicusan
Copy link
Member

anicusan commented Dec 2, 2024

I will close this issue now. If you find any other issues, please feel free to reopen this discussion or another one.

@anicusan anicusan closed this as completed Dec 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants