Skip to content
This repository has been archived by the owner on May 27, 2021. It is now read-only.

Arithmetic bug with 64-bit integers #58

Closed
maleadt opened this issue Apr 27, 2017 · 4 comments
Closed

Arithmetic bug with 64-bit integers #58

maleadt opened this issue Apr 27, 2017 · 4 comments

Comments

@maleadt
Copy link
Member

maleadt commented Apr 27, 2017

Computing 2^19 + 2 * 1 fails when working with 64-bit integers, but works with 32-bit ones. Consistently reproducible with a GeForce GTX TITAN on drivers 378.13 (current short-lived) and 381.09 (current beta), but works properly on 375.39 (current long-lived). SASS code is identical across driver versions, so this looks to be an even lower-level bug than #4.

Reproduced from a bogus OOB (64-bit indices...) when working on Rodiina/needle.
Looked something like:

function oob(reference)
    index    = 4130784 - 32784 * blockIdx().x + 16 * blockIdx().x + threadIdx().x + 2051

    ref = @cuStaticSharedMem(Int32, (16, 16))
    for ty = 0:15
        i = index + 2049 * ty + 1
        @inbounds ref[threadIdx().x, ty + 1] = reference[i]
    end

    return nothing
end

array = CuArray{Int32}(2049, 2049)
@cuda (1,1) oob(array)

Full repro:

using CUDAdrv, CUDAnative

function kernel{T}(one::T, ptr::Ptr{T})
    val = T(524288) + T(2) * one
    Base.pointerset(ptr, val, 1, 8)
    return nothing
end

dev = CuDevice(0)
ctx = CuContext(dev)

function test(name, T)
    ref = CuArray{T}(1)
    @cuda (1,1) kernel(T(1), pointer(ref))
    println("$name: ", Array(ref)[1])

    if !isfile("$name.ll")
        open("$name.ll", "w") do io
            CUDAnative.code_llvm(io, kernel, Tuple{T, Ptr{T}};
                                 dump_module=true, cap=capability(dev))
        end
    end
end

test("32bit", Int32)
test("64bit", Int64)

destroy(ctx)
32bit: 524290
64bit: 2

Bug has been filed with NVIDIA.

cc @cfoket

@maleadt
Copy link
Member Author

maleadt commented Apr 27, 2017

SASS code is identical across driver versions, so this looks to be an even lower-level bug than #4.

Unless of course the ptxas I use (part of the CUDA toolkit, not the driver) to get a hold of the SASS code contains its own copy of the assembler.

@maleadt
Copy link
Member Author

maleadt commented Apr 27, 2017

Filed with NVIDIA as bug 200303640. Not that I have access to the bug tracker...

@vchuravy vchuravy assigned vchuravy and unassigned vchuravy May 8, 2017
@maleadt
Copy link
Member Author

maleadt commented Jul 3, 2017

Fixed on 384.47 (current beta), 381.22 (current short-lived) and 375.66 (current long-lived).

Not sure how we should deal with this. Warn when detecting a bad driver version? But we lack libNVML on eg. macOS, so that'd be annoying to detect.

@maleadt
Copy link
Member Author

maleadt commented Nov 6, 2017

Seeing how this is fixes on current long-lived, and NVIDIA hasn't provided my with any details about the versions that contain this bug, I guess there's nothing much we can do about this except for advising users to be running the latest long-lived kernel...

@maleadt maleadt closed this as completed Nov 6, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants