[bug] Type cast problem when using torch tensor as kernel args #7915

Dinngger · 2023-04-30T09:54:16Z

Describe the bug
When using pytorch Tensor as kernel args, ti.cast performs wrong.

To Reproduce

import torch
import taichi as ti

ti.init(arch=ti.gpu)

@ti.kernel
def normal_cast(index: ti.types.vector(3, ti.i32)):
    ui = ti.cast(index, ti.u32)
    print(index[0], ' ', index[1], ' ', index[2], ' -> ', ui[0], ' ', ui[1], ' ', ui[2])

@ti.kernel
def torch_cast(points: ti.types.ndarray()):
    voxel_f = ti.Vector([points[0], points[1], points[2]], ti.f32)
    voxel = ti.floor(voxel_f, ti.i32)
    ui = ti.cast(voxel, ti.u32)
    print(voxel[0], ' ', voxel[1], ' ', voxel[2], ' -> ', ui[0], ' ', ui[1], ' ', ui[2])

if __name__ == "__main__":
    points = torch.Tensor([-3, -31, -23])
    torch_cast(points.contiguous()) # this output is wrong
    voxel = ti.Vector([-3, -31, -23], dt=ti.i32)
    normal_cast(voxel)  # this output is ok

Log/Screenshots

$ python3 ti_cast.py 
[Taichi] version 1.5.0, llvm 15.0.4, commit 7b885c28, linux, python 3.8.10
[Taichi] Starting on arch=cuda
-3   -31   -23  ->  0   0   0
-3   -31   -23  ->  4294967293   4294967265   4294967273

Additional comments
it's performs normal when changing torch.Tensor into torch.IntTensor.

The text was updated successfully, but these errors were encountered:

ailzhang · 2023-05-06T08:14:52Z

FYI a simpler repro for this is

import torch
import taichi as ti

ti.init(arch=ti.cuda, print_ir=False)

@ti.kernel
def my_cast(x: ti.f32):
    print(ti.cast(x, ti.u32))

if __name__ == "__main__":
    my_cast(-1)

CPU backend produces the expected result. But I haven't looked into the root cause yet.
Discussed with @lin-hitonami on this issue, we found that the LLVM code generated for CPU & CUDA are the same although they produce different results.
Note casting a negative floating point to unsigned int is undefined behavior (https://stackoverflow.com/questions/25046164/llvm-intermediate-representation-fptoui-vs-fptosi) so it may actually makes sense...

Dinngger · 2023-05-06T08:44:06Z

In the algebraic simplification pass, Taichi combines two cast stmts when they are both cast into integral values. However, (float -> int -> int) is not equivalent to (float -> int). So maybe you can change the is_redundant_cast function to solve this issue?

When I disabled the advanced optimization, the code works as expected:

import taichi as ti

ti.init(arch=ti.gpu, advanced_optimization=False)

@ti.kernel
def my_cast(x: ti.f32):
    y = ti.floor(x, ti.i32)
    print(ti.cast(y, ti.u32))

if __name__ == "__main__":
    my_cast(-1)

The output is

$ python3 ti_cast.py 
[Taichi] version 1.5.0, llvm 15.0.4, commit 7b885c28, linux, python 3.8.10
[Taichi] Starting on arch=cuda
4294967295

ailzhang · 2023-05-06T08:46:49Z

@Dinngger Aha nice catch, thanks! Let me take a look there!

Fixes #7915 ### Brief Summary  ### <samp>🤖 Generated by Copilot at ed6302b</samp> Fix a bug in `alg_simp` that removed casts between signed and unsigned integers. Add a test case in `test_optimization` to check the cast simplification. ### Walkthrough  ### <samp>🤖 Generated by Copilot at ed6302b</samp> * Fix a bug in algebraic simplification that incorrectly removed some casts between signed and unsigned integers ([link](https://github.com/taichi-dev/taichi/pull/7944/files?diff=unified&w=0#diff-77d8ca8e4dc6081988bd6dddb74bb9a5485af28ce3e0b43bc06d123256695513L63-R66)) * Add a test case to verify the correctness of the cast simplification after the bug fix ([link](https://github.com/taichi-dev/taichi/pull/7944/files?diff=unified&w=0#diff-b8b031f0789413acece482512df4af5b8419a2a2dea3624b26114bbb9b57d334R146-R155))

Fixes taichi-dev#7915 ### Brief Summary  ### <samp>🤖 Generated by Copilot at ed6302b</samp> Fix a bug in `alg_simp` that removed casts between signed and unsigned integers. Add a test case in `test_optimization` to check the cast simplification. ### Walkthrough  ### <samp>🤖 Generated by Copilot at ed6302b</samp> * Fix a bug in algebraic simplification that incorrectly removed some casts between signed and unsigned integers ([link](https://github.com/taichi-dev/taichi/pull/7944/files?diff=unified&w=0#diff-77d8ca8e4dc6081988bd6dddb74bb9a5485af28ce3e0b43bc06d123256695513L63-R66)) * Add a test case to verify the correctness of the cast simplification after the bug fix ([link](https://github.com/taichi-dev/taichi/pull/7944/files?diff=unified&w=0#diff-b8b031f0789413acece482512df4af5b8419a2a2dea3624b26114bbb9b57d334R146-R155))

taichi-gardener added this to Taichi Lang Apr 30, 2023

github-project-automation bot moved this to Untriaged in Taichi Lang Apr 30, 2023

Dinngger changed the title ~~Type cast problem when using torch tensor as kernel args~~ [bug] Type cast problem when using torch tensor as kernel args Apr 30, 2023

erizmr assigned ailzhang May 5, 2023

erizmr moved this from Untriaged to Todo in Taichi Lang May 5, 2023

ailzhang mentioned this issue May 6, 2023

[Opt] Make merging casts int(int(x)) less aggressive #7944

Merged

lin-hitonami closed this as completed in #7944 May 8, 2023

github-project-automation bot moved this from Todo to Done in Taichi Lang May 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[bug] Type cast problem when using torch tensor as kernel args #7915

[bug] Type cast problem when using torch tensor as kernel args #7915

Dinngger commented Apr 30, 2023

ailzhang commented May 6, 2023

Dinngger commented May 6, 2023

ailzhang commented May 6, 2023

[bug] Type cast problem when using torch tensor as kernel args #7915

[bug] Type cast problem when using torch tensor as kernel args #7915

Comments

Dinngger commented Apr 30, 2023

ailzhang commented May 6, 2023

Dinngger commented May 6, 2023

ailzhang commented May 6, 2023