Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] Type cast problem when using torch tensor as kernel args #7915

Closed
Dinngger opened this issue Apr 30, 2023 · 3 comments · Fixed by #7944
Closed

[bug] Type cast problem when using torch tensor as kernel args #7915

Dinngger opened this issue Apr 30, 2023 · 3 comments · Fixed by #7944
Assignees

Comments

@Dinngger
Copy link

Describe the bug
When using pytorch Tensor as kernel args, ti.cast performs wrong.

To Reproduce

import torch
import taichi as ti

ti.init(arch=ti.gpu)

@ti.kernel
def normal_cast(index: ti.types.vector(3, ti.i32)):
    ui = ti.cast(index, ti.u32)
    print(index[0], ' ', index[1], ' ', index[2], ' -> ', ui[0], ' ', ui[1], ' ', ui[2])

@ti.kernel
def torch_cast(points: ti.types.ndarray()):
    voxel_f = ti.Vector([points[0], points[1], points[2]], ti.f32)
    voxel = ti.floor(voxel_f, ti.i32)
    ui = ti.cast(voxel, ti.u32)
    print(voxel[0], ' ', voxel[1], ' ', voxel[2], ' -> ', ui[0], ' ', ui[1], ' ', ui[2])

if __name__ == "__main__":
    points = torch.Tensor([-3, -31, -23])
    torch_cast(points.contiguous()) # this output is wrong
    voxel = ti.Vector([-3, -31, -23], dt=ti.i32)
    normal_cast(voxel)  # this output is ok

Log/Screenshots

$ python3 ti_cast.py 
[Taichi] version 1.5.0, llvm 15.0.4, commit 7b885c28, linux, python 3.8.10
[Taichi] Starting on arch=cuda
-3   -31   -23  ->  0   0   0
-3   -31   -23  ->  4294967293   4294967265   4294967273

Additional comments
it's performs normal when changing torch.Tensor into torch.IntTensor.

@github-project-automation github-project-automation bot moved this to Untriaged in Taichi Lang Apr 30, 2023
@Dinngger Dinngger changed the title Type cast problem when using torch tensor as kernel args [bug] Type cast problem when using torch tensor as kernel args Apr 30, 2023
@erizmr erizmr moved this from Untriaged to Todo in Taichi Lang May 5, 2023
@ailzhang
Copy link
Contributor

ailzhang commented May 6, 2023

FYI a simpler repro for this is

import torch
import taichi as ti

ti.init(arch=ti.cuda, print_ir=False)

@ti.kernel
def my_cast(x: ti.f32):
    print(ti.cast(x, ti.u32))

if __name__ == "__main__":
    my_cast(-1)

CPU backend produces the expected result. But I haven't looked into the root cause yet.
Discussed with @lin-hitonami on this issue, we found that the LLVM code generated for CPU & CUDA are the same although they produce different results.
Note casting a negative floating point to unsigned int is undefined behavior (https://stackoverflow.com/questions/25046164/llvm-intermediate-representation-fptoui-vs-fptosi) so it may actually makes sense...

@Dinngger
Copy link
Author

Dinngger commented May 6, 2023

In the algebraic simplification pass, Taichi combines two cast stmts when they are both cast into integral values. However, (float -> int -> int) is not equivalent to (float -> int). So maybe you can change the is_redundant_cast function to solve this issue?

When I disabled the advanced optimization, the code works as expected:

import taichi as ti

ti.init(arch=ti.gpu, advanced_optimization=False)

@ti.kernel
def my_cast(x: ti.f32):
    y = ti.floor(x, ti.i32)
    print(ti.cast(y, ti.u32))

if __name__ == "__main__":
    my_cast(-1)

The output is

$ python3 ti_cast.py 
[Taichi] version 1.5.0, llvm 15.0.4, commit 7b885c28, linux, python 3.8.10
[Taichi] Starting on arch=cuda
4294967295

@ailzhang
Copy link
Contributor

ailzhang commented May 6, 2023

@Dinngger Aha nice catch, thanks! Let me take a look there!

lin-hitonami pushed a commit that referenced this issue May 8, 2023
Fixes #7915 

### Brief Summary

<!--
copilot:summary
-->
### <samp>🤖 Generated by Copilot at ed6302b</samp>

Fix a bug in `alg_simp` that removed casts between signed and unsigned
integers. Add a test case in `test_optimization` to check the cast
simplification.

### Walkthrough

<!--
copilot:walkthrough
-->
### <samp>🤖 Generated by Copilot at ed6302b</samp>

* Fix a bug in algebraic simplification that incorrectly removed some
casts between signed and unsigned integers
([link](https://github.com/taichi-dev/taichi/pull/7944/files?diff=unified&w=0#diff-77d8ca8e4dc6081988bd6dddb74bb9a5485af28ce3e0b43bc06d123256695513L63-R66))
* Add a test case to verify the correctness of the cast simplification
after the bug fix
([link](https://github.com/taichi-dev/taichi/pull/7944/files?diff=unified&w=0#diff-b8b031f0789413acece482512df4af5b8419a2a2dea3624b26114bbb9b57d334R146-R155))
@github-project-automation github-project-automation bot moved this from Todo to Done in Taichi Lang May 8, 2023
quadpixels pushed a commit to quadpixels/taichi that referenced this issue May 13, 2023
Fixes taichi-dev#7915 

### Brief Summary

<!--
copilot:summary
-->
### <samp>🤖 Generated by Copilot at ed6302b</samp>

Fix a bug in `alg_simp` that removed casts between signed and unsigned
integers. Add a test case in `test_optimization` to check the cast
simplification.

### Walkthrough

<!--
copilot:walkthrough
-->
### <samp>🤖 Generated by Copilot at ed6302b</samp>

* Fix a bug in algebraic simplification that incorrectly removed some
casts between signed and unsigned integers
([link](https://github.com/taichi-dev/taichi/pull/7944/files?diff=unified&w=0#diff-77d8ca8e4dc6081988bd6dddb74bb9a5485af28ce3e0b43bc06d123256695513L63-R66))
* Add a test case to verify the correctness of the cast simplification
after the bug fix
([link](https://github.com/taichi-dev/taichi/pull/7944/files?diff=unified&w=0#diff-b8b031f0789413acece482512df4af5b8419a2a2dea3624b26114bbb9b57d334R146-R155))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

2 participants