Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NaN values #33

Closed
ShengyuH opened this issue Oct 22, 2019 · 1 comment
Closed

NaN values #33

ShengyuH opened this issue Oct 22, 2019 · 1 comment

Comments

@ShengyuH
Copy link

hi @HuguesTHOMAS
Thanks for your work and open-source code!

Under CUDA10.2, Ubuntu 18.04.3, tensorflow 1.12.0, GeForce GTX 1080 Ti, I successfully compiled cpp wrappers and tf_ops by removing the as mentioned tag. However, when I run train_ModelNet.py, everything goes well in first around two epochs, after around 2000 steps, I have the problem of NaN values in loss and acc. I compile tensorflow from source and under the same environment, I compiled other tf-user-ops and there's no problem there.

@HuguesTHOMAS
Copy link
Owner

Hi @HenrryBryant,

As explained here, it seems that CUDA 10 has internal issues leading to the apparitions of NaN values. Although these issues have only appeared when using a RTX 2080ti, I dont recommand using this version of CUDA.

Best,
Hugues

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants