You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When the DS launch up a remote training, on DO side, report "TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first."
How to Reproduce
run the code line-by-line, everything works fine, until arriving to PART 3: Training. (I have a GPU and CUDA )
Training will stop at epoch 1 and no progress anymore.
on DO side I can see the error report as above "TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first."
Expected Behavior
This is a classic issue for general ML and I can find solution, but how to handle this by using FL lib (by which the training happen on DO side actually)
System Information
OS: ubuntu18.04
Language Version: Python:3.7.10, torch:1.8.1, torchvision:0.9.1
Description
When the DS launch up a remote training, on DO side, report "TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first."
How to Reproduce
Expected Behavior
This is a classic issue for general ML and I can find solution, but how to handle this by using FL lib (by which the training happen on DO side actually)
System Information
The text was updated successfully, but these errors were encountered: