You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@williamFalcon I struggle to reproduce this. PL does not even recognize the TPUs in the runtime, XLA_AVAILABLE gets set to false (mnist tpu colab). Tried different pytorch and xla versions, all the same.
Could you share the colab in which you observed the hangs?
@williamFalcon checked again and now I am able to run the mnist colab without validation and it does not hang anymore (latest master). Not sure what fixed it.
I think it's somehow related to checkpointing.
Easiest way to debug is to get on colab.
The text was updated successfully, but these errors were encountered: