You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hey,
I'm trying to use the forward_state function. From time to time, I get
this error:
RuntimeError: CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Jumping out of:
File "/media/data2/ethan_baron/state-spaces-improv/src/models/sequence/ss/kernel.py", line 434, in _setup_linear
R = torch.linalg.solve(R.to(Q_D), Q_D) # (H r N)
Meaning,
from this lines (433-436) in the NPLR Kernel:
try:
R = torch.linalg.solve(R.to(Q_D), Q_D) # (H r N)
except torch._C._LinAlgError:
R = torch.tensor(np.linalg.solve(R.to(Q_D).cpu(), Q_D.cpu())).to(Q_D)
I changed very little this lines for debugging, for:
try:
R = torch.linalg.solve(R.to(Q_D), Q_D) # (H r N)
except:
x1 = R.to(Q_D).cpu()
x2 = R.to(Q_D).cpu()
R = torch.tensor(np.linalg.solve(x1, x2)).to(Q_D)
EDIT: Removed stacktrace (was quite unhelpful and long) and edited the code to be in code snippets.
The text was updated successfully, but these errors were encountered:
I looked into this recently and also found the same issue, which wasn't present before. I wasn't able to figure out why. It's weird that it happens randomly.
Regardless, the implementation of "state forwarding" (README) is currently unoptimized for S4 so it is not recommended to use this. If you want this functionality, it should work with S4D. Feel free to file another issue if something comes up.
Finally, could you please edit the original issue here to be shorter, and in particular remove at least the last part of the stack trace. It might also help to put the whole thing in a code block. The last few lines are all parsed in a way that references other Issues which is confusing.
Yeah, I tried to look into it for a couple of days and didn't understand what happened.
I'm using now the S4D forward_state version and until now it works quite well.
Edited the issue, hopefully to be more readable.
Thanks!
Hey,
I'm trying to use the forward_state function. From time to time, I get
this error:
Jumping out of:
Meaning,
from this lines (433-436) in the NPLR Kernel:
I changed very little this lines for debugging, for:
EDIT: Removed stacktrace (was quite unhelpful and long) and edited the code to be in code snippets.
The text was updated successfully, but these errors were encountered: