-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Getting a SIGBUS on the cluster #4
Comments
OK, I managed to get an interactive shell on the cluster with a Tesla K40, yay! I have compiled this on the node itself, which is nice. It compiles and runs to a point, but then:
Note that this is suspiciously close to what I'm getting on my GTX940MX:
So I am guessing this is similar/close? Except in my case, I get to run a bit then bump into I hope this helps! Thanks in advance for helping debug this, Mate PS: Note to self. To create an interactive GPU shell on the cluster, use: |
Yaaaaay! It's working now! Ah, I think this was the one that was causing issues, in particular: So that would explain it maybe? Anyway, it's all good now! And it just finished solving an instance. Amazing! So we'll be able to use the cluster to run tons of experiments with 24 real cores! This is fantastic. Thank you so much for fixing this issue. I'm about to schedule a run, but I see there is too much verbosity by default -- there is slow IO on the cluster, and IO costs space which is scarce. I'm opening a new issue about that and the configs you'd like me to run. Then we'll be good to go :) Looking forward to running this on the cluster, Mate |
Hi Nicolas,
I'm getting a SIGBUS on the cluster, it's apparently: "The BUS signal is sent to a process when it causes a bus error, such as an incorrect memory access alignment or non-existent physical address.."
Output:
That's it, it exits there. I run everything under
/usr/bin/time -v
so I get some kernel info about the process that ran this is:I guess I'd need to have access to an interactive shell to be able to debug? Do you think we could up the verbosity somehow so we could get some info where it's dying? I think that might help with #1 as well. In the meanwhile, I'm filing a support request with the cluster developers to learn how I can get an interactive GPU-enabled shell :)
The text was updated successfully, but these errors were encountered: