-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
parsl provider error messages are lost #679
Labels
bug
Something isn't working
Comments
I've recreated this in my dev environment by replacing the For point 2, I have discussed internally with @sirosen about logging parsl (and more) error messages to the endpoint logs. |
benclifford
added a commit
that referenced
this issue
Feb 2, 2022
This tries to find the provider label inside self.config.provider, which does not exist. In this interchange, the provider is directly available as an attribute. Tested by: modify my local kube provider to return None on all submits, see that the issue #679 stack trace appears. Make this change in this commit, and see that a ScalingFailed correctly appears. This addresses the first bullet point in issue #679.
benclifford
added a commit
that referenced
this issue
Feb 15, 2022
This tries to find the provider label inside self.config.provider, which does not exist. In this interchange, the provider is directly available as an attribute. Tested by: modify my local kube provider to return None on all submits, see that the issue #679 stack trace appears. Make this change in this commit, and see that a ScalingFailed correctly appears. This addresses the first bullet point in issue #679.
benclifford
added a commit
that referenced
this issue
Mar 8, 2022
This tries to find the provider label inside self.config.provider, which does not exist. In this interchange, the provider is directly available as an attribute. Tested by: modify my local kube provider to return None on all submits, see that the issue #679 stack trace appears. Make this change in this commit, and see that a ScalingFailed correctly appears. Fixes issue #679
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the bug
This is based on a report in the #help slack channel
When the slurm provider fails to scale out, the code that is supposed to report that to the user fails in potentially several ways:
config
.parsl.providers.slurm
but the endpoint admin was unable to find the relevant log message - maybe it should appear around the same place as the above report? The relevant parsl log line is:To Reproduce
Get endpoint to try to scale out with a broken provider/provider configuration
Expected behavior
The errors coming from parsl.providers should lead the user towards fixing the problem (in the example user's case, a quota exhaustion reported by
sbatch
) rather than being hiddenEnvironment
slurm
other component versions unknown
The text was updated successfully, but these errors were encountered: