-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xds: Poor RPC error messages when unable to get listener resource #5020
Comments
It didn't change the state of this issue. I had re-tested after that PR and it was realized to be an accident to close it. |
This is now failing with
In |
I think it's OK to close this issue now. |
@menghanl, "per-RPC creds failed due to error: context canceled" makes it seem like a timeout. How is it clear there is a permission issue? We shouldn't be losing the ADS failure status. It seems even low-level debugging doesn't include it. It looks like in Java we deliver ADS stream closures as errors to all the watchers. We might want to discuss if that is appropriate cross-language. |
@menghanl, in my test I had local credentials available (for the wrong project). In your test it is probably failing to talk to the metadata server or something. |
Using
So nothing has changed. (Changing to insecure should have similar results to running on GCE or setting GOOGLE_APPLICATION_CREDENTIALS) |
I will circle back to this once the xdsClient API changes are fully merged. |
I created a trash bootstrap config with server_uri set to
When I increase the RPC deadline to something like
At this point, the xDS resolver sends an empty service config on the channel, which results in the
This then immediately leads to RPC failure:
|
@ejona86 @dfawley
What do you think? |
That goes into a previous conversation about how grpc-go should not be using wait-on-ready for the ADS stream because it can't fail with useful error messages (nor promptly).
Timeouts and "not found" aren't helpful and are really misleading. The "error reading server preface" is the error that is useful. |
A57 changes are currently in review here: #5996, and there we are getting rid of the wait-for-ready call option.
Are you of the opinion that we should not be logging the timeout and "not found" errors? |
My main opinion is those errors are wrong today. Once you fix the wait-for-ready you should no longer time out and convert that to not found. It isn't so much "log or don't log" but "the code should stop going through those flows." |
After the A57 changes were merged, the client does not time out, and instead throws the following error:
Thanks @ejona86 !! I'm closing this now since I feel like things are working as expected. Please let us know if you feel otherwise. FYI @arvindbr8 |
This is related to #4939 (comment).
I created a trash bootstrap (td-grpc-bootstrap --gcp-project-number 5) and ran the xds client on my local machine without any credentials.
I'd have hoped the RPC wouldn't need to timeout, because xds client is unable to do a successful RPC. I thought that was supposed to produce an error quickly from xds client. It looks like Java does produce an error quickly ("Permission denied on resource project #5"). So that seems like a bug, but let's work around it.
Changing the RPC's timeout in main.go to 30s, let's try again:
That's... not that helpful. You have to be a grpc developer to understand that error message properly (others might think they understand it, but they'd be wrong). I could accept that error if the server was slow or something. But that's not the case.
Given that this situation is super-likely for new users, it is very important that we promptly provide helpful error messages in situations like this one.
The text was updated successfully, but these errors were encountered: