-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid discarding SRE state for IO cause #18836
Conversation
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
93419c4
to
f2651d5
Compare
Unwrapping all StatusRuntimeExceptions in in ReferenceCountedChannel when caused by IOException will discard critical tracing and retriability. The Retrier evaluations may not see an SRE in the causal chain, and presume it is invariably an unretriable exception. In general, IOExceptions as SRE wrappers are unsuitable containers and are routinely misued either for identification (grpc aware status), or capture (handleInitError).
Unwrapping all StatusRuntimeExceptions in in ReferenceCountedChannel when caused by IOException will discard critical tracing and retriability. The Retrier evaluations may not see an SRE in the causal chain, and presume it is invariably an unretriable exception. In general, IOExceptions as SRE wrappers are unsuitable containers and are routinely misued either for identification (grpc aware status), or capture (handleInitError). Partially addresses bazelbuild#18764 (retries will occur with SSL handshake timeout, but the actual connection will not be retried) Closes bazelbuild#18836. PiperOrigin-RevId: 546037698 Change-Id: I7f6efcb857c557aa97ad3df085fc032c8538eb9a
Unwrapping all StatusRuntimeExceptions in in ReferenceCountedChannel when caused by IOException will discard critical tracing and retriability. The Retrier evaluations may not see an SRE in the causal chain, and presume it is invariably an unretriable exception. In general, IOExceptions as SRE wrappers are unsuitable containers and are routinely misued either for identification (grpc aware status), or capture (handleInitError). Partially addresses bazelbuild#18764 (retries will occur with SSL handshake timeout, but the actual connection will not be retried) Closes bazelbuild#18836. PiperOrigin-RevId: 546037698 Change-Id: I7f6efcb857c557aa97ad3df085fc032c8538eb9a
Unwrapping all StatusRuntimeExceptions in in ReferenceCountedChannel when caused by IOException will discard critical tracing and retriability. The Retrier evaluations may not see an SRE in the causal chain, and presume it is invariably an unretriable exception. In general, IOExceptions as SRE wrappers are unsuitable containers and are routinely misued either for identification (grpc aware status), or capture (handleInitError). Partially addresses bazelbuild#18764 (retries will occur with SSL handshake timeout, but the actual connection will not be retried) Closes bazelbuild#18836. PiperOrigin-RevId: 546037698 Change-Id: I7f6efcb857c557aa97ad3df085fc032c8538eb9a
* Include stack trace in all gRPC errors when --verbose_failures is set. Also refactor a couple places where the stack trace was included in an ad-hoc manner, and force Utils.grpcAwareErrorMessage callers to be explicit to avoid future instances. Closes #16086. PiperOrigin-RevId: 502854490 Change-Id: Id2d6a1728630fffea9399b406378c7f173b247bd * Avoid discarding SRE state for IO cause Unwrapping all StatusRuntimeExceptions in in ReferenceCountedChannel when caused by IOException will discard critical tracing and retriability. The Retrier evaluations may not see an SRE in the causal chain, and presume it is invariably an unretriable exception. In general, IOExceptions as SRE wrappers are unsuitable containers and are routinely misued either for identification (grpc aware status), or capture (handleInitError). Partially addresses #18764 (retries will occur with SSL handshake timeout, but the actual connection will not be retried) Closes #18836. PiperOrigin-RevId: 546037698 Change-Id: I7f6efcb857c557aa97ad3df085fc032c8538eb9a * Operation stream termination is not an error According to the GrpcRemoteExecutor when it occurs after a !done operation response. Remove the error from the ExperimentalRemoteGrpcExecutor and reinforce both with tests. Update the FakeExecutionService to generate compatible error responses that appear in the ExecuteResponse, rather than the operation error field, per the REAPI spec. Made required adjustments to ExGRE Test invocations to avoid the ExecutionStatusException interpretation of DEADLINE_EXCEEDED -> FAILED_PRECONDITION in ExecuteResponse. Closes #18785. PiperOrigin-RevId: 546925894 Change-Id: I7a489c8bc936a83cfd94e0138437f3fe6d152da8 * Done operations must be reexecuted Any operation with done == true as reported by the server is not expected to change its result on subsequent waitExecution calls. To properly retry, this action must be reexecuted, if it was truly transient, to achieve a definitive result. Submit a transient status for retry, disallow special behaviors for NOT_FOUND as covered by done observation, and consider method type when handling operation streams. Closes #18943. PiperOrigin-RevId: 548680656 Change-Id: Ib2c9887ead1fbd3de97761db6e8b4077783ad03c --------- Co-authored-by: Tiago Quelhas <[email protected]>
The changes in this PR have been included in Bazel 6.4.0 RC1. Please test out the release candidate and report any issues as soon as possible. If you're using Bazelisk, you can point to the latest RC by setting USE_BAZEL_VERSION=last_rc. |
Unwrapping all StatusRuntimeExceptions in in ReferenceCountedChannel when caused by IOException will discard critical tracing and retriability. The Retrier evaluations may not see an SRE in the causal chain, and presume it is invariably an unretriable exception. In general, IOExceptions as SRE wrappers are unsuitable containers and are routinely misued either for identification (grpc aware status), or capture (handleInitError).
Partially addresses #18764 (retries will occur with SSL handshake timeout, but the actual connection will not be retried)