-
Notifications
You must be signed in to change notification settings - Fork 239
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A42 update: fix ring_hash connectivity state aggregation rules #296
Conversation
CC @dfawley |
prevent the failover timer in the `priority` policy from working | ||
correctly, because the timer will be started when the child is created | ||
but then immediately cancelled when it reports `IDLE`. To address this, | ||
we will change the `priority` policy to restart the failover timer when a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to confirm, consider the following case:
- ring hash subchannels are all unreachable in an unresponsive way (so it takes e.g. 30 seconds to notice each connection failure)
- ring hash first under a
priority
policy, and a working failover priority follows it - channel and the LB policy tree are just starting up, i.e. first RPC is made on a new channel
Does this mean we can now rely on the priority
failover timer to fail over after 10 seconds? (where before we needed to wait for ring hash to enter TF)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it does mean that we'll failover faster in the case where all backends are blackholed.
The down-side of this is that we may wind up failing over without ever having tried more than one address. So while this will fail over more quickly in the case where all of the addresses are blackholed, if there is a case where only the first address we try is blackholed but all of the other addresses are working fine, we will wind up failing over when we ideally shouldn't.
If that becomes a problem, and if can ever agree on the semantics for happy eyeballs, we could potentially use something like that to improve this in the future.
A42-xds-ring-hash-lb-policy.md
Outdated
`READY` or `IDLE`. This means that for `ring_hash` to function as a child | ||
of the `priority` policy, it needs to report `TRANSIENT_FAILURE` when its | ||
subchannels are not reachable. However, because `ring_hash` attempts to | ||
connect only to those subchannels that pick requests hash to, there are |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this accurate, that there can be addresses which ring hash will never attempt?
I might be wrong, but I thought that if every connection attempt failed, that ring hash will eventually circle around and try every address?
I'm asking this because, if ring hash will eventually attempt every address, then considering the new priority changes to properly handle the fail over timeout with ring hash, I'm wondering if this heuristic to enter TRANSIENT_FAILURE after encountering two connection failures is still worth having - i.e. what problem would it solve that isn't solved by priority fail over.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you're right, it will eventually try all of the subchannels. However, this change is still needed for two reasons:
- The
priority
policy uses an idempotent algorithm for choosing a priority that is triggered every time a child's state changes, and that algorithm treatsIDLE
the same asREADY
-- i.e., the state of the failover timer matters only when the child is reportingCONNECTING
, so ifring_hash
continues to reportIDLE
after the failover timer fires, we'll wind up re-selecting it as soon as the newly created next priority reportsCONNECTING
, which is not what we want. (To say this another way, the priority policy assumes that a policy inIDLE
will transition toCONNECTING
as soon as it gets a pick.) - The
ring_hash
policy is designed to not be xDS-specific; eventually, we'd like to be able to use it as a top-level policy even without xDS. At that point, it would still be necessary to have the same hueristic here so that the channel will properly go intoTRANSIENT_FAILURE
when things are not working, rather than staying inIDLE
indefinitely. If things are not working, we want non-wait_for_ready RPCs to fail quickly.
I've updated the wording here to try to clarify this a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copying over context from chat FTR:
When ring_hash
is under priority
, it could stay in connecting for however long it took until reaching TD, and priority would do the right thing.
However, when ring_hash is a top-level policy, we need to balance the goals of failing fast in some cases and using all failover addresses in others.
My only thought about waiting until every subchannel enters TF is that it may be easier for users to understand - that time taken for ring hash to reach TF is proportional to ring hash's address list size (i.e. we remove the need to understand this heuristic). But OTOH, taking forever to reach TF introduces its own problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we really can't wait until every single subchannel has been attempted before going TF, since that would basically turn every RPC into a wait_for_ready RPC, which is definitely not what we want.
This is a heuristic, and like any heuristic, it's an imperfect attempt to balance between competing objectives. No matter what value we choose, someone can argue that a different value would be better in their case, but we can't choose the value they want without making it worse in some other case.
At the end of the day, I think two is the right number here. We definitely don't want to do just one, because having one individual backend unreachable is probably not uncommon, and we don't want to fail all RPCs in that case. But it's fairly unlikely that the first two backends that we happen to try are both independently down at the same time without there being some broader reachability issue that will also affect any other backend we might try. Increasing the number to three might give us slightly more confidence that there is a broader reachability issue that will also affect any other backend, but that slight boost in confidence doesn't seem like enough to justify the additional delay in failing non-wait_for_ready RPCs. And the trade-off gets dramatically worse as you go up from three: the increased confidence becomes smaller and smaller, and the delay before failing non-wait_for_ready RPCs becomes larger and larger. So I think two is the right sweet-spot here.
… child switch to Connecting from non-transient-failure state See grpc/proposal#296 for context. After this change, priority will restart the failover timer when a child reports Connecting, if that child hasn't reported TF more recently than it reported Ready or Idle. Also changed the priority policy to always call the centralized function `syncPriority` to handle child switching.
… child switch to Connecting from non-transient-failure state See grpc/proposal#296 for context. After this change, priority will restart the failover timer when a child reports Connecting, if that child hasn't reported TF more recently than it reported Ready or Idle. Also changed the priority policy to always call the centralized function `syncPriority` to handle child switching.
… child switch to Connecting from non-transient-failure state See grpc/proposal#296 for context. After this change, priority will restart the failover timer when a child reports Connecting, if that child hasn't reported TF more recently than it reported Ready or Idle. Also changed the priority policy to always call the centralized function `syncPriority` to handle child switching.
… child switch to Connecting from non-transient-failure state See grpc/proposal#296 for context. After this change, priority will restart the failover timer when a child reports Connecting, if that child hasn't reported TF more recently than it reported Ready or Idle. Also changed the priority policy to always call the centralized function `syncPriority` to handle child switching.
… child switch to Connecting from non-transient-failure state See grpc/proposal#296 for context. After this change, priority will restart the failover timer when a child reports Connecting, if that child hasn't reported TF more recently than it reported Ready or Idle. Also changed the priority policy to always call the centralized function `syncPriority` to handle child switching.
This PR contains the following updates: | Package | Type | Update | Change | |---|---|---|---| | [io.grpc:grpc-testing](https://togithub.com/grpc/grpc-java) | dependencies | minor | `1.15.1` -> `1.57.1` | --- ### Release Notes <details> <summary>grpc/grpc-java (io.grpc:grpc-testing)</summary> ### [`v1.57.1`](https://togithub.com/grpc/grpc-java/releases/tag/v1.57.1) ### Bug fixes - Fix compatibility with Java 8. This fixes the `NoSuchMethodError` for `ByteBuffer` methods present in 1.57.0 ([#​10441](https://togithub.com/grpc/grpc-java/issues/10441)) - xds: Remove debug assert in WeightedRoundRobinLoadBalancer. The assert was to detect breakages in the static stride algorithm causing too much looping. However, with multithreading it is possible to trigger even in legitimate scenarios ([#​10437](https://togithub.com/grpc/grpc-java/issues/10437)) ### [`v1.57.0`](https://togithub.com/grpc/grpc-java/releases/tag/v1.57.0) #### gRPC Java 1.57.0 Release Notes This release accidentally broke Java 8: `NoSuchMethodError` for some ByteBuffer methods. The issue is tracked in [https://github.com/grpc/grpc-java/issues/10432](https://togithub.com/grpc/grpc-java/issues/10432) ##### API Changes - Use fully qualified java.lang.String in all cases in generated code. This fixes compilation if a protobuf message is named “String”. - Stabilize two io.grpc.Status methods (asRuntimeException & trailersFromThrowable) - Stabilize io.grpc.ManagedChannelBuilder.useTransportSecurity ([#​10244](https://togithub.com/grpc/grpc-java/issues/10244)) - Stabilize io.grpc.util.MutableHandlerRegistry ([#​10348](https://togithub.com/grpc/grpc-java/issues/10348)) ##### Behavior Changes - xds: Handle loops and duplicates in xds Aggregate clusters - core: Change delay for hedging retry after a non-fatal error to be 0 to match the gRFC ([A6](https://togithub.com/grpc/proposal/blob/master/A6-client-retries.md)). - api: CheckedForwardingClientCall now passes trailers from the caught exception - xds: require EDS service name in CDS resources with an xdstp name - xds: Use Rule order instead of RuleChain - Wrap other name resolver types in a RetryingNameResolver . Previously, if authority was not overridden, then some name resolvers (such as grpclb) had no retry. - xds: Environment variable "GRPC_XDS_EXPERIMENTAL_SECURITY_SUPPORT" is no longer respected, so xDS security cannot be disabled any more ([#​10243](https://togithub.com/grpc/grpc-java/issues/10243)) - context, api: Package io.grpc is now consolidated into a single artifact grpc-api by moving classes from grpc-context to grpc-api. grpc-context now has a dependency on grpc-api (but excludes other dependencies of grpc-api) so any application previously using only grpc-context will now also bring in grpc-api. This fixes [#​3522](https://togithub.com/grpc/grpc-java/issues/3522) which was the major issue preventing support of Java modules. We are not done fixing support, as some artifacts need to be split and Automatic-Module-Name needs to be added. The next release is likely to be more stable for modules. - ##### New Features - binder: Add `UserHandle` and `BinderChannelCredentials` to support cross-user communication ([#​10197](https://togithub.com/grpc/grpc-java/issues/10197)) - xds,orca: LRS named metrics support ##### Improvements - core: Resolve isAndroid only once on class loading. This can improve channel creation performance on Android. - xds: Pick a subchannel with new static stride scheduler in WeightedRoundRobinLoadBalancer ##### Bug Fixes - xds: Fix the server sending a GOAWAY when an LDS update with no changes other than ordering is received. - netty: Fix NPE when a header with errors is received with endStream=true. This was causing logs to be filled with errors when health checkers didn’t specify a content type. - okhttp: Fix the Socket data race when shutdown/closed during connecting that was causing a significant delay ##### Dependencies - Upgraded Netty to 4.1.93-Final - Update guava dependency to 32.0.1 to address CVE-2023-2976 ##### Acknowledgements - Benjamin Peterson - Masakuni Oishi - Philip K. Warren - Stephane Landelle ### [`v1.56.1`](https://togithub.com/grpc/grpc-java/releases/tag/v1.56.1) ##### Bug fixes - core: Fix regression in 1.54.0 where polling NameResolvers would not refresh after a resolution error ([#​10328](https://togithub.com/grpc/grpc-java/issues/10328)). The symptom is a transient failure like "UNAVAILABLE: Unable to resolve host" continuing potentially forever. This did not impact DnsNameResolver, but it did impacted GrpclbNameResolver which is the dns name resolver used when `grpc-grpclb` is in the classpath. So even users that think "I don't use grpclb" may have been impacted. `round_robin` is mainly impacted on startup, but if the error happened afterward it would commonly fix itself for short transient DNS failures. `pick_first` is impacted at all times; any failed DNS resolution could cause all future RPCs on the channel to fail. ### [`v1.56.0`](https://togithub.com/grpc/grpc-java/releases/tag/v1.56.0) ##### API Changes - api: Stabilize the `SynchronizationContext` class ([#​10130](https://togithub.com/grpc/grpc-java/issues/10130)). - api: Stabilize `io.grpc.CallCredentials` ([#​10208](https://togithub.com/grpc/grpc-java/issues/10208), [#​10211](https://togithub.com/grpc/grpc-java/issues/10211)). `thisUsesUnstableApi()` is `@Deprecated` and has a default implementation. `CallCredentials` implementations should delete their implementation or remove `@Overrides`, as the method will be deleted in the future. - api: Stabilize the `ProxyDetector` hierarchy and `ManagedChannelBuilder.proxyDetector` method. ##### Behavior Changes - core: Sticky `TRANSIENT_FAILURE` in `PickFirstLoadBalancer` ([#​10106](https://togithub.com/grpc/grpc-java/issues/10106)). See [gRFC A62](https://togithub.com/grpc/proposal/blob/master/A62-pick-first.md#sticky-transient-failure). If it can't connect, pick-first will now immediately fail RPCs until after it successfully connects. RPCs will no longer be delayed while it performs those attempts, which previously could cause significant (error) latency. It now also performs reconnect attempts after failure and backoff without prompting; previously it required an RPC to trigger the reconnect. `ManagedChannel.idleTimeout` (defaults to 30 minutes) still applies and forces the channel idle after a period of no RPCs. - stub: Add a null check for `responseObserver` into the methods for initiating a call that takes a `responseObserver` argument. This ensures a fail fast with a clearer cause instead of an NPE when the observer is first used. - xds: Flip default for RLS being enabled to true for XDS ([#​10248](https://togithub.com/grpc/grpc-java/issues/10248)) ([#​10252](https://togithub.com/grpc/grpc-java/issues/10252)). If there are no RLS configurations in your XDS or you already enabled it with the environment variable this will have no effect. To disable it, set the flag `GRPC_EXPERIMENTAL_XDS_RLS_LB` to false. - xds: Rename `weighted_round_robin_experimental` LB Policy to `weighted_round_robin` ([#​10162](https://togithub.com/grpc/grpc-java/issues/10162)). ##### New Features - protobuf,protobuf-lite: Allow to configure protobuf recursion limit ([#​10094](https://togithub.com/grpc/grpc-java/issues/10094)). - core: Optional address shuffle in `PickFirstLoadBalancer` ([#​10110](https://togithub.com/grpc/grpc-java/issues/10110)). - xds: `pick_first` LB configuration ([#​10181](https://togithub.com/grpc/grpc-java/issues/10181)). ##### Improvements - xds: Add `error-per-second` in weight formula for client-side WRR ([#​10177](https://togithub.com/grpc/grpc-java/issues/10177)). - xds: Use` application_utilization ` and fallback to `cpu_utilization` if unset in weight formula for client-side WRR. ([#​10256](https://togithub.com/grpc/grpc-java/issues/10256)). - bazel: The README now mentions Bazel and where to find the example. ([#​10217](https://togithub.com/grpc/grpc-java/issues/10217)). ##### Bug Fixes - binder: Handle unexpected exceptions on binder threads. ([#​10092](https://togithub.com/grpc/grpc-java/issues/10092). - android,binder,cronet: `.aar` file when publishing. ([#​10138](https://togithub.com/grpc/grpc-java/issues/10138)). - api: Fix boundary check in `Status.fromCodeValue()`. ([#​10155](https://togithub.com/grpc/grpc-java/issues/10155)). - core: Don't use system Locale for content-type matching. ([#​10097](https://togithub.com/grpc/grpc-java/issues/10097)). - okhttp: Fix signed-byte comparison in server when checking for ASCII in header ([#​10151](https://togithub.com/grpc/grpc-java/issues/10151)). Without fix, authority could contain utf-8. ##### Dependencies - Version pinning (e.g., `[1.56.0]` instead of `1.56.0`) has been removed from POMs, for both Netty and gRPC dependencies. The pinning was unreliable in Maven and ignored in Gradle, yet caused downloads during the build to fetch the version list. For a while we've had a BOM that helps reduce version skew. ([#​10175](https://togithub.com/grpc/grpc-java/issues/10175)). - bazel: Add java toolchain type to all rules using `java_common`. ([#​10225](https://togithub.com/grpc/grpc-java/issues/10225)). - Upgraded `netty-tcnative-boringssl-static` in `grpc-netty-shaded` to 2.0.61.Final ([#​10260](https://togithub.com/grpc/grpc-java/issues/10260)). Netty itself was not updated. - Upgraded AndroidX Annotation to 1.6.0 ([#​10178](https://togithub.com/grpc/grpc-java/issues/10178)). - Upgraded AndroidX Core to 1.10.0 ([#​10178](https://togithub.com/grpc/grpc-java/issues/10178)). - Upgraded AndroidX Lifecycle-Common to 2.6.1 ([#​10178](https://togithub.com/grpc/grpc-java/issues/10178)). - Upgraded OpenCensus to 0.31.1 ([#​10178](https://togithub.com/grpc/grpc-java/issues/10178)). - Upgraded Cronet API to 108.5359.79 ([#​10178](https://togithub.com/grpc/grpc-java/issues/10178)). - Upgraded `proto-google-common-protos` to 2.17.0 ([#​10178](https://togithub.com/grpc/grpc-java/issues/10178)). - Upgraded Gson to 2.10.1 ([#​10178](https://togithub.com/grpc/grpc-java/issues/10178)). - Upgraded PerfMark API to 0.26.0 ([#​10178](https://togithub.com/grpc/grpc-java/issues/10178)). - Upgraded RE2/J to 1.7 ([#​10178](https://togithub.com/grpc/grpc-java/issues/10178)). ##### Acknowledgements - [@​chenwei321](https://togithub.com/chenwei321) - [@​cushon](https://togithub.com/cushon) - [@​kloyan](https://togithub.com/kloyan) - [@​kotlaja](https://togithub.com/kotlaja) - [@​vorburger](https://togithub.com/vorburger) ### [`v1.55.3`](https://togithub.com/grpc/grpc-java/releases/tag/v1.55.3) ##### Bug fixes - core: Fix regression in 1.54.0 where polling NameResolvers would not refresh after a resolution error ([#​10328](https://togithub.com/grpc/grpc-java/issues/10328)). The symptom is a transient failure like "UNAVAILABLE: Unable to resolve host" continuing potentially forever. This did not impact DnsNameResolver, but it did impacted GrpclbNameResolver which is the dns name resolver used when `grpc-grpclb` is in the classpath. So even users that think "I don't use grpclb" may have been impacted. `round_robin` is mainly impacted on startup, but if the error happened afterward it would commonly fix itself for short transient DNS failures. `pick_first` is impacted at all times; any failed DNS resolution could cause all future RPCs on the channel to fail. ### [`v1.55.1`](https://togithub.com/grpc/grpc-java/releases/tag/v1.55.1) The 1.55.0 release failed. There were no artifacts published for it. ##### API Changes - services: Rename `MetricRecorder.setQps`/`clearQps` to `setQpsMetric`/`clearQpsMetric` ([#​10031](https://togithub.com/grpc/grpc-java/issues/10031)) ##### Behavior Changes - gcp-observability: Remove monitored resource detection for logging ([https://github.com/grpc/grpc-java/pull/10020](https://togithub.com/grpc/grpc-java/pull/10020)). The cloud libraries will fill in these details instead - protoc-gen-grpc-java: binaries for Linux ARM and PPC are now built using Ubuntu 18.04. They will no longer work on Ubuntu 16.04 and Debian 9 ##### New Features - api: Stabilize the frequently used compression APIs ([#​9942](https://togithub.com/grpc/grpc-java/issues/9942)): `CallOptions.withCompression`, `CallOptions.getCompressor`, `AbstractStub.withCompression`, `ServerCall.setCompression`, `ServerCall.setMessageCompression` - api: Stabilize `Detachable` and `HasByteBuffer` - gcp-observability: Stabilize `GcpObservability` ([https://github.com/grpc/grpc-java/pull/10024](https://togithub.com/grpc/grpc-java/pull/10024)). The GcpObservability API provides a simple way to export logging, tracing, and metrics to Google Cloud Operations. See [the Google Cloud blog post](https://cloud.google.com/blog/products/networking/introducing-grpc-observability-for-microservices). - census: Add new tracer annotation to indicate the time when name resolution completed for those RPCs that experienced name resolution delay, or the time when picking subchannel completed for those RPCs that experienced picking subchannel delay. ([#​10014](https://togithub.com/grpc/grpc-java/issues/10014), [#​10044](https://togithub.com/grpc/grpc-java/issues/10044)) - protoc-gen-grpc-java: binary for s390x is now published ([#​9455](https://togithub.com/grpc/grpc-java/issues/9455)). The glibc version used is available in Ubuntu 20.04, Debian 11, and CentOS 9 and later - authz: Added `FileWatcherAuthorizationServerInterceptor` ([#​9775](https://togithub.com/grpc/grpc-java/issues/9775)) - services: Added `OrcaMetricReportingServerInterceptor.create(MetricRecorder)` which adds common metrics per-RPC ([#​9902](https://togithub.com/grpc/grpc-java/issues/9902)) - android: Add `UdsChannelBuilder` for using LocalSocket an Android ([#​8418](https://togithub.com/grpc/grpc-java/issues/8418)) - alts: Observe the `GRPC_ALTS_MAX_CONCURRENT_HANDSHAKES` environment variable user to adjust the max number of concurrent ALTS handshakes ([#​10016](https://togithub.com/grpc/grpc-java/issues/10016)) - binder: Expose client identity via `PeerUid` and `PeerUids` ([#​9952](https://togithub.com/grpc/grpc-java/issues/9952)) - binder: Add `BindServiceFlags.setAllowActivityStarts()` for `BIND_ALLOW_ACTIVITY_STARTS` added in Android U ([#​10008](https://togithub.com/grpc/grpc-java/issues/10008)) ##### Bug Fixes - core: Fix NPE race during hedging ([https://github.com/grpc/grpc-java/pull/10007](https://togithub.com/grpc/grpc-java/pull/10007)), fixing a Netty buffer memory leak for cancelled RPCs - core: Allow transparent retries after a retry attempt and the configured max retries was 1 ([#​10066](https://togithub.com/grpc/grpc-java/issues/10066)) - okhttp: properly implement `OkHttpServerBuilder.maxConnectionAgeGrace()` ([#​9968](https://togithub.com/grpc/grpc-java/issues/9968)) - xds: Enable federation support. See [gRFC A47](https://togithub.com/grpc/proposal/blob/master/A47-xds-federation.md) - xds: Enable Weighted Round Robin LB policy support. See [gRFC A58](https://togithub.com/grpc/proposal/blob/master/A58-client-side-weighted-round-robin-lb-policy.md) - xds: Avoid ClassCastException if the control plane changes the top-level policy ([#​10091](https://togithub.com/grpc/grpc-java/issues/10091)). This is expected to be unlikely, but is possible - xds: Fix `java.util.NoSuchElementException: SecurityProtocolNegotiators$ClientSdsHandler#0` ([#​10118](https://togithub.com/grpc/grpc-java/issues/10118)). This error did not cause any problems, other than unnecessary logging - xds: Avoid using the default locale for case insensitive path matching ([#​10148](https://togithub.com/grpc/grpc-java/issues/10148)) - googleapis: Enable ignore_resource_deletion for `google-c2p:` resolver’s default xds bootstrap ([#​10121](https://togithub.com/grpc/grpc-java/issues/10121)) - rls: Refresh name resolution on rejected addresses ([#​10032](https://togithub.com/grpc/grpc-java/issues/10032)) ##### New Examples - Keepalive ([#​9956](https://togithub.com/grpc/grpc-java/issues/9956)) - Cancellation ([#​9962](https://togithub.com/grpc/grpc-java/issues/9962)) - Deadline ([#​9958](https://togithub.com/grpc/grpc-java/issues/9958)) - Using waitForReady ([#​9960](https://togithub.com/grpc/grpc-java/issues/9960)) - Client and Server sharing ([#​9969](https://togithub.com/grpc/grpc-java/issues/9969)) - Reflection ([#​9955](https://togithub.com/grpc/grpc-java/issues/9955)) - Doing debug ([#​9957](https://togithub.com/grpc/grpc-java/issues/9957)) - Health service ([#​9991](https://togithub.com/grpc/grpc-java/issues/9991)) - Error details ([#​9997](https://togithub.com/grpc/grpc-java/issues/9997)) - Custom load balancing ([#​9951](https://togithub.com/grpc/grpc-java/issues/9951)) - gRPC-level reverse proxy ([#​10059](https://togithub.com/grpc/grpc-java/issues/10059)) ##### Dependencies - protobuf-java and protobuf-java-util upgraded to 3.22.3 ([#​10045](https://togithub.com/grpc/grpc-java/issues/10045)) ##### Acknowledgements - [@​carl-mastrangelo](https://togithub.com/carl-mastrangelo) - [@​haubenr](https://togithub.com/haubenr) - [@​jpd236](https://togithub.com/jpd236) - [@​kenk42292](https://togithub.com/kenk42292) ### [`v1.54.2`](https://togithub.com/grpc/grpc-java/releases/tag/v1.54.2) ##### Bug Fixes - core: Fix regression in 1.54.0 where polling NameResolvers would not refresh after a resolution error ([https://github.com/grpc/grpc-java/pull/10328](https://togithub.com/grpc/grpc-java/pull/10328)). The symptom is a transient failure like "UNAVAILABLE: Unable to resolve host" continuing potentially forever. This did not impact DnsNameResolver, but it did impacted GrpclbNameResolver which is the dns name resolver used when grpc-grpclb is in the classpath. So even users that think "I don't use grpclb" may have been impacted. round_robin is mainly impacted on startup, but if the error happened afterward it would commonly fix itself for short transient DNS failures. pick_first is impacted at all times; any failed DNS resolution could cause all future RPCs on the channel to fail. - xds: Avoid using the default locale for case insensitive path matching ([#​10149](https://togithub.com/grpc/grpc-java/issues/10149)) - xds: Avoid potential channel panic when control plane changes the field used to configure load balancing ([#​10103](https://togithub.com/grpc/grpc-java/issues/10103)) - core: Allow transparent retries after a retry attempt and the configured max retries was 1 ([#​10080](https://togithub.com/grpc/grpc-java/issues/10080)) ### [`v1.54.1`](https://togithub.com/grpc/grpc-java/releases/tag/v1.54.1) #### Bug Fixes - core: Fix NPE race during hedging ([https://github.com/grpc/grpc-java/pull/10046](https://togithub.com/grpc/grpc-java/pull/10046)), fixing a Netty buffer memory leak for cancelled RPCs #### Behavior Changes - gcp-observability: Remove monitored resource detection for logging ([https://github.com/grpc/grpc-java/pull/10026](https://togithub.com/grpc/grpc-java/pull/10026)). The cloud libraries will fill in these details instead #### API stabilizations - Stabilize GcpObservability ([https://github.com/grpc/grpc-java/pull/10027](https://togithub.com/grpc/grpc-java/pull/10027)) - The GcpObservability API provides users with a simple way to export logging, tracing, and metrics to Google Cloud Operations. For more information, please see [this blog post](https://cloud.google.com/blog/products/networking/introducing-grpc-observability-for-microservices). ### [`v1.54.0`](https://togithub.com/grpc/grpc-java/releases/tag/v1.54.0) ##### New Features - xds: Add weightedRoundRobin LB policy. The WRR policy allows picking the subchannel by weight based on the metrics feedback from the backend using ORCA API. See gRFC A58: Weighted Round Robin LB Policy. ([#​9873](https://togithub.com/grpc/grpc-java/issues/9873)) - census: Add per call latency metric which is latency across all attempts ([#​9906](https://togithub.com/grpc/grpc-java/issues/9906)) - Generated code now has an interface named `AsyncService` that the `<service-name>ImplBase` class implements. This allows you to provide your own base class when used with the static `<service-name>Grpc.bindService(AsyncService)` method([#​9688](https://togithub.com/grpc/grpc-java/issues/9688)). ##### Examples - Add examples for gcp observability ([#​9967](https://togithub.com/grpc/grpc-java/issues/9967)) ##### Bugfixes - rls:Fix throttling in route lookup where success and error metrics had been inverted ([b/262779100](https://b.corp.google.com/262779100)) ([#​9874](https://togithub.com/grpc/grpc-java/issues/9874)) - protobuf: update external javadoc link ([#​9890](https://togithub.com/grpc/grpc-java/issues/9890)) - core: fix outlier detection default ejection time ([#​9889](https://togithub.com/grpc/grpc-java/issues/9889)) - xds: deletion only to watchers of same control plane ([#​9896](https://togithub.com/grpc/grpc-java/issues/9896)) - api: Target scheme is now properly case insensitive ([#​9899](https://togithub.com/grpc/grpc-java/issues/9899)). `NameResolverProvider`s, however, are expected to return the scheme used for registration in lower-case - api: ForwardingServerCall now forwards getMethodDescriptor(). Previously only SimpleForwardingServerCall forwarded the method ##### Behavior Changes - xds:Allow a cluster’s sum of weights to exceed the maximum signed integer up to a limit of max unsigned integer ([#​9864](https://togithub.com/grpc/grpc-java/issues/9864)) - grpclb: no SRV lookup for "metadata.google.internal." ##### Improvements - xds, orca: Allow removing OobLoadReportListener from a subchannel in OrcaOobUil. ([#​9881](https://togithub.com/grpc/grpc-java/issues/9881)) - services: ORCA API change to allow recording QPS in MetricRecorder and CallMetricRecorder. ([#​9866](https://togithub.com/grpc/grpc-java/issues/9866)) - Move name resolution retry from managed channel to name resolver (take [#​2](https://togithub.com/grpc/grpc-java/issues/2)) ([#​9812](https://togithub.com/grpc/grpc-java/issues/9812)) - Rename AbstractXdsClient to ControlPlaneClient ([#​9934](https://togithub.com/grpc/grpc-java/issues/9934)) - all: fix build with errorprone 2.18 ([#​9886](https://togithub.com/grpc/grpc-java/issues/9886)) - build: allow Java 11+ to use modern error prone - errorprone: enable UnnecessaryAnonymousClass ([#​9927](https://togithub.com/grpc/grpc-java/issues/9927)) - core: add logger to OutlierDetectionLoadBalancer ([#​9880](https://togithub.com/grpc/grpc-java/issues/9880)) - census: add trace annotation to report received message sizes ([#​9944](https://togithub.com/grpc/grpc-java/issues/9944)) - gcp-observability: emit latency and payload size metrics by default when monitoring is enabled ([#​9893](https://togithub.com/grpc/grpc-java/issues/9893)) - gcp-observability: add trace information like TraceId and SpanId in logs for log correlation when both logging and traces are enabled ([#​9963](https://togithub.com/grpc/grpc-java/issues/9963)) - gcp-observability: close() will take longer, to ensure metrics and traces are flushed ([#​9972](https://togithub.com/grpc/grpc-java/issues/9972)) - gcp-observability: update status code type in logs to Google RPC code instead of an integer ([#​9959](https://togithub.com/grpc/grpc-java/issues/9959)) - gcp-observability: retain default opencensus-task identifier even when custom labels are specified in the configuration ([#​9982](https://togithub.com/grpc/grpc-java/issues/9982)) - Build Improvements ([#​9855](https://togithub.com/grpc/grpc-java/issues/9855)) - Fixes MethodDescriptor java documentation ([#​9860](https://togithub.com/grpc/grpc-java/issues/9860)) - api: forward getSecurityLevel on PartialForwardingServerCall ([#​9912](https://togithub.com/grpc/grpc-java/issues/9912)) - Updating ServerInterceptors.java to support different marshallers for Request and Response messages. ([#​9877](https://togithub.com/grpc/grpc-java/issues/9877)) ##### API stabilizations - Stabilize method ServerBuilder.intercept which had previously been marked experimental. ([#​9894](https://togithub.com/grpc/grpc-java/issues/9894)) - api:stabilize offloadExecutor usage in ManagedChannelBuilder and NameResolver. ([#​9931](https://togithub.com/grpc/grpc-java/issues/9931)) ##### Dependencies - netty:Upgrade Netty from 4.1.79 to 4.1.87, tcnative from 2.0.54 to 2.0.56 ([#​9784](https://togithub.com/grpc/grpc-java/issues/9784)) - gcp-observability: Transitive gRPC components now have the same gRPC version - gcp-observability : Google cloud logging updated to 3.14.5 ##### Acknowledgements [@​benjaminp](https://togithub.com/benjaminp) [@​s-matyukevich](https://togithub.com/s-matyukevich) [@​Faqa](https://togithub.com/Faqa) [@​antechrestos](https://togithub.com/antechrestos) [@​carl-mastrangelo](https://togithub.com/carl-mastrangelo) [@​ioanbsu](https://togithub.com/ioanbsu) ### [`v1.53.0`](https://togithub.com/grpc/grpc-java/releases/tag/v1.53.0) ##### New Features - googleapis: Allow user set c2p bootstrap config ([#​9856](https://togithub.com/grpc/grpc-java/issues/9856)) - xds: Add contain and stringMatcher in `RouteConfiguration` ([#​9845](https://togithub.com/grpc/grpc-java/issues/9845)) - core: Add `grpc-previous-rpc-attempts` to the initial response metadata ([#​9686](https://togithub.com/grpc/grpc-java/issues/9686)) - servlet: Implement gRPC server as a Servlet ([#​8596](https://togithub.com/grpc/grpc-java/issues/8596)) - authz: Implement static authorization server interceptor ([#​8934](https://togithub.com/grpc/grpc-java/issues/8934)) ##### Examples - servlet: Add servlet example ([#​8596](https://togithub.com/grpc/grpc-java/issues/8596)) ##### Bug Fixes - xds: Update xds error handling logic. Specifically: - When the ads stream is closed only send errors to subscribers that haven't yet gotten results - Timers to detect missing resources don’t start until the adsStream is ready ([#​9745](https://togithub.com/grpc/grpc-java/issues/9745)) - Call subscriber onError callback when xds client fails to connect to server ([#​9827](https://togithub.com/grpc/grpc-java/issues/9827)) - core: Delay retriable stream master listener close until all sub streams are closed. This fixes the call executor lifecycle and prevents potential `RejectedExecutionException`. ([#​9754](https://togithub.com/grpc/grpc-java/issues/9754)) - core: Free unused `MessageProducer` in `RetriableStream` ([#​9853](https://togithub.com/grpc/grpc-java/issues/9853)), fixing a Netty buffer memory leak for cancelled RPCs - api: Fail with `NullPointerException` when a Metadata.Marshaller returns null bytes ([#​9781](https://togithub.com/grpc/grpc-java/issues/9781)). This would previously cause a `NullPointerException` later during the RPC. Now the return value of the Marshaller is checked immediately, to help find the broken Marshaller ##### Behavior Changes - xds: Disallow duplicate addresses in the RingHashLB. ([#​9776](https://togithub.com/grpc/grpc-java/issues/9776)) - xds: EDS weight sums are allowed up to max unsigned int (was max signed int) ([#​9765](https://togithub.com/grpc/grpc-java/issues/9765)) - xds: Drop xds v2 support ([#​9760](https://togithub.com/grpc/grpc-java/issues/9760)) ##### Dependencies - JUnit upgraded to 4.13.2 - bazel: Dropped support for Bazel 4. We track the two most recent major versions of Bazel, Bazel 5 and 6. Bazel 4 may still work, but we are no longer testing it - bazel: Include Tomcat annotations dependency for `@Generated` as used by autovalue ([#​9762](https://togithub.com/grpc/grpc-java/issues/9762)). Necessary for building xds and rls on Java 9+ - bazel: Export deps from Maven Central-specific stand-in targets ([#​9780](https://togithub.com/grpc/grpc-java/issues/9780)). Some Maven Central artifacts are a combination of multiple Bazel targets, like grpc-core is composed of //core:inprocess, //core:internal, //core:util, //api. There is a “//core:core_maven” target used by maven_install that uses the other targets. Previously the target used `runtime_deps` to discourage their use by Bazel users, but that could cause compilation failures from lack of hjars. These targets now use `exports` ##### Acknowledgement [@​cpovirk](https://togithub.com/cpovirk) [@​niloc132](https://togithub.com/niloc132) [@​stephenh](https://togithub.com/stephenh) [@​olderwei](https://togithub.com/olderwei) [@​pandaapo](https://togithub.com/pandaapo) [@​panxuefeng](https://togithub.com/panxuefeng) ### [`v1.52.1`](https://togithub.com/grpc/grpc-java/releases/tag/v1.52.1) ##### Bug Fixes - xds: Fix an internal bug in xds resource subscription that might cause xds stream not accepting response update for that resource type entirely. ([#​9810](https://togithub.com/grpc/grpc-java/issues/9810)) ### [`v1.52.0`](https://togithub.com/grpc/grpc-java/releases/tag/v1.52.0) #### gRPC Java 1.52.0 Release Notes **grpc-xds starting with 1.51.0 had a regression where resources might stop receiving updates. The trigger could happen hours or days after the binary had started. xDS users should avoid this release and use 1.50.x until patch releases with the fix are available. [https://github.com/grpc/grpc-java/pull/9809](https://togithub.com/grpc/grpc-java/pull/9809)** ##### API Changes - Fix CallOptions to be properly `@Immutable` ([#​9689](https://togithub.com/grpc/grpc-java/issues/9689)) - binder: Promote out of experimental status ([#​9669](https://togithub.com/grpc/grpc-java/issues/9669)). Much of the API is now stable ##### New Features - xds: Support localities in multiple priorities ([#​9683](https://togithub.com/grpc/grpc-java/issues/9683)) - xds: Log xDS node ID with verbosity INFO when environment variable GRPC_LOG_XDS_NODE_ID=true ([#​9731](https://togithub.com/grpc/grpc-java/issues/9731)) ##### Examples - Add examples for name resolver and load balancer ([#​9700](https://togithub.com/grpc/grpc-java/issues/9700)) - Swap to ChannelCredentials/ServerCredentials API, as it is preferred ##### Bug Fixes - xds:Fix ConcurrentModificationException in PriorityLoadBalancer ([#​9728](https://togithub.com/grpc/grpc-java/issues/9728)) - ManagedChannelImpl.SubchannelImpl fix args check to avoid NPE ([#​9651](https://togithub.com/grpc/grpc-java/issues/9651)) - okhttp: Add missing server support for TLS ClientAuth ([#​9711](https://togithub.com/grpc/grpc-java/issues/9711)) - binder: Ensure the security interceptor is always closest to the actual transport ([#​9716](https://togithub.com/grpc/grpc-java/issues/9716)) - bazel: Include [@​Generated](https://togithub.com/Generated) dep for autovalue. This fixes builds of xds and rls using Java 9+ - xds: Nack xds response when weighted cluster total weight sums zero ([#​9738](https://togithub.com/grpc/grpc-java/issues/9738)) - core: Fix a bug about a retriable stream lifecycle. It stops using the call executor resource in a retriable stream when the client call is closed, thus preventing potential channel panics. ([#​9626](https://togithub.com/grpc/grpc-java/issues/9626)) ##### Behavior Changes - binder: Set default idle timeout to 60 seconds, and enable "strict lifecycle management". ([#​9486](https://togithub.com/grpc/grpc-java/issues/9486)) - xds: Limit ring hash max size to 4K instead of 8M ([#​9709](https://togithub.com/grpc/grpc-java/issues/9709)). `RingHashOptions.setRingSizeCap()` can increase the limit - binder: Set default idle timeout to 60 seconds, and add `BinderChannelBuilder.strictLifecycleManagement()` which disables idle timeout and prevents it from being changed ([#​9486](https://togithub.com/grpc/grpc-java/issues/9486)). Disabling idle timeout can be useful to find bugs in applications that fail to promptly shut down the channel and are particularly sensitive to keeping Binder instances alive. - bazel: Replace ctx.host_configuration.host_path_separator with ctx.configuration.host_path_separator ([#​9742](https://togithub.com/grpc/grpc-java/issues/9742)). This changes no behavior today, but improves future compatibility with newer versions of Bazel - xds: Refactor internal logics about LDS and CDS resource handling. It may cause minor log line changes about corresponding RDS and EDS subscriber event notification, but it should not change xds name resolution and LB behavior. ([#​9724](https://togithub.com/grpc/grpc-java/issues/9724)) ##### Dependencies ##### Acknowledgement [@​RapperCL](https://togithub.com/RapperCL) [@​Smityz](https://togithub.com/Smityz) [@​pandaapo](https://togithub.com/pandaapo) ### [`v1.51.3`](https://togithub.com/grpc/grpc-java/releases/tag/v1.51.3) ##### Bug Fixes - xds: Fix an internal bug in xds resource subscription that might cause xds stream not accepting response update for that resource type entirely. ([https://github.com/grpc/grpc-java/pull/9811](https://togithub.com/grpc/grpc-java/pull/9811)) ### [`v1.51.1`](https://togithub.com/grpc/grpc-java/releases/tag/v1.51.1) **grpc-xds starting with 1.51.0 had a regression where resources might stop receiving updates. The trigger could happen hours or days after the binary had started. xDS users should avoid this release and use 1.50.x until patch releases with the fix are available. [https://github.com/grpc/grpc-java/pull/9809](https://togithub.com/grpc/grpc-java/pull/9809)** ##### Bug Fixes - xds: Fix ConcurrentModificationException in PriorityLoadBalancer. ([#​9744](https://togithub.com/grpc/grpc-java/issues/9744)) ### [`v1.51.0`](https://togithub.com/grpc/grpc-java/releases/tag/v1.51.0) **grpc-xds starting with 1.51.0 had a regression where resources might stop receiving updates. The trigger could happen hours or days after the binary had started. xDS users should avoid this release and use 1.50.x until patch releases with the fix are available. [https://github.com/grpc/grpc-java/pull/9809](https://togithub.com/grpc/grpc-java/pull/9809)** ##### Bug Fixes - grpclb: Fix a debug logging message which incorrectly logged loadbalancer addresses under backend addresses. ([#​9602](https://togithub.com/grpc/grpc-java/issues/9602)) ##### New Features - okhttp: okhttp server now supports maxConnectionAge and maxConnectionAgeGrace configuration for improved connection management. ([#​9649](https://togithub.com/grpc/grpc-java/issues/9649)) ##### Behavior Changes - netty: switch default cumulation strategy from MERGE to ADAPTIVE. When accumulating incoming network data, Adaptive cumulator dynamically switches between MERGE and COMPOSE strategies to minimize the amount of copying while also limiting per-buffer overhead. ([#​9558](https://togithub.com/grpc/grpc-java/issues/9558)) ##### Acknowledgements [@​TrevorEdwards](https://togithub.com/TrevorEdwards) ### [`v1.50.3`](https://togithub.com/grpc/grpc-java/releases/tag/v1.50.3) #### Bug Fixes - core: Free unused MessageProducer in RetriableStream ([https://github.com/grpc/grpc-java/pull/9853](https://togithub.com/grpc/grpc-java/pull/9853)), fixing a Netty buffer memory leak for cancelled RPCs ### [`v1.50.2`](https://togithub.com/grpc/grpc-java/releases/tag/v1.50.2) ##### Bug fixes gcp-observability: Supports period(.) in the service name part of regular expression for a fully-qualified method to accept "package.service" ### [`v1.50.1`](https://togithub.com/grpc/grpc-java/releases/tag/v1.50.1) gcp-observability: support new configuration defined in grpc-gcp-observability public preview user guide ### [`v1.50.0`](https://togithub.com/grpc/grpc-java/releases/tag/v1.50.0) #### New Features - okhttp: Added connection management features to okhttp server, including maxConnectionIdle(), permitKeepAliveTime(), and permitKeepAliveWithoutCalls() ([#​9494](https://togithub.com/grpc/grpc-java/issues/9494), [#​9544](https://togithub.com/grpc/grpc-java/issues/9544)) - binder: Add `SecurityPolicies` for checking device owner/profile owner ([#​9428](https://togithub.com/grpc/grpc-java/issues/9428)) #### API Changes - api: Add LoadBalancer.acceptResolvedAddresses() ([#​9498](https://togithub.com/grpc/grpc-java/issues/9498)). The method is like `handleResolvedAddresses()` but returns a `boolean` of whether the addresses and configuration were accepted. Not accepting the update triggers the NameResolver to retry after a delay. We are not yet encouraging migration to this method, as there is still a second future API change - core: add CallOptions to CallCredentials.RequestInfo ([#​9538](https://togithub.com/grpc/grpc-java/issues/9538)) #### Bug Fixes - auth: Fix AppEngine failing while retrieving access token when instantiating a blocking stub using AppEngineCredentials ([#​9504](https://togithub.com/grpc/grpc-java/issues/9504)) - core: Ensure that context cancellationCause is set ([#​9501](https://togithub.com/grpc/grpc-java/issues/9501)) - core: Update outlier detection max ejection logic to allow exceeding the limit by one, to match Envoy. ([#​9489](https://togithub.com/grpc/grpc-java/issues/9489), [#​9492](https://togithub.com/grpc/grpc-java/issues/9492)) - core: outlier detection to honor min host request volume ([#​9490](https://togithub.com/grpc/grpc-java/issues/9490)) - okhttp: Add timeout for HTTP CONNECT proxy handshake ([#​9586](https://togithub.com/grpc/grpc-java/issues/9586)) - xds: ringhash policy in TRANSIENT_FAILURE should not attempt connecting when already in connecting ([#​9535](https://togithub.com/grpc/grpc-java/issues/9535)). With workloads where most requests have the same hash, ring hash should behave more like pick-first of slowly trying backends #### Dependencies - netty: upgrade netty from 4.1.77.Final to 4.1.79.Final and tcnative from 2.0.53 to 2.0.54 ([#​9451](https://togithub.com/grpc/grpc-java/issues/9451)) #### Acknowledgements [@​cpovirk](https://togithub.com/cpovirk) [@​prateek-0](https://togithub.com/prateek-0) [@​sai-sunder-s](https://togithub.com/sai-sunder-s) ### [`v1.49.2`](https://togithub.com/grpc/grpc-java/releases/tag/v1.49.2) #### Dependencies - Bump protobuf to 3.21.7 ### [`v1.49.1`](https://togithub.com/grpc/grpc-java/releases/tag/v1.49.1) #### Bug Fixes - xds: Fix a bug in ring-hash load balancing policy that, during `TRANSIENT_FAILURE` state, it might cause unnecessary internal connection requests on `subchannel`s. ([#​9537](https://togithub.com/grpc/grpc-java/issues/9537)) - auth: Fix AppEngine failing while retrieving access token when instantiating a blocking stub using AppEngineCredentials ([#​9524](https://togithub.com/grpc/grpc-java/issues/9524)) #### Behavior Changes - core: Update outlier detection max ejection logics, and min host request volume logics. ([https://github.com/grpc/grpc-java/pull/9550](https://togithub.com/grpc/grpc-java/pull/9550), [#​9551](https://togithub.com/grpc/grpc-java/issues/9551), [#​9552](https://togithub.com/grpc/grpc-java/issues/9552)) ### [`v1.49.0`](https://togithub.com/grpc/grpc-java/releases/tag/v1.49.0) ##### New Features - okhttp: Add `OkHttpServerBuilder`. The server can be used directly, but is not yet available via `ServerBuilder.forPort()` and `Grpc.newServerBuilderForPort()`. It passes our tests, but has seen no real-world use. It is also lacking connection management features - okhttp: Add support for byte-based private keys via TlsChannelCredentials and TlsServerCredentials - core: New outlier detection load balancer - googleapis: google-c2p resolver is now stabilized ##### Bug Fixes - core: Fix retry causing memory leak for canceled RPCs. ([#​9360](https://togithub.com/grpc/grpc-java/issues/9360)) - core: Use SyncContext for InProcess transport callbacks to avoid deadlocks. This fixes the long-standing issue [#​3084](https://togithub.com/grpc/grpc-java/issues/3084) which prevented using directExecutor() in some tests using streaming RPCs - core: Disable retries with in-process transport by default ([#​9361](https://togithub.com/grpc/grpc-java/issues/9361)). In-process does not compute message sizes so can retain excessive amounts of memory - bazel: Use valid target name for services and xds when overriding Maven targets ([#​9422](https://togithub.com/grpc/grpc-java/issues/9422)). This fixes an error of the form `no such target '@​io_grpc_grpc_java//services:services'` for services and missing ORCA classes for xds. The wrong target names were introduced in 1.47.0 - xds: channel_id hash policy now uses a random per-channel id instead of an incrementing one. The incrementing id was the same for every process of a binary, which was not the intention ([#​9453](https://togithub.com/grpc/grpc-java/issues/9453)) - core: Fix a bug that the server stream should not deliver halfClose() when the call is immediately canceled. The bug causes a bad message INTERNAL, desc: Half-closed without a request at server call. ([#​9362](https://togithub.com/grpc/grpc-java/issues/9362)) - xds: Remove shaded orca proto dependency in ORCA api. The shading was broken and couldn't really be used. ([#​9366](https://togithub.com/grpc/grpc-java/issues/9366)) ##### Behavior Changes - gcp-observability: Interceptors are now injected in more situations, including for non-Netty transports and when using transport-specific APIs like NettyChannelBuilder. ([#​9309](https://togithub.com/grpc/grpc-java/issues/9309) [#​9312](https://togithub.com/grpc/grpc-java/issues/9312) [#​9424](https://togithub.com/grpc/grpc-java/issues/9424)) - gcp-observability: custom tags now extended to metrics and traces ([#​9402](https://togithub.com/grpc/grpc-java/issues/9402) [#​9407](https://togithub.com/grpc/grpc-java/issues/9407)) - gcp-observability: excludes RPCs into Google Cloud Ops backend for instrumentation ([#​9436](https://togithub.com/grpc/grpc-java/issues/9436)) - xds: xdsNameResolver now matches channel overrideAuthority in virtualHost matching ([#​9405](https://togithub.com/grpc/grpc-java/issues/9405)) ##### Acknowledgement [@​benjaminp](https://togithub.com/benjaminp) [@​j-min5u](https://togithub.com/j-min5u) ### [`v1.48.2`](https://togithub.com/grpc/grpc-java/releases/tag/v1.48.2) #### Bug Fixes - xds: Fix a bug in ring-hash load balancing policy that, during TRANSIENT_FAILURE state, it might cause unnecessary internal connection requests on subchannels. ([https://github.com/grpc/grpc-java/pull/9537](https://togithub.com/grpc/grpc-java/pull/9537)) - auth: Fix AppEngine failing while retrieving access token when instantiating a blocking stub using AppEngineCredentials ([https://github.com/grpc/grpc-java/pull/9524](https://togithub.com/grpc/grpc-java/pull/9524)) - xds: channel_id hash policy now uses a random per-channel id instead of an incrementing one. The incrementing id was the same for every process of a binary, which was not the intention ([https://github.com/grpc/grpc-java/pull/9453](https://togithub.com/grpc/grpc-java/pull/9453)) - bazel: Use valid target name for services and xds when overriding Maven targets ([https://github.com/grpc/grpc-java/pull/9422](https://togithub.com/grpc/grpc-java/pull/9422)). This fixes an error of the form no such target '@​io_grpc_grpc_java//services:services' for services and missing ORCA classes for xds. The wrong target names were introduced in 1.47.0 #### Dependencies - Bump protobuf to 3.21.7 ### [`v1.48.1`](https://togithub.com/grpc/grpc-java/releases/tag/v1.48.1) #### New Features ORCA provides APIs to inject custom metrics at a gRPC server, and consume them at a gRPC client. It implements [A51: Custom Backend Metrics Support](https://togithub.com/grpc/proposal/blob/master/A51-custom-backend-metrics.md). We changed the ORCA APIs; they had broken shading and couldn't really be used, so we fixed them in the patch release. #### Bug Fixes - core: Fix a bug that the server stream should not deliver halfClose() when the call is immediately canceled. The bug causes a bad message `INTERNAL, desc: Half-closed without a request` at server call. ([#​9362](https://togithub.com/grpc/grpc-java/issues/9362)) - core: Fix retry causing memory leak for cancelled RPCs. ([#​9415](https://togithub.com/grpc/grpc-java/issues/9415)) - core: Disable retry by default for in-process transport's channel.([#​9368](https://togithub.com/grpc/grpc-java/issues/9368)) ### [`v1.48.0`](https://togithub.com/grpc/grpc-java/releases/tag/v1.48.0) ##### Bug Fixes - Removed the Class-Path manifest entry from jars generated with the gradle shadow plugin ([#​9270](https://togithub.com/grpc/grpc-java/issues/9270)). This should prevent “\[WARNING] \[path] bad path element” compilation warnings - Fix Channelz HTTP/2 window reporting. Previously the sender and receiver windows were reversed - Service config parse failures should be UNAVAILABLE, not INVALID_ARGUMENT ([#​9346](https://togithub.com/grpc/grpc-java/issues/9346)). This bug could cause RPCs to fail with INVALID_ARGUMENT if the service config was invalid when the channel started. RPCs were not failed if the channel had previously received no config or a valid config. Channels using xds were not exposed to this issue ##### New Features - xds: implement ignore_resource_deletion server feature as defined in the gRFC [A53: Option for Ignoring xDS Resource Deletion](https://togithub.com/grpc/proposal/blob/master/A53-xds-ignore-resource-deletion.md). ([#​9339](https://togithub.com/grpc/grpc-java/issues/9339)) - bazel: Support maven_install's strict_visibility=True by including direct dependencies explicitly ##### Improvements - Changed the debug strings for many `Attributes.Key`s to reference the API of the key. This should make it easier to find the API the key is exposed when using `attributes.toString()` - api: Document `Attributes.Key` uses reference equality. This is to make it clear the behavior is on purpose, and mirrors other Key types in the API - api: Explain security constraints of `EquivalentAddressGroup.ATTR_AUTHORITY_OVERRIDE`, to avoid misuse by `NameResolver`s ([#​9281](https://togithub.com/grpc/grpc-java/issues/9281)) - testing: `GrpcCleanupRule` now extends `ExternalResource`. This makes it usable with JUnit 5 - core: Clear ConfigSelector when the channel enters panic mode ([#​9272](https://togithub.com/grpc/grpc-java/issues/9272)). This prevents hanging RPCs if panic mode is entered very early in the channel lifetime and makes panic mode more predictable when xds is in use. Panic mode is a Channel feature used when a bug causes an unrecoverable error - core: Avoid unnecessary flushes for unary responses. It optimizes the response flow ([#​9273](https://togithub.com/grpc/grpc-java/issues/9273)) - core: Use the offload executor in CallCredentials rather than the executor from CallOptions ([#​9313](https://togithub.com/grpc/grpc-java/issues/9313)) - compiler: support protoc compiling on loongarch\_64 and ppc64le platform ([#​9178](https://togithub.com/grpc/grpc-java/issues/9178) [#​9284](https://togithub.com/grpc/grpc-java/issues/9284)) - binder: Add security Policy for verifying signature using sha-256 hash ([#​9305](https://togithub.com/grpc/grpc-java/issues/9305)) - xds: clusterresolver reuses child policy names for the same locality to avoid subchannel connection churns ([#​9287](https://togithub.com/grpc/grpc-java/issues/9287)) - xds: Fail RPCs with error details when resources are deleted instead of “NameResolver returned no usable address errors” ([#​9337](https://togithub.com/grpc/grpc-java/issues/9337)) - xds: Support least_request LB in LoadBalancingPolicy ([#​9262](https://togithub.com/grpc/grpc-java/issues/9262)) - xds: weighted target to delay picker updates while updating children ([#​9306](https://togithub.com/grpc/grpc-java/issues/9306)) - xds: delete the permanent error logic in processing LDS updates in XdsServerWrapper ([#​9268](https://togithub.com/grpc/grpc-java/issues/9268)) - xds: when delegate server throws on start communicate the error to statusListener ([#​9277](https://togithub.com/grpc/grpc-java/issues/9277)) ##### Dependencies - Bump Guava to 31.1 - Bump protobuf to 3.21.1 ([#​9311](https://togithub.com/grpc/grpc-java/issues/9311)) - Bump Error Prone annotations to 2.14.0 - Bump Animal Sniffer annotations to 1.21 - Bump Netty to 4.1.77.Final and netty_tcnative to 2.0.53.Final - protobuf: Bump `com.google.api.grpc:proto-google-common-protos` to 2.9.0 - alts: Bump Conscrypt to 2.5.2 - xds: Bump RE2J to 1.6 - xds: Remove unused org.bouncycastle:bcpkix-jdk15on dependency - xds: Update xDS protos ([#​9223](https://togithub.com/grpc/grpc-java/issues/9223)) ##### Acknowledgements [@​mirlord](https://togithub.com/mirlord) [@​zhangwenlong8911](https://togithub.com/zhangwenlong8911) [@​adilansari](https://togithub.com/adilansari) [@​amirhadadi](https://togithub.com/amirhadadi) [@​jader-eero](https://togithub.com/jader-eero) [@​jvolkman](https://togithub.com/jvolkman) [@​sumitd2](https://togithub.com/sumitd2) ### [`v1.47.1`](https://togithub.com/grpc/grpc-java/releases/tag/v1.47.1) #### Bug Fixes - core: Fix retry causing memory leak for canceled RPCs. ([#​9416](https://togithub.com/grpc/grpc-java/issues/9416)) #### Behavior Changes - xds: Remove permanent error handling in LDS update in XdsServerWrapper. Also notify `OnNotServing` on `StatusListener` when the delegated server initial start fails. ([#​9276](https://togithub.com/grpc/grpc-java/issues/9276), [#​9279](https://togithub.com/grpc/grpc-java/issues/9279)) #### Dependencies - Bump protobuf to 3.19.6 ### [`v1.47.0`](https://togithub.com/grpc/grpc-java/releases/tag/v1.47.0) ##### Bug Fixes - api: Ignore `ClassCastExceptions` for hard-coded providers on Android ([#​9174](https://togithub.com/grpc/grpc-java/issues/9174)). This avoids `ServiceConfigurationError` in certain cases when an “SDK” includes a copy of gRPC that was renamed with Proguard-like tools that do precise class name rewriting (versus something like Maven Shade Plugin which uses coarse pattern matching) - binder: respect requested message limits when provide received messages to listener ([#​9163](https://togithub.com/grpc/grpc-java/issues/9163)) - binder: Avoid an ISE from `asAndroidAppUri()` ([#​9169](https://togithub.com/grpc/grpc-java/issues/9169)) - okhttp: Use the user-provided `ScheduledExecutorService` for keepalive if provided. Previously the user-provided executor was used for deadlines, but not keepalive. Keepalive always used the default executor ([#​9073](https://togithub.com/grpc/grpc-java/issues/9073)) - bom: Reverted “bom: Removed protoc-gen-grpc-java from the BOM” in v1.46.0. There was a way to use it with Gradle ([#​9154](https://togithub.com/grpc/grpc-java/issues/9154)) - build: fix grpc-java build against protobuf 3.21 ([#​9218](https://togithub.com/grpc/grpc-java/issues/9218)) - grpclb: Adds missing META-INF resources to `libgrpclb.jar` produced by bazel `//grpclb:grpclb` target ([#​9156](https://togithub.com/grpc/grpc-java/issues/9156)) - xds: Protect xdstp processing with federation env var. If the xds server uses xdstp:// resource names it was possible for federation code paths to be entered even without enabling the experimental federation support. This is now fixed and it is safe for xds servers to use xdstp:// resource names. ([#​9190](https://togithub.com/grpc/grpc-java/issues/9190)) - xds: fix bugs in ring-hash load balancer picking subchannel behavior per gRFC. The bug may cause connection not failing over from `TRANSIENT_FAILURE` status. ([#​9085](https://togithub.com/grpc/grpc-java/issues/9085)) - xds: NACK EDS resources with duplicate localities in the same priority ([#​9119](https://togithub.com/grpc/grpc-java/issues/9119)) ##### New Features - api: Add connection management APIs to `ServerBuilder` ([#​9176](https://togithub.com/grpc/grpc-java/issues/9176)). This includes methods for keepalive, max connection age, and max connection idle. These APIs have been available on NettyServerBuilder since v1.4.0 - api: allow `NameResolver` to influence which transport to use ([#​9076](https://togithub.com/grpc/grpc-java/issues/9076)) - api: New API in ServerCall to expose SecurityLevel on server-side ([#​8943](https://togithub.com/grpc/grpc-java/issues/8943)) - netty: Add `NameResolver` for `unix:` scheme, as defined in [gRPC Name Resolution](https://togithub.com/grpc/grpc/blob/master/doc/naming.md) ([#​9113](https://togithub.com/grpc/grpc-java/issues/9113)) - binder: add `allOf` security policy, which allows access iff ALL given security policies allow access. ([#​9125](https://togithub.com/grpc/grpc-java/issues/9125)) - binder: add `anyOf` security policy, which allows access if ANY given security policy allows access. ([#​9147](https://togithub.com/grpc/grpc-java/issues/9147)) - binder: add `hasPermissions` security policy, which checks that a caller has all of the given package permissions. ([#​9117](https://togithub.com/grpc/grpc-java/issues/9117)) - build: Add Bazel build support for xds, googleapis, rls, and services. grpc-services previously had partial bazel support, but some parts were missing. These artifacts are now configured via `IO_GRPC_GRPC_JAVA_OVERRIDE_TARGETS` so maven_install will not use the artifacts from Maven Central ([#​9172](https://togithub.com/grpc/grpc-java/issues/9172)) - xds: New ability to configure custom load balancer implementations via the xDS `Cluster.load_balancing_policy` field. This implements [gRFC A52: gRPC xDS Custom Load Balancer Configuration](https://togithub.com/grpc/proposal/blob/master/A52-xds-custom-lb-policies.md). ([#​9141](https://togithub.com/grpc/grpc-java/issues/9141)) - xds, orca: add support for custom backend metrics reporting: allow setting metrics at gRPC server and consuming metrics reports from a custom load balancing policy at the client. This implements [gRFC A51: Custom Backend Metrics Support](https://togithub.com/grpc/proposal/blob/master/A51-custom-backend-metrics.md). - xds: include node ID in RPC failure status messages from the XdsClient ([#​9099](https://togithub.com/grpc/grpc-java/issues/9099)) - xds: support for the `is_optional` logic in Cluster Specifier Plugins: if an unsupported Cluster Specifier Plugin is optional, don't NACK, and skip any routes that point to it. ([#​9168](https://togithub.com/grpc/grpc-java/issues/9168)) ##### Behavior Changes - xds: Allow unspecified listener traffic direction, to match other languages and to work with Istio ([#​9173](https://togithub.com/grpc/grpc-java/issues/9173)) - xds: change priority load balancer failover time behavior and `ring_hash` LB aggregation rule to better handle transient_failure channel status ([#​9084](https://togithub.com/grpc/grpc-java/issues/9084), [#​9093](https://togithub.com/grpc/grpc-java/issues/9093)) ##### Dependencies - Bump GSON to 2.9.0. Earlier versions of GSON are affected by [CVE-2022-25647](https://nvd.nist.gov/vuln/detail/CVE-2022-25647). gRPC was not impacted by the vulnerability. ([#​9215](https://togithub.com/grpc/grpc-java/issues/9215)) - gcp-observability: add grpc-census as a dependency and update opencensus version ([#​9140](https://togithub.com/grpc/grpc-java/issues/9140)) ##### Acknowledgements [@​caseyduquettesc](https://togithub.com/caseyduquettesc) [@​cfredri4](https://togithub.com/cfredri4) [@​jvolkman](https://togithub.com/jvolkman) [@​mirlord](https://togithub.com/mirlord) [@​ovidiutirla](https://togithub.com/ovidiutirla) ### [`v1.46.1`](https://togithub.com/grpc/grpc-java/releases/tag/v1.46.1) #### Behavior Changes - xds: Remove permanent error handling in LDS update in XdsServerWrapper. Also notify `OnNotServing` on `StatusListener` when the delegated server initial start fails. ([#​9278](https://togithub.com/grpc/grpc-java/issues/9278), [#​9280](https://togithub.com/grpc/grpc-java/issues/9280)) - xds: Protect xdstp processing with federation env var. If the xds server uses xdstp:// resource names it was possible for federation code paths to be entered even without enabling the experimental federation support. This is now fixed and it is safe for xds servers to use xdstp:// resource names. ([https://github.com/grpc/grpc-java/pull/9190](https://togithub.com/grpc/grpc-java/pull/9190)) #### Dependencies - Bump protobuf to 3.19.6 ### [`v1.46.0`](https://togithub.com/grpc/grpc-java/releases/tag/v1.46.0) ##### Bug Fixes - netty: Fixed incompatibility with Netty 4.1.75.Final that caused COMPRESSION_ERROR ([#​9004](https://togithub.com/grpc/grpc-java/issues/9004)) - xds: Fix LBs blindly propagating control plane errors ([#​9012](https://togithub.com/grpc/grpc-java/issues/9012)). This change forces the use of UNAVAILABLE for any xDS communication failures, which otherwise could greatly confuse an application. This is essentially a continuation of the fix in 1.45.0 for XdsNameResolver, but for other similar cases - xds: Fix ring_hash reconnecting behavior. Previously a TRANSIENT_FAILURE subchannel would remain failed forever - xds: Fix ring_hash defeating priority’s failover connection timeout. [grpc/proposal#296](https://togithub.com/grpc/proposal/issues/296) - binder: Work around an Android Intent bug for consistent AndroidComponentAndress hashCode() and equals() ([#​9061](https://togithub.com/grpc/gr </details> --- ### Configuration 📅 **Schedule**: Branch creation - "before 4am on the first day of the month" (UTC), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://togithub.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi4zNS4xIiwidXBkYXRlZEluVmVyIjoiMzYuMzUuMSIsInRhcmdldEJyYW5jaCI6Im1haW4ifQ==-->
No description provided.