Improve "active" metrics handling in WebClient observations #31702
Labels
in: web
Issues in web modules (web, webmvc, webflux, websocket)
theme: observability
An issue related to observability and tracing
type: enhancement
A general enhancement
Milestone
Current behavior:
WebClient long task timer task active metrics does not show "destination host" in "client_name" label, instead always shows "none".
In service which has more than 1 webClient instance to different remote host all long task timer active metrics are aggregated and metrics cannot detect the number of pending requests per destination host. This decrease the value of the metrics as breakdown per destination host calls cannot be done.
When MVC is used instead of webflux then "client_name" label contains correct destination host".
Expected behavior:
webClient should report "client_name" label correctly matching remote destination address shown in normal metrics.
Potential root cause:
DefaultWebClient exchange() store the requestsBuilder() to carrier() before observation start, but "request" is stored to context only after observation is started just before applying the filter.
DefaultClientRequestObservationConvention/clientName() reads the "client_name" from context when long task timer is started and get "null" because context is not set. Thus "client_name" is not available for long task timer active start but only for normal metrics stop.
MVC works because "client_name" is read from carrier.
There were some other changes in DefaultWebClient blame but I am not sure if those change or broke the the "client_name" because context is still set after start.
Possible customization workaround is reading the clientName from context.getCarrier().build() instead from context. But not clear if this is safe way in case webClient implementation change again.
Environment:
Java: 17/21
SpringBoot 3.1.5, same issue in 3.1.6 and 3.2.0
The text was updated successfully, but these errors were encountered: