-
Notifications
You must be signed in to change notification settings - Fork 927
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Micrometer Observation instrumentation #4980
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry about the delay! Left some questions and comments but honestly I think this PR is already close to completion 🚀
Also pushed some small commits which
- Fixes failing tests
- Hides some public classes
Feel free to revert any changes I made, or just let me know and I'll revert it back 😄
if (!ctx.config().transientServiceOptions().contains(TransientServiceOption.WITH_TRACING) && | ||
!ctx.config().transientServiceOptions().contains( | ||
TransientServiceOption.WITH_METRIC_COLLECTION)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Transient services signify services that may not be of interest for most users. For instance, HealthCheckService
is a transient service where users may not be interested in recording metrics/logs.
By default, user defined services have all TransientServiceOption
s enabled so most times I guess this won't be a problem.
return TransientServiceOption.allOf(); |
I guess we don't want to collect observation traces/metrics for transient services, so what do you think of just disabling if either option doesn't exist?
Even if WITH_TRACING
may be disabled, it is currently possible that traces are still recordeddue to WITH_METRIC_COLLECTION
being enabled.
if (!ctx.config().transientServiceOptions().contains(TransientServiceOption.WITH_TRACING) && | |
!ctx.config().transientServiceOptions().contains( | |
TransientServiceOption.WITH_METRIC_COLLECTION)) { | |
if (!ctx.config().transientServiceOptions().contains(TransientServiceOption.WITH_TRACING) || | |
!ctx.config().transientServiceOptions().contains( | |
TransientServiceOption.WITH_METRIC_COLLECTION)) { |
I realize the behavior depends on how users customize their ObservationRegistry
, but I guess we don't have a good way to determine what kind of handlers users added at the moment 😅
enrichObservation(ctx, httpServerContext, observation); | ||
|
||
return observation.scopedChecked( | ||
() -> unwrap().serve(ctx, req)); // TODO: Maybe we should observation stopping here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// TODO: Maybe we should observation stopping here
As it stands, the most reliable way to know whether a response is finished is to subscribe to ctx.log
in armeria.
I think the current approach is fine, but let me know if I misunderstood your comment 😄
// TODO: ClientConnectionTimings - no hook to be there at the | ||
// moment of those things actually hapenning |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ClientConnectionTimings
represents timing information for client-side, whereas this class is for server side.
Would it be OK to remove this comment?
observation/src/main/java/com/linecorp/armeria/client/observation/HttpClientContext.java
Outdated
Show resolved
Hide resolved
return DefaultHttpClientObservationConvention.class; | ||
} | ||
|
||
// TODO: Figure what should be low and what high cardinality |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess HTTP_METHOD
, STATUS_CODE
can be low cardinality and we can probably add more keys later if needed.
For reference, here are some tags that are added by default in Armeria's MetricCollecting[Service|Client]
Lines 53 to 66 in ea3b063
public MeterIdPrefix activeRequestPrefix(MeterRegistry registry, RequestOnlyLog log) { | |
/* hostname.pattern, method, service */ | |
final Builder<Tag> tagListBuilder = ImmutableList.builderWithExpectedSize(3); | |
addActiveRequestPrefixTags(tagListBuilder, log); | |
return new MeterIdPrefix(name, tagListBuilder.build()); | |
} | |
@Override | |
public MeterIdPrefix completeRequestPrefix(MeterRegistry registry, RequestLog log) { | |
/* hostname.pattern, http.status, method, service */ | |
final Builder<Tag> tagListBuilder = ImmutableList.builderWithExpectedSize(4); | |
addCompleteRequestPrefixTags(tagListBuilder, log); | |
return new MeterIdPrefix(name, tagListBuilder.build()); | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess SESSION_PROTOCOL
and SERIALIZATION_FORMAT
can also be considered low cardinality
import io.micrometer.observation.Observation.Context; | ||
import io.micrometer.observation.ObservationConvention; | ||
|
||
interface HttpClientObservationConvention extends ObservationConvention<HttpClientContext> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question) Is it possible to remove this interface and just add supportContext
directly into DefaultHttpClientObservationConvention
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can do whatever you want :) This is how we're suggesting that the users should proceed https://micrometer.io/docs/observation#_observation_observationconvention_example
|
||
import io.micrometer.observation.transport.RequestReplySenderContext; | ||
|
||
final class HttpClientContext extends RequestReplySenderContext<RequestHeadersBuilder, RequestLog> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't see much value in exposing HttpClientContext
so I pushed a commit which hides this class to package-private along with some other classes (Sorry, I probably should've asked for your opinion before doing so! 😅 Let me know if you feel like it's better to expose this though )
If users want to add their own handlers/conventions, I guess they can store additional information via. Observation.Context.put(key, val)
to access from handlers/conventions.
I think it would also be awesome if there was an interface for Observation.Context
so that we could do something like:
// this is public
public interface ArmeriaObservationContext implements Observation.Context {
RequestContext requestContext();
}
// this is package-private
class HttpClientContext extends RequestReplySenderContext<RequestHeadersBuilder, RequestLog>
implements ArmeriaObservationContext {
}
// and then users can do something like this...
registry.observationConfig().observationHandler(new ObservationHandler<ArmeriaObservationContext>()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't see much value in exposing HttpClientContext so I pushed a commit which hides this class to package-private along with some other classes (Sorry, I probably should've asked for your opinion before doing so! Let me know if you feel like it's better to expose this though )
If you make that package-private users will not be able to see it and won't be easily be able to provide their own customizations (here you have an example https://micrometer.io/docs/observation#_observation_observationconvention_example)
observation/src/main/java/com/linecorp/armeria/client/observation/HttpClientContext.java
Outdated
Show resolved
Hide resolved
@Nullable HttpClientObservationConvention | ||
httpClientObservationConvention) { | ||
super(delegate); | ||
this.observationRegistry = observationRegistry; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you do a null
check for public APIs so that users can know right away when their input is incorrect?
this.observationRegistry = observationRegistry; | |
this.observationRegistry = requireNonNull(observationRegistry, "observationRegistry"); |
// Make the span the current span and run scope decorators when the ctx is pushed. | ||
ctxExtension.hook(observation::openScope); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice 👍
// TODO: ClientConnectionTimings - no hook to be there | ||
// at the moment of those things actually hapening |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this information isn't really critical to merge this PR, but still is useful for users going through traces.
Would it be difficult to also provide the option to pass a timestamp
such that observation.event(Event, long)
like brave and otel do? (maybe DefaultMeterObservationHandler
can just ignore the timestamp)
Adding such hooks require some degree of synchronization which imposes a performance penalty, and I'm not sure if it's feasible to add hooks for all events we want to add in the future 😅
Actually, I guess this doesn't make much sense if users want to just collect metrics... let me think about this a little more
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added comments to your comments and will upload some code very soon
|
||
import io.micrometer.observation.transport.RequestReplySenderContext; | ||
|
||
final class HttpClientContext extends RequestReplySenderContext<RequestHeadersBuilder, RequestLog> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't see much value in exposing HttpClientContext so I pushed a commit which hides this class to package-private along with some other classes (Sorry, I probably should've asked for your opinion before doing so! Let me know if you feel like it's better to expose this though )
If you make that package-private users will not be able to see it and won't be easily be able to provide their own customizations (here you have an example https://micrometer.io/docs/observation#_observation_observationconvention_example)
observation/src/main/java/com/linecorp/armeria/client/observation/HttpClientContext.java
Outdated
Show resolved
Hide resolved
import io.micrometer.observation.Observation.Context; | ||
import io.micrometer.observation.ObservationConvention; | ||
|
||
interface HttpClientObservationConvention extends ObservationConvention<HttpClientContext> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can do whatever you want :) This is how we're suggesting that the users should proceed https://micrometer.io/docs/observation#_observation_observationconvention_example
145bc0c
to
a21e66a
Compare
I've also added support for Micrometer Docs Generation https://micrometer.io/docs/observation#_automated_documentation_generation. Under |
@@ -82,7 +69,7 @@ public HttpResponse serve(ServiceRequestContext ctx, HttpRequest req) throws Exc | |||
|
|||
final HttpServerContext httpServerContext = new HttpServerContext(ctx, req); | |||
final Observation observation = ServiceObservationDocumentation.OBSERVATION.observation( | |||
this.serviceObservationConvention, DefaultServiceObservationConvention.INSTANCE, | |||
null, DefaultServiceObservationConvention.INSTANCE, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I understand that you're not allowing users to modify the default tagging by injecting the custom convention?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second thought, I think it's a good idea to allow users to inject custom conventions.
Ended up making HttpClientContext
, HttpServerContext
public
😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll take another look in the morning but I think this pull request is pretty much done!
I've pushed some changes which
- Makes
HttpClientContext
,HttpServerContext
public along with some test code- Also renamed some variables
- Removed the default
Http[Client|Server]ObservationConvention
interfaces - Added some more documentation
Let me know if any changes don't make sense 😄
Also, what do you think of changing the status of this PR to ready for review now?
this.clientRequestContext = clientRequestContext; | ||
this.httpRequest = httpRequest; | ||
setCarrier(carrier); | ||
updateRemoteEndpoint(this, clientRequestContext); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On a second look, I think it's not possible to know what the remote address is at this stage unless the client is explicitly connecting to an ip address from the start. (this is before a connection is acquired/determined)
We can know this information for sure later when ctx.log().whenComplete()
is invoked.
Is there any way we can update remote endpoint information later and have it exposed in the spans?
If not, I think it might be better to just remove this since we can't reliably supply an ip here 😅
@@ -47,7 +47,7 @@ Now, you can specify <type://BraveService> using [Decorating a service](/docs/se | |||
```java | |||
import com.linecorp.armeria.common.HttpResponse; | |||
import com.linecorp.armeria.server.Server; | |||
import com.linecorp.armeria.server.brave.BraveService; | |||
import com.linecorp.armeria.server.observation.BraveService; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you revert this since the class doesn't exist? 😅
I'll be adding a separate page on tracing later
OK, I've removed setting of the remote address - and applied all suggested changes. Also, the PR is now officially ready for review :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @marcingrzejszczak 🙇 👍 🚀
...ation/src/main/java/com/linecorp/armeria/client/observation/MicrometerObservationClient.java
Outdated
Show resolved
Hide resolved
Thank you, my pleasure @jrhee17 ! :) |
// TODO: ClientConnectionTimings - there is no way to record events | ||
// with a specific timestamp for an observation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you elaborate on this comment? What prevents us from recording connection timings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Micrometer Observation doesn't have the API to record an event with a timestamp. Current Armeria API gives you information about a fact happening at a given timestamp. With Micrometer Observation you can react to a situation happening at a given point in time. So Armeria API would have to give a handle to call observation.event(...)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. Any chance Micrometer Observation to have such an API in the future? We worked with Brave team to add such a feature in the past.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could add that but the problem I see is that we can't set the timestamp for a counter. So whenever we have an event with Micrometer Observation what we do for metrics is we create a counter and for tracing we annotate the span. What we could do is ignore the timestamp information for the metrics and simply increment the counter when the event method was called. For tracing we would set an event with a given timestamp. Does it make sense? cc @shakuzen @jonatan-ivanov
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For tracing we would set an event with a given timestamp.
Yeah, this is probably what we want.
.whenAvailable(RequestLogProperty.RESPONSE_FIRST_BYTES_TRANSFERRED_TIME) | ||
.thenAccept(requestLog -> { | ||
if (requestLog.responseFirstBytesTransferredTimeNanos() != null) { | ||
observation.event(Events.WIRE_RECEIVE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to specify the timestamp? I'm pretty sure there will be latency between the actual recording and observation.event()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No there is none. That's why this (#4980 (comment)) is not possible to be done
.../src/test/java/com/linecorp/armeria/server/observation/MicrometerObservationServiceTest.java
Outdated
Show resolved
Hide resolved
settings.gradle
Outdated
@@ -63,6 +63,7 @@ includeWithFlags ':kafka', 'java', 'publish', 'rel | |||
includeWithFlags ':kotlin', 'java', 'publish', 'relocate', 'kotlin' | |||
includeWithFlags ':logback', 'java', 'publish', 'relocate' | |||
includeWithFlags ':oauth2', 'java', 'publish', 'relocate' | |||
includeWithFlags ':observation', 'java', 'publish', 'relocate' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we keep this as a separate module or make it part of the core? We already have Micrometer in our core, so I think we can move it into core if it doesn't pull in other dependencies such as Brave.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. We always release Micrometer Observation with Micrometer Core (that's the same project) and Core depends on Observation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Talked with the team and moving the implementation to the :core
module
Is there anything else I should to merge this? |
Sorry about the delay 😅 Let me update the PR to address @trustin's comments tonight, and then ping the other maintainers |
Oh absolutely no problem, I was just wondering if someone is waiting on me and I have missed the message. Take your time! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good! Left comments for code style, micro optimizations, and suggestions.
core/src/main/java/com/linecorp/armeria/client/observation/HttpClientContext.java
Outdated
Show resolved
Hide resolved
core/src/main/java/com/linecorp/armeria/client/observation/MicrometerObservationClient.java
Outdated
Show resolved
Hide resolved
core/src/main/java/com/linecorp/armeria/client/observation/MicrometerObservationClient.java
Outdated
Show resolved
Hide resolved
core/src/main/java/com/linecorp/armeria/client/observation/MicrometerObservationClient.java
Outdated
Show resolved
Hide resolved
core/src/main/java/com/linecorp/armeria/client/observation/HttpClientContext.java
Outdated
Show resolved
Hide resolved
...rc/main/java/com/linecorp/armeria/server/observation/HttpServerObservationDocumentation.java
Outdated
Show resolved
Hide resolved
core/src/main/java/com/linecorp/armeria/server/observation/MicrometerObservationService.java
Outdated
Show resolved
Hide resolved
core/src/main/java/com/linecorp/armeria/server/observation/MicrometerObservationService.java
Outdated
Show resolved
Hide resolved
core/src/main/java/com/linecorp/armeria/server/observation/MicrometerObservationService.java
Outdated
Show resolved
Hide resolved
core/src/main/java/com/linecorp/armeria/server/observation/MicrometerObservationService.java
Outdated
Show resolved
Hide resolved
Hey I think I've applied all the changes. Tell me if I missed sth |
f55ad42
to
3b3ed20
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
High-quality work! Many thanks, @marcingrzejszczak. 🚀🙇♂️
...ain/java/com/linecorp/armeria/client/observation/DefaultHttpClientObservationConvention.java
Outdated
Show resolved
Hide resolved
core/src/main/java/com/linecorp/armeria/client/observation/ClientObservationContext.java
Outdated
Show resolved
Hide resolved
core/src/main/java/com/linecorp/armeria/client/observation/ObservationClient.java
Outdated
Show resolved
Hide resolved
core/src/main/java/com/linecorp/armeria/client/observation/ObservationClient.java
Outdated
Show resolved
Hide resolved
...c/main/java/com/linecorp/armeria/server/observation/DefaultServiceObservationConvention.java
Outdated
Show resolved
Hide resolved
core/src/main/java/com/linecorp/armeria/server/observation/ObservationService.java
Outdated
Show resolved
Hide resolved
core/src/main/java/com/linecorp/armeria/server/observation/ObservationService.java
Outdated
Show resolved
Hide resolved
…ion/MicrometerObservationClient.java
…rometerObservationService.java Co-authored-by: Ikhun Um <[email protected]>
…rometerObservationClient.java Co-authored-by: Ikhun Um <[email protected]>
Co-authored-by: Ikhun Um <[email protected]>
…aultHttpClientObservationConvention.java Co-authored-by: Ikhun Um <[email protected]>
…entObservationContext.java Co-authored-by: Ikhun Um <[email protected]>
…ervationClient.java Co-authored-by: Ikhun Um <[email protected]>
…ervationClient.java Co-authored-by: Ikhun Um <[email protected]>
…aultServiceObservationConvention.java Co-authored-by: Ikhun Um <[email protected]>
…ervationService.java Co-authored-by: Ikhun Um <[email protected]>
…ervationService.java Co-authored-by: Ikhun Um <[email protected]>
Co-authored-by: minux <[email protected]>
67d7a72
to
868c8ea
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for your patience, @marcingrzejszczak. Looks great to me. It'd be awesome if Micrometer Observation exposes an API that allows us to specify timestamps explicitly, though. Any chance we can have them in the future?
Thanks! I left a comment on that comment :) I'd like @shakuzen and @jonatan-ivanov to chime in on that cause if we don't care that the moment in which the counter got incremented is not the same as the one from the timestamp then we should be alright and could add this feature in the future |
Thanks a lot for your high quality contribution and patience, @marcingrzejszczak! Let us stay tuned to micrometer-metrics/micrometer#4032 🙇 |
Sure @trustin, we had a conversation inside the team about this and we see no problem in adding that feature to Micrometer . Since I have your attention here, what do you think of netty/netty#8546 ? :) |
fixes #4659