Metrics #98

NiteshKant · 2014-04-21T07:27:54Z

In order to use RxNetty effectively in production, we would require to have some metrics providing insights into the clients and servers created by RxNetty. Currently, there isn't any metrics published by RxNetty.

Pluggability

Since, RxNetty strives to be not opinionated on many aspects but for the fact that it uses Netty for the network stack and RxJava for async interfaces. It would not be prudent for us to bind to a particular metrics framework like servo, yammer metrics, etc.
Instead we should provide hooks into the system for anyone to be able to provide a plugin for various metrics framework.

Hooks

There are a few ways of providing hooks for the metrics system.

Event based

This essentially means that we publish events for any interesting state change inside RxNetty via Observables, from where people can infer these metrics.
Current RxClient connection pool implementation provides such a hook here

Advantages

The advantage of this approach is that we do not have to maintain any state inside RxNetty and would not face problems around holding this metrics around overflow which without a metrics framework would create more work for us.
One subtle advantage is that we do not have to upfront decide for everyone, which metrics are useful, people can infer newer metrics from the available events.

Disadvantages

The disadvantage is that for every plugin, people have to re-invent the wheel from the point of view of understanding these events and correlating them to infer these metrics. This gets complex when the metrics are to be inferred from a set of events instead of one.

Data based

We can also provide a data based plugin mechanism, wherein we store all the metrics of interest and provide a hook to transform and send these metrics to the metric framework of choice.

Advantages

Lesser things to do for every plugin provider.
No one has to understand the infrastructure to create the metrics.

Disadvantages

We would have to create basic metrics constructs like rolling counters, percentiles, etc. which comes for free with
any metrics framework.

Required metrics

Client

All required client side metrics are defined in the issue #96

Server

All required server side metrics are defined in the issue #97

benjchristensen · 2014-04-21T19:31:48Z

The disadvantage is that for every plugin
We would have to create basic metrics constructs like rolling counters

For both of these we can have contrib modules that provide all of this by default.

For example, a default metrics implementation can use Numerus for counters, percentiles etc (https://github.com/Netflix-Skunkworks/Numerus) and then we can have publishers for Servo and Yammer Metrics.

NiteshKant · 2014-04-21T20:27:38Z

@benjchristensen sounds good. So your preference is the event based mechanism?

allenxwang · 2014-04-21T20:34:28Z

It seems to me that the event based solution is better, provided that we provide default implementations based on a metrics system like Servo in an extension. Then there is no need to re-invent the wheel.

For example, TrackableStateChangeListener can be easily extended into a class for pool stats once we add Servo annotations and we can put such class under rxnetty-servo.

benjchristensen · 2014-04-21T20:47:19Z

I think the event based one makes sense. Keep in mind too that I found Servo to not have all of the types needed for things like RollingPercentile, so Numerus was created with these types.

It was done separate from Servo instead of adding to Servo so they could be used with or without the library for publishing, which is what we use Servo and Yammer Metrics for on Hystrix.

Mantis is also using Numerus for metrics, then allows publishing of metrics via Servo and Yammer, same as we're discussing here.

NiteshKant · 2014-04-21T21:01:55Z

Awesome. I will work on the event based system then.

NiteshKant · 2014-05-19T21:58:04Z

Moving this to milestone 0.3.5.
Had to invest time on issue #117 which requires a release now.

NiteshKant · 2014-05-26T21:10:45Z

This design has been updated a few times based on the discussion/feedback in the following comments

Design Proposal

After a few preliminary discussions on the design in issue #96 I am proposing the following design.
(The code for the proposal can be found here)

Metric Events

Code

Any metric can be derived from these events. These events should always be referred to as constants (enum pattern) to avoid creation of new objects per event callback.

public interface MetricsEvent<T extends Enum> {

    T getType();

    boolean isTimed();

    boolean isError();
}

Metric Event Listeners

Code

A listener that can be added to a Client/Server to listen for metrics related events.

public interface MetricEventsListener<E extends MetricsEvent<?>> {

    int NO_DURATION = -1;
    Object NO_VALUE = null;
    Throwable NO_ERROR = null;
    TimeUnit NO_TIME_UNIT = null;

    /**
     * Event callback for any {@link MetricsEvent}. The parameters passed are all the contextual information possible for
     * any event. There presence or absence will depend on the type of event.
     *
     * @param event Event for which this callback has been invoked. This will never be {@code null}
     * @param duration If the passed event is {@link MetricsEvent#isTimed()} then the actual duration, else
     * {@link #NO_DURATION}
     * @param timeUnit The time unit for the duration, if exists, else {@link #NO_TIME_UNIT}
     * @param throwable If the passed event is {@link MetricsEvent#isError()} then the cause of the error, else
     * {@link #NO_ERROR}
     * @param value If the passed event requires custom object to be passed, then that object, else {@link #NO_VALUE}
     */
    void onEvent(E event, long duration, TimeUnit timeUnit, Throwable throwable, Object value);

    /**
     * Marks the end of all event callbacks. No methods on this listener will ever be called once this method is called.
     */
    void onCompleted();
}

Metric Event Publishers

Code

Source of Metric Events that can have zero or more listeners registered with them.

public interface MetricEventsPublisher<E extends MetricsEvent<?>> {

    Subscription subscribe(MetricEventsListener<? extends E> listener);

}

Registering Metric Event Listeners

There are two primary ways of registering Metric Event Listeners with the targeted Clients and Servers.

Global Associations

One can register a MetricEventsListenerFactory with RxNetty to associate a new listener created from this factory for every client and server created by RxNetty.
This would usually be done at the application level (for instance any application created using karyon) to provide a metrics framework like servo, yammer metrics

The Factory definition looks like this:

public interface MetricEventsListenerFactory {

    MetricEventsListener<ClientMetricsEvent<ClientMetricsEvent.EventType>> forClient(@SuppressWarnings("rawtypes")RxClient client);

    MetricEventsListener<HttpClientMetricsEvent<HttpClientMetricsEvent.HttpEventType>> forHttpClient(@SuppressWarnings("rawtypes")HttpClient client);

    MetricEventsListener<UdpClientMetricsEvent<UdpClientMetricsEvent.UdpEventType>> forUdpClient(@SuppressWarnings("rawtypes")UdpClient client);

    MetricEventsListener<ServerMetricsEvent<ServerMetricsEvent.EventType>> forServer(@SuppressWarnings("rawtypes")RxServer server);

    MetricEventsListener<HttpServerMetricsEvent<HttpClientMetricsEvent.HttpEventType>> forHttpServer(@SuppressWarnings("rawtypes")HttpServer server);

    MetricEventsListener<UdpServerMetricsEvent<UdpServerMetricsEvent.UdpEventType>> forUdpServer(@SuppressWarnings("rawtypes")UdpServer server);
}

Instance level associations

In addition to global associations one can also register a metric events listener per client/server as all the Clients and Servers contracts will also implement the Metric Event Publishers interface defined above.

allenxwang · 2014-05-27T19:01:50Z

I am wondering if it is necessary to have boolean isTimed(); and boolean isError(); as part of the MetricsEvent interface, as MetricEventsListener will have different callbacks for timed events and errors.

benjchristensen · 2014-05-28T17:02:18Z

I think MetricEventsListener is generally the right pattern. I like that is lets the values stay as primitives.

A few comments:

Duplication

There appear to be 3 event types and then duplicate overloads of each with long duration:

    void onEvent(E event, Object value);

    void onEvent(E event, Object value, long duration);

Can we eliminate the one without duration so implementers don't have to deal with the either/or invocation of these?

Object

What is the reason for this event type? I'd like to know the use case that requires sending the Object and seeming to confuse with onEvent(E event, long duration);

void onEvent(E event, Object value, long duration);

Events

Are they going to be defined as an enum somewhere or left completely dynamic? I imagine that these won't change much so should be defined.

NiteshKant · 2014-05-28T20:23:53Z

Thanks @allenxwang for the comment.

I am wondering if it is necessary to have boolean isTimed(); and boolean isError(); as part of the MetricsEvent interface, as MetricEventsListener will have different callbacks for timed events and errors.

Yeah its more like a metadata for human consumption (to know what data will be accompanied with that event) and not for RxNetty to make any decisions on those attributes.

In the below example looking at the different event, it is quiet evident what accompanying data one can expect while listening for these events:

public class ClientMetricsEvent extends AbstractMetricsEvent {

    public static final ClientMetricsEvent NEW_REQUEST = new ClientMetricsEvent("new-request", false, false);
    public static final ClientMetricsEvent REQUEST_SUCCEEDED = new ClientMetricsEvent("request-succeeded", true, false);
    public static final ClientMetricsEvent REQUEST_FAILED = new ClientMetricsEvent("request-failed", true, true);

    /*Always refer to as constants*/protected ClientMetricsEvent(String name, boolean isTimed, boolean isError) {
        super(name, isTimed, isError);
    }
}

OTOH, if this metadata is absent, someone has to look for code that publishes these events to know what data accompanies these events.

NiteshKant · 2014-05-28T20:36:42Z

Thanks @benjchristensen for your comments.

Let me take this example to explain the reasoning for the design.

public class ClientMetricsEvent extends AbstractMetricsEvent {

    public static final ClientMetricsEvent NEW_REQUEST = new ClientMetricsEvent("new-request", false, false);
    public static final ClientMetricsEvent REQUEST_SUCCEEDED = new ClientMetricsEvent("request-succeeded", true, false);
    public static final ClientMetricsEvent REQUEST_FAILED = new ClientMetricsEvent("request-failed", true, true);

    /*Always refer to as constants*/protected ClientMetricsEvent(String name, boolean isTimed, boolean isError) {
        super(name, isTimed, isError);
    }
}

Duplication

For NEW_REQUEST there would be only the event but for REQUEST_SUCCEEDED there will be event and duration.
Similarly, for REQUEST_SUCCEEDED there would just be duration but for REQUEST_FAILED there would be duration and exception.
These usecases were the reason for the methods in the listener.

Object

Now, if we plan to also pass Http Status code with REQUEST_SUCCEEDED then we would need this method:

void onEvent(E event, Object value, long duration);

and pass the Status code for the value parameter. Thinking about this now, I think if the Event itself defines the type of value as a generic parameter, it will be more intutive.

Events

Yes we will have classes like the above example which will define all events published by RxNetty.

headinthebox · 2014-05-28T21:02:02Z

No way to remove listeners?

NiteshKant · 2014-05-28T21:43:41Z

@headinthebox Good point :) Added removeListener() to MetricEventsPublisher and updated my previous comment.

NiteshKant · 2014-05-29T05:17:22Z

Added removeListener() to MetricEventsPublisher and updated my previous comment.

On thinking more about it , this feels better:

public interface MetricEventsPublisher<E extends MetricsEvent> {

    Subscription subscribe(MetricEventsListener<? extends E> listener);

}

returned Subscription can be unsubscribed to remove the listener. Thoughts?

NiteshKant · 2014-06-03T21:09:47Z

Updated the MetricEventsPublisher interface in the original design to use Subscription as proposed in my previous comment.

@benjchristensen @allenxwang @headinthebox I am starting work on this now, so let me know if you have any more comments about the design.

NiteshKant · 2014-06-04T23:53:43Z

As I started implementing this design I found that there were multiple things that were wrong :). I am stating them below:

The time unit for the duration parameter in MetricEventsListener interface is missing.
The lack of an enum in the event is quiet painful as listeners can not do a switch-case while handling these events. So, instead of MetricsEvent.name() I propose Metrics.getType() which returns an enum. So, the MetricEvent interface looks like:

public interface MetricsEvent<T extends Enum> {

    T getType();

    boolean isTimed();

    boolean isError();
}

Since, the Enum is different for different events, the MetricsEvent class is parameterized.

There is no onComplete() on MetricEventsListener so listeners can not do any cleanup when the associated client/server shutdown. It is more appropriate for client listeners as they can be large in number and have a lifecycle.
The multiple methods in MetricEventsListener are quiet error prone for event consumers. The biggest issue being that if the consumer started listening to an event in the wrong method, he will never get the event. The penalty of this inadvertent mistake is quiet high and tough to debug.
I think @benjchristensen you were referring to this in your feedback.
In order to eliminate this issue, I can think of the approach where we fold all these variations into a single method and make all other arguments but for the Event as optional.
Based on the above, I can think of the following listener interface.

public interface MetricEventsListener<E extends MetricsEvent<?>> {

    void onEvent(E event, long duration, TimeUnit timeUnit, Throwable throwable, Object value);

    void onComplete();
}

All arguments to onEvent() above are optional but for event. duration will be -1 if the event is not timed. All other parameters will be null when not applicable.

Thoughts?

Did a few changes in the design as discussed in the github issue. This code change suffices for the current Pool State change events. I will be adding the remaining events and migrate the tests to use the new model.

NiteshKant · 2014-06-09T06:34:19Z

Updated the original design above with the proposed changes here

allenxwang · 2014-06-09T16:07:36Z

It seems that Event isTimed() and isError() API is somewhat redundant now that duration and Throwable is available in the onEvent() method. It could lead to ambiguity which one should take precedence for event consumer.

NiteshKant · 2014-06-10T06:03:31Z

@allenxwang as mentioned before, the isTimed() and isError() APIs are for human consumption and not really to make a code decision. If a consumer is consuming a particular event type, the data published with the event must be known to the consume before hand and RxNetty will make sure that the metadata for the event type (isTimed(), isError()) is consistent with what is published.

The documentation in the MetricEventsListener.onEvent() is also pretty clear in suggesting when should the parameters be available.

So, if a consumer is consuming an event publishing duration and Throwable, the code should be able to directly access duration and Throwable without doing existence checks.

Metrics (Fixes Issues #96 #97 #98)

mikeweisskopf · 2015-09-29T16:30:52Z

At one point you mentioned you were adding removeListener() to MetricEventsPublisher, but later changed to use Subscription instead. The rational was "returned Subscription can be unsubscribed to remove the listener". However, it seems to me thats not what actually happens.....instead calling unsubscribe only changes the state of the subscription (for example BooleanSubscription.isUnsubscribed() will return true). However, the listener (MetricsEventsPublisher) remains unchanged....it still has the listener and the listener still receives events. Am I missing something here or is there actually no way to remove / stop a listener?

NiteshKant · 2015-09-29T16:51:55Z

@mikeweisskopf that sounds like a bug, this piece of code is expected to remove the listener, but there is mismatch between the type of item removed and added. Can you file a bug or if you are interested in sending a PR, you are welcome too :)

mikeweisskopf · 2015-09-29T20:01:15Z

Sure, I will file a bug and followup with a PR. Thanks for the quick reply :-)

Pull request is (here) [https://github.com//pull/428]

mikeweisskopf · 2015-12-11T23:44:40Z

Wondering about another perspective on metrics. The approach here generally aggregates metrics on a per client basis. But what if one needs to track metrics on a per request basis? For example, suppose one needs to know how long it took for a response to a specific request? It seems like we'd need to be able to pass some sort of "Request Context" into the MetricEventListeners to make that kind of granularity possible. In a way we have a little of that already .... for example StartTime is request specific. Do you have any thoughts on how something like this might be done, or am I missing something that is already present?

NiteshKant · 2015-12-12T06:36:22Z

@mikeweisskopf I have tried to address "general" requirements around what timings one would need to capture for HTTP/TCP interactions in 0.5.x. There is also a facility to pass custom events to the listeners.

There is an open request around what you are asking and one way of doing it would be to have an ability to attach a listener per request.

mikeweisskopf · 2015-12-14T16:49:38Z

Thank very much @NiteshKant , we'll take a look at listener per request as well as request context. Any thoughts on when there will be a release version of 0.5.x?

NiteshKant added the enhancement label Apr 21, 2014

NiteshKant self-assigned this Apr 21, 2014

NiteshKant added this to the 0.3.4 milestone Apr 28, 2014

NiteshKant modified the milestones: 0.3.5, 0.3.4 May 19, 2014

NiteshKant mentioned this issue May 23, 2014

Required Client side metrics #96

Closed

NiteshKant mentioned this issue May 26, 2014

Required server side metrics #97

Closed

NiteshKant modified the milestones: 0.3.7, 0.3.6 Jun 24, 2014

NiteshKant pushed a commit to NiteshKant/RxNetty that referenced this issue Jun 24, 2014

Fixes issue ReactiveX#96 ReactiveX#97 ReactiveX#98

5733172

NiteshKant added a commit that referenced this issue Jun 25, 2014

Merge pull request #149 from NiteshKant/metrics

4e26dd7

Metrics (Fixes Issues #96 #97 #98)

NiteshKant closed this as completed Jun 25, 2014

NiteshKant mentioned this issue Jun 25, 2014

Remove deprecated pool stats #150

Closed

NiteshKant removed their assignment Aug 19, 2014

mikeweisskopf mentioned this issue Sep 29, 2015

(0.4.x) Can't remove a listener from a MetricEventsPubliser #427

Closed

mikeweisskopf mentioned this issue Aug 10, 2016

allow MetricEventsListener to be specified for a particular request #386

Open

NiteshKant mentioned this issue Nov 5, 2016

[0.5.x-events] Events interface rsocket/rsocket-java#189

Merged

dbacinski mentioned this issue Aug 30, 2017

Metrics reactor/reactor-netty#157

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Metrics #98

Metrics #98

NiteshKant commented Apr 21, 2014

benjchristensen commented Apr 21, 2014

NiteshKant commented Apr 21, 2014

allenxwang commented Apr 21, 2014

benjchristensen commented Apr 21, 2014

NiteshKant commented Apr 21, 2014

NiteshKant commented May 19, 2014

NiteshKant commented May 26, 2014

allenxwang commented May 27, 2014

benjchristensen commented May 28, 2014

NiteshKant commented May 28, 2014

NiteshKant commented May 28, 2014

headinthebox commented May 28, 2014

NiteshKant commented May 28, 2014

NiteshKant commented May 29, 2014

NiteshKant commented Jun 3, 2014

NiteshKant commented Jun 4, 2014

NiteshKant commented Jun 9, 2014

allenxwang commented Jun 9, 2014

NiteshKant commented Jun 10, 2014

mikeweisskopf commented Sep 29, 2015

NiteshKant commented Sep 29, 2015

mikeweisskopf commented Sep 29, 2015

mikeweisskopf commented Dec 11, 2015

NiteshKant commented Dec 12, 2015

mikeweisskopf commented Dec 14, 2015

Metrics #98

Metrics #98

Comments

NiteshKant commented Apr 21, 2014

Pluggability

Hooks

Event based

Data based

Required metrics

Client

Server

benjchristensen commented Apr 21, 2014

NiteshKant commented Apr 21, 2014

allenxwang commented Apr 21, 2014

benjchristensen commented Apr 21, 2014

NiteshKant commented Apr 21, 2014

NiteshKant commented May 19, 2014

NiteshKant commented May 26, 2014

Design Proposal

Metric Events

Metric Event Listeners

Metric Event Publishers

Registering Metric Event Listeners

Global Associations

Instance level associations

allenxwang commented May 27, 2014

benjchristensen commented May 28, 2014

NiteshKant commented May 28, 2014

NiteshKant commented May 28, 2014

headinthebox commented May 28, 2014

NiteshKant commented May 28, 2014

NiteshKant commented May 29, 2014

NiteshKant commented Jun 3, 2014

NiteshKant commented Jun 4, 2014

NiteshKant commented Jun 9, 2014

allenxwang commented Jun 9, 2014

NiteshKant commented Jun 10, 2014

mikeweisskopf commented Sep 29, 2015

NiteshKant commented Sep 29, 2015

mikeweisskopf commented Sep 29, 2015

mikeweisskopf commented Dec 11, 2015

NiteshKant commented Dec 12, 2015

mikeweisskopf commented Dec 14, 2015