Expose a request ID to help with debug logging #407

markt-asf · 2021-07-09T16:18:19Z

Inspired by what Netty currently does.

To assist with debug logging - particularly in the context of asynchronous I/O, HTTP upgrade and HTTP/2 - it would be helpful if the Servlet API exposed some form of tracking ID to help track the following:

network connection
- identify requests that shared a connection (HTTP keep-alive, HTTP upgrade)
request within a connection
- particularly for async requests that move between threads, link activity associated with the same request
- must be unique for the current OS process when combined with the network connection ID
- may have additional uniqueness properties
protocol specific IDs
- HTTP/2 connection and stream
- WebSocket sesison ID
local address and port
- for correlation with network dumps, other processes
remote address and port
- for correlation with network dumps, other processes
container specific
- is there a requirement for this?

The intention is that the ID would correlate with container specific logging, assumming the container uses the same IDs for its internal logging.

"request within a connection" and "protocol specific" look to be equivalent.

The network address and port info could be relatively expensive to obtain. Container specific solutions often have short and long IDs. The would give us:

getShortID(), returns <connectionID>-<requestID>
getLongID(), returns <connectionID>-<requestID>-<localAddressAndPort>-<remoteAddressAndPort>[-<customContainer>]

Where:

connectionID is 8 hex digits (hash code of appropriate object as hex - avoids negative hashCodes as I want to use "-" as the separator)
requestID depends on protocol
- HTTP - simple counter for the connection
- HTTP/2 - connectionID:streamID
- WebSocket - session ID
- Other - "NONE" or protocol dependent
localAddressAndPort is IP:port in same format as URL
remoteAddressAndPort is IP:port in same format as URL
customContainer
- optional

On ServletRequest or HttpServletRequest? I'm leaning towards ServletRequest.

gregw · 2021-07-10T00:44:32Z

Firstly, I'm not sure that a hashCode is sufficient as an ID, since it is not required to be unique. Most implementations will provide a practically unique hashcode, but is that sufficient?

Then, I'm not sure that providing an API that does the combination is the right thing to do. You've already had to suggest 2 combinations, but there will always be users that want other combinations. I fear that we will see applications parsing these IDs to access the individual components. Thus my preference would be to provide API to the raw components (connectionID, requestID, connectionRequestCount, etc.) and then applications would be free to combine however they want to get an ID.

It would also need to be made clear if this was just a debugging API (and thus perhaps optional or no great need to enforce uniqueness) or are they APIs that could be used for real application work (perhaps caching security credentials by connectionID for example, and thus needing a guarantee of uniqueness?). I'm not sure we can stop users using a debugging API for real application work, so I think the API would have to be the later.

stuartwdouglas · 2021-07-12T02:22:59Z

This also sounds relatively expensive, so it needs to be able to be calculated in a lazy manner.

markt-asf · 2021-07-12T10:17:32Z

The two combinations were based on what Netty provides but I take the point that an API that provides the components offers more flexibility.
If we take the component approach, the only elements missing on ServletRequest are:

String getConnectionID()
String getRequestID()

Definitions something long the lines of:

Connection ID: case insensitive alphanumeric identifier that is unique within the lifetime of the JVM that represents the network connection.
Request ID: case insensitive alphanumeric identifier that is unique for a given network connection.

Thinking about it some more, keeping these distinct from any protocol specific IDs offers some benefits - particularly around HTTP upgrade if you want to track a connection that starts as HTTP and then upgrades to something else. However, there is no API to obtain either the h2 connection ID or stream ID. How about exposing these via request attributes?

If containers and/or frameworks opt to provide some form of combined ID then I'd expect that implementation to utilise lazy creation and caching as appropriate.

gregw · 2021-09-02T05:24:50Z

Note that with a connectionID then we may need some kind of Listener so we can inform the application that a connection has closed.

stuartwdouglas · 2021-09-02T06:12:12Z

I am not super enthusiastic about exposing connection related information to the user, as it can lead to the user making assumptions that are not always true (e.g. assuming that a connection is 1:1 with a user, assuming only one request active on the connection at a time).

Connection listeners are also really problematic, in terms of how their lifecycle is managed. If every request adds a connection listener and that connection lasts for a long time you have a potential memory leak.

gregw · 2021-09-02T06:45:48Z

@stuartwdouglas I'm kind of on the fence. Probably users shouldn't know about connections and may make wrong assumptions and listeners are complex..... but then you also get users that use other mechanism to reverse engineer connections... so there is a use-case. E.g. some large company app store is using HTTP/2 connections for authentication, so that one authed request over the connection makes the whole connection OK. Do we say to such users that servlets will never be for them?

stuartwdouglas · 2021-09-02T07:22:21Z

I am not 100% against it, but we need to make sure all the pitfalls are well documented, and that whatever listener API we come up with makes it easy to avoid possible concurrency issues you can get with HTTP/2.

We would also need to specify the threading requirements for listeners, as connection close can happen as an async notification. We should add a requirement that all request processing is complete (committed and threads returned to container) before the listeners are invoked.

markt-asf · 2021-09-02T07:24:16Z

The primary purpose of exposing these IDs is for debugging. I think connection tracking and associated listeners deserves a separate issue and discussion.

gregw · 2021-09-02T23:29:23Z

I'm not sure we can stop the IDs being used for uses other than debugging, specially as there is demand for connection management. The APN protocol is an example of something that uses connection state for a key part of their protocol. We currently could not implement APN in servlets because we have no way to know about connections nor to control parameters like h2 streams per connection.
Similarly many proxy style implementation in servlets could benefit from knowing about connections for significant performance improvements.

So I think this is precisely the kind of thing we need to consider in a major release... although I realise time is very short for 6.0 and it would be better to do nothing rather than get this wrong. But it is at least worth a couple of moments to think about what a full solution could look like and then make sure that any baby steps we take now are in the right direction.

Rather than just adding a connectionID method, when it is possible that we will need more methods later, perhaps we need to add a ServletConnection ServletRequest.getServletConnection() method?

public interface ServletConnection {
    String getConnectionID();
    InetSocketAddress getRemoteAddr();
    InetSocketAddress getLocalAddr();
    InetSocketAddress getRemote();
    InetSocketAddress getLocal();
    boolean isSecure();
    boolean isPersistent();
    int getRequestCount();
    int getMaxMultiplex();
    void setMaxMultiplex();
    void addListener(Listener listener);
    Object getAttribute(String name);
    void setAttribute(String name, Object value);

    interface Listener {
        onClose();
    }
}

Notes:

the difference between getLocal/Remote() and getLocal/RemoteAddr() is that the former is logical and the later actual. These may differ if PROXY protocol is used and h3 allows different ports to be used.
a connection may non be secure (ie not SSL), but individual requests might be secure because they have proxy headers saying they are secure. This together with making SSL certificates available in the attributes would go some way to letting a server know if a connection could be shared or not.

joakime · 2021-09-03T11:47:50Z

Don't use InetSocketAddress.
That's incompatible with things like UnixSocket.

Mark, String getConnectionID(), what would that represent in HTTP/2 (the physical connection? the stream?) and with HTTP/3 it falls apart again.

gregw · 2021-09-03T21:37:05Z

@joakim, there is a well define connection concept in both h2 and h3.

gregw · 2021-09-04T23:56:19Z

Note also that I'm asking about our destination, which may not be the next step. If the destination might look something like what i purposed then it may be that the next step is just:

Interface ServletConnection {
  String getId();
}

markt-asf · 2021-09-16T10:19:59Z

The first step is probably slightly more than that but not much. I like the general idea. I'll work up a PR for this as I think we are at the point where having something more concrete to discuss would be helpful.

markt-asf added a commit to markt-asf/servlet-api that referenced this issue Sep 16, 2021

Fix jakartaee#407 add identifiers for debug logging

fdda8c8

markt-asf closed this as completed in 40906cf Sep 20, 2021

markt-asf mentioned this issue Nov 10, 2022

Access to connector/socket container info #45

Closed

grgrzybek mentioned this issue Jun 12, 2023

Prepare Pax Web version using JakartaEE 10+ API ops4j/org.ops4j.pax.web#1802

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expose a request ID to help with debug logging #407

Expose a request ID to help with debug logging #407

markt-asf commented Jul 9, 2021 •

edited

Loading

gregw commented Jul 10, 2021

stuartwdouglas commented Jul 12, 2021

markt-asf commented Jul 12, 2021

gregw commented Sep 2, 2021

stuartwdouglas commented Sep 2, 2021

gregw commented Sep 2, 2021

stuartwdouglas commented Sep 2, 2021

markt-asf commented Sep 2, 2021

gregw commented Sep 2, 2021

joakime commented Sep 3, 2021

gregw commented Sep 3, 2021

gregw commented Sep 4, 2021

markt-asf commented Sep 16, 2021

Expose a request ID to help with debug logging #407

Expose a request ID to help with debug logging #407

Comments

markt-asf commented Jul 9, 2021 • edited Loading

gregw commented Jul 10, 2021

stuartwdouglas commented Jul 12, 2021

markt-asf commented Jul 12, 2021

gregw commented Sep 2, 2021

stuartwdouglas commented Sep 2, 2021

gregw commented Sep 2, 2021

stuartwdouglas commented Sep 2, 2021

markt-asf commented Sep 2, 2021

gregw commented Sep 2, 2021

joakime commented Sep 3, 2021

gregw commented Sep 3, 2021

gregw commented Sep 4, 2021

markt-asf commented Sep 16, 2021

markt-asf commented Jul 9, 2021 •

edited

Loading