-
Notifications
You must be signed in to change notification settings - Fork 147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reprioritization frames #1021
Comments
Can you expand on what you mean by "complex"? Parsing strings is more difficult and error-prone than parsing fixed-width binary fields. I think you might be concerned about ordering problems w.r.t. other frames for the same stream, but I'm not sure.
Funny you wrote that, because I have the exact opposite opinion. Priority is a relative concept. We must have some way to group requests before we can speak meaningfully about the priority of any individual request. This proposal does not define an explicit grouping of requests -- the grouping is implicitly per connection. Whether we communicate priorities with a header or a frame does not change this fact. It follows that this proposal defines hop-by-hop priorities. We can claim they are "end-to-end" but that is a lie. The proposal admits as much when it refers to proxies needing to coalesce priorities before they are forwarded to backends. I can imagine true end-to-end priorities. One strawman proposal is to say that priorities are global: if client A sends a request with level=low, and client B sends a request with level=high, then B's request is always preferred. This is a bad idea for obvious reasons. Another strawman is to add a Connections are the only available grouping mechanism I can think of. Relatedly, if we are using an "end-to-end" HTTP header field, then I expect that the header field should have some meaning on HTTP/1. But I'm not sure what that meaning is ... it hasn't been specified, and I'm not sure if any meaningful behavior is possible?
Don't agree. Reprioritization is not possible with header fields (barring a hacky repurposing of trailers, as I think you suggested), so a frame-based API is strictly more powerful. HTTP libraries typically have a type that manages information about the request, beyond just the request headers. This type is a perfectly reasonable place to put priority information. JS has RequestInfo and RequestInit, Go has http.Request, Python has urllib.Request, etc. I don't see how frames make APIs "far more difficult"? Perhaps you had a different notion of "API" in mind? |
Thanks for kicking this off! Indeed, I do have questions about header fields. After reading @tombergan's reply, I think we're both on the same page.
It feels like any such proxies need to make a conscious decision about how to prioritize such requests. Using a hop-by-hop mechanism makes this required. Frames are obviously hop-by-hop and they seem ideally suited here. (If we felt really passionate that headers are the right approach then perhaps a pseudo-header would be an option. But frames seem simpler in this case).
|
The evidence we've seen with HTTP/2 is that no one really experimented with priorities and that there were scant APIs that enabled doing so. Putting priority information in request objects does make sense but we have not seen that in practice, why not? Is it a by product of the complicated tree? Would a simpler priority scheme make it people more inclined to take this up? In contrast, everything I'm familiar with has methods to set or get headers with a K-V. Handling headers is a common activity, so the code is probably battle tested, more familiar to developers, and easily migrates between implemention environments and language. What's more, endpoint applications could experiment today without any needing their libraries to update. The code implementing the priority scheduling does need to understand these signals, but in my experience the internals of this are not exposed to applications, which is probably a good thing.
I think this point came up in some design team meetings and its a good one. Would consuming a header meet the expectations of developers? Are there other examples of that? |
For what it's worth, Chrome's URLRequest object has a priority field: https://cs.chromium.org/chromium/src/net/url_request/url_request.h?sq=package:chromium&g=0&l=900 That being said, it's a simple enum. We had many discussions about exposing HTTP/2's prioritzation mechanism up the stack, but decided against it. Among other things the inter-stream dependencies in HTTP/2 made it really challenging to create an higher level abstraction that made sense here. The proposed priority scheme (enum priority + bool interleave) would be simple to expose, on the other hand. |
Interesting, thanks!
Something to consider is that the proposal's extensibiliy capability. If that is desirable, then a less rigid structure is desireable, and we settled on structured headers for this purpose in the current design (including PRIORITY_UPDATE frame holding a field that carries A more binary-friendly format with extensibility, in HTTP/2, is a bit of a tricky problem to solve. So it will be interesting to see what APIs would think is sane, |
The extensibility is an interesting issue! On one handing, simply passing around ascii text which can store information on any prioritization scheme is really convenient and attractive. On the other hand, as something to expose on, for example, Chrome's URLRequest object it's not a great fit. In Chrome at least, requests are typically initiated at a layer that is entirely ignorant of the underlying connection that will be used to satisfy the request. (Many requests result in a new connection being created, for example). As such, the caller does not know if it's going to be HTTP/1, HTTP/2, HTTP/3, HTTP/3+priority extension X, etc. This means that the caller is not in a great position to know what underlying prioritization scheme will be used on the wire. As a result, the API that we want to expose on the request needs to be relatively generic. (HTTP/2 stream dependencies is the opposite extreme :>) Further, along the way from the Request to the connection there are various throttles and internal prioritization decisions which take the Request's priority into consideration. As such it's important at this code be able to understand priority and a free-from priority string is problematic in this respect. All of this being said, I think the issue of API design should be largely orthogonal from the on-the-wire prioritization encoding and I wonder if we (HTTP and QUIC WGs) would be well served to focus on the protocol and not the API? But I'm not sure about this... |
Extensibility is an interesting argument for using headers. I will have to think about that some more.
These are good questions. My attempt at answers: Browsers use 10s to 100s of HTTP requests to render a complex document with many parts and dependencies, often for users over slow networks, making prioritization important. That might be a special case. Outside of browser land, I suspect that most programs fetch no more than one or a few URLs at a time and wouldn't benefit from priorities. (I don't have any evidence for this, it's just a suspicion.) One exception is RPC-over-HTTP, such as gRPC. Developers have requested priority support for gRPC, but the maintainers have pushed back with complexity concerns: One thing to note about gRPC is that its API does not expose the underlying HTTP headers directly (see here and here), but it does expose Custom-Metadata which is serialized as ordinary HTTP header fields. These fields are defined to be application-specific and are not interpreted by gRPC. Our new Priority header field does not match the There has been interest in exposing priorities to JavaScript. I'm not sure what the status of that is -- @yoavweiss might know more. |
My understanding is that the frame is just the header-field value (i.e., text) shoved into a frame with some additional fields that allow the frame to reference the subject. That referential information is where the added complexity comes in.
This is, I think a good place to take this discussion. I agree that we can only realistically use priority information if they share a common reference point. Priority signals from different sources are unlikely to share a reference frame (which is another reason that merging is probably a bad idea - servers might not share the frame the client uses). However, we have a number of ways in which we might determine that two requests share a common reference frame. Connections are probably the obvious way to tie requests together, but they fall short in a few ways. The coalescing intermediary described in the draft is one. And then we have HTTP/1.1, as you say. The use of IP- and transport-layer fields to correlate requests is similarly fraught, but I'll bet that it happens already, so I wouldn't rule it out completely. Weaknesses due to competition of incompatible endpoints behind a NAT are probably rare enough (outside large-scale NAT) that you could deploy something that used source IP with moderate confidence that it wouldn't do significant harm. The [X-]Forwarded-For fields are still useful at origin servers that are fronted by an intermediary. But the obvious grouping construct here is cookies. As much as we despise them, they are almost perfect for this purpose. It is conceivable that HTTP/1.1 requests could use priority using the above, though I suspect that the implementation challenges are such that we won't see it happen. If you had a server with multiple HTTP/1.1 connections to it from the same source, it could implement shared prioritization and congestion control for that set of connections. You might say that HTTP/2 is probably a good idea at that point and you'd be right, but I've seen stranger things.
This is why I like header fields. It avoids creating any strong linkage between the API and the eventual connection. We know that they get through, even if they might not be acted upon in precisely the same way. Have you considered continuing to respect |
You are right. I must have been remembering an older draft that used a 32 bit field. This seems to give frames identical extensibility as headers. |
A tree-based API is difficult to expose easily in a comprehensible way. You could go object-oriented, and pass a request object to another request's method to establish a relationship. Or you could pass IDs around for the same effect in a more declarative style. But regardless, it requires that the caller understand the universe of other requests that are open on the connection -- and therefore requires knowing the relationship between requests and connections, which might be better to abstract away. However, this style of priority is an independent declaration about each request -- the caller doesn't need to know any relationships; it just describes the resource's importance. That's a much easier API to expose and an easier API to understand how to use. |
FWIW that If starting with a frame axiom, one might wish to create a extensible frame format. HTTP/2 SETTINGS provides fixed-size key-value parameters but that that might be too restrictive for the needs of extensions. If might be possible to define some nested binary structure that would allow for lists, dictionaries etc. But that might lead to divergence between HTTP/2 and HTTP/3. As mentioned somewhere (Singapore maybe?), it might be prudent to stick with Structured headers as a canonical definition and maybe later we can lean on the proposed Binary Structured Headers for a less bespoke, more efficient(?) serialization. |
Trying to return this to something more concrete: I see two ways this can go. One is that we drop the header and use a frame only. I prefer this for a number of reasons already mentioned. The other is that we keep the header. I am open to this (despite preferring the frame) if we can resolve the following questions:
Martin had a good comment earlier regarding the first two questions. What I got from that comment was: cookies could work, but many people dislike cookies; IP could work, except for large-scale NATs; you could use cookies or IPs to do something on H1, but it seems unlikely anyone will do so due to the complexity. Therefore, in practice, we'll likely end up in the same place that we would with frames: priority will apply within individual H2/H3 connections. |
|
My previous comment is a blatant misread on MT's. Apologies. Martin was asking for a signal that reprioritization is a mechanism that is useful, and that there is evidence that people would implement it. The linked PR does not answer the question about reprioritization. What it allows is the use of a frame for the initial priority signal. Sending a signal on a different stream to the request means that there will be ordering edge-cases, a server that can accomodate that doesn't need to do too much work to support reprioritization while a response is in flight. |
@martinthomson I don't think it's possible to conclude that the old prioritization scheme wasn't widely used because it was a frame. The main reason I've heard cited for why it wasn't widely used was excess complexity for implementers and for developers who used it. If the scheme had been simpler, I believe APIs would be widespread by now. Above, you talk about using priorities across multiple HTTP 1.1 connections and how it might work. For every load balancer setup I've ever heard of, that would be somewhere between impractical and impossible. Either way, I can't imagine anyone doing this. Cookies are a particularly bad example, because they are visible long after the connection has been load-balanced. Also, the frame solves a problem besides re-prioritization, which is how to convey a hop-by-hop signal when a header indicates an E2E signal, such as on a multiplexed connection behind a load balancer. Some points that came out of the design team that are still important:
My personal opinion is that the header is primarily useful for server infrastructure to fix poor or non-existent client prioritization, which is the use case Cloudflare was successful with. I've seen no evidence the header has any advantages over the frame in the browser use case, and it does introduce complexity, since now my multi-tier load balancer infrastructure needs to do special case handling for a new header based on whether it's directly user-facing. Realistically, stripping it may be the safest option for load balancers. |
I don't think I ever suggested a causal relationship between the frame and the failure of the old scheme. There are plenty of aspects in the overall design that you might point to, and I'm happy to concede that being a frame is probably the least problematic. I didn't mean to suggest that cross-connection prioritization was something you might expect to have implemented - at least at the server end. But browsers might be in a better position to do something. Addressing your points in order.
Are you suggesting that you can't or wont access a header field? |
I think @icing made an interesting point in https://lists.w3.org/Archives/Public/ietf-http-wg/2020AprJun/0268.html, by using as an example an H2/H3 reverse proxy in front of a legacy H1 server. For a deployment that can only have limited concurrency between H2/H3 reverse proxy and the backend, having a guarantee that the priority signal is delivered alongside the request itself is important. That is because the processing order of requests cannot be changed once the request is being forwarded to the backend server. This sounded to me like an argument against using a frame sent on a stream other than the request stream for indicating initial priority, as we do not have ordering guarantee across multiple streams in QUIC. I am not sure how strong that argument is, though at least we (H2O) provides a knob to change the backend concurrency independently from frontend concurrency to prevent attackers from issuing huge amount of requests in parallel, while providing enough concurrency to fulfill the BDP between the end-client and the reverse proxy. |
Coming back to this after leave.
I certainly can access a header, I'm just saying it adds complexity and risk to a very complex and risk-averse environment on the server side, and I can't see the value outside of a multiplexed connection. I'm not saying we shouldn't include the header in the draft, but I don't see it as being that useful past the first hop. |
From the chairs' email on August 26 2020 https://lists.w3.org/Archives/Public/ietf-http-wg/2020JulSep/0118.html
12 days on there's been no objections. We should start thinking about closing this issue soon. |
Closing this issue. If folks want to open new specific issues on the draft text, editor's copy or when we cut draft -02, that's fine. |
There has been some debate about the relative utility of frames as a means of prioritizing and reprioritizing requests.
Frames are not guaranteed to be end-to-end and don't enjoy the benefits listed in Section 8.1. An (unlisted) reason is that the API for header fields is better understood and more likely to be available than a new and unspecified API. HTTP/2 defined the use of frames for reprioritization, but APIs are not widely available, nor are they consistent.
Frames provide two obvious benefits:
They are scoped to the current connection and so only propagate if the receiving endpoint explicitly propagates them.
They can be sent at any time. This allows for reprioritization, but the requirement to have them sent after requests are complete means that they can't be sent "on" a stream, they have to instead refer to a stream, which complicates the design somewhat. The proposal to allow trailers at any time helps a little here, but only to the extent that it remains possible to send. Stream closure is used to terminate request bodies in some cases, making any trailer-based solution limited.
However, frames have shortcomings too:
They are complex. The encoding of a header field into a frame is suboptimal, but necessary for extensibility and consistency reasons. And the manner in which they refer to requests (or pushes) makes them more complex to manage than a header field.
They are only scoped to a connection, which means that the extent of propagation can be non-deterministic. In particular, they depend on the connection supporting the signaling. This is the flip side of the benefit, and - in my opinion - outweighs it.
The API is far more difficult compared to header fields.
We know little enough about the efficacy of prioritization that we should defer the definition of frames absent strong motivation for their efficacy and desirability. The current document makes a good case for header fields, but the motivation for frames is weaker. It is true that cases like escalation of a speculative fetch to blocking status are theoretically appealing, but those cases are rare and unlikely to benefit from reprioritization. In particular, the mismatch between end-to-end and hop-by-hop (excuse the use of deprecated terminology) characteristics here makes it very hard to reason about the outcome.
I know that others (@RyanatGoogle in particular) have questions about the header field from a different angle and would prefer frames, exclusively. So this probably comes down to having more discussion about which approach to take: headers, frames, or both.
In either case, I would suggest that we have a need for both clear signals that the mechanism is broadly useful and evidence that that it is likely to be widely implemented.
The text was updated successfully, but these errors were encountered: