Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: x/net/http2: provide hook for custom frame policers #63518

Open
elindsey opened this issue Oct 12, 2023 · 14 comments
Open

proposal: x/net/http2: provide hook for custom frame policers #63518

elindsey opened this issue Oct 12, 2023 · 14 comments
Labels
Milestone

Comments

@elindsey
Copy link

This is a followup to CVE-2023-44487/CVE-2023-39325.

The fix shipped is a very welcome change and nicely caps the number of handlers in the system. However, it's still insufficient in certain cases, for example kube and some of our own services. It is difficult to impossible in practice to tweak MAX_CONCURRENT_STREAMS below 100 without hitting client compatibility issues, and even at 100 the rapid reset attack may incur significant resource use, especially in cases where the go server is a gateway to other heavyweight services.

A wide number of mitigations have been deployed in other stacks; a non-exhaustive summary:

  • a token bucket rate limiter across all http2 frames (this has proven to be effective even against new attacks; some projects put these in place after the 2019 http2 vulnerabilities and were not affected by this latest issue), then terminating connection
  • a token bucket rate limiter strictly for RST_STREAM frames, then terminating connection
  • heuristics for tracking total "overhead" frames, then terminating connection
  • delays in processing new frames on connections that are exhibiting unusual behavior

The thing all these mitigations have in common is visibility into the types of frames flowing over a connection. Additionally, this level of visibility is necessary to produce metrics and get insights into what attacks are even being seen, something that is not possible with the current APIs.

Proposal is to add a single callback hook that would receive basic frame information scoped to the individual connection: frame type, frame length, and stream ID. The callback's return value may trigger termination of the connection. With this hook, all necessary metrics can be gathered and all mentioned variants of frame policers may be built. The default is unset and will have no changes to out of the box behavior.

Attached is a patch that we deployed to build our mitigation for the CVE -
0001-introducing-ConnectionCalmer.patch

@gopherbot gopherbot added this to the Proposal milestone Oct 12, 2023
@enj
Copy link
Contributor

enj commented Oct 12, 2023

While I am not against this proposal, I do not consider it sufficient because it requires per app changes to mitigate potential http2 CVEs. The default behavior of the system should prevent abuse. For example, the fix for CVE-2023-44487/CVE-2023-39325 should have 'fixed' the problem for Kube, especially since Kube lets an end user configure MAX_CONCURRENT_STREAMS.

@elindsey
Copy link
Author

My take for this CVE is that without changes to the standards, there will always be some set of apps requiring specific changes because the way it must be mitigated is somewhat coarse and will have cases where it's either too aggressive or not aggressive enough. That means that no matter what, some new API surface area will need to be exposed, and ideally that surface would be flexible enough to not need changes the next time an http2 vulnerability rolls around (ie. it would be ideal to not expose just a 'maxRsts' config or similar).

Whether the proposal is this API with a default policer or this API with no default policer is something I don't have a strong opinion on.

@ianlancetaylor ianlancetaylor moved this to Incoming in Proposals Oct 12, 2023
@ianlancetaylor
Copy link
Member

CC @neild @bradfitz

@dprotaso
Copy link

dprotaso commented Oct 23, 2023

For example, the fix for GHSA-4374-p667-p6c8 should have 'fixed' the problem for Kube, especially since Kube lets an end user configure MAX_CONCURRENT_STREAMS.

What's the next steps here? Does the Go team agree with this assessment and is there further follow up work required being planned?

@dprotaso
Copy link

Thanks @dims I saw that PR - I was curious if there were going to be further changes to the go stdlib

@skonto
Copy link

skonto commented Oct 23, 2023

Hi all, I am wondering what is the diff of the current Golang fix compared to the design behind haproxy's solution.
It is claimed that haproxy was never affected by it (more here). Old fix at the haproxy side which imposes a strict limit. Thus, I am wondering if such a solution could be utilized to solve this once and for all without any mitigation. 🤔

@elindsey
Copy link
Author

@skonto There are roughly three common fixes that have been deployed for this issue.

From least to most restrictive:

  1. Tying the number of open streams in the upper application layer to the concurrent streams limit. Not doing this was essentially a bug, and the stacks that weren't doing it tended to show higher impact quicker. This is what haproxy was doing, and also what the recent go patch does.
  2. Doing 1, but also limiting the number of streams sent to application level processing to some level below the max concurrent streams and allowing it to grow. Max concurrent streams default is 100 and it tends to not work well with clients when set below this limit, but you can effectively limit it by eg. only sending a portion to processing and letting more be sent if the connection proves that it's well behaved. This is what apache httpd does.
  3. Adding various limits on frames seen - either hard limits on specific frame types, token buckets, or a running computation of what frame composition a "good" connection should exhibit and severing when passed. Netty, proxygen, tomcat, etc. have mitigations similar to this.

Number 3 isn't a replacement for doing 1 or 2, but it's a valuable addition for a few reasons:

  • Doing 1 without 2 or 3 can still allow too much load to backends due to the high concurrency limit and inability to effectively lower it for many clients; this is what kube saw.
  • 1 and 2 prevent kicking off backend processing that would significantly increase resources, but the HEADERS/RST loop prior to starting backend processing can still be resource intense and peg the CPU and/or starve the event loop; this is what we're seeing.
  • Many normal ddos strategies rely on connection attempts (ja3 blocking, syn cookies, etc.) and having a way to turn misbehaved connections into reconnects allows using a wider set of mitigations.

@skonto
Copy link

skonto commented Oct 26, 2023

Hi @elindsey thanks for the insights. So by looking at the haproxy comments it seems that the max per connection is fixed to the number of max streams that can be processed:

Several experiments were made using varying thresholds to see if
overbooking would provide any benefit here but it turned out not to be
the case, so the conn_stream limit remains set to the exact streams
limit. Interestingly various performance measurements showed that the
code tends to be slightly faster now than without the limit, probably
due to the smoother memory usage.

So this is 1 in your description and applies per connection. It seems that the multiple clients problem with using the max number of streams available, is not discussed in that fix but later on if I understand correctly the effect is not considerable if a connection uses the max=100.
In that thread also it seems that folks discuss that there is no reason to lower the max concurrent streams per connection as you could have more connections from the client side with less streams to avoid for example the bad behavior detection (apache http), also mentioned above in your comment.
I guess global limits (max connections per proxy instance) also help to set an upper limit to resources consumption in general but the hard/important part is to identify bad traffic (eg. block IPs) and avoid it from consuming any part of your resource budget (eg. even with connection attempts only). Thus the hook request here would allow some early bad behavior detection although I suspect people have other systems on the traffic path to mitigate the issue in general, right?

@neild
Copy link
Contributor

neild commented Apr 26, 2024

The proposal is to add a hook for HTTP/2 server connections:

package http2

// ConnectionCalmer is called on each new frame received. Metadata about the
// frame is passed to the function. If `true` is returned, the connection will
// be sent a GOAWAY, ENHANCE_YOUR_CALM and terminated.
type ConnectionCalmer func(f FrameType) bool

type Server struct {
       // NewCalmer constructs a calmer for a connection.
       // If nil, no calmer is used.
       NewCalmer func() ConnectionCalmer
}

The use case is to permit users to provide their own fine-grained rate limiting policy on HTTP/2 connections.

I understand the desire here. However, the proposed API is a very low-level hook into the internals of the HTTP/2 server. This is the sort of feature we have generally avoided providing, and we have often come to regret when we do provide. For example, I think it's clear at this point that http2.Server.NewWriteScheduler was a mistake, as was http2.Transport.ConnPool.

A point in this proposal's favor is that, unlike WriteScheduler and ClientConnPool, the hook is quite simple: It is called once per frame, and it is called with just the frame type.

On the other hand, it seems to me inevitable that someone is going to want the contents of each frame, in which case this hook no longer suffices.

I personally would much prefer to provide reasonable default behavior and higher-level APIs than this one.

Adding various limits on frames seen - either hard limits on specific frame types, token buckets, or a running computation of what frame composition a "good" connection should exhibit and severing when passed. Netty, proxygen, tomcat, etc. have mitigations similar to this.

In particular, if there are good mitigations of this nature that can be applied, I'd be much happier about implementing them directly in the HTTP/2 server so that all users can take advantage of them.

That said, I do see the argument that providing the tools to build mitigations permits more experimentation in user code when there isn't an obvious right approach.

@enj
Copy link
Contributor

enj commented Apr 26, 2024

@neild what would you recommend that folks building Go servers that run directly on the public internet do to harden against http2 issues? I was able to do kubernetes/kubernetes#121120 for a subset of clients, but kubernetes/kubernetes#121197 remains open with no good solution.

@neild
Copy link
Contributor

neild commented Apr 26, 2024

Ideally, if there's additional hardening we can apply, we'd do so in net/http and golang.org/x/net/http2 so all users can benefit.

kubernetes/kubernetes#121197 doesn't have much information about what specific measures you want to take. Can you go into more detail?

@elindsey
Copy link
Author

In particular, if there are good mitigations of this nature that can be applied, I'd be much happier about implementing them directly in the HTTP/2 server so that all users can take advantage of them.

Unfortunately I haven't yet seen a scheme that doesn't require both exposing a configuration knob and exposing frame-level metrics in order to configure correctly, and pretty quickly end up with a lowish level hook.

Tomcat is perhaps the most interesting to dig into. They tried to make a more "hands off" policer that tracks the number of good frames and the number of frames it deems 'overhead' (essentially things that aren't requests). Theoretically this would be less brittle than a simpler rate limiter and require less tuning, but anecdotally this has been difficult to use in practice. Tuning it requires understanding both the client traffic patterns and Tomcat's internal algorithm so that the rates can be translated into Tomcat config. It hasn't been entirely robust to new attack types, requiring successive addition of specific configs (eg. overHeadResetFactor, overheadContinuationThreshold), and has grown to five separate tunables. And perhaps most relevant, tuning it at all still requires frame-level metrics, so needs some form of lower-level APIs if you want to avoid requiring users to rely on decrypted pcaps.

Netty, proxygen, and similar proxy toolkits had an easier time because it was more natural to expose lower-level APIs, or they already had APIs in place that let users implement mitigations. My experience with a similar Netty rate limiter and seeing that it was robust in the face of newer attacks (RST, CONTINUATION, etc.) is what originally led to the ConnectionCalmer() patch. It's definitely a less natural fit for go's net APIs, but I haven't seen a good or simpler alternative yet (very happy to be wrong).

@enj
Copy link
Contributor

enj commented May 1, 2024

kubernetes/kubernetes#121197 doesn't have much information about what specific measures you want to take. Can you go into more detail?

I want to be able to make it so that the cost (CPU/RAM/etc) for a single authenticated client to attempt to DOS the Kube API server is roughly 1:1 between the client and server. Today, a single authenticated client using many TCP connections with many HTTP2 streams can disproportionately cause the Kube API server to perform a high amount of work while only using a limited amount of resources on the client.

I am not looking to address concerns related to DDOS.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Incoming
Development

No branches or pull requests

8 participants