Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Endpoint stats interface #1900

Merged
merged 14 commits into from
Jul 28, 2024
Merged

Endpoint stats interface #1900

merged 14 commits into from
Jul 28, 2024

Conversation

ryleung-solana
Copy link
Contributor

In Anza, we need to get more insight into the resources being used by quinn in servicing quic handshakes. This PR seeks to add an initial interface to the Endpoint struct to allow for getting more detailed stats than just the open connections, and some initial data relevant to the handshakes. On a side note to the Quinn devs, if there is a good way to support getting the current number of open handshakes, we would be interested in know it. Currently, it seems that the Endpoint is not signaled when a Connecting future finishes, and to add such an event would require some refactoring, not to mention handling an extra event, locking, etc.

Copy link
Collaborator

@Ralith Ralith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if there is a good way to support getting the current number of open handshakes, we would be interested in know it

This isn't readily available, because endpoint behavior isn't sensitive to connection state once the application has decided to accept it. What is readily available is the number of incoming connection attempts which haven't yet been accepted/rejected/ignored. You could always maintain your own atomic counter that you update by whatever logic you like, of course.

quinn/src/endpoint.rs Outdated Show resolved Hide resolved
quinn/src/endpoint.rs Outdated Show resolved Hide resolved
quinn/src/endpoint.rs Outdated Show resolved Hide resolved
quinn/src/endpoint.rs Outdated Show resolved Hide resolved
quinn/src/endpoint.rs Outdated Show resolved Hide resolved
@ryleung-solana
Copy link
Contributor Author

@Ralith @djc one thing that we're particularly interested in is related to the resource usage from crypto, especially in accept() because we don't currently reject any incoming connection requests. We merely accept all incoming connections and then immediately drop some Connecting future based on our own logic.

@djc
Copy link
Member

djc commented Jun 20, 2024

@Ralith @djc one thing that we're particularly interested in is related to the resource usage from crypto, especially in accept() because we don't currently reject any incoming connection requests.

Why not? I think we had some Solana people look at that as an option for this exact reason.

@ryleung-solana
Copy link
Contributor Author

ryleung-solana commented Jun 20, 2024

@Ralith @djc one thing that we're particularly interested in is related to the resource usage from crypto, especially in accept() because we don't currently reject any incoming connection requests.

Why not? I think we had some Solana people look at that as an option for this exact reason.

Mostly because we currently only do some basic rate-limiting and nothing more. It's probably a good idea to look at using reject when over the rate limit instead of accepting and immediately dropping 😓 but the thing is that we currently don't really have a great idea of what the limit should be, since we don't have insight into the resource usage from handshakes, especially crypto (but we think it's significant).

@ryleung-solana
Copy link
Contributor Author

Would you guys have any ideas of how we might be able to get some better numbers on e.g. the mem usage from crypto in accept() before the future is returned?

@Ralith
Copy link
Collaborator

Ralith commented Jun 21, 2024

Can you elaborate on "mem usage from crypto"? I don't expect cryptographic operations to consume significant memory, or really anything but O(input). Usually the concern with handshakes is CPU usage, which could in principle be measured with some sort of high resolution timer around accept and, most importantly, the poll calls on the internal connection driver future.

quinn/src/endpoint.rs Show resolved Hide resolved
quinn/src/endpoint.rs Outdated Show resolved Hide resolved
quinn/src/endpoint.rs Outdated Show resolved Hide resolved
@ryleung-solana
Copy link
Contributor Author

Can you elaborate on "mem usage from crypto"? I don't expect cryptographic operations to consume significant memory, or really anything but O(input). Usually the concern with handshakes is CPU usage, which could in principle be measured with some sort of high resolution timer around accept and, most importantly, the poll calls on the internal connection driver future.

I think that's (CPU usage) also something to look at as well. We're looking specifically at x509 verification, as it happens before we get the future returned to us when we accept() making it a prime target for rate limiting given some data. It would also allow us to determine once and for all, whether crypto does indeed take a significant amount of memory under attack (e.g. getting spammed with lots of handshake attempts, or some variation of this), which we suspect from a recent security advisory, but don't know for sure.

@djc
Copy link
Member

djc commented Jun 26, 2024

In general I think a quinn_proto::Connection takes a decent amount of memory, but only a small part of that is due to the crypto involved. To the extent that you can, it's probably better to reject connections based on the Incoming data.

@Ralith
Copy link
Collaborator

Ralith commented Jun 27, 2024

We're looking specifically at x509 verification, as it happens before we get the future returned to us when we accept() making it a prime target for rate limiting given some data.

This is no longer the case as of 0.11. No CPU-intensive work is performed until you accept a connection.

@ryleung-solana
Copy link
Contributor Author

ryleung-solana commented Jul 10, 2024

CC @t-nelson for input. From the above, it's a little difficult to directly measure memory usage from x509 verification and apparently there are decent ways to rate-limit this in newer versions anyway. I also understand that it was requested to track various state transitions in the handshake, but this doesn't look super feasible. In any case, perhaps we should just get the interface in and look at adding more fields later as the need arises?

@ryleung-solana ryleung-solana marked this pull request as ready for review July 12, 2024 16:35
quinn/src/endpoint.rs Outdated Show resolved Hide resolved
quinn/src/endpoint.rs Outdated Show resolved Hide resolved
quinn/src/endpoint.rs Outdated Show resolved Hide resolved
Copy link
Member

@djc djc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks.

@ryleung-solana
Copy link
Contributor Author

Hmm, I'm unable to merge. Perhaps it's because "@Ralith requested changes"?

@djc
Copy link
Member

djc commented Jul 18, 2024

You won't be able to merge either way, only we do. But, I'll wait for @Ralith's approval.

Copy link
Collaborator

@Ralith Ralith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@Ralith Ralith merged commit a385630 into quinn-rs:main Jul 28, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants