-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhance socket read to handle multiple packets #1530
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #1530 +/- ##
==========================================
- Coverage 87.61% 86.57% -1.04%
==========================================
Files 117 117
Lines 37848 38120 +272
==========================================
- Hits 33160 33002 -158
- Misses 4688 5118 +430 ☔ View full report in Codecov by Sentry. |
neqo-client/src/main.rs
Outdated
let d = Datagram::new(remote, *local_addr, &buf[..sz]); | ||
client.process_input(d, Instant::now()); | ||
handler.maybe_key_update(client)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still passes the packets one-by-one to neqo, so effectively it's doing exactly what we did before. I think what we want (also for GRO/GSO) it to be able to pass a Vec<Datagram>
into neqo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tried to expose a new interface: pub fn process_multiple_input(&mut self, dgrams: Vec<Datagram>, now: Instant)
to allow pass Vec<Datagram>
to neqo, but I am not sure if this is enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I think we want an array here and not a Vec
, because IIRC there is some way to run the AEAD over multiple consecutive packets in a memory buffer rather than one-by-one. We'd need to pick a sane maximum array length, probably based on whatever the GSO/GRO limit is at the moment.
We'll also need to make sure that neqo consumes/processes all the inbound frames in an array of datagrams before deciding when and what to transmit.
e7d76a1
to
19334d0
Compare
@@ -921,6 +921,13 @@ impl Connection { | |||
self.streams.cleanup_closed_streams(); | |||
} | |||
|
|||
/// Process new input datagrams on the connection. | |||
pub fn process_multiple_input(&mut self, dgrams: Vec<Datagram>, now: Instant) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what @larseggert specifically meant when he said "array" here, but Rust convention would have this interface as:
pub fn process_multiple_input(&mut self, dgrams: Vec<Datagram>, now: Instant) { | |
pub fn process_multiple_input(&mut self, dgrams: &[Datagram], now: Instant) { |
I would also inline the loop in multiple_input. It's a trivial function that isn't used elsewhere.
Finally, please consider what you will do when dgrams.is_empty()
. This function could do nothing in that case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I change it to &[Datagram]
, how can I avoid cloning each datagram?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. It's a pretty big change, but I think that you could change all the input/process functions to take a datagram reference instead. Maybe impl AsRef<Datagram>
would make the whole thing less of a challenge.
For now, this is fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to avoid big change for now. Please let me know if you think it's better to change all input/process functions to take a reference. If so, I'll create another issue.
Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to support GSO/GRO and avoid memory copies, we need to pass a consecutive region of memory in and out of the packet I/O functions: https://blog.cloudflare.com/accelerating-udp-packet-transmission-for-quic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that is possible here. The GSO calls happen outside this code and the memory can be cut up and passed in to the stack. That is easier if we move to accepting slices of bytes, but that is more disruptive and we can probably manage with Vec::split_off at first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I follow. On GSO (i.e., TX), the region passed down into the kernel needs to be consecutive. We can obviously copy memory to make it so, but it would be more efficient if that was not needed (or done as a side effect of where we need to copy memory anyway, i.e., when doing TLS encrypt).
On GRO (i.e., RX) we get a consecutive memory region from the kernel, which we could exploit to get some TLS decrypt efficiency but don't need to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change was about reading. If we get a contiguous range of memory, Rust might view that as a single x: &[u8]
. It is trivial then for us to split that up into multiple slices x[..1280], x[1280..2560], ...
, each of which are wrapped in a fresh Datagram
instance. We probably couldn't use Vec
for that, but we could if it was passed as owned memory, like Vec<u8>
.
I agree that sending is going to be harder. My original design had the memory provided by the caller, but we decided to keep things this way very early on. That is still something we can probably manage with a few tweaks to our API, probably just by having the encryption in place, but we could also do in place packet building and save an allocation with only a tiny bit more effort. We can keep the single-shot nature of the API in that case though, with the caller passing in mutable slices as needed.
19334d0
to
14ab5df
Compare
Previously, the we only read one packet from the socket at a time. This patch allows to read more than one packet from the socket.