Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(dcutr): Fix end to end tests and add legacy behavior flag (default=true) #3044

Merged
merged 33 commits into from
Feb 25, 2025

Conversation

MarcoPolo
Copy link
Collaborator

@MarcoPolo MarcoPolo commented Nov 15, 2024

Previously this PR sought to migrate away from the legacy behavior. Now it just tests the hole punching behavior using a simulated network. It also adds a flag to control the legacy behavior, but this is set to default to the current behavior.

Most of the code introduced here is related to the new simconn package which lets us create simulated UDP connections for use with the QUIC transport. This lets us set up arbitrary networks to test holepunching. The End to End test now places hosts A and B behind a firewall that requires holepunching for a direct connection.

The plan is to merge this PR to actually test holepunching, then #3171 by coming up with a migration strategy and use these tests to verify the strategy.

@sukunrt
Copy link
Member

sukunrt commented Nov 18, 2024

On the side that in the new code will be the Server, can we add a random wait time after which the server will switch the role to Client and initiate the security negotiation?

@MarcoPolo MarcoPolo changed the title [DO NOT MERGE] fix: dcutr: fix roles in tcp simultaneous connection [DO NOT MERGE] fix(dcutr): fix roles in tcp simultaneous connection Nov 18, 2024
@MarcoPolo
Copy link
Collaborator Author

Does it need to be random?

Could the server wait 3 RTT (it has the estimate) for the client and switch after?

@sukunrt
Copy link
Member

sukunrt commented Nov 19, 2024

Unfortunately simple timeouts wont work. This is trickier.

We need to always start as a Server. Then switch roles after a random wait. This is because when old go nodes and new go nodes interact, they may both assume the Client role, in which case, conn establishment will fail as both the sides assume the Client role.

In the new code, if we always assume the role of Server, there will be 3 cases:

old go node vs new go node

  1. Client <-> Server:
    For client client to work, we need to assume the role of Server in new code. The peer, old go node, will initiate the handshake.
  2. Server <-> Server
    For server server to work, we need to assume the role of Server in new code, switch roles and initiate the handshake.

new go node vs new go node

  1. Server <-> Server:
    After a random backoff one side becomes the client and initiates the handshake.

@MarcoPolo MarcoPolo force-pushed the marco/fix-dcutr-roles branch 2 times, most recently from cc7e359 to 6459ddb Compare February 4, 2025 01:15
@MarcoPolo MarcoPolo changed the title [DO NOT MERGE] fix(dcutr): fix roles in tcp simultaneous connection fix(dcutr): Fix end to end tests and add legacy behavior flag (default=true) Feb 5, 2025
@MarcoPolo MarcoPolo requested a review from sukunrt February 5, 2025 19:20
@MarcoPolo MarcoPolo marked this pull request as ready for review February 5, 2025 19:26
Copy link
Member

@sukunrt sukunrt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor comments. It is fine to merge for me. This is a significant improvement over the current testing of the holepunch package! 🚀

Comment on lines 40 to 41
overrideListenUDP listenUDP
overrideSourceIPSelectorFn func() (SourceIPSelector, error)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we make the ConnManager an interface so that we have a simple implementation to use simulated conns.
Admittedly, it'll not test the full holepunching stack. But the current system feels a bit too complex.

I don't have a strong opinion on fixing this right now.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I follow, can you explain a bit more?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I think I understand. You're suggesting having quicreuse implement some new interface that we could then implement some simpler mock thing that returns the simulated packet conns.

If that's it, I think I'd prefer to have these options in quic resuse for the following reasons:

  1. This doesn't change any of the real logic of the quicreuse package.
  2. It keeps the simulated environment closer to the real environment.
  3. We could test how QUIC and WebTransport interact in quicreuse with holepunching.

@MarcoPolo MarcoPolo force-pushed the marco/fix-dcutr-roles branch from 0803336 to e0a57d7 Compare February 25, 2025 04:13
@MarcoPolo MarcoPolo merged commit 93014e1 into master Feb 25, 2025
9 checks passed
@MarcoPolo MarcoPolo mentioned this pull request Feb 25, 2025
19 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants