Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deployment plan for AutoNAT v2 #10091

Open
3 tasks done
sukunrt opened this issue Aug 21, 2023 · 7 comments
Open
3 tasks done

Deployment plan for AutoNAT v2 #10091

sukunrt opened this issue Aug 21, 2023 · 7 comments
Labels
effort/days Estimated to take multiple days, but less than a week kind/feature A new feature need/analysis Needs further analysis before proceeding need/maintainer-input Needs input from the current maintainer(s) P2 Medium: Good to have, but can wait until someone steps up

Comments

@sukunrt
Copy link
Contributor

sukunrt commented Aug 21, 2023

Checklist

  • My issue is specific & actionable.
  • I am not suggesting a protocol enhancement.
  • I have searched on the issue tracker for my issue.

Description

We plan to introduce AutoNAT v2 in the next go-libp2p release(v0.31.0)(PR). AutoNAT v2 will allow nodes to test individual addresses for reachability. This fixes issues with address handling within libp2p and allows the node to advertise an accurate set of reachable addresses to its peers. This would also allow nodes to determine IPv6 vs IPv4 connectivity.

When initially deployed a node won't be able to find any AutoNATv2 servers as there would be so few of them, so it cannot use autonat v2 to verify its addresses. So initially we will need to enable the server and not use the client.

There are two considerations for a subsequent deployment that uses the v2 client to verify its addresses.

  1. Finding a Peer

As the adoption of the version with the AutoNATv2 server increases, nodes will be able to find servers for AutoNATv2. This adoption fraction, when nodes can find at least 3 servers is some small multiple of 3 * 1/(dht routing table size). For the ipfs dht the routing table size is roughly 250 nodes which makes this number 1.2%. So once 5% nodes have adopted the version with the AutoNATv2 server, most nodes should be able to find enough autonat v2 servers to verify their addresses.

  1. Server Rate limiting

If the ratio of clients to servers is too high, AutoNATv2 will not be useful as clients won't be able to verify their addresses because of server rate limiting.

In the steady state, when a large chunk(10%?) of the network runs AutoNATv2 servers, this is not a problem. The ratio of DHT nodes in client mode to server mode is roughly 10:1. So we should have a ratio of 10 AutoNATv2 clients to 1 server. If all the clients have 10 addresses each and they verify these 10 addresses every 30 minutes(a very naive implementation), the server needs to verify 100 addresses in 30 minutes which is 4 addresses a minute.

But till we reach this state, there is a possibility that clients update more quickly than servers. This seems likely to me as about 50% of the ipfs network is still on v0.18. I'm not sure how to factor this.

Is there any prior art to such rollouts? I'm especially interested in understanding how to bound the client to server ratio.

Note: I'm ignoring the alternative of gracefully handling the case where AutoNAT v2 peers are not available as the development time for that will be very high. It seems better to just switch over to AutoNATv2 after we have enough servers in the network.

cc @marten-seemann

@sukunrt sukunrt added the kind/feature A new feature label Aug 21, 2023
@Jorropo
Copy link
Contributor

Jorropo commented Aug 21, 2023

This sounds to be incrementally testable.
We turn on the server, then we use the network statistics to decide when the ratio and total deployment both seems ok.

@lidel lidel moved this to 🥞 Todo in IPFS Shipyard Team Oct 30, 2023
@lidel lidel added need/analysis Needs further analysis before proceeding P2 Medium: Good to have, but can wait until someone steps up effort/days Estimated to take multiple days, but less than a week need/maintainer-input Needs input from the current maintainer(s) labels Oct 30, 2023
@lidel lidel mentioned this issue Aug 5, 2024
32 tasks
lidel added a commit that referenced this issue Aug 5, 2024
Part of #10091
We include a flag that allows shutting down V2 in case there are issues
with it.
lidel added a commit that referenced this issue Aug 6, 2024
Part of #10091
We include a flag that allows shutting down V2 in case there are issues
with it.
lidel added a commit that referenced this issue Aug 6, 2024
Part of #10091
We include a flag that allows shutting down V2 in case there are issues
with it.
lidel added a commit that referenced this issue Aug 6, 2024
* feat: libp2p.EnableAutoNATv2

Part of #10091
We include a flag that allows shutting down V2 in case there are issues
with it.

* docs: EnableAutoNATv2
@lidel
Copy link
Member

lidel commented Aug 8, 2024

For anyone following this rollout, #10468 is scheduled to ship with 0.30.0-rc1 (#10436), by default, a publicly diallable node will expose both v1 and v2:

/libp2p/autonat/1.0.0
/libp2p/autonat/2/dial-back
/libp2p/autonat/2/dial-request

My understanding of the plan:

  • For now, Kubo will still use v1 as a client until we have enough v2s.
  • Once we have enough v2s, we will switch to v2 and remove v1.

@RubenKelevra
Copy link
Contributor

RubenKelevra commented Aug 8, 2024

I haven't looked at the code yet, just a quick question: Is there a blacklist for addresses yet which won't be able to be tested?

I was thinking about RFC1918 as well as RFC4193 addresses. Otherwise offering an AutoNAT server may cause a lot of connection attempts to a private network the server has also access to.

The related bug tracking this would be libp2p/go-libp2p#436.

I just see an issue deploying a new version of AutoNAT without fixing a potential DoS vector of a server into a private network architecture not accessible to the internet.

Sure the potential of harm is slim, but we have had issues with plastic routers at home hanging up because of this in the past.

@sukunrt
Copy link
Contributor Author

sukunrt commented Aug 9, 2024

The client(provided in a future release) that uses autonat v2 for verifying reachability, will only test Public IP / DNS Addresses. The client will not query for private IPs and the server will reply with an error for private IPs.

libp2p/go-libp2p#436 is about the node sharing private unroutable IPs with the rest of the network. That's a separate issue unrelated to the service AutoNAT provides. If you're interested in fixing that, a comment explaining your usecase will be very helpful!

I just see an issue deploying a new version of AutoNAT without fixing a potential DoS vector of a server into a private network architecture not accessible to the internet.

Can you elaborate on the attack vector here?

we have had issues with plastic routers at home hanging up because of this in the past.

Can you share more info about this?

@RubenKelevra
Copy link
Contributor

The client(provided in a future release) that uses autonat v2 for verifying reachability, will only test Public IP / DNS Addresses. The client will not query for private IPs and the server will reply with an error for private IPs.
[...]
Can you elaborate on the attack vector here?

There is none - thanks for your explanation!

we have had issues with plastic routers at home hanging up because of this in the past.

Can you share more info about this?

Yeah sure.

So a client would announce all their IPs, including say a 10.x.x.x IP-Address to the DHT.

Another client would now try to find this client and try to connect to those addresses.

This usually is not a problem, as there's simply no route to those networks in your NAT router and thus it can respond with an ICMP that there's no route.

But this changes as soon as we talk about carrier grade NAT.

In this case your router has a route but seems to get no ICMP back from the ISP infrastructure due to ICMP firewalling.

This leads to a lot of half-open NAT connections which will never get a response, until the plastic router at home will randomly start to drop actually useful connections as the routing table overflows.

There have been a couple of issues around this, like #3320 and #9998 and the only solution is either to manually create a connection filter to all RFC1918 subnets in the configure, except your own and move your own subnet to something less often used. So avoid 192.168.0.0/24, 192.168.1.0/24 and 192.168.178.0/24 - because the connection filter will also block local connections to other peers in your own network - because there's only one scope for addresses: libp2p/go-libp2p#793

@lidel
Copy link
Member

lidel commented Sep 30, 2024

@RubenKelevra some code landed sicne last year, and Kubo (and other clients running DualDHT setup) now should only announce public IPs on public DHT, and private ones on LAN DHT:

@sukunrt fysa Kubo 0.30 should be >5% of the network:

Any preference when do we want to enable V2 client and remove V1 one?

@RubenKelevra
Copy link
Contributor

@lidel which would explain why my router hasn't crashed since I've installed IPFS a week ago 🥲

Good job, very much appreciated :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
effort/days Estimated to take multiple days, but less than a week kind/feature A new feature need/analysis Needs further analysis before proceeding need/maintainer-input Needs input from the current maintainer(s) P2 Medium: Good to have, but can wait until someone steps up
Projects
No open projects
Status: 🥞 Todo
Development

No branches or pull requests

4 participants