Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix Issue232: Use dynamically adjusted request size to fetch blobs data from peers through p2p instread of using the static p2p.max.request.size value #287

Merged
merged 15 commits into from
Jun 5, 2024

Conversation

ping-ke
Copy link
Collaborator

@ping-ke ping-ke commented May 18, 2024

Issue: #232

Currently, we use p2p.max.request.size to control the size of blobs fetched from peers. However, different peers are distributed in different regions, so the request sizes between the local node and peers should be different. So we should use different request sizes to fetch blobs from different peers.

  1. Change the p2p.max.request.size to p2p.request.size which will be used to initialize the request size from a new peer.
  2. Add a tracker for each peer to adjust request size according to network conditions between the peer and local node.
    capacity = 0.9*(t.capacity) + 0.1* return blobs/second (Return blob count/Return time in second)
  3. When selecting an idle peer to send a new request, order idle peers by their capacities, and select the peer with the biggest capacity.

This design refers to rates design in "github.com\ethereum\go-ethereum\eth\protocols\snap\sync.go" as the following:
1716396243340

A similar design also exists in prysm, it uses both capacity and score (processedBatches * 0.1) to sort, and also adds some randomness to select that peer or not.
1716685393539
1716685393555
1716685393570
1716691956365
1716691956385

How To Test
Change to the log level to debug, or change the following code in the tracker.go to info. Then run the es-node from the begining.
log.Debug("Update tracker", "peer id", t.peerID, "elapsed", elapsed, "items", items, "old capacity", oldcap, "capacity", t.capacity)

then check the log like the following:
t=2024-05-25T11:46:19+0000 lvl=info msg="Update tracker" "peer id"=16Uiu2HAmGAyykt2njnJYTSU9KsiFutrQKZA1w8LhS45ERpqxfwFV elapsed=333.787809ms items=8,388,608 "old capacity"=39724207.481 capacity=38264942.629

The following is the test result for this feature between local node and peer on AX101. The inital request size is 8M, and after 3 minutes, the request size stable between 4.5M ~ 5.2M
80272674f2478a3e226cc65f3c41536

@qzhodl
Copy link
Collaborator

qzhodl commented May 22, 2024

Issue: #232

Currently, we use p2p.max.request.size to control the size of blobs fetched from peers. However, different peers are distributed in different regions, so the request sizes between the local node and peers should be different. So we should use different request sizes to fetch blobs from different peers.

  1. Change the p2p.max.request.size to p2p.request.size which will be used to initialize the request size from a new peer.
  2. Add a tracker for each peer to adjust request size according to network conditions between the peer and local node.
    capacity = 0.9*(t.capacity) + 0.1* return blobs/second (Return blob count/Return time in second)
  3. When selecting an idle peer to send a new request, order idle peers by their capacities, and select the peer with the biggest capacity.

It would be great to provide the reference code link for Geth so that reviewers can compare it. Additionally, as mentioned in the ACD meeting, it would be helpful to list how Prysm implements it as well.

ethstorage/flags/p2p_flags.go Show resolved Hide resolved
ethstorage/p2p/protocol/peer.go Outdated Show resolved Hide resolved
@ping-ke ping-ke requested a review from qzhodl May 23, 2024 06:11
@qzhodl
Copy link
Collaborator

qzhodl commented May 24, 2024

What about Prysm?

@qzhodl
Copy link
Collaborator

qzhodl commented May 24, 2024

I feel the PR comment needs to describe how we tested it.

@qzhodl
Copy link
Collaborator

qzhodl commented May 28, 2024

Issue: #232

Currently, we use p2p.max.request.size to control the size of blobs fetched from peers. However, different peers are distributed in different regions, so the request sizes between the local node and peers should be different. So we should use different request sizes to fetch blobs from different peers.

  1. Change the p2p.max.request.size to p2p.request.size which will be used to initialize the request size from a new peer.
  2. Add a tracker for each peer to adjust request size according to network conditions between the peer and local node.
    capacity = 0.9*(t.capacity) + 0.1* return blobs/second (Return blob count/Return time in second)
  3. When selecting an idle peer to send a new request, order idle peers by their capacities, and select the peer with the biggest capacity.

This design refers to rates design in "github.com\ethereum\go-ethereum\eth\protocols\snap\sync.go" as the following: 1716396243340

A similar design also exists in prysm, it uses both capacity and score (processedBatches * 0.1) to sort, and also adds some randomness to select that peer or not. 1716685393539 1716685393555 1716685393570 1716691956365 1716691956385

How To Test Change to the log level to debug, or change the following code in the tracker.go to info. Then run the es-node from the begining. log.Debug("Update tracker", "peer id", t.peerID, "elapsed", elapsed, "items", items, "old capacity", oldcap, "capacity", t.capacity)

then check the log like the following: t=2024-05-25T11:46:19+0000 lvl=info msg="Update tracker" "peer id"=16Uiu2HAmGAyykt2njnJYTSU9KsiFutrQKZA1w8LhS45ERpqxfwFV elapsed=333.787809ms items=8,388,608 "old capacity"=39724207.481 capacity=38264942.629

I think a valid test result would be: if the two nodes have a poor internet connection (e.g., China and ax101), the tracker’s capacity would quickly adapt to a very small value, and vice versa.

ethstorage/p2p/protocol/syncserver.go Outdated Show resolved Hide resolved
ethstorage/p2p/protocol/utils.go Show resolved Hide resolved
@ping-ke ping-ke requested review from qzhodl and syntrust May 30, 2024 10:03
@ping-ke ping-ke requested a review from qzhodl June 2, 2024 02:59
@ping-ke ping-ke merged commit 4ef3547 into main Jun 5, 2024
2 checks passed
@ping-ke ping-ke deleted the issue232-2 branch October 30, 2024 10:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants