Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lighthouse unable to reliably serve DataColumnsByRange #6108

Closed
jimmygchen opened this issue Jul 16, 2024 · 1 comment
Closed

Lighthouse unable to reliably serve DataColumnsByRange #6108

jimmygchen opened this issue Jul 16, 2024 · 1 comment
Labels
bug Something isn't working das Data Availability Sampling Networking

Comments

@jimmygchen
Copy link
Member

Description

When testing sync locally, a syncing Lighthouse node isn't able to download data columns from it's Lighthouse peer reliably. The network seems to be functioning fine with 100% participation and all peers in sync. However the peer returns 0 columns most of the time. See logs below.

To reproduce:

  1. Start a local testnet with the network_params_das_local.yaml config
  2. Stop one Lighthouse node, and wait for 2-3 epochs to make sure it triggers range sync
  3. Start the Lighthouse node, notice that sync gets stuck pretty quickly and peers don't return columns

Version

das branch

Present Behaviour

Logs from syncing node:

Jul 15 07:27:09.430 DEBG Sending DataColumnsByRange requests, peer: 16Uiu2HAm4HH1Bioy2Qp75Rymx2X6TZwTnWeGwmjLKMHAZ5evVmcc, columns: [16, 24, 50, 64, 80, 88], epoch: 0, count: 32, method: DataColumnsByRange, service: sync, module: network::sync::network_context:398
Jul 15 07:27:09.430 DEBG Sending DataColumnsByRange requests, peer: 16Uiu2HAmKxsGh6atb4WQDVz3gb37XaJbi3MZNW8ZRsSwkBu7URYT, columns: [0, 114], epoch: 0, count: 32, method: DataColumnsByRange, service: sync, module: network::sync::network_context:398
Jul 15 07:27:09.430 DEBG Sending DataColumnsByRange requests, peer: 16Uiu2HAm4HH1Bioy2Qp75Rymx2X6TZwTnWeGwmjLKMHAZ5evVmcc, columns: [0, 16, 24, 50, 64, 80, 88, 114], epoch: 1, count: 32, method: DataColumnsByRange, service: sync, module: network::sync::network_context:398
Jul 15 07:27:09.431 DEBG Sending DataColumnsByRange requests, peer: 16Uiu2HAm4HH1Bioy2Qp75Rymx2X6TZwTnWeGwmjLKMHAZ5evVmcc, columns: [0, 16, 24, 80, 88], epoch: 2, count: 32, method: DataColumnsByRange, service: sync, module: network::sync::network_context:398
Jul 15 07:27:09.431 DEBG Sending DataColumnsByRange requests, peer: 16Uiu2HAmKxsGh6atb4WQDVz3gb37XaJbi3MZNW8ZRsSwkBu7URYT, columns: [50, 64, 114], epoch: 2, count: 32, method: DataColumnsByRange, service: sync, module: network::sync::network_context:398
Jul 15 07:27:09.431 DEBG Sending DataColumnsByRange requests, peer: 16Uiu2HAmKxsGh6atb4WQDVz3gb37XaJbi3MZNW8ZRsSwkBu7URYT, columns: [0, 24, 50, 64, 88, 114], epoch: 3, count: 32, method: DataColumnsByRange, service: sync, module: network::sync::network_context:398
Jul 15 07:27:09.431 DEBG Sending DataColumnsByRange requests, peer: 16Uiu2HAm4HH1Bioy2Qp75Rymx2X6TZwTnWeGwmjLKMHAZ5evVmcc, columns: [16, 80], epoch: 3, count: 32, method: DataColumnsByRange, service: sync, module: network::sync::network_context:398

Logs from supernode peer (16Uiu2HAm4HH1Bioy2Qp75Rymx2X6TZwTnWeGwmjLKMHAZ5evVmcc):
Note that only the request (0-32) returned some data, 114 data columns ~ 19 blocks, given 6 columns requested.

Jul 15 07:27:09.437 DEBG Received DataColumnsByRange Request, start_slot: 32, count: 32, peer_id: 16Uiu2HAmTBJCHSjTZDGRKAU5vaq3wgomDW7FZ8SpJbK5xWe4N6CY, module: network::network_beacon_processor::rpc_methods:960
Jul 15 07:27:09.438 DEBG DataColumnsByRange Response processed, returned: 0, requested: 32, current_slot: 133, start_slot: 32, peer: 16Uiu2HAmTBJCHSjTZDGRKAU5vaq3wgomDW7FZ8SpJbK5xWe4N6CY, module: network::network_beacon_processor::rpc_methods:1119
Jul 15 07:27:09.439 DEBG Received DataColumnsByRange Request, start_slot: 0, count: 32, peer_id: 16Uiu2HAmTBJCHSjTZDGRKAU5vaq3wgomDW7FZ8SpJbK5xWe4N6CY, module: network::network_beacon_processor::rpc_methods:960
Jul 15 07:27:09.449 DEBG DataColumnsByRange Response processed, returned: 114, requested: 32, current_slot: 133, start_slot: 0, peer: 16Uiu2HAmTBJCHSjTZDGRKAU5vaq3wgomDW7FZ8SpJbK5xWe4N6CY, module: network::network_beacon_processor::rpc_methods:1119
Jul 15 07:27:09.450 DEBG Received DataColumnsByRange Request, start_slot: 96, count: 32, peer_id: 16Uiu2HAmTBJCHSjTZDGRKAU5vaq3wgomDW7FZ8SpJbK5xWe4N6CY, module: network::network_beacon_processor::rpc_methods:960
Jul 15 07:27:09.450 DEBG DataColumnsByRange Response processed, returned: 0, requested: 32, current_slot: 133, start_slot: 96, peer: 16Uiu2HAmTBJCHSjTZDGRKAU5vaq3wgomDW7FZ8SpJbK5xWe4N6CY, module: network::network_beacon_processor::rpc_methods:1119
Jul 15 07:27:09.450 DEBG Received DataColumnsByRange Request, start_slot: 64, count: 32, peer_id: 16Uiu2HAmTBJCHSjTZDGRKAU5vaq3wgomDW7FZ8SpJbK5xWe4N6CY, module: network::network_beacon_processor::rpc_methods:960
Jul 15 07:27:09.451 DEBG DataColumnsByRange Response processed, returned: 0, requested: 32, current_slot: 133, start_slot: 64, peer: 16Uiu2HAmTBJCHSjTZDGRKAU5vaq3wgomDW7FZ8SpJbK5xWe4N6CY, module: network::network_beacon_processor::rpc_methods:1119
@jimmygchen
Copy link
Member Author

There are many reasons this can happen and we’ve recently fixed a few issues on data column persistence, will close this one now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working das Data Availability Sampling Networking
Projects
None yet
Development

No branches or pull requests

1 participant