Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ForkStats: compute and wire up fork_weight #34623

Merged
merged 8 commits into from
Jan 18, 2024

Conversation

HaoranYi
Copy link
Contributor

@HaoranYi HaoranYi commented Jan 2, 2024

Problem

When debugging forking in the cluster, we notice that log for voting's fork_weight is always zero.

[2024-01-01T01:10:04.276812282Z INFO  solana_core::replay_stage] voting: 239174891 0

It turns out that weight and fork_weight in ForkStat are never wired up. The values are always zero for these two fields.

Furthermore, we no longer use sum(vote_stake * lock_out) to choose forks. Instead, we just use sum(vote_stake).

Summary of Changes

  • remove old fork_weight computation, i.e. sum(vote_stake * lock_out)
  • update fork_weight to sum(vote_stake)/total_stake.
  • wire fork_weight to ForkStat

Fixes #

@HaoranYi HaoranYi changed the title remove unused fork stats weight and fork_weight wire up fork stats fork_weight field Jan 2, 2024
@HaoranYi HaoranYi force-pushed the 2024_jan_2 branch 2 times, most recently from fc6ee59 to 6bffc45 Compare January 2, 2024 19:12
@HaoranYi HaoranYi changed the title wire up fork stats fork_weight field ForkStats: wire up fork_weight field Jan 2, 2024
@HaoranYi
Copy link
Contributor Author

HaoranYi commented Jan 2, 2024

A node running with this PR against mainnet, now shows voting fork_weight.

[2024-01-02T20:19:24.523594818Z INFO  solana_core::replay_stage] voting: 239515880 2982457551305752085543582888
[2024-01-02T20:19:24.863351784Z INFO  solana_core::replay_stage] voting: 239515881 2929292058758796232534635596
[2024-01-02T20:19:25.155045035Z INFO  solana_core::replay_stage] voting: 239515882 2916869857236949405254942564
[2024-01-02T20:19:25.463345509Z INFO  solana_core::replay_stage] voting: 239515883 2785677130113856978579335768
[2024-01-02T20:19:25.766373239Z INFO  solana_core::replay_stage] voting: 239515884 3160879827438218954306718996
[2024-01-02T20:19:26.072663600Z INFO  solana_core::replay_stage] voting: 239515885 2952946062170266327316319312
[2024-01-02T20:19:26.401458337Z INFO  solana_core::replay_stage] voting: 239515886 2894309954886908677806332696

Copy link

codecov bot commented Jan 2, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (258c920) 81.7% compared to head (899d9ad) 81.8%.
Report is 6 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master   #34623   +/-   ##
=======================================
  Coverage    81.7%    81.8%           
=======================================
  Files         825      825           
  Lines      223124   223126    +2     
=======================================
+ Hits       182505   182523   +18     
+ Misses      40619    40603   -16     

AshwinSekar
AshwinSekar previously approved these changes Jan 3, 2024
Copy link
Contributor

@AshwinSekar AshwinSekar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM just a minor nit

my_vote_pubkey,
bank_slot,
stats.weight,
stats.fork_weight,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor nit, it would be nice to display this as a % of total_stake to make debugging easier

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[2024-01-03T18:26:25.000512548Z INFO  solana_core::replay_stage] LoQVTpDZE4XbXsCMAxQ4GTGsapUnMRXwENgeKE9aDbU slot_weight: 239690881 791724356429 239690880%
[2024-01-03T18:26:25.354383305Z INFO  solana_core::replay_stage] LoQVTpDZE4XbXsCMAxQ4GTGsapUnMRXwENgeKE9aDbU slot_weight: 239690882 827376443805 239690881%
[2024-01-03T18:26:25.748420681Z INFO  solana_core::replay_stage] LoQVTpDZE4XbXsCMAxQ4GTGsapUnMRXwENgeKE9aDbU slot_weight: 239690883 807612437952 239690882%
[2024-01-03T18:26:26.807998806Z INFO  solana_core::replay_stage] LoQVTpDZE4XbXsCMAxQ4GTGsapUnMRXwENgeKE9aDbU slot_weight: 239690884 814807560101 239690883%
[2024-01-03T18:26:27.204953613Z INFO  solana_core::replay_stage] LoQVTpDZE4XbXsCMAxQ4GTGsapUnMRXwENgeKE9aDbU slot_weight: 239690885 830796739067 239690884%

fork_weight/total_stake > 100%?
$forkweight = \sum votestake_i * lockout_i$

Copy link
Contributor Author

@HaoranYi HaoranYi Jan 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

>>> import math; math.log2(239690880/100)
21.192743586241622

Maybe report the average lockout from the fork_weight, i.e. average voting tower depth?

Copy link
Contributor Author

@HaoranYi HaoranYi Jan 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed a commit to compute bank_stake, which is the sum of all voted_stakes in the tower.
I think this can give us the percentage of stake voted in current tower against the total_stake.
WDYT? @AshwinSekar

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[2024-01-03T22:20:07.488366622Z INFO  solana_core::replay_stage] 9NTUUFtzNnGw3rJM8vsbTwM3TvC5DiA7ptvhaCHgEMGG slot_weight: 239721382 30% 825601785482% 32.94 239721381
[2024-01-03T22:20:07.894267160Z INFO  solana_core::replay_stage] 9NTUUFtzNnGw3rJM8vsbTwM3TvC5DiA7ptvhaCHgEMGG slot_weight: 239721383 26% 766049453440% 32.83 239721382
[2024-01-03T22:20:08.261075034Z INFO  solana_core::replay_stage] 9NTUUFtzNnGw3rJM8vsbTwM3TvC5DiA7ptvhaCHgEMGG slot_weight: 239721384 31% 783833172294% 32.87 239721383
[2024-01-03T22:20:08.666354848Z INFO  solana_core::replay_stage] 9NTUUFtzNnGw3rJM8vsbTwM3TvC5DiA7ptvhaCHgEMGG slot_weight: 239721385 34% 830971562227% 32.95 239721384
[2024-01-03T22:20:09.001766489Z INFO  solana_core::replay_stage] 9NTUUFtzNnGw3rJM8vsbTwM3TvC5DiA7ptvhaCHgEMGG slot_weight: 239721386 37% 832918403880% 32.96 239721385
[2024-01-03T22:20:09.216591220Z INFO  solana_core::replay_stage] 9NTUUFtzNnGw3rJM8vsbTwM3TvC5DiA7ptvhaCHgEMGG slot_weight: 239721387 35% 807584906983% 32.91 239721386
[2024-01-03T22:20:09.573224938Z INFO  solana_core::replay_stage] 9NTUUFtzNnGw3rJM8vsbTwM3TvC5DiA7ptvhaCHgEMGG slot_weight: 239721388 33% 779995340382% 32.86 239721387
[2024-01-03T22:20:09.911590661Z INFO  solana_core::replay_stage] 9NTUUFtzNnGw3rJM8vsbTwM3TvC5DiA7ptvhaCHgEMGG slot_weight: 239721389 36% 836237571100% 32.96 239721388
[2024-01-03T22:20:10.232700468Z INFO  solana_core::replay_stage] 9NTUUFtzNnGw3rJM8vsbTwM3TvC5DiA7ptvhaCHgEMGG slot_weight: 239721390 37% 816006910920% 32.93 239721389
[2024-01-03T22:20:10.494837632Z INFO  solana_core::replay_stage] 9NTUUFtzNnGw3rJM8vsbTwM3TvC5DiA7ptvhaCHgEMGG slot_weight: 239721391 26% 752236531000% 32.81 239721390
[2024-01-03T22:20:10.953962504Z INFO  solana_core::replay_stage] 9NTUUFtzNnGw3rJM8vsbTwM3TvC5DiA7ptvhaCHgEMGG slot_weight: 239721392 8% 730434757884% 32.77 239721391
[2024-01-03T22:20:11.371950705Z INFO  solana_core::replay_stage] 9NTUUFtzNnGw3rJM8vsbTwM3TvC5DiA7ptvhaCHgEMGG slot_weight: 239721393 17% 795522748142% 32.89 239721392
[2024-01-03T22:20:11.784694568Z INFO  solana_core::replay_stage] 9NTUUFtzNnGw3rJM8vsbTwM3TvC5DiA7ptvhaCHgEMGG slot_weight: 239721394 27% 797520633266% 32.89 239721393
[2024-01-03T22:20:12.255983780Z INFO  solana_core::replay_stage] 9NTUUFtzNnGw3rJM8vsbTwM3TvC5DiA7ptvhaCHgEMGG slot_weight: 239721395 21% 762066047176% 32.83 239721394
[2024-01-03T22:20:12.569378268Z INFO  solana_core::replay_stage] 9NTUUFtzNnGw3rJM8vsbTwM3TvC5DiA7ptvhaCHgEMGG slot_weight: 239721396 13% 749670704587% 32.80 239721395
[2024-01-03T22:20:13.018834819Z INFO  solana_core::replay_stage] 9NTUUFtzNnGw3rJM8vsbTwM3TvC5DiA7ptvhaCHgEMGG slot_weight: 239721397 26% 807580761627% 32.91 239721396
[2024-01-03T22:20:13.388260123Z INFO  solana_core::replay_stage] 9NTUUFtzNnGw3rJM8vsbTwM3TvC5DiA7ptvhaCHgEMGG slot_weight: 239721398 29% 803500353760% 32.90 239721397
[2024-01-03T22:20:13.767272281Z INFO  solana_core::replay_stage] 9NTUUFtzNnGw3rJM8vsbTwM3TvC5DiA7ptvhaCHgEMGG slot_weight: 239721399 25% 778122003476% 32.86 239721398
[2024-01-03T22:20:14.114656777Z INFO  solana_core::replay_stage] 9NTUUFtzNnGw3rJM8vsbTwM3TvC5DiA7ptvhaCHgEMGG slot_weight: 239721400 23% 774933806678% 32.85 239721399
[2024-01-03T22:20:14.422135809Z INFO  solana_core::replay_stage] 9NTUUFtzNnGw3rJM8vsbTwM3TvC5DiA7ptvhaCHgEMGG slot_weight: 239721401 24% 804264188008% 32.91 239721400
[2024-01-03T22:20:14.769350952Z INFO  solana_core::replay_stage] 9NTUUFtzNnGw3rJM8vsbTwM3TvC5DiA7ptvhaCHgEMGG slot_weight: 239721402 25% 771793170545% 32.85 239721401

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the original vote stake * lockout was an initial version of fork choice and that metric can be removed. We can just have 1 bank_weight or bank_stake field that keeps track of the stake present for that bank and divide that by total_stake.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

core/src/replay_stage.rs Outdated Show resolved Hide resolved
my_vote_pubkey,
bank_slot,
stats.weight,
stats.fork_weight,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the original vote stake * lockout was an initial version of fork choice and that metric can be removed. We can just have 1 bank_weight or bank_stake field that keeps track of the stake present for that bank and divide that by total_stake.

@@ -2286,10 +2283,6 @@ pub mod test {
let mut new_votes = latest_validator_votes_for_frozen_banks.take_votes_dirty_set(0);
new_votes.sort();
assert_eq!(new_votes, account_latest_votes);

Copy link
Contributor Author

@HaoranYi HaoranYi Jan 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

voted_stake*lock_out is no longer used for choosing the forks.
We no longer computes this. Removed the assertion in the tests.

@HaoranYi HaoranYi changed the title ForkStats: wire up fork_weight field ForkStats: compute and wire up bank_weight Jan 16, 2024
@HaoranYi HaoranYi changed the title ForkStats: compute and wire up bank_weight ForkStats: compute and wire up fork_weight Jan 16, 2024
@HaoranYi
Copy link
Contributor Author

slot_weight log from test_duplicate_shreds_broadcast_leader.

H19v236LYdWSHVcoMTLrGRFHJzKg6VYkuuSsPSWC8VWS slot_weight: 99 100.0% 98
2mKh37BQqZGvTJG7RGECYxU6sPpeDrhtNBpsTXgDGi4G slot_weight: 100 50.0% 99
7hTGRTYsEZ6BXD6DGQALEsEBYn8VC9MKtmNXbzLqsq1E slot_weight: 101 50.0% 100

@HaoranYi
Copy link
Contributor Author

slot_weight log from mainnet

[2024-01-17T17:22:47.800362139Z INFO  solana_core::replay_stage] 38NhbNVqJ9EbRBu3xQkVGZDPmuev8Dr2j5fquuMVT6KS slot_weight: 242400622 28.9% 242400621
[2024-01-17T17:22:48.183038726Z INFO  solana_core::replay_stage] 38NhbNVqJ9EbRBu3xQkVGZDPmuev8Dr2j5fquuMVT6KS slot_weight: 242400623 3.5% 242400622
[2024-01-17T17:22:48.666295930Z INFO  solana_core::replay_stage] 38NhbNVqJ9EbRBu3xQkVGZDPmuev8Dr2j5fquuMVT6KS slot_weight: 242400624 35.3% 242400623
[2024-01-17T17:22:49.007474686Z INFO  solana_core::replay_stage] 38NhbNVqJ9EbRBu3xQkVGZDPmuev8Dr2j5fquuMVT6KS slot_weight: 242400625 61.4% 242400624
[2024-01-17T17:22:49.417041979Z INFO  solana_core::replay_stage] 38NhbNVqJ9EbRBu3xQkVGZDPmuev8Dr2j5fquuMVT6KS slot_weight: 242400626 41.5% 242400625
[2024-01-17T17:22:49.740997536Z INFO  solana_core::replay_stage] 38NhbNVqJ9EbRBu3xQkVGZDPmuev8Dr2j5fquuMVT6KS slot_weight: 242400627 27.2% 242400626
[2024-01-17T17:22:50.407736878Z INFO  solana_core::replay_stage] 38NhbNVqJ9EbRBu3xQkVGZDPmuev8Dr2j5fquuMVT6KS slot_weight: 242400628 41.9% 242400627
[2024-01-17T17:22:50.776029597Z INFO  solana_core::replay_stage] 38NhbNVqJ9EbRBu3xQkVGZDPmuev8Dr2j5fquuMVT6KS slot_weight: 242400629 31.4% 242400628
[2024-01-17T17:22:51.172086077Z INFO  solana_core::replay_stage] 38NhbNVqJ9EbRBu3xQkVGZDPmuev8Dr2j5fquuMVT6KS slot_weight: 242400630 28.1% 242400629

Copy link
Contributor

@brooksprumo brooksprumo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, but I'm not particularly familiar with this code. I defer to the others that are the experts in this area 😸

Copy link
Contributor

@AshwinSekar AshwinSekar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. thanks for the change!

@HaoranYi HaoranYi merged commit 6a9f729 into solana-labs:master Jan 18, 2024
35 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants