-
-
Notifications
You must be signed in to change notification settings - Fork 330
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: collect and add proposal stats in grafana metrics #5448
Conversation
Performance Report✔️ no performance regression detected Full benchmark results
|
@@ -58,6 +58,66 @@ export function createBeaconMetrics(register: RegistryMetricCreator) { | |||
|
|||
// Non-spec'ed | |||
|
|||
// Finalized block and proposal stats | |||
allValidators: { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just reading the metric name all_validators_expected_count
or allValidators.expected
does not make it obvious that this is about blocks
Should this PR also add dashboard panels in grafana to visualize the metrics? The title suggests that but right now it only added the prometheus metrics |
updated |
this.logger.info("All validators finalized proposal stats", { | ||
...allValidators, | ||
finalizedCanonicalCheckpointsCount, | ||
finalizedFoundCheckpointsInStateCache, | ||
}); | ||
this.logger.info("Attached validators finalized proposal stats", { | ||
...attachedValidators, | ||
finalizedAttachedValidatorsCount, | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would like more eyes here @philknows @dapplion since these are new info logs and need to be extremely polished and well thought out.
My first thought is that I don't think we want the "all validators" log, and only want the "attached validators" log.
Also I think we should remove finalizedAttachedValidatorsCount
(or at the very least rename it to be less wordy).
My justification for ^ is that info logs should probably not be added for extensive general chain metrics (rather use prom metrics for that), but rather be reserved for individual node status and individual validator statuses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could just hide behind a feature flag, similar to validator monitor logs (to be added)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can push all validators log (and finalizedAttachedValidatorsCount) to debug and retain attached validator logs for info
reasoning: not all deployments have metrics set, but all deployments by default write in debug file which has been helpful in past for debugging
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@g11tech After seeing this log (Attached validators finalized proposal stats expected=0, finalized=0, orphaned=0, missed=0, finalizedAttachedValidatorsCount=0
) on different beacon node instances I find it not very useful as a info log which should be printed out by default.
I think should reconsider and only enable it via flag as I suggested above. Right now, it seems really random as we do report stats on block proposals but not attestations.
It also just prints out the proposal stats during which the beacon node was running, on restart, it will reset all values to 0. Unless someone is running hundreds of validators this log will not change for months.
Focusing the BN logs on just sync and peer info by default feels a lot cleaner to be as there might not even be validators attached.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do agree here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We had a discussion about this in Discord here: https://discord.com/channels/593655374469660673/1105347927548907600/1110563872152236133
The compromise made is that this will not show on info
if node is still syncing, there are no validators attached or if all stats are 0. There will also be a flag to disable this.
packages/beacon-node/test/unit/chain/archive/collectFinalizedProposalStats.test.ts
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While the functionality is important I think this PR breaks a bunch of separation of concerns in our code that makes me very reluctant to approve
- Now archiver class is mixed with the role of the validator monitor
- Dependency between archiver / BeaconProposerCache is nasty
- Having to bloat the fork-choice data structure for a purely metric feature is also not pretty
- Archiver doing a validator status info log at non-regular intervals is nasty, which should be the responsability of the validator monitor, see Expose validator monitor via logs #5336
Also that random const checkpointState = checkpointStateCache.get(checkpointHex);
in the middle of the archiver is not good, as we want to eventually move to a more black-box style regen and there's no guarantee the state is available at all
I don't have good suggestions for all the points brought up but the current implementation seems expensive to maintain long term
hmmm, let me suggest ways to address these concerns :
If you like this approach I can refac this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did a second pass and I think the approach is good. Attempted to do alternative solutions and they have their own trade-offs, specifically:
- Register in validator monitor seen blocks by attached validators
- Extend the EpochContext to include the prev epoch proposers
- Track orphan data at the end of each epoch cross referencing those data points
That approach does not require to change the fork-choice but requires another cache in the validator monitor. It think we can explore that latter, but for now this PR is good. Also there's a bug that prevents that approach from working always
🎉 This PR is included in v1.9.0 🎉 |
Compute and add proposal stats in grafana metrics on finalization
TODO:
Closes #4636