Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nomination: do not vote for new values after confirming a candidate #3544

Conversation

marta-lokhova
Copy link
Contributor

@marta-lokhova marta-lokhova commented Sep 22, 2022

Resolves #3523

  • Cancel the nomination timer as soon as we have a candidate value. Turn nominate into a no-op when a node confirmed a candidate (just in case it ever gets called outside of the timer).
  • add a metric to count how many values are referenced per consensus round. This is done at the slot purge time by counting referenced tx sets in the cache, so this way we know how many values we had to pull. Note that this metric includes value from self as well, even though sometimes the locally generated value might not actually be propagated to the network. If this is an issue, I can add an extra condition to count the local value only if the node actually nominated. Or we can move the cache population call (addTxSet) to when the node actually nominates its local value (but this might cause the node to pull the tx set it generated locally anyway)

Copy link
Contributor

@hidenori-shinohara hidenori-shinohara left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change looks good to me! I had one small question about logging.

@nano-o
Copy link

nano-o commented Sep 28, 2022

This makes sense to me at a high level. As I understand it, this applies to voting only, and we should keep accepting and confirming values even if we have a candidate already.

@marta-lokhova marta-lokhova force-pushed the nomination_do_not_introduce_local_value branch from 2ec528f to 1790a94 Compare September 28, 2022 22:28
@hidenori-shinohara
Copy link
Contributor

LGTM!

This makes sense to me at a high level. As I understand it, this applies to voting only, and we should keep accepting and confirming values even if we have a candidate already.

Yeah, that's how I understood it too.

@@ -674,10 +677,20 @@ PendingEnvelopes::eraseBelow(uint64 slotIndex)

// 0 is special mark for data that we do not know the slot index
// it is used for state loaded from database
mTxSetCache.erase_if([&](TxSetFramCacheItem const& i) {
auto erased = mTxSetCache.erase_if([&](TxSetFramCacheItem const& i) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is a clean data source:

  • txset cache is a cache -> there is no guarantee that you'll find all entries
  • as we're counting when slots are evicted, there is a higher chance that eviction took place
  • more noise may be added if multiple slots are purged at once

There is a function logQuorumInformation that seems to be a better place for this kind of tracking: it's only called when externalizing the latest slot, and with a ~fixed window to allow for messages to propagate.
I think that you could combine something like processCurrentState and getTxSetHashes to build a set of unique values referenced by that slot.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think the initial idea was to get an estimate on how many values the node is processing per ledger, which included values we pulled but didn't record (if an envelope got rejected, for example). I agree that looking at the cache can be noisy, though in this case eviction is extremely unlikely (the cache is sufficiently big, as I'm assuming we want to be able to answer queries about recent ledgers). That being said, I think a metric for values propagated network-wide makes more sense here. I updated the code based on your suggestion.

@marta-lokhova marta-lokhova force-pushed the nomination_do_not_introduce_local_value branch from 2f238ca to 74a672e Compare October 4, 2022 20:53
@MonsieurNicolas
Copy link
Contributor

r+ 74a672e

@latobarita latobarita merged commit d167117 into stellar:master Oct 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Nomination: do not introduce a new value when we have a candidate
5 participants