-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Limit network actions per batch #3415
base: main
Are you sure you want to change the base?
Conversation
Allow it to be read from other crates.
Prepare to limit the number of network actions created in a single batch.
Add a builder method to set the configuration option.
Ensure that the server has a network actions batch limit set.
Truncate the number of network actions added to a batch.
Avoid retrying to send network actions to the same recipients endlessly.
targets.truncate(limit); | ||
} | ||
} | ||
|
||
if let Some(tracked_chains) = self.tracked_chains.as_ref() { | ||
let publishers = self | ||
.chain |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I was only worried about returning too many NetworkAction
s, so I'm a little bit on fence regarding the randomization idea. Let's see what others think...
if let Some(limit) = self.config.network_actions_batch_limit { | ||
if limit > targets.len() / 2 { | ||
let elements_to_discard = targets.len().saturating_sub(limit); | ||
targets.partial_shuffle(&mut thread_rng(), elements_to_discard); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just use choose_multiple
? That should be faster, not require the case distinction, and the order of the selected elements doesn't really matter that much. I guess the downside is that it clones the entries, but they are small, I think?
Motivation
There's a risk of
O(n^2)
behavior in the chain worker, because when network actions are created (for cross-chain messages) all outboxes are inspected in order to figure out which actions should be created. However, the outboxes are only emptied after a confirmation of the receipt from the receiver. This means that until that confirmation is received, the same network actions might be created again and again, stressing the cross-chain channels.Proposal
Limit the number of outboxes inspected to create the network actions. The limit is configured on server shards to be equal to the cross-shard communication channel queue size. If the limit is reached, random outboxes are skipped when creating a new batch of network actions.
Test Plan
CI should catch any regressions.
TODO: should we add a stress test for this scenario?
Release Plan
devnet
branch, thentestnet
branch, thenBecause this could be related to a performance issue the testnet is running into.
Links