-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scripts for estimating validation bit rate #1240
base: main
Are you sure you want to change the base?
Conversation
This PR is draft.
|
Esgen also has some I suggest we use this PR to decide what on one presentation and only merge that. |
@amesgen My scripts divide time into 10 seconds chunks. If I recall correctly, your Python script actually slides a window of X blocks, for a few values of X. I think using proper time is closer to information we want, but it introduces the wrinkle of deciding how to account for blocks whose validation spans a boundary of the sliding window. In my script, I include only all blocks that began validating in the 10 seconds window, even if the last such block took 100 seconds to validate. In fact, if some block were to take more than 2*10 seconds to validate, then there'd necessarily be at least one 10 seconds chunk in which there were no blocks whatsoever. 🤔 My script uses the total validation time of those blocks as the denominator --- it doesn't unsoundly assume it took exactly 10 seconds to validate those blocks. This means the "windows" in my analysis are of actually of varying size... as determined by the moments between the blocks' validation intervals --- which is quite similar to yours, actually! The following approach might be excessively accurate, but: I think it would be most appropriate (at least for for our current purposes) to partition a block's size into multiple (sliding) windows in proportion to how much of the block's total validation duration overlaps with the window. Which is a relatively straight-forward calculation, despite making the sliding window logic unusually dynamic/awkward. Edit: Having boldly written "the most appropriate", I couldn't help but immediately start considering alternatives. I'll share one in my next comment. |
This is similar to my first idea, except I divided time in to 10 seconds chunks 100 times, each offset by 0.1 seconds. Then, for each set of 100 "aligned" windows, I kept only the one with the maximum bit rate. Something is still slightly off: the ratio I'm calculating is as if I need to download the same blocks I'm validating as I'm validating them. That's not quite right: I need to download the "next" blocks, not the ones I am validating. I'll give this a bit more thought tomorrow. |
654ea98
to
d356b5a
Compare
d356b5a
to
45bcf0b
Compare
After my morning call with Esgen, we developed the plot generated by my most recent commit. Here are the two files that come out. You'll want to view them in a dedicated browser tab so that you can zoom to 100% height and the scroll along the x-axis. See the plot-1.png (2.2 megabytes) It's not trivial to relate this buffer size to the current code. It has up to 10 blocks explicitly buffered between the BlockFetch client and the ChainSel logic, but it also has an additional "buffer" of the bytes in-flight with the BlockFetch peer. For a strong connection, that might be enough bytes to fit several max size blocks, eg. |
No description provided.