-
Notifications
You must be signed in to change notification settings - Fork 649
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Skip updating certain secondary indexes during replay #683
Comments
Note: I am building requirements. I am not claiming this issue. Please comment on this post, and I'll update it with your changes: According to libraries/chain/db_init.cpp, The The grouped_orders plugin also adds a secondary index to These indexes are kept updated during the replay process. But it is not necessary, as these indexes are only used when the current chain state is up to date. Therefore, delaying the updating of these indexes will increase replay performance. It has yet to be proven that all 4 of these indexes are not used during the replay process. Each index should be examined to verify it is not used while the replay process is in action. Only if it is not used should updates to it be skipped during the replay process. It appears the majority of the replay process is contained in libraries/chain/db_management.cpp. The process should be modified to:
Edit: Added |
I didn't check how many secondary indexes are there in the code. However, in grouped_orders plugin I did add one more. We do need to make sure that they're not used in replay. |
If possible, make a test run without secondary indexes first, to see how big the savings would be. |
@pmconrad I will attempt to.
@abitmore Do you have suggestions of what should happen if they are? I'm thinking of the scenario of a replay is running, and a client connects and makes an API call that requires a secondary index. Can that happen (I think clients can connect during replay, but unsure)? If so, should it block until complete? Should it return an error? |
Clients can't connect during replay. To do a simple test, we can remove related code from db_init.cpp, then try a replay, compare the time elapsed to the result when running old code. If they're needed, the replay should fail. |
Awesome. Here are my numbers: After 2 tries with secondary indexes, 3154084 blocks, avg(241.1075) secs with a difference of less than 0.5 secs After 2 tries without secondary indexes, 3154084 blocks, avg(234.9055) secs with a difference of less than 1.19 secs. Between with and without, a difference of 2.6%. So replaying 3154084 blocks from the genesis until the first of February, 2016 costs an extra 6 seconds. Note: As indexes grow, insertion times can be longer (although usually not linearly, heavily dependent on implementation). So interpolating based on number of current total blocks may not be accurate. Therefore, I tested again with a larger number of blocks (see further down): here are the details with a small number of blocks: Try 1 with secondary indexes and a larger number of blocks (25380257 blocks): A difference of 160.8179 seconds, which is 2.38%. |
Just found that I have some statistics about replay here: bitshares/bitshares-fc#20 (comment). Replay time with and without grouped_orders plugin (which has a secondary index) is |
My interpretation of the tests above:
With respect to the results above, I look forward to your comments, questions, and advice on how to proceed. |
IMO 2-3% performance gain to not justify the risks associated with getting it wrong. |
Fixed by #1918. |
Some (if not all) secondary indexes can be generated from current chain state only, no need to be continuously updated during replay.
The text was updated successfully, but these errors were encountered: