-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Small speedup in incremental accumulator updates #5576
Conversation
Instead of updating at most two accumulators, update all accumluators during incremental updates. Tests have shown that this change yields a small speedup of at least 0.5%, and up to 1% with shorter TC. Passed STC: LLR: 2.93 (-2.94,2.94) <0.00,2.00> Total: 54368 W: 14179 L: 13842 D: 26347 Ptnml(0-2): 173, 6122, 14262, 6449, 178 https://tests.stockfishchess.org/tests/view/66db038a9de3e7f9b33d1ad9 Passed 5+0.05: LLR: 2.98 (-2.94,2.94) <0.00,2.00> Total: 55040 W: 14682 L: 14322 D: 26036 Ptnml(0-2): 303, 6364, 13856, 6664, 333 https://tests.stockfishchess.org/tests/view/66dbc325dc53972b68218ba7 Passed non-regression LTC: LLR: 2.95 (-2.94,2.94) <-1.75,0.25> Total: 57390 W: 14555 L: 14376 D: 28459 Ptnml(0-2): 37, 5876, 16683, 6069, 30 https://tests.stockfishchess.org/tests/view/66dbc30adc53972b68218ba5 No functional change
Wow, congrats on the speedup! |
the questions are general, I'm good with any reply about this patch. |
They make perfect sense to me |
I was about to sum up the logics behind but I had to sleep first, thanks peregrine for the questions. :)
This is the current update strategy: Let's assume there are not computed accumulators between two accumulators (that we are trying to update), and only
As you see, for a few iterations, SF has to write to two different accumulators that are not close to each other, which is possibly not cache-friendly. Now, this patch updates all accumluators up to Note that this explanation is not what I had in my mind first; I'd just been trying different approaches to find something works.
It depends on the shape of search tree and update/refresh cost model. I think experimenting with update/refresh cost model further is a good direction to improve this patch. |
I've been making some attempts to improve update behaviour for some time, and I think there are some optimisations we can make on this PR. I will try and launch some tests in the coming days. Good job on this PR! PS. in my head the main saving is the fact that we save multiple adds and subs, i.e. if 1 is computed and current accumulator is 5, at the moment we will update 2 and 5. However, what is to note is that 3 and 4 are computed in the intermediate steps to update 5, ie. this PR saves some extra recomputation. |
Eventually 3 and 4 are computed in later evaluations so I thought the cost itself is equal. Also if the search tree is not wide (i.e. multiple |
if you think about it, it's more like: if you look at it, "apply 3" and "apply 4" are called twice vs patch: although, this is assuming the worst case so it makes sense why the extra savings don't gain as much as one might think in the theoretical sense |
Yes that's correct as well, although I'm not sure if populating removed/added lists is slow to that extent. Might be worth to check where the gains are from to optimize this even further. |
Instead of updating at most two accumulators, update all accumluators during incremental updates. Tests have shown that this change yields a small speedup of at least 0.5%, and up to 1% with shorter TC.
Passed STC:
LLR: 2.93 (-2.94,2.94) <0.00,2.00>
Total: 54368 W: 14179 L: 13842 D: 26347
Ptnml(0-2): 173, 6122, 14262, 6449, 178
https://tests.stockfishchess.org/tests/view/66db038a9de3e7f9b33d1ad9
Passed 5+0.05:
LLR: 2.98 (-2.94,2.94) <0.00,2.00>
Total: 55040 W: 14682 L: 14322 D: 26036
Ptnml(0-2): 303, 6364, 13856, 6664, 333
https://tests.stockfishchess.org/tests/view/66dbc325dc53972b68218ba7
Passed non-regression LTC:
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 57390 W: 14555 L: 14376 D: 28459
Ptnml(0-2): 37, 5876, 16683, 6069, 30
https://tests.stockfishchess.org/tests/view/66dbc30adc53972b68218ba5
No functional change