-
Notifications
You must be signed in to change notification settings - Fork 29.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
process: slightly simplify next tick execution #16888
Conversation
Note: this will need adjustment to |
Ok, so it seems like the CI benchmark turned out worse than my local testing:
Doing a second run just to be sure: |
Ok, I'm pretty sure that wasn't necessarily reflective. I think there is likely a slight performance dip but it's not consistent and very minor.
Third run: https://ci.nodejs.org/view/Node.js%20benchmark/job/benchmark-node-micro-benchmarks/26/ |
Yep, the results are worse after all. Unless we're ok with 2-4% decline on the depth benchmark I'll probably have to close this.
|
Seems like this should probably not be backported, or only to 8.x? |
@Fishrock123 Definitely shouldn't. I don't even know if this can go on 9.x to be honest given the slight decline in performance. |
Ok, I think there's no performance regression after bumping up the iterations to remove any noise.
|
@apapirovski the benchmarks don't show any significance (huge p-value, except for |
@refack yeah, I was just worried about
but it seems like it was noise after bumping up iterations and running it more times. |
So we got:
I'm trying to think about way to improve the benchmarks to see if there is a significant pattern after all. |
Sorry, I don't quite understand this. Why we can't do that? It seems that the update:
|
158071c
to
39ec9e6
Compare
[refack: I pushed a change, then pushed it out, because I wanted to use the benchmark machine] |
A couple more runs:
Just looking at the raw data, I can see that both the old code and the new code seem to end up in situations where they're either deoptimized or just not optimized by V8. |
I'll probably spend some more time looking the nextTick code. The fact that there's such huge peaks and valleys indicates that there's something off in both versions. |
Here's the results from @refack's experiment with many runs (and low n value):
|
a2f43ef
to
b749fa1
Compare
I think the real issue here is basically that those benchmarks aren't really testing these particular changes, or at least not in a very representative way. Most of the difference seems to have to do with GC kicking in or not. On my local system I get none of the extreme dips and it's all pretty even. I created two benchmarks that should do a better job of testing the actual execution performance within Benchmark CI: https://ci.nodejs.org/view/Node.js%20benchmark/job/benchmark-node-micro-benchmarks/39/ |
b749fa1
to
9b12434
Compare
Ok, I think that got us more reliable info... 1st run:
2nd run:
|
I think at least some of the confusion here has to do with how you use statistics. In simple words: This means that the only conculsion (and this concusion will likely be wrong) that you can arrive at by running the benchmarks multiplie times, is that there is a difference. But you only conculude this because of the randomness, not because the difference is actually there. If you think there is a garbage collection issue that interfers with the results (and there may very well be). Then you need to analyse this issue without using the benchmark suite. You can use the Statisticians (especially frequentistic statisticians which is what is the base for our benchmark suite) don't like to admit that there are belifes in statestics, but the unfortunete truth is that there is :(
Since you are trying to prove that there is no difference, it will always have too high a variance. I discussed a bit in #8139 what consequences a high variance has. It is especially important that you understand that a high variance is fine if there is enogth runs. Here are some outputs of that:
Let's be super specific. Let's say we are looking for at least a 1% difference and we have 100 runs then we should have a coefficient of variation (
As you can see the variance for An important assumption in these calculations, is that they assume the variance is the same. But this does appear to be somewhat so. You can look at the F-test (
Don't expect it to be normal. Time is typically gamma distributed, because of the central limit theorm it will approach a normal distribution as you increase the number of iterations. But it will never become normally distributed. This is theoretically unfortunete, as the t-test assumes a normal distribution, however in practise it is not a huge deal. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM but it needs a rebase
e710e21
to
a40a73c
Compare
I've resolved all performance regressions in the latest version.
PTAL @addaleax, @refack, @cjihrig and others. This should be ready to land after the weekend. |
Get rid of separate function to call callback from _tickCallback as it no longer yields worthwhile performance improvement. Move some code from nextTick & internalNextTick into TickObject constructor to minimize duplication.
a40a73c
to
c9513e1
Compare
Get rid of separate function to call callback from _tickCallback as it no longer yields worthwhile performance improvement. Move some code from nextTick & internalNextTick into TickObject constructor to minimize duplication. PR-URL: #16888 Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Colin Ihrig <[email protected]> Reviewed-By: Refael Ackermann <[email protected]> Reviewed-By: Luigi Pinca <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Ruben Bridgewater <[email protected]> Reviewed-By: Timothy Gu <[email protected]>
Landed in cbaf59c. Thanks everyone for the reviews & patience in getting this code tightened up. |
Get rid of separate function to call callback from _tickCallback as it no longer yields worthwhile performance improvement. Move some code from nextTick & internalNextTick into TickObject constructor to minimize duplication. PR-URL: #16888 Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Colin Ihrig <[email protected]> Reviewed-By: Refael Ackermann <[email protected]> Reviewed-By: Luigi Pinca <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Ruben Bridgewater <[email protected]> Reviewed-By: Timothy Gu <[email protected]>
Get rid of separate function to call callback from _tickCallback as it no longer yields worthwhile performance improvement. Move some code from nextTick & internalNextTick into TickObject constructor to minimize duplication. PR-URL: #16888 Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Colin Ihrig <[email protected]> Reviewed-By: Refael Ackermann <[email protected]> Reviewed-By: Luigi Pinca <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Ruben Bridgewater <[email protected]> Reviewed-By: Timothy Gu <[email protected]>
Get rid of separate function to call callback from _tickCallback as it no longer yields worthwhile performance improvement. Move some code from nextTick & internalNextTick into TickObject constructor to minimize duplication. PR-URL: #16888 Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Colin Ihrig <[email protected]> Reviewed-By: Refael Ackermann <[email protected]> Reviewed-By: Luigi Pinca <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Ruben Bridgewater <[email protected]> Reviewed-By: Timothy Gu <[email protected]>
This PR gets rid of a separate function to execute the callback from
_tickCallback
as it no longer yields a performance benefit.Unlike with emit this doesn't yield an improvement on any of our benchmarks but since it also doesn't make them worse, I feel this is probably a worthwhile change (simpler code and less unnecessary internals in stack traces).
Unfortunately we still can't switch to using spread operator in
nextTick
instead of the manualarguments
copying. We can follow up on that in the next few V8 versions.Checklist
make -j4 test
(UNIX), orvcbuild test
(Windows) passesAffected core subsystem(s)
process