-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve Array.Sort(T[]) performance #35297
Conversation
c9c9ccd
to
5482eac
Compare
src/libraries/System.Private.CoreLib/src/System/Collections/Generic/ArraySortHelper.cs
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/Collections/Generic/ArraySortHelper.cs
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/Collections/Generic/ArraySortHelper.cs
Show resolved
Hide resolved
5482eac
to
dc739b0
Compare
src/libraries/System.Private.CoreLib/src/System/Collections/Generic/ArraySortHelper.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/Collections/Generic/ArraySortHelper.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/Collections/Generic/ArraySortHelper.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/Collections/Generic/ArraySortHelper.cs
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/Collections/Generic/ArraySortHelper.cs
Outdated
Show resolved
Hide resolved
dc739b0
to
78c521b
Compare
src/libraries/System.Private.CoreLib/src/System/Collections/Generic/ArraySortHelper.cs
Show resolved
Hide resolved
78c521b
to
13c854c
Compare
A variety of tweaks to improve `Array.Sort<T>(T[])` performance and address a regression left over from moving the array sorting implementation from native to managed. The two most impactful are using `Unsafe.*` in `PickPivotAndPartition` to avoid bounds checks and aggressive inlining on `SwapIfGreater`. A few other small improvements to codegen round it out. I only made the unsafe changes in the `Sort<T>(T[])` implementation, and not in the more complicated implementations, such as for `Sort<T>(T[], Comparer<T>)` and `Sort<TKey, TValue>(TKey[], TValue[])`, but I did make some of the smaller changes for consistency across the file.
13c854c
to
2eef7dd
Compare
FWIW, I've double checked my side and it is exactly 10% (before these changes), on both my Kaby Lake and AMD Ryzen machine. I will mention that my Intel machine is clean of the Intel JCC microcode update, so that may be a factor in this, depending's on @stephentoub's machine configuration. |
Thanks. Are you able to test with this PR? |
Issue seems resolved.
|
Great. Thanks for confirming! |
while (pivot.CompareTo(keys[++left]) > 0) ; | ||
while (pivot.CompareTo(keys[--right]) < 0) ; | ||
while (Unsafe.IsAddressLessThan(ref leftRef, ref nextToLastRef) && pivot.CompareTo(leftRef = ref Unsafe.Add(ref leftRef, 1)) > 0) ; | ||
while (Unsafe.IsAddressGreaterThan(ref rightRef, ref zeroRef) && pivot.CompareTo(rightRef = ref Unsafe.Add(ref rightRef, -1)) < 0) ; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stephentoub this will no longer result in an index-out-of-range exception thrown when bogus comparable, insteead, it silently swallows that case now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Incorrect comparables wouldn't have always done so, only for some forms of incorrect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand, does that mean you think this is not a problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. You disagree?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have thought that it would be a breaking change, you will no longer get an exception. Behavior is different from earlier versions. Is it not?
I have no hard feelings about this, clearly my bar for acceptable changes, perf regressions etc. was set too high in some regards 😅 When I worked on this I was aiming for 100% fidelity with existing output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thinking when I chose to make this change was that the comparer is busted, and there are a multitude of possible behaviors that could result from a busted comparer... it could throw arbitrary exceptions, it could sporadically yield an incorrect comparison in a way no one would know, it could crash the process, etc. We could even be changing behavior in that regard just by changing how many times we invoked the comparer, or by changing the order in which we invoked it on the input data, or any number of other things. So I'm not concerned with taking a failure we may have only sometimes been able to detect (and for which a goal was never true detection) and making it into something which sometimes fails differently, and does so by not throwing instead of throwing.
@jkotas, any concerns?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not have concerns about the subtle behavior change with broken comparers. My primary concerns around the unsafe code are buffer overruns. It is easy to see that the bounds are checked in this case.
aim for 100% fidelity with existing output.
That was a good place to start and the initial PR that got merged maintained this fidelity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, good to know. Perf/code size will be better without the check 👍 I have tests that will fail due to this change, since I test against built-in Sort
, but that is my concern then.
@stephentoub @jkotas guess this means dotnet/corefx#26859 could have been merged anyhow :) |
There's a ton more unsafe code in that PR than in this one. The "unsafety" in this PR is scoped to just two functions, with extra scrutiny as called out in dotnet/corefx#26859 (comment). |
True, but mainly due to code having to be repeated so many times, code is the same across versions, if one is safe, is the other not? :) Guess source generators might "solve" this. |
#35175 was opened about a 10% regression in sorting throughput (specifically looking just at
Int32[]
) from .NET Core 3.1 to .NET 5.On my machine:
our dotnet/performance tests don't show a regression anywhere near that, e.g.
However, the benchmarks shared in that PR do show a regression in some cases for Int32 on my machine, albeit not quite what was cited:
Even so, a simple console app does sufficiently demonstrate a meaningful throughput regression:
On .NET Core 3.1 I get results like:
and with master I get results like:
which is closer to a 20% regression.
Since even though our actual benchmarks aren’t showing anything close to that (and in some cases .NET 5 being meaningfully faster, especially with strings), this PR addresses the gap. It includes a variety of tweaks to improve
Array.Sort<T>(T[])
performance; the two most impactful are usingUnsafe.*
inPickPivotAndPartition
to avoid bounds checks and aggressive inlining onSwapIfGreater
. A few other small improvements to codegen round it out. I only made the unsafe changes in theSort<T>(T[])
implementation, and not in the more complicated implementations, such as forSort<T>(T[], Comparer<T>)
andSort<TKey, TValue>(TKey[], TValue[])
, but I did make some of the smaller changes for consistency across the file.Fixes #35175
@jkotas, any visceral reaction to the
Unsafe.*
code here? 😄cc: @damageboy, @adamsitnik
In terms of impact, here are my results from my running the benchmarks shared in #35175:
For the simple command-line app, we now get results like:
I also tweaked the above to remove the app that copies the unsorted data over each array, such that we're then sorting an already sorted array each time. With that, on .NET Core 3.1 I get:
and on master I get:
and with this PR I get:
Finally, I checked GC pause times similar to what @jkotas did in dotnet/coreclr#27642 (comment). With .NET Core 3.1, we get average GC pause times around 500ms, and with .NET 5 (both master and this PR), we get average GC pause times around 15ms. This highlights one of the benefits of moving the sorting into managed code, separate from all the other benefits.