-
Notifications
You must be signed in to change notification settings - Fork 4.9k
[WIP] Add Sort(...) extension methods for Span<T> #26859
Conversation
Initial set of benchmark runs from https://github.com/DotNetCross/Sorting/blob/c168e6ff6a10f2cb2bde911f719feacffcfaffa7/tests/DotNetCross.Sorting.Benchmarks/Program.cs can be found at https://github.com/DotNetCross/SortingBenchmarkResults/tree/master/180205_c168e6ff_i5-3475S_nietras Note that this was run BenchmarkDotNet=v0.10.12, OS=Windows 10 Redstone 3 [1709, Fall Creators Update] (10.0.16299.192)
Intel Core i5-3475S CPU 2.90GHz (Ivy Bridge), 1 CPU, 4 logical cores and 4 physical cores
Frequency=2840377 Hz, Resolution=352.0659 ns, Timer=TSC
.NET Core SDK=2.1.4
[Host] : .NET Core 2.0.5 (Framework 4.6.26020.03), 64bit RyuJIT
Platform=X64 Runtime=Core Toolchain=InProcessToolchain
LaunchCount=1 RunStrategy=Monitoring TargetCount=11
UnrollFactor=1 WarmupCount=3 As I mention value type performance is on par or better, but reference type performance is a lot slower. |
I've added lots of new benchmarks and disassembling courtesy of fixes made by the BenchmarkDotNet guys, in https://github.com/DotNetCross/Sorting/tree/04add921f49e19a33ccb00d81145fd765f1ecb91 which can be found at https://github.com/DotNetCross/SortingBenchmarkResults/tree/master/180209_04add921_i5-3475S_nietras This adds benchmarks of overloads and more types. Overall, the results are the same. For value types and value type comparers perf is good even great see SingleSortBench-report-github.md, observe how ClassComparableComparer is much worse than for Array, though. As soon as reference types are involved things start getting bad. Just to be clear, I am waiting for feedback on this and why this is and whether or not I really have to add variants for each of the specific cases, what needs adding is variants based on TComparer, IComparer and Comparison it seems, if these can't be handled with decoration as I had hoped. |
There might still be some duplication of code with LINQ. |
@stephentoub @KrzysztofCwalina as far as I am concerned this was done, until the "security" issue popped up (long after starting this):
As then you might as well simply refactor existing array code (pretty much search/replace and then change a few indexing stuff) to use span and use that for all "consumers" (e.g. Array, List, Span) with the span version being the common ground. That is, if the version I made here isn't useable after all. As I noted, the existing From my point of view this code could be used for all consumers, but I have to say that I am not sure I have much time to revisit this again, though. |
Ok, thank you for your efforts on this. Given the above, I'm going to close this PR, but hopefully whoever picks this up next will be able to build on your efforts / reuse your commits as is appropriate. |
Last days I experimented with other code and made sure that bounds checks unacceptably slow down for such code. Bounds checks should be done only once at the top level for security. Now I believe that @nietras's code is a right direction. |
@iSazonov that's my experience too (I did some experiments with sorting spans here). Might it be an idea to stick something like this into a separate package with the intent to include it in core in the future? Because I think it's a shame it's not available until everything is "perfect", and simultaneously, I think it's worrisome that once it's in, it's very hard to change, and that implies that choices made here may stick around even when they're not ideal in the long run. In particular, the idea that this version sorts the same way as array.sort in the face of buggy comparers or indeterminate orderings sounds unnecessarily restrictive to me, and it makes the code somewhat larger and slower. |
@EamonNerbonne it was my intent to release https://github.com/DotNetCross/Sorting as a nuget package so it was at least available in some form, just never got around too it, since I wanted to allow the sorting in this package to diverge a little bit (still correct sorting) from corefx to make some use cases a lot faster. The PR to corefx was targeting sorting that was exactly the same as before. The corefx version is inconsistent across different types when keys are identical. |
@stephentoub if any of the comments have changed your view on this PR let me know. I still believe the Span/ref based sort code can be used as the one and only sorting code in coreclr/corefx. |
Which view are you referring to? |
@stephentoub that the Sorting impl must use bounds checked indexing instead of ref based without bounds checking. Despite the old c++ impl does the same. |
I didn't comment on bounds checking, did I? I've not actually reviewed this PR before to my knowledge. |
@stephentoub sorry given you closed it I thought you were "in the know" of the reasons why this was dismissed. Can't remember who said it, but as far as I recall that was the reason for this PR not being accepted. |
If bounds checking is not an issue and it's a simple. matter of reducing code duplication (e. g. use the span version everywhere) then I could try getting this done again with help/guidance around that. |
I closed it because it was still a WIP and you noted that you weren't sure when you'd be able to spend more time pushing it forward. My ideal would be that this not copy the array's sorting code but rather actually share the same implementation, living in corelib and changing the existing implementation to work on either arrays or spans rather than just arrays. |
This was based on me doing a complete rewrite/start from scratch due to ask for bounds checking. Because then you might as well just have copied and search replaced Given how much time I spent on doing the ref version, I didn't see the point to be frank 😆 |
I kept the WIP because no one said this was acceptable, I also wanted to get the span version in, before removing the old code, to do one thing at a time. As I said I believe this PR can be used to consolidate all code to a single managed version, with perf being on par or in most cases faster. Give or take. Question is what implementation is acceptable? What principles apply? Bounds checking everywhere? Consolidate all code in same PR? |
I believe @jkotas commented originally on bounds checking, so he should comment, too, but from my perspective, the min bar would be:
|
This in my view requires the |
As I noted, I've not reviewed the PR, but just on the surface, this PR is in corefx rather than coreclr, it's duplicating code rather than using the same exact implementation (I can't easily tell how much it's diverged from what array is currently using), and because of both of those, I can't easily determine the impact it actually has on the existing array sorting safety or performance. My suggestion would be to start a new PR in coreclr and go from there. |
There was a separate PR for
Yes, of course, one step at a time I would think 🙂
That's a hard question, current implementation uses native not bounds checked code for specialized sorts. Versus not in other code paths. Does this make the native version less safe? I understand the importance of this but it does not answer whether bounds checking is required.
I did my best, as this PR reflects, to show this was not the case. |
The current managed array sorting code in CoreLib is safe with bounds checks. It is immune to buffer overruns by design. The bounds checks are skipped in the special cased unmanaged code for primitive types only. That unmanaged code was was a factory for security bugs that turned into MSRC (https://blogs.technet.microsoft.com/msrc/) events. I have said that the thousands lines of a new unsafe sorting code without boundschecks will need extra security scrutiny given the history. |
Great, that's where the discussion should be, then. The only changes that should be needed in corefx are the additions to the reference assemblies and tests. |
I understand, but the ref rewrite required a lot of changes, and I thought a single "version" for different "compares" was possible without drawbacks, it was not. The coreclr code is not it my view clear on the many subtle differences between the different versions, so if I have to adapt back to being like coreclr again would indeed be a lot of work. I rearranged the code here to clearly show the different versions. And if the bounds checking is still required, well then I still think my efforts were wasted, I'd have to say. |
coreclr PR: Note though I wanted code to be acceptable here first because testing is much easier here than in coreclr. Perhaps things have changed. Given @jkotas the bounds checking is still a question that needs a clear answer and path forward, how should the code look? I understand the importance of security here, but you will have to decide which versions of the code is acceptable. |
Another possibility it to pick a few simpler sub-problems and avoid bounds-checking only on those. E.g the insertion-sort finisher is a pretty good candidate for that. |
Implements https://github.com/dotnet/corefx/issues/15329
WIP: This is still very much a work-in-progress. Warts and all! :)
Goals
coreclr
Array.Sort
code without major changes to the algorithm.That is, it is still Introspective Sort using median-of-three quick sort and heap sort.
Array.Sort
.Non-Goals
for this PR to hopefully be accepted without too much controversy :)
Array.Sort
Array.Sort
is implemented as both managed code and native code (for some primitives) in:TrySZSort
)Base and Variant Differences
This PR is based mainly on the native implementation and the generic implementation.
In retrospective I believed the different variants of Array.Sort would sort identically, but
they do not. The fact that I started out with the native implementation and focused on
primitives and the specialization of these, appears to have been a mistake.
The fact is
Array.Sort
can yield different sorting results depending on the variant usedwhen also sorting items (also called values, since the items used for sorting are then keys).
Note that the sort is still correct, it is just that equal keys can have different results
for where the items are. I.e. here is an example that comes from a special test:
Minor Bug Fix
Minor bug fix for introspective depth limit, see dotnet/coreclr#16002
Code Structure
Code is currently, probably temporary, structured into multiple files
to allow for easier comparison of the different variants.
Consolidated (most likely will be removed):
SpanSortHelpers.KeysAndOrValues.cs
ref struct
and value type generic arguments to ensurethis would could be done with neglible performance impact but a few things were in the way:
ref struct
s can't containref
sbut probably a bit too C++ template like.
System.Ben
to see if it helped ;)Current:
SpanSortHelpers.Common.cs
Swap
.SpanSortHelpers.Keys.cs
SpanSortHelpers.Keys.IComparable.cs
SpanSortHelpers.Keys.Specialized.cs
SpanSortHelpers.Keys.TComparer.cs
SpanSortHelpers.KeysValues.cs
SpanSortHelpers.KeysValues.IComparable.cs
SpanSortHelpers.KeysValues.Specialized.cs
SpanSortHelpers.KeysValues.TComparer.cs
Primary development was done in my DotNetCross.Sorting repo:
https://github.com/DotNetCross/Sorting/
This was to get a better feedback loop and to use BenchmarkDotNet for benchmark testing
and disassembly.
NOTE: I understand that in
corefx
we might want to consolidate this into a single file,but for now it is easier when comparing variants. Note also that the
coreclr
hasmore variants than currently in this PR, there is for example a variant for
Comparison<T>
in
coreclr
.Changes
Many small changes have been made due to usings refs and Unsafe, but a couple notable changes are:
Sort3
add a specific implementation for sorting three, used both for finding pivot and when sorting exactly 3 elements.Sort2
three times likecoreclr
as otherwise,there would be differences for some same key cases. That I haven't had time to debug.
Sort3
? For expensive comparisons this can be a big improvementif keys are almost already sorted, since only 2 compares are needed instead of 3.
Sort2
instead ofSwapIfGreater
.IDirectComparer
allowing for the kind ofspecialization for basic types that
coreclr
does in native code.This started out as just a
LessThan
method but the problem is thenwhether bogus comparers should yield the same result as
Array.Sort
so Ichanged this to have more methods.
Benchmarks
Since
Sort
is an in-place operation it is "destructive" and benchmarking is done a bit different than normal.The basic code for benchmarks is shown below, this uses https://github.com/dotnet/BenchmarkDotNet and was
done in the https://github.com/DotNetCross/Sorting/ git repo. Porting these to
the normal
corefx
performance tests is on the TODO list.The benchmarks use a
Filler
to pre-fill a_filled
arraywith a pattern in slices of
Length
. Depending on the filler the full array will then be filledwith either repeating patterns of
Length
or fill the entire array with a given pattern.This allows using this for testing sorts of different lengths, but measures the total time to do a number of sorts,
so it includes the overhead of slicing and the loop. And allows testing different patterns/sequnces.
The main point here is that this is used to compare relatively to the
Array.Sort
.Results for this for specific commits will come in comments later.
Fillers
As noted there are different fillers. Som fill the entire array not caring about the slice length.
Others fill based on slice length. This is particularly important for MedianOfThreeKiller.
length - 1
to0
.0
tolength - 1
.Random pairs are then swapped, as a ratio of length. In this case 10%,
i.e. 10% pairs have been swapped randomly. Seeded so each run is the same.
For each different type, the
int
value is converted to the given typee.g. using
ToString("D9")
forstring
.These fillers are also used for test case generation. But are combined with other
sort case generators.
Difference to BinarySearch
BinarySearch
only has an overload without comparer for whenT : IComparable<T>
. That is, there is no overload where the value searched for is not not generically constrained.This is unlike
Sort
where there is no generic constraint on the key type. If we had dotnet/csharplang#905 this might not be that big an issue, but we might expect issues for some uses ofBinarySearch
.Tests
The fundamental principle of the tests are they use
Array.Sort
for generating the expected output. The idea being spanSort
should give exactly the same result.Notes and TODOs
Biggest problem currently is that there are differences for some specific test cases:
coreclr
only does that when items/values are of the same type. This meanscoreclr
does not do aNaNPrepass
for floating point types when items are not of same type.Array.Sort
will throw on these for some lengths.There are a lot of other remaining TODOs e.g.:
corefx
standard.Performance is currently on par or significantly better than
coreclr
for value types. As soonas reference types are used performance is... well miserably. This probably reflects the fact that
my main focus was on optimizing for
int
s. I believe the issue here must be around how I have factoredthe generic code.
JIT_GenericHandleMethod
orJIT_GenericHandleClass
, which shows up during profiling? This is the reason for theIComparable
variant of the code... but it is still slow.CC: @ahsonkhan @jkotas