Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doc: The default sorting alg. is stable from 1.9 #47579

Merged
merged 16 commits into from
Nov 21, 2022
Merged
67 changes: 35 additions & 32 deletions doc/src/base/sort.md
Original file line number Diff line number Diff line change
Expand Up @@ -141,53 +141,56 @@ There are currently four sorting algorithms available in base Julia:
* [`PartialQuickSort(k)`](@ref)
* [`MergeSort`](@ref)

`InsertionSort` is an O(n^2) stable sorting algorithm. It is efficient for very small `n`, and
is used internally by `QuickSort`.
`InsertionSort` is an O(n²) stable sorting algorithm. It is efficient for very small `n`,
petvana marked this conversation as resolved.
Show resolved Hide resolved
and is used internally by `QuickSort`.

`QuickSort` is an O(n log n) sorting algorithm which is in-place, very fast, but not stable –
i.e. elements which are considered equal will not remain in the same order in which they originally
appeared in the array to be sorted. `QuickSort` is the default algorithm for numeric values, including
integers and floats.
`QuickSort` is an in-place and very fast sorting algorithm with an average-case time
petvana marked this conversation as resolved.
Show resolved Hide resolved
complexity of O(n log n). Since Julia v1.9, `QuickSort` is stable, i.e., elements considered
petvana marked this conversation as resolved.
Show resolved Hide resolved
equal will remain in the same order. Notice that O(n²) is worst-case complexity, but it gets
vanishingly unlikely as the pivot selection is randomized in Julia v1.9.

`PartialQuickSort(k)` is similar to `QuickSort`, but the output array is only sorted up to index
`k` if `k` is an integer, or in the range of `k` if `k` is an `OrdinalRange`. For example:
`PartialQuickSort(k::OrdinalRange)` is similar to `QuickSort`, but the output array is only
sorted in the range of `k`. For example:

!!! compat "Julia 1.9"
The `QuickSort` and `PartialQuickSort` are stable since Julia 1.9.
LilithHafner marked this conversation as resolved.
Show resolved Hide resolved

```julia
x = rand(1:500, 100)
k = 50
k2 = 50:100
s = sort(x; alg=QuickSort)
ps = sort(x; alg=PartialQuickSort(k))
qs = sort(x; alg=PartialQuickSort(k2))
map(issorted, (s, ps, qs)) # => (true, false, false)
map(x->issorted(x[1:k]), (s, ps, qs)) # => (true, true, false)
map(x->issorted(x[k2]), (s, ps, qs)) # => (true, false, true)
s[1:k] == ps[1:k] # => true
s[k2] == qs[k2] # => true
x = rand(1:500, 100);
k = 50:100;
s1 = sort(x; alg=QuickSort);
s2 = sort(x; alg=PartialQuickSort(k));
map(issorted, (s1, s2)) # => (true, false)
petvana marked this conversation as resolved.
Show resolved Hide resolved
map(x->issorted(x[k]), (s1, s2)) # => (true, true)
s1[k] == s2[k] # => true
```

`MergeSort` is an O(n log n) stable sorting algorithm but is not in-place – it requires a temporary
array of half the size of the input array – and is typically not quite as fast as `QuickSort`.
It is the default algorithm for non-numeric data.

The default sorting algorithms are chosen on the basis that they are fast and stable, or *appear*
to be so. For numeric types indeed, `QuickSort` is selected as it is faster and indistinguishable
in this case from a stable sort (unless the array records its mutations in some way). The stability
property comes at a non-negligible cost, so if you don't need it, you may want to explicitly specify
your preferred algorithm, e.g. `sort!(v, alg=QuickSort)`.
The default sorting algorithm is chosen on the basis that it is stable and fast, or *appear*
petvana marked this conversation as resolved.
Show resolved Hide resolved
to be fast. Usually, `QuickSort` is selected, but `InsertionSort` is preferred for small
data. You can also explicitly specify your preferred algorithm, e.g.
`sort!(v, alg=PartialQuickSort(10:20))`.

The mechanism by which Julia picks default sorting algorithms is implemented via the
`Base.Sort.defalg` function. It allows a particular algorithm to be registered as the
default in all sorting functions for specific arrays. For example, here is the default
method from [`sort.jl`](https://github.com/JuliaLang/julia/blob/master/base/sort.jl):

The mechanism by which Julia picks default sorting algorithms is implemented via the `Base.Sort.defalg`
function. It allows a particular algorithm to be registered as the default in all sorting functions
for specific arrays. For example, here are the two default methods from [`sort.jl`](https://github.com/JuliaLang/julia/blob/master/base/sort.jl):
```julia
defalg(v::AbstractArray) = DEFAULT_STABLE
```

You may change the default behavior for specific element type by:
```julia
defalg(v::AbstractArray) = MergeSort
defalg(v::AbstractArray{<:Number}) = QuickSort
defalg(v::AbstractArray{<:Number}) = MergeSort
petvana marked this conversation as resolved.
Show resolved Hide resolved
```

As for numeric arrays, choosing a non-stable default algorithm for array types for which the notion
of a stable sort is meaningless (i.e. when two values comparing equal can not be distinguished)
may make sense.
!!! compat "Julia 1.9"
The default sorting algorithm (returned by `Base.Sort.defalg` function) is
stable since Julia 1.9.
petvana marked this conversation as resolved.
Show resolved Hide resolved

## Alternate orderings

Expand Down