From de7d3f0d1975da90675409ae128645674ac9a2cd Mon Sep 17 00:00:00 2001 From: Petr Vana Date: Tue, 15 Nov 2022 15:33:53 +0100 Subject: [PATCH 01/14] Update docs: The default sorting alg. is stable --- doc/src/base/sort.md | 61 ++++++++++++++++++++++---------------------- 1 file changed, 30 insertions(+), 31 deletions(-) diff --git a/doc/src/base/sort.md b/doc/src/base/sort.md index 9f00381ab892c..be7cbd7286e6c 100644 --- a/doc/src/base/sort.md +++ b/doc/src/base/sort.md @@ -141,53 +141,52 @@ There are currently four sorting algorithms available in base Julia: * [`PartialQuickSort(k)`](@ref) * [`MergeSort`](@ref) -`InsertionSort` is an O(n^2) stable sorting algorithm. It is efficient for very small `n`, and -is used internally by `QuickSort`. +`InsertionSort` is an O(n²) stable sorting algorithm. It is efficient for very small `n`, +and is used internally by `QuickSort`. -`QuickSort` is an O(n log n) sorting algorithm which is in-place, very fast, but not stable – -i.e. elements which are considered equal will not remain in the same order in which they originally -appeared in the array to be sorted. `QuickSort` is the default algorithm for numeric values, including -integers and floats. +`QuickSort` is an in-place and very fast sorting algorithm with an average-case time +complexity of O(n log n). Since Julia v1.9, `QuickSort` is stable, i.e., elements considered +equal will remain in the same order. Notice that O(n²) is worst-case complexity, but it gets +vanishingly unlikely as the pivot selection is randomized in Julia v1.9. -`PartialQuickSort(k)` is similar to `QuickSort`, but the output array is only sorted up to index -`k` if `k` is an integer, or in the range of `k` if `k` is an `OrdinalRange`. For example: +`PartialQuickSort(k::OrdinalRange)` is similar to `QuickSort`, but the output array is only +sorted in the range of `k`. For example: ```julia x = rand(1:500, 100) -k = 50 -k2 = 50:100 -s = sort(x; alg=QuickSort) -ps = sort(x; alg=PartialQuickSort(k)) -qs = sort(x; alg=PartialQuickSort(k2)) -map(issorted, (s, ps, qs)) # => (true, false, false) -map(x->issorted(x[1:k]), (s, ps, qs)) # => (true, true, false) -map(x->issorted(x[k2]), (s, ps, qs)) # => (true, false, true) -s[1:k] == ps[1:k] # => true -s[k2] == qs[k2] # => true +k = 50:100 +s1 = sort(x; alg=QuickSort) +s2 = sort(x; alg=PartialQuickSort(k)) +map(issorted, (s1, s2)) # => (true, false) +map(x->issorted(x[k]), (s1, s2)) # => (true, true) +s1[k] == s2[k] # => true ``` `MergeSort` is an O(n log n) stable sorting algorithm but is not in-place – it requires a temporary array of half the size of the input array – and is typically not quite as fast as `QuickSort`. It is the default algorithm for non-numeric data. -The default sorting algorithms are chosen on the basis that they are fast and stable, or *appear* -to be so. For numeric types indeed, `QuickSort` is selected as it is faster and indistinguishable -in this case from a stable sort (unless the array records its mutations in some way). The stability -property comes at a non-negligible cost, so if you don't need it, you may want to explicitly specify -your preferred algorithm, e.g. `sort!(v, alg=QuickSort)`. +The default sorting algorithm is chosen on the basis that it is stable and fast, or *appear* +to be fast. Usually, `QuickSort` is selected, but `InsertionSort` is preferred for small +data. You can also explicitly specify your preferred algorithm, e.g. +`sort!(v, alg=PartialQuickSort(10:20))`. -The mechanism by which Julia picks default sorting algorithms is implemented via the `Base.Sort.defalg` -function. It allows a particular algorithm to be registered as the default in all sorting functions -for specific arrays. For example, here are the two default methods from [`sort.jl`](https://github.com/JuliaLang/julia/blob/master/base/sort.jl): +The mechanism by which Julia picks default sorting algorithms is implemented via the +`Base.Sort.defalg` function. It allows a particular algorithm to be registered as the +default in all sorting functions for specific arrays. For example, here is the default +method from [`sort.jl`](https://github.com/JuliaLang/julia/blob/master/base/sort.jl): ```julia -defalg(v::AbstractArray) = MergeSort -defalg(v::AbstractArray{<:Number}) = QuickSort +defalg(v::AbstractArray) = DEFAULT_STABLE ``` -As for numeric arrays, choosing a non-stable default algorithm for array types for which the notion -of a stable sort is meaningless (i.e. when two values comparing equal can not be distinguished) -may make sense. +You may change the default behavior for specific element type by: +```julia +defalg(v::AbstractArray{<:Number}) = MergeSort +``` + +!!! compat "Julia 1.9" + The default sorting algorithm is stable from Julia 1.9. ## Alternate orderings From 788224c126c9082ac1723ed2777dfcef84dcefbe Mon Sep 17 00:00:00 2001 From: Petr Vana Date: Tue, 15 Nov 2022 15:37:44 +0100 Subject: [PATCH 02/14] Compat 1.9 for QuickSort to be stable --- doc/src/base/sort.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/doc/src/base/sort.md b/doc/src/base/sort.md index be7cbd7286e6c..900a6723abb0a 100644 --- a/doc/src/base/sort.md +++ b/doc/src/base/sort.md @@ -152,6 +152,9 @@ vanishingly unlikely as the pivot selection is randomized in Julia v1.9. `PartialQuickSort(k::OrdinalRange)` is similar to `QuickSort`, but the output array is only sorted in the range of `k`. For example: +!!! compat "Julia 1.9" + The `QuickSort` and `PartialQuickSort` are stable since Julia 1.9. + ```julia x = rand(1:500, 100) k = 50:100 @@ -186,7 +189,7 @@ defalg(v::AbstractArray{<:Number}) = MergeSort ``` !!! compat "Julia 1.9" - The default sorting algorithm is stable from Julia 1.9. + The default sorting algorithm is stable since Julia 1.9. ## Alternate orderings From ec70c8867c803cd309e435419e2755a6ee2b3243 Mon Sep 17 00:00:00 2001 From: Petr Vana Date: Tue, 15 Nov 2022 15:41:02 +0100 Subject: [PATCH 03/14] Do not show the data --- doc/src/base/sort.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/doc/src/base/sort.md b/doc/src/base/sort.md index 900a6723abb0a..78c80a22e4777 100644 --- a/doc/src/base/sort.md +++ b/doc/src/base/sort.md @@ -156,10 +156,10 @@ sorted in the range of `k`. For example: The `QuickSort` and `PartialQuickSort` are stable since Julia 1.9. ```julia -x = rand(1:500, 100) -k = 50:100 -s1 = sort(x; alg=QuickSort) -s2 = sort(x; alg=PartialQuickSort(k)) +x = rand(1:500, 100); +k = 50:100; +s1 = sort(x; alg=QuickSort); +s2 = sort(x; alg=PartialQuickSort(k)); map(issorted, (s1, s2)) # => (true, false) map(x->issorted(x[k]), (s1, s2)) # => (true, true) s1[k] == s2[k] # => true From 8b277aaf7fa32cbadf2a7970c766bd2ae01f51c4 Mon Sep 17 00:00:00 2001 From: Petr Vana Date: Tue, 15 Nov 2022 15:48:27 +0100 Subject: [PATCH 04/14] Specify the default algorithm --- doc/src/base/sort.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/doc/src/base/sort.md b/doc/src/base/sort.md index 78c80a22e4777..5f3902496d571 100644 --- a/doc/src/base/sort.md +++ b/doc/src/base/sort.md @@ -189,7 +189,8 @@ defalg(v::AbstractArray{<:Number}) = MergeSort ``` !!! compat "Julia 1.9" - The default sorting algorithm is stable since Julia 1.9. + The default sorting algorithm (returned by `Base.Sort.defalg` function) is + stable since Julia 1.9. ## Alternate orderings From c740f9249597e5d9f9067a7036f18a4a29189545 Mon Sep 17 00:00:00 2001 From: Petr Vana Date: Fri, 18 Nov 2022 14:44:27 +0100 Subject: [PATCH 05/14] Update doc/src/base/sort.md Co-authored-by: Lilith Orion Hafner --- doc/src/base/sort.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/src/base/sort.md b/doc/src/base/sort.md index 5f3902496d571..87b8be38bccf2 100644 --- a/doc/src/base/sort.md +++ b/doc/src/base/sort.md @@ -145,7 +145,7 @@ There are currently four sorting algorithms available in base Julia: and is used internally by `QuickSort`. `QuickSort` is an in-place and very fast sorting algorithm with an average-case time -complexity of O(n log n). Since Julia v1.9, `QuickSort` is stable, i.e., elements considered +complexity of O(n log n). Since Julia 1.9, `QuickSort` is stable, i.e., elements considered equal will remain in the same order. Notice that O(n²) is worst-case complexity, but it gets vanishingly unlikely as the pivot selection is randomized in Julia v1.9. From 52656f684f4455c1de73b668d41bcd9adfff8241 Mon Sep 17 00:00:00 2001 From: Petr Vana Date: Fri, 18 Nov 2022 14:53:18 +0100 Subject: [PATCH 06/14] Udpate according to the comments --- doc/src/base/sort.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/doc/src/base/sort.md b/doc/src/base/sort.md index 87b8be38bccf2..bab2e5342bb51 100644 --- a/doc/src/base/sort.md +++ b/doc/src/base/sort.md @@ -144,9 +144,9 @@ There are currently four sorting algorithms available in base Julia: `InsertionSort` is an O(n²) stable sorting algorithm. It is efficient for very small `n`, and is used internally by `QuickSort`. -`QuickSort` is an in-place and very fast sorting algorithm with an average-case time -complexity of O(n log n). Since Julia 1.9, `QuickSort` is stable, i.e., elements considered -equal will remain in the same order. Notice that O(n²) is worst-case complexity, but it gets +`QuickSort` is a very fast sorting algorithm with an average-case time complexity of +O(n log n). Since Julia 1.9, `QuickSort` is stable, i.e., elements considered equal will +remain in the same order. Notice that O(n²) is worst-case complexity, but it gets vanishingly unlikely as the pivot selection is randomized in Julia v1.9. `PartialQuickSort(k::OrdinalRange)` is similar to `QuickSort`, but the output array is only @@ -169,9 +169,9 @@ s1[k] == s2[k] # => true array of half the size of the input array – and is typically not quite as fast as `QuickSort`. It is the default algorithm for non-numeric data. -The default sorting algorithm is chosen on the basis that it is stable and fast, or *appear* -to be fast. Usually, `QuickSort` is selected, but `InsertionSort` is preferred for small -data. You can also explicitly specify your preferred algorithm, e.g. +The default sorting algorithms are chosen on the basis that they are stable and fast, or +*appear* to be fast. Usually, `QuickSort` is selected, but `InsertionSort` is preferred for +small data. You can also explicitly specify your preferred algorithm, e.g. `sort!(v, alg=PartialQuickSort(10:20))`. The mechanism by which Julia picks default sorting algorithms is implemented via the @@ -190,7 +190,7 @@ defalg(v::AbstractArray{<:Number}) = MergeSort !!! compat "Julia 1.9" The default sorting algorithm (returned by `Base.Sort.defalg` function) is - stable since Julia 1.9. + is guaranteed to be stable since Julia 1.9. ## Alternate orderings From 9461f527ef152fe68e9990b24fb8ad6bccf060ae Mon Sep 17 00:00:00 2001 From: Petr Vana Date: Fri, 18 Nov 2022 15:05:24 +0100 Subject: [PATCH 07/14] Use exemple from InlineStrings.jl --- doc/src/base/sort.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/doc/src/base/sort.md b/doc/src/base/sort.md index bab2e5342bb51..24347765ec4e9 100644 --- a/doc/src/base/sort.md +++ b/doc/src/base/sort.md @@ -183,9 +183,10 @@ method from [`sort.jl`](https://github.com/JuliaLang/julia/blob/master/base/sort defalg(v::AbstractArray) = DEFAULT_STABLE ``` -You may change the default behavior for specific element type by: +You may change the default behavior for specific element type by, e.g., as in +[InlineStrings.jl](https://github.com/JuliaStrings/InlineStrings.jl/blob/v1.3.2/src/InlineStrings.jl#L903): ```julia -defalg(v::AbstractArray{<:Number}) = MergeSort +Base.Sort.defalg(::AbstractArray{<:Union{SmallInlineStrings, Missing}}) = InlineStringSort ``` !!! compat "Julia 1.9" From 6e16282ad6868d7159f5fb66cde548f77d96b16c Mon Sep 17 00:00:00 2001 From: Petr Vana Date: Fri, 18 Nov 2022 17:48:11 +0100 Subject: [PATCH 08/14] Change example to julia-repl --- doc/src/base/sort.md | 29 +++++++++++++++++++---------- 1 file changed, 19 insertions(+), 10 deletions(-) diff --git a/doc/src/base/sort.md b/doc/src/base/sort.md index 24347765ec4e9..d62873d725c13 100644 --- a/doc/src/base/sort.md +++ b/doc/src/base/sort.md @@ -155,14 +155,23 @@ sorted in the range of `k`. For example: !!! compat "Julia 1.9" The `QuickSort` and `PartialQuickSort` are stable since Julia 1.9. -```julia -x = rand(1:500, 100); -k = 50:100; -s1 = sort(x; alg=QuickSort); -s2 = sort(x; alg=PartialQuickSort(k)); -map(issorted, (s1, s2)) # => (true, false) -map(x->issorted(x[k]), (s1, s2)) # => (true, true) -s1[k] == s2[k] # => true +```julia-repl +julia> x = rand(1:500, 100); + +julia> k = 50:100; + +julia> s1 = sort(x; alg=QuickSort); + +julia> s2 = sort(x; alg=PartialQuickSort(k)); + +julia> map(issorted, (s1, s2)) +(true, false) + +julia> map(x->issorted(x[k]), (s1, s2)) +(true, true) + +julia> s1[k] == s2[k] +true ``` `MergeSort` is an O(n log n) stable sorting algorithm but is not in-place – it requires a temporary @@ -190,8 +199,8 @@ Base.Sort.defalg(::AbstractArray{<:Union{SmallInlineStrings, Missing}}) = Inline ``` !!! compat "Julia 1.9" - The default sorting algorithm (returned by `Base.Sort.defalg` function) is - is guaranteed to be stable since Julia 1.9. + The default sorting algorithm (returned by `Base.Sort.defalg`) is guaranteed + to be stable since Julia 1.9. ## Alternate orderings From 586ec9d2b698c658a99a973fb9086f43fc3a8c96 Mon Sep 17 00:00:00 2001 From: Petr Vana Date: Fri, 18 Nov 2022 18:28:58 +0100 Subject: [PATCH 09/14] Switch to jldoctest --- doc/src/base/sort.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/src/base/sort.md b/doc/src/base/sort.md index d62873d725c13..871cfc45da6a9 100644 --- a/doc/src/base/sort.md +++ b/doc/src/base/sort.md @@ -155,7 +155,7 @@ sorted in the range of `k`. For example: !!! compat "Julia 1.9" The `QuickSort` and `PartialQuickSort` are stable since Julia 1.9. -```julia-repl +```jldoctest julia> x = rand(1:500, 100); julia> k = 50:100; From 7a66f7658d217608a57e27861a4dccc3090d4ea7 Mon Sep 17 00:00:00 2001 From: Petr Vana Date: Fri, 18 Nov 2022 18:32:56 +0100 Subject: [PATCH 10/14] Remove v1.9 from QuickSort description, as already mentioned in compat. --- doc/src/base/sort.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/doc/src/base/sort.md b/doc/src/base/sort.md index 871cfc45da6a9..70cb2cc887455 100644 --- a/doc/src/base/sort.md +++ b/doc/src/base/sort.md @@ -145,9 +145,9 @@ There are currently four sorting algorithms available in base Julia: and is used internally by `QuickSort`. `QuickSort` is a very fast sorting algorithm with an average-case time complexity of -O(n log n). Since Julia 1.9, `QuickSort` is stable, i.e., elements considered equal will -remain in the same order. Notice that O(n²) is worst-case complexity, but it gets -vanishingly unlikely as the pivot selection is randomized in Julia v1.9. +O(n log n). `QuickSort` is stable, i.e., elements considered equal will remain in the same +order. Notice that O(n²) is worst-case complexity, but it gets vanishingly unlikely as the +pivot selection is randomized. `PartialQuickSort(k::OrdinalRange)` is similar to `QuickSort`, but the output array is only sorted in the range of `k`. For example: From 29f08aa7b32b52ccf9232ca4eeabc3ec824b715a Mon Sep 17 00:00:00 2001 From: Petr Vana Date: Fri, 18 Nov 2022 18:39:01 +0100 Subject: [PATCH 11/14] Move compat bellow the example --- doc/src/base/sort.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/doc/src/base/sort.md b/doc/src/base/sort.md index 70cb2cc887455..653c321efd827 100644 --- a/doc/src/base/sort.md +++ b/doc/src/base/sort.md @@ -152,9 +152,6 @@ pivot selection is randomized. `PartialQuickSort(k::OrdinalRange)` is similar to `QuickSort`, but the output array is only sorted in the range of `k`. For example: -!!! compat "Julia 1.9" - The `QuickSort` and `PartialQuickSort` are stable since Julia 1.9. - ```jldoctest julia> x = rand(1:500, 100); @@ -174,6 +171,9 @@ julia> s1[k] == s2[k] true ``` +!!! compat "Julia 1.9" + The `QuickSort` and `PartialQuickSort` are stable since Julia 1.9. + `MergeSort` is an O(n log n) stable sorting algorithm but is not in-place – it requires a temporary array of half the size of the input array – and is typically not quite as fast as `QuickSort`. It is the default algorithm for non-numeric data. From a3d1675b09545ac97acb15f64ef728b526004ebf Mon Sep 17 00:00:00 2001 From: Lilith Orion Hafner Date: Mon, 21 Nov 2022 13:20:21 +0600 Subject: [PATCH 12/14] Fix typos --- doc/src/base/sort.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/doc/src/base/sort.md b/doc/src/base/sort.md index 653c321efd827..15abefaa43c52 100644 --- a/doc/src/base/sort.md +++ b/doc/src/base/sort.md @@ -172,14 +172,14 @@ true ``` !!! compat "Julia 1.9" - The `QuickSort` and `PartialQuickSort` are stable since Julia 1.9. + The `QuickSort` and `PartialQuickSort` algorithms are stable since Julia 1.9. `MergeSort` is an O(n log n) stable sorting algorithm but is not in-place – it requires a temporary array of half the size of the input array – and is typically not quite as fast as `QuickSort`. It is the default algorithm for non-numeric data. -The default sorting algorithms are chosen on the basis that they are stable and fast, or -*appear* to be fast. Usually, `QuickSort` is selected, but `InsertionSort` is preferred for +The default sorting algorithms are chosen on the basis that they are fast and stable, or +*appear* to be stable. Usually, `QuickSort` is selected, but `InsertionSort` is preferred for small data. You can also explicitly specify your preferred algorithm, e.g. `sort!(v, alg=PartialQuickSort(10:20))`. From 8b7ed9661ab22070d85353e5eea742a9399cba14 Mon Sep 17 00:00:00 2001 From: Lilith Orion Hafner Date: Mon, 21 Nov 2022 13:20:37 +0600 Subject: [PATCH 13/14] Adjust wording --- doc/src/base/sort.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/doc/src/base/sort.md b/doc/src/base/sort.md index 15abefaa43c52..89af50ca0d265 100644 --- a/doc/src/base/sort.md +++ b/doc/src/base/sort.md @@ -192,15 +192,16 @@ method from [`sort.jl`](https://github.com/JuliaLang/julia/blob/master/base/sort defalg(v::AbstractArray) = DEFAULT_STABLE ``` -You may change the default behavior for specific element type by, e.g., as in -[InlineStrings.jl](https://github.com/JuliaStrings/InlineStrings.jl/blob/v1.3.2/src/InlineStrings.jl#L903): +You may change the default behavior for specific types by defining new methods for `defalg`. +For example, [InlineStrings.jl](https://github.com/JuliaStrings/InlineStrings.jl/blob/v1.3.2/src/InlineStrings.jl#L903) +defines the following method: ```julia Base.Sort.defalg(::AbstractArray{<:Union{SmallInlineStrings, Missing}}) = InlineStringSort ``` !!! compat "Julia 1.9" The default sorting algorithm (returned by `Base.Sort.defalg`) is guaranteed - to be stable since Julia 1.9. + to be stable since Julia 1.9. Previous versions had unstable edge cases when sorting numeric arrays. ## Alternate orderings From bbcd8c17a9df3e74a0777f0a8fb6413379494d37 Mon Sep 17 00:00:00 2001 From: Petr Vana Date: Mon, 21 Nov 2022 10:23:29 +0100 Subject: [PATCH 14/14] Remove "*appear* to be stable." as slightly misleading. --- doc/src/base/sort.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/doc/src/base/sort.md b/doc/src/base/sort.md index 89af50ca0d265..e93d9716b1487 100644 --- a/doc/src/base/sort.md +++ b/doc/src/base/sort.md @@ -178,9 +178,9 @@ true array of half the size of the input array – and is typically not quite as fast as `QuickSort`. It is the default algorithm for non-numeric data. -The default sorting algorithms are chosen on the basis that they are fast and stable, or -*appear* to be stable. Usually, `QuickSort` is selected, but `InsertionSort` is preferred for -small data. You can also explicitly specify your preferred algorithm, e.g. +The default sorting algorithms are chosen on the basis that they are fast and stable. +Usually, `QuickSort` is selected, but `InsertionSort` is preferred for small data. +You can also explicitly specify your preferred algorithm, e.g. `sort!(v, alg=PartialQuickSort(10:20))`. The mechanism by which Julia picks default sorting algorithms is implemented via the