From 0031f748d229a8926ee8bda39acfcdecf5ec9e8e Mon Sep 17 00:00:00 2001
From: William Clodius <w.clodius@icloud.com>
Date: Sat, 10 Apr 2021 21:23:43 -0600
Subject: [PATCH 01/45] Start the addition of the module stdlib_sorting

Added the draft markdown documentation file, stdlib_sorting.md.

[ticket: X]
---
 doc/specs/stdlib_sorting.md | 526 ++++++++++++++++++++++++++++++++++++
 1 file changed, 526 insertions(+)
 create mode 100644 doc/specs/stdlib_sorting.md
diff --git a/doc/specs/stdlib_sorting.md b/doc/specs/stdlib_sorting.md
new file mode 100644
index 000000000..b843b89be
--- /dev/null
+++ b/doc/specs/stdlib_sorting.md
@@ -0,0 +1,526 @@
+---
+title: Sorting Procedures
+---
+
+# The `stdlib_sorting` module
+
+(TOC)
+
+## Overview of sorting
+
+The sorting of collections of data is useful in the analysis of those
+collections.
+With its absence of generics and limited polymorphism, it is
+impractical, in current Fortran, to provide sorting routines for
+arbitrary collections of arbitrary types of data.
+However Fortran's arrays are by far its most widely used collection,
+and arrays of arbitrary types of data can often be sorted in terms of
+a single component of intrinsic type.
+The Fortran Standard Library therefore provides a module,
+`stdlib_sorting`, with procedures to sort arrays of single intrinsic
+numeric types, i.e. the different kinds of integers and reals.
+
+## Overview of the module
+
+The module `stdlib_sorting` defines several public entities one
+default integer parameter, `int_size`, and three overloaded
+subroutines: `ORD_SORT`, `UNORD_SORT`, and `ORD_SORTING`. The
+overloaded subroutines also each have seven specific names for
+versions corresponding to differend types of array arguments.
+
+### The `int_size` parameter
+
+The `int_size` parameter is used to specify the kind of integer used
+in indexing the various arrays. Currently the module sets `int_size`
+to the value of `int64` from the intrinsic `ISO_FORTRAN_ENV` module.
+For many applications a value of `INT32` would be sufficient for
+addressing and would save some stack space for the subroutines,
+
+### The module subroutines
+
+The `stdlib_sorting` module provides three different overloaded
+subroutines intended to sort three different kinds of arrays of
+data:
+* `ORD_SORT` is intended to sort simple arrays of intrinsic data
+  that have significant sections that were partially ordered before
+  the sort; and
+* `ORD_SORTING` is intended to provide indices for sorting arrays of
+  derived type data, based on the ordering of an intrinsic component
+  of the derived type.
+* `UNORD_SORT` is intended to sort simple arrays of intrinsic data 
+  that are effectively unordered before the sort; 
+
+#### The `ORD_SORT` subroutine
+
+`ORD_SORT` is a translation of the [`rust sort` sorting algorithm]
+(https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs)
+which in turn is inspired by the [`timsort` algorithm of Tim Peters]
+(http://svn.python.org/projects/python/trunk/Objects/listsort.txt).
+`ORD_SORT` is a hybrid stable comparison algorithm combining `merge sort`,
+and `insertion sort`. It has always at worst O(N Ln(N)) runtime
+performance in sorting random data, having a performance about 15-25%
+slower than `UNORD_SORT` on such data. However it has much better
+performance than `UNORD_SORT` on partially sorted data, having O(N)
+performance on uniformly increasing or decreasing data.
+
+The [`rust sort` implementation]
+(https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs)
+is distributed with the header:
+
+    Copyright 2012-2015 The Rust Project Developers. See the COPYRIGHT
+    file at the top-level directory of this distribution and at
+    http://rust-lang.org/COPYRIGHT.
+
+    Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or
+    http://www.apache.org/licenses/LICENSE-2.0> or the MIT license
+    <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your
+    option. This file may not be copied, modified, or distributed
+    except according to those terms.
+
+so the license for the original code is compatible with the use of
+modified versions of the code in the Fortran Standard Library under
+the MIT license.
+
+As with `timsort`, `ORD_SORT` is a stable hybrid algorithm.
+It begins by traversing the array starting in its tail attempting to
+identify `runs` in the array, where a run is either a uniformly
+decreasing sequence, `ARRAY(i-1) > ARRAY(i)`, or non-decreasing,
+`ARRAY(i-1) <= ARRAY(i)`, sequence. Decreasing sequences are reversed.
+Then, if the sequence has less than `MIN_RUN` elements, previous
+elements in the array are added to the run using `insertion sort`
+until the run contains `MIN_RUN` elements or the array is completely
+processed. As each run is identified the start and length of the run
+is then pushed onto a stack and the stack is then processed using
+`merge` until it obeys the stack invariants:
+
+1. len(i-2) > len(i-1) + len(i)
+2. len(i-1) > len(i)
+
+ensuring that processing the stack is, at worst, of order `O(N
+Ln(N))`. However, because of the identification of decreasing and
+non-decreasing runs, processing of structured data can be much faster,
+with processing of uniformly decreasing or non-decreasing arrays being
+of order O(N). The result in our tests is that `ORD_SORT` is about
+15-25% slower than `UNORD_SORT` on purely random data, depending on
+the compiler, but can be more than an order of magnitude faster than
+`UNORD_SORT` on highly structured data. As a modified `merge sort`,
+`ORD_SORT` requires the use of a "scratch" array, that may be provided
+as an optional `work` argument or allocated internally on the stack.
+
+#### The `ORD_SORTING` subroutine
+
+The `UNORD_SORT` and `ORD_SORT` subroutines can sort isolated arrays
+of intrinsic types, but do nothing for the sorting of arrays of
+derived types. For arrays of derived types what is useful is an array
+of indices that maps the original array to an array sorted based on the
+value of a component of the derived type. For such a sort, a stable
+sort is useful, therefore the module provides a subroutine,
+`ORD_SORTING`, that generates such an array of indices based on
+the `ORD_SORT` algorithm.
+
+As `ORD_SORT` is also based on the `rust sort` algorithm the `rust
+sort` license must be acknowledged:
+
+    Copyright 2012-2015 The Rust Project Developers. See the COPYRIGHT
+    file at the top-level directory of this distribution and at
+    http://rust-lang.org/COPYRIGHT.
+
+    Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or
+    http://www.apache.org/licenses/LICENSE-2.0> or the MIT license
+    <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your
+    option. This file may not be copied, modified, or distributed
+    except according to those terms.
+
+noting that the Fortran Standard Library is released under the MIT
+license so that incorporation of the `rust sort` algorithm is
+compatible with its license.
+
+The logic of `ORD_SORTING` parallels that of `ORD_SORT`, with
+additional housekeeping to keep the array of indices consistent with
+the sorted positions of the input array. Because of this additional
+housekeeping it has slower runtime performance than `ORD_SORT`.
+`ORD_SORTING` requires the use of two "scratch" arrays, that may be
+provided as optional `work` and `iwork` arguments or allocated
+internally on the stack.
+
+#### The `UNORD_SORT` subroutines 
+
+`UNORD_SORT` uses the [`introsort` sorting algorithm of David Musser]
+(http://www.cs.rpi.edu/~musser/gp/introsort.ps). `introsort` is a hybrid 
+unstable comparison algorithm combining `quicksort`, `insertion sort`,
+and `heap sort`. While this algorithm's runtime performance is always
+O(N Ln(N)), it is relatively fast on randomly ordered data, but
+inconsistent in performance on partly sorted data. David Musser has
+given permission to include a variant of `introsort` in the Fortran
+Standard Library under the MIT license provided we cite: 
+
+    Musser, D.R., “Introspective Sorting and Selection Algorithms,”
+    Software—Practice and Experience, Vol. 27(8), 983–993 (August 1997). 
+
+as the official source of the algorithm. 
+
+As with `introsort`, `UNORD_SORT` is an unstable hybrid algorithm.
+First it examines the array and estimates the depth of recursion a
+quick sort would require for ideal (random) data, `D =
+Ceiling(Ln(N)/Ln(2))`. It then defines a limit to the number of
+quicksort recursions to be allowed in processing,
+`D_limit = factor * D`, where factor is currently 2, and
+calls `introsort` proper. `introsort` proper then:
+
+1. Examines the number of elements remaining to be sorted, and, if
+   they are less than 16, sorts them using insertion sort and returns;
+2. If they are not less than 16, checks whether the current depth of
+   recursion exceeds `D_limit` and, if it does, processes the remaining
+   elements with heap sort and returns;
+3. If the current depth of recursion does not exceed `D_limit`, then
+   in effect does a quicksort step:
+
+    * Partitions the remaining array using a median of three,
+    * Calls introsort proper on the leftmost partition,
+    * Calls introsort proper on the rightmost partition, and then
+      returns.
+
+The resulting algorithm is of order O(N Ln(N)) run time
+performance for all inputs. Because it relies on `quicksort`, the
+coefficient of the O(N Ln(N)) behavior is typically small compared to
+other sorting algorithms on random data. On partially sorted data it
+can show either slower `heap sort` performance, or enhanced
+performance by up to a factor of six. Still, even when it shows
+enhanced performance, its performance on partially sorted data is
+typically an order of magnitude slower then `ORD_SORT`.
+
+### Tentative specifications of the `stdlib_sorting` procedures
+
+#### `ord_sort` - sorts an input array
+
+##### Status
+
+Experimental
+
+##### Description
+
+Returns an input `array` with the elements sorted in order of
+increasing value.
+
+##### Syntax
+
+`call [[stdlib_sorting(module):ord_sort(subroutine)]]ord_sort ( array[, work ] )`
+
+##### Class
+
+Generic subroutine.
+
+##### Arguments
+
+`array` : shall be a rank one array of any of the types:
+`integer(int8)`, `integer(int16)`, `integer(int32)`, `integer(int64)`,
+`real(real32)`, `real(real64)`, or `real(real128)`. It is an
+`intent(inout)` argument. On input it is the array to be sorted. If
+both the type of `array` is real and at least one of the elements is a
+`NaN`, then the ordering of the result is undefined. Otherwise on
+return its elements will be sorted in order of non-decreasing value.
+
+`work` (optional): shall be a rank one array of any of the same type as
+array, and shall have at least `size(array)/2` elements. It is an
+`intent(inout)` argument. It is intended to be used as "scratch"
+memory for internal record keeping. If associated with an array in
+static storage, its use can significantly reduce the stack memory
+requirements for the code. Its contents on return are undefined.
+
+##### Notes
+
+`ORD_SORT` implements a hybrid sorting algorithm combining
+`merge sort`, and `insertion sort`. For most purposes it behaves like
+a `merge sort`, providing worst case `O(N Ln(N))` run time performance
+for most random arrays, that is typically slower than `UNORD_SORT`.
+However, if the array has significant runs of decreasing or
+non-decreasing values, performance can be much better than
+`UNORD_SORT`, with `O(N)` behavior on uniformly decreasing, or
+non-decreasing arrays. The optional `work` array replaces "scratch"
+memory that would otherwise be allocated on the stack. If `array` is of
+any type `REAL` the order of its elements on return undefined if any
+element of `array` is a `NaN`.
+
+
+##### Example
+
+```Fortran
+	...
+	! Read arrays from sorted files
+	call read_sorted_file( 'dummy_file1', array1 )
+	call read_sorted_file( 'dummy_file2', array2 )
+	! Concatenate the arrays
+	allocate( array( size(array1) + size(array2) ) )
+	array( 1:size(array1) ) = array1(:)
+	array( size(array1)+1:size(array1)+size(array2) ) = array2(:)
+	! Sort the resulting array
+	call ord_sort( array, work )
+	! Process the sorted array
+	call array_search( array, values )
+	...
+```
+
+#### `ord_sorting` - creates an arry of sorting indices for an input array.
+
+##### Status
+
+Experimental
+
+##### Description
+
+Returns an integer array whose elements would sort the input array in
+the specified direction retaining order stability.
+
+##### Syntax
+
+`call [[stdlib_sorting(module):ord_sorting(subroutine)]]ord_sorting ( array, index[, work, iwork, reverse, zero_based ] )`
+
+##### Class
+
+Generic subroutine.
+
+##### Arguments
+
+`array`: shall be a rank one array of any of the types:
+`integer(int8)`, `integer(int16)`, `integer(int32)`, `integer(int64)`,
+`real(real32)`, `real(real64)`, or `real(real128)`. It is an
+`intent(inout)` argument. On input it will be an array whose sorting
+indices are to be determined. On return it will be the sorted
+array.
+
+`index`: shall be a rank one integer array of kind `int_size` and of
+the size of `array`. It is an `intent(out)` argument. On return it
+shall have values that are the indices needed to sort the original
+array in the desired direction.
+
+`work` (optional): shall be a rank one array of any of the same type as
+array, and shall have at least `size(array)/2` elements. It is an
+`intent(inout)` argument. It is intended to be used as "scratch"
+memory for internal record keeping. If associated with an array in
+static storage, its use can significantly reduce the stack memory
+requirements for the code. Its contents on return are undefined.
+
+`iwork` (optional): shall be a rank one integer array of kind
+`int_size`, and shall have at least `size(array)/2` elements. It
+is an `intent(inout)` argument. Its contents on return are undefined.
+
+`reverse` (optional): shall be a scalar of type default logical. It
+is an `intent(in)` argument. If present with a value of `.true.` then
+`index` will sort `array` in order of non-increasing values in stable
+order. Otherwise index will sort `array` in order of non-decreasing
+values in stable order.
+
+`zero_based` (optional): shall be a scalar of type default logical.
+It is an `intent(in)` argument. If present with the value `.true.`
+the values of `index` will be for zero based array indexing,
+otherwise the values of index will be for one's based array
+indexing.
+
+##### Notes
+
+`ORD_SORTING` implements the hybrid sorting algorithm of `ORD_SORT`,
+keeping the values of `index` consistent with the elements of `array`
+as it is sorted. As a `merge sort` based algorithm, it is a stable
+sorting comparison algorithm. The optional `work` and `iwork` arrays
+replace "scratch" memory that would otherwise be allocated on the
+stack. If `array` is of any kind of `REAL` the order of the elements in
+`index` and `array` on return are undefined if any element of `array`
+is a `NaN`. It should be emphasized that the order of `array` will
+typically be different on return.
+
+
+##### Example
+
+```Fortran
+	subroutine sort_a_data( a_data, a, work, index, iwork )
+	    ! Sort `a_data` in terms or its component `a`
+	    type(a_type), intent(inout)      :: a_data(:)
+		integer(int32), intent(inout)    :: a(:)
+		integer(int32), intent(inout)    :: work(:)
+		integer(int_size), intent(inout) :: index(:)
+		integer(int_size), intent(inout) :: iwork(:)
+		! Extract a component of `a_data`
+		a(1:size(a_data)) = a_data(:) % a
+		! Find the indices to sort the component
+		call ord_sorting(a(1:size(a_data)), index(1:size(a_data)),&
+		    work(1:size(a_data)/2), iwork(1:size(a_data)/2))
+		! Sort a_data based on the sorting of that component
+		a_data(:) = a_data( index(1:size(a_data)) )
+	end subroutine sort_a_data
+```
+
+#### `unord_sort` - sorts an input array
+
+##### Status
+
+Experimental
+
+##### Description
+
+Returns an input array with the elements sorted in order of increasing
+value.
+
+##### Syntax
+
+`call [[stdlib_sorting(module):unord_sort(subroutine)]]unord_sort ( array )`
+
+##### Class
+
+Pure generic subroutine.
+
+##### Arguments
+
+`array` : shall be a rank one array of any of the types:
+`integer(int8)`, `integer(int16)`, `integer(int32)`, `integer(int64)`,
+`real(real32)`, `real(real64)`, or `real(real128)`. It
+is an `intent(inout)` argument. On return its input elements will be
+sorted in order of non-decreasing value.
+
+##### Notes
+
+`UNORD_SORT` implements a hybrid sorting algorithm combining
+`quicksort`, `merge sort`, and `insertion sort`. For most purposes it
+behaves like a `quicksort` with a median of three partition, providing
+good, `O(N Ln(N))`, run time performance for most random arrays, but
+defaulting to `merge sort` if the structure of the array results in
+the `quicksort` not converging as rapidly as expected. If `array` is of
+any type `REAL`, the behavior of the sorting is undefined if any
+element of `array` is a `NaN`.
+
+##### Example
+
+```Fortran
+	...
+	! Read random data from a file
+	call read_file( 'dummy_file', array )
+	! Sort the random data
+	call unord_sort( array )
+	! Process the sorted data
+	call array_search( array, values )
+	...
+```
+
+#### Specific procedures
+
+Usually the name of a generic procedure is the most convenient way of
+invoking it. However sometimes it is necessary to pass a procedure as
+an argument to another procedure. In that case it is usually necessary
+to know the name of the specific procedure desired. The following
+table lists the specific subroutines and the corresponding types of
+their `array` arguments.
+
+| Generic Subroutine | Specific Subroutine | Array type      |
+|--------------------|---------------------|-----------------|
+| `ORD_SORT`         | `INT8_ORD_SORT`     | `INTEGER(INT8)` |
+|                    | `INT16_ORD_SORT`    | `INTEGER(INT16)` |
+|                    | `INT32_ORD_SORT`    | `INTEGER(INT32)` |
+|                    | `INT64_ORD_SORT`    | `INTEGER(INT64)` |
+|                    | `SP_ORD_SORT`    | `REAL(REAL32)` |
+|                    | `DP_ORD_SORT`    | `REAL(REAL64)` |
+|                    | `QP_ORD_SORT`    | `REAL(REAL128)` |
+| `ORD_SORTING`      | `INT8_ORD_SORTING`     | `INTEGER(INT8)` |
+|                    | `INT16_ORD_SORTING`    | `INTEGER(INT16)` |
+|                    | `INT32_ORD_SORTING`    | `INTEGER(INT32)` |
+|                    | `INT64_ORD_SORTING`    | `INTEGER(INT64)` |
+|                    | `SP_ORD_SORTING`    | `REAL(REAL32)` |
+|                    | `DP_ORD_SORTING`    | `REAL(REAL64)` |
+|                    | `QP_ORD_SORTING`    | `REAL(REAL128)` |
+| `UNORD_SORT`         | `INT8_UNORD_SORT`     | `INTEGER(INT8)` |
+|                    | `INT16_UNORD_SORT`    | `INTEGER(INT16)` |
+|                    | `INT32_UNORD_SORT`    | `INTEGER(INT32)` |
+|                    | `INT64_UNORD_SORT`    | `INTEGER(INT64)` |
+|                    | `SP_UNORD_SORT`    | `REAL(REAL32)` |
+|                    | `DP_UNORD_SORT`    | `REAL(REAL64)` |
+|                    | `QP_UNORD_SORT`    | `REAL(REAL128)` |
+
+
+### Performance benchmarks
+
+We have performed benchmarks of the procedures on nine different
+arrays each of size `2**20`:
+
+* Blocks - the array is divided into siz blocks, each of distinct
+  uniformly increasing integers. 
+* Decreasing - values decrease uniformly from `2**20-1` to `0`. 
+* Identical - all integers have the same value of 10.
+* Increasing - values increase uniformly from `0` to `2**20-1`.
+* Random dense - the integers are generated randomly from a set of 
+  values from `0` to `2**18-1` so duplicates are dense. 
+* Random order - a set of integers from `0` to `2**20 - 1` in random 
+  order. 
+* Random sparse - the integers are generated randomly from a set of
+  values from `0` to `2**22-1` so duplicates are sparse.
+* Random-3 - the increasing array has 3 random exchanges of individual
+  elements.
+* Random-10 - the final ten elements of the increasing array are
+  replaced by random values.
+
+These benchmarks have been performed on two different processors, both
+on a MacBook Pro, featuring a 2.3 GHz Quad-Core Intel Core i5, with 8
+GB 2133 MHz LPDDR3 memory. The first processor was GNU Fortran
+(GCC) 10.2.0, with the following results:
+
+|   Type  | Elements |   Order       |    Method     | Time (s)  |
+|---------|----------|---------------|---------------|-----------|
+| Integer | 1048576  |        Blocks |      Ord_Sort |   0.00692 |
+| Integer | 1048576  |    Decreasing |      Ord_Sort |   0.00377 |
+| Integer | 1048576  |     Identical |      Ord_Sort |   0.00236 |
+| Integer | 1048576  |    Increasing |      Ord_Sort |   0.00205 |
+| Integer | 1048576  |  Random dense |      Ord_Sort |   0.16675 |
+| Integer | 1048576  |  Random order |      Ord_Sort |   0.16780 |
+| Integer | 1048576  | Random sparse |      Ord_Sort |   0.16715 |
+| Integer | 1048576  |      Random 3 |      Ord_Sort |   0.00907 |
+| Integer | 1048576  |     Random 10 |      Ord_Sort |   0.00435 |
+| Integer | 1048576  |        Blocks |   Ord_Sorting |   0.01177 |
+| Integer | 1048576  |    Decreasing |   Ord_Sorting |   0.00695 |
+| Integer | 1048576  |     Identical |   Ord_Sorting |   0.00438 |
+| Integer | 1048576  |    Increasing |   Ord_Sorting |   0.00429 |
+| Integer | 1048576  |  Random dense |   Ord_Sorting |   0.19079 |
+| Integer | 1048576  |  Random order |   Ord_Sorting |   0.19242 |
+| Integer | 1048576  | Random sparse |   Ord_Sorting |   0.19212 |
+| Integer | 1048576  |      Random 3 |   Ord_Sorting |   0.01504 |
+| Integer | 1048576  |     Random 10 |   Ord_Sorting |   0.00794 |
+| Integer | 1048576  |        Blocks |    Unord_Sort |   0.20889 |
+| Integer | 1048576  |    Decreasing |    Unord_Sort |   0.12826 |
+| Integer | 1048576  |     Identical |    Unord_Sort |   0.15646 |
+| Integer | 1048576  |    Increasing |    Unord_Sort |   0.05012 |
+| Integer | 1048576  |  Random dense |    Unord_Sort |   0.14359 |
+| Integer | 1048576  |  Random order |    Unord_Sort |   0.14613 |
+| Integer | 1048576  | Random sparse |    Unord_Sort |   0.15159 |
+| Integer | 1048576  |      Random 3 |    Unord_Sort |   0.13826 |
+| Integer | 1048576  |     Random 10 |    Unord_Sort |   0.35356 |
+
+The second processor was ifort (IFORT) 18.0.3 20180410, with the
+following results:
+
+|   Type  | Elements |   Order       |    Method     | Time (s)  |
+|---------|----------|---------------|---------------|-----------|
+| Integer | 1048576  |        Blocks |      Ord_Sort |   0.00231 |
+| Integer | 1048576  |    Decreasing |      Ord_Sort |   0.00123 |
+| Integer | 1048576  |     Identical |      Ord_Sort |   0.00088 |
+| Integer | 1048576  |    Increasing |      Ord_Sort |   0.00088 |
+| Integer | 1048576  |  Random dense |      Ord_Sort |   0.08914 |
+| Integer | 1048576  |  Random order |      Ord_Sort |   0.08779 |
+| Integer | 1048576  | Random sparse |      Ord_Sort |   0.08357 |
+| Integer | 1048576  |      Random 3 |      Ord_Sort |   0.00335 |
+| Integer | 1048576  |     Random 10 |      Ord_Sort |   0.00196 |
+| Integer | 1048576  |        Blocks |   Ord_Sorting |   0.00519 |
+| Integer | 1048576  |    Decreasing |   Ord_Sorting |   0.00240 |
+| Integer | 1048576  |     Identical |   Ord_Sorting |   0.00110 |
+| Integer | 1048576  |    Increasing |   Ord_Sorting |   0.00113 |
+| Integer | 1048576  |  Random dense |   Ord_Sorting |   0.10185 |
+| Integer | 1048576  |  Random order |   Ord_Sorting |   0.10222 |
+| Integer | 1048576  | Random sparse |   Ord_Sorting |   0.10376 |
+| Integer | 1048576  |      Random 3 |   Ord_Sorting |   0.00628 |
+| Integer | 1048576  |     Random 10 |   Ord_Sorting |   0.00264 |
+| Integer | 1048576  |        Blocks |    Unord_Sort |   0.07272 |
+| Integer | 1048576  |    Decreasing |    Unord_Sort |   0.03873 |
+| Integer | 1048576  |     Identical |    Unord_Sort |   0.03677 |
+| Integer | 1048576  |    Increasing |    Unord_Sort |   0.01153 |
+| Integer | 1048576  |  Random dense |    Unord_Sort |   0.06706 |
+| Integer | 1048576  |  Random order |    Unord_Sort |   0.06808 |
+| Integer | 1048576  | Random sparse |    Unord_Sort |   0.06798 |
+| Integer | 1048576  |      Random 3 |    Unord_Sort |   0.07230 |
+| Integer | 1048576  |     Random 10 |    Unord_Sort |   0.13640 |
+
+

From 56fd86f8ea832839ee9f89fcb6a5cb2efebbf7b9 Mon Sep 17 00:00:00 2001
From: William Clodius <w.clodius@icloud.com>
Date: Sun, 11 Apr 2021 11:27:01 -0600
Subject: [PATCH 02/45] Removed trailing whitespace

Removed trailing whitespace from the markdown document, stdlib_sorting.md.

[ticket: X]
---
 doc/specs/stdlib_sorting.md | 26 +++++++++++++-------------
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/doc/specs/stdlib_sorting.md b/doc/specs/stdlib_sorting.md
index b843b89be..dcd6058e8 100644
--- a/doc/specs/stdlib_sorting.md
+++ b/doc/specs/stdlib_sorting.md
@@ -47,8 +47,8 @@ data:
 * `ORD_SORTING` is intended to provide indices for sorting arrays of
   derived type data, based on the ordering of an intrinsic component
   of the derived type.
-* `UNORD_SORT` is intended to sort simple arrays of intrinsic data 
-  that are effectively unordered before the sort; 
+* `UNORD_SORT` is intended to sort simple arrays of intrinsic data
+  that are effectively unordered before the sort;
 
 #### The `ORD_SORT` subroutine
 
@@ -143,21 +143,21 @@ housekeeping it has slower runtime performance than `ORD_SORT`.
 provided as optional `work` and `iwork` arguments or allocated
 internally on the stack.
 
-#### The `UNORD_SORT` subroutines 
+#### The `UNORD_SORT` subroutines
 
 `UNORD_SORT` uses the [`introsort` sorting algorithm of David Musser]
-(http://www.cs.rpi.edu/~musser/gp/introsort.ps). `introsort` is a hybrid 
+(http://www.cs.rpi.edu/~musser/gp/introsort.ps). `introsort` is a hybrid
 unstable comparison algorithm combining `quicksort`, `insertion sort`,
 and `heap sort`. While this algorithm's runtime performance is always
 O(N Ln(N)), it is relatively fast on randomly ordered data, but
 inconsistent in performance on partly sorted data. David Musser has
 given permission to include a variant of `introsort` in the Fortran
-Standard Library under the MIT license provided we cite: 
+Standard Library under the MIT license provided we cite:
 
     Musser, D.R., “Introspective Sorting and Selection Algorithms,”
-    Software—Practice and Experience, Vol. 27(8), 983–993 (August 1997). 
+    Software—Practice and Experience, Vol. 27(8), 983–993 (August 1997).
 
-as the official source of the algorithm. 
+as the official source of the algorithm.
 
 As with `introsort`, `UNORD_SORT` is an unstable hybrid algorithm.
 First it examines the array and estimates the depth of recursion a
@@ -440,14 +440,14 @@ We have performed benchmarks of the procedures on nine different
 arrays each of size `2**20`:
 
 * Blocks - the array is divided into siz blocks, each of distinct
-  uniformly increasing integers. 
-* Decreasing - values decrease uniformly from `2**20-1` to `0`. 
+  uniformly increasing integers.
+* Decreasing - values decrease uniformly from `2**20-1` to `0`.
 * Identical - all integers have the same value of 10.
 * Increasing - values increase uniformly from `0` to `2**20-1`.
-* Random dense - the integers are generated randomly from a set of 
-  values from `0` to `2**18-1` so duplicates are dense. 
-* Random order - a set of integers from `0` to `2**20 - 1` in random 
-  order. 
+* Random dense - the integers are generated randomly from a set of
+  values from `0` to `2**18-1` so duplicates are dense.
+* Random order - a set of integers from `0` to `2**20 - 1` in random
+  order.
 * Random sparse - the integers are generated randomly from a set of
   values from `0` to `2**22-1` so duplicates are sparse.
 * Random-3 - the increasing array has 3 random exchanges of individual

From 9b28d8fb44c134c21ab242ec909d6498a472d442 Mon Sep 17 00:00:00 2001
From: William Clodius <w.clodius@icloud.com>
Date: Sun, 11 Apr 2021 11:30:18 -0600
Subject: [PATCH 03/45] Added stdlib_sorting to the index

Added a reference to stdlib_sorting.md to index.md

[ticket: X]
---
 doc/specs/index.md | 1 +
 1 file changed, 1 insertion(+)

diff --git a/doc/specs/index.md b/doc/specs/index.md
index 098f2ce21..4238dc35e 100644
--- a/doc/specs/index.md
+++ b/doc/specs/index.md
@@ -19,6 +19,7 @@ This is and index/directory of the specifications (specs) for each new module/fe
  - [logger](./stdlib_logger.html) - Runtime logging system
  - [optval](./stdlib_optval.html) - Fallback value for optional arguments
  - [quadrature](./stdlib_quadrature.html) - Numerical integration
+ - [sorting](.stdlib_sorting.html) - Sorting of numerical arrays
  - [stats](./stdlib_stats.html) - Descriptive Statistics
  - [stats_distribution_PRNG](./stdlib_stats_distribution_PRNG.html) - Probability Distributions random number generator
  - [string\_type](./stdlib_string_type.html) - Basic string support

From f2944224db1bad9de14d09fcc70f77223ded0da0 Mon Sep 17 00:00:00 2001
From: William Clodius <w.clodius@icloud.com>
Date: Mon, 12 Apr 2021 20:36:52 -0600
Subject: [PATCH 04/45] Incorproated suggestions by Jeremie Vandenplas

Added a licnsing section. Changed the description of `ORD_SORT`, `ORD_SORTING`,
and `UNORD_SORT`. Added more examples for `ORD_SORTING`. Simplified the
discussion of `INT_SIZE`. Used `stdlib_kinds` instead of `iso_fortran_env`.
Changed `processor` to `compiler`. Fixed spelling of array.

[ticket: X]
---
 doc/specs/stdlib_sorting.md | 200 ++++++++++++++++++++++--------------
 1 file changed, 122 insertions(+), 78 deletions(-)

diff --git a/doc/specs/stdlib_sorting.md b/doc/specs/stdlib_sorting.md
index dcd6058e8..5ea6da868 100644
--- a/doc/specs/stdlib_sorting.md
+++ b/doc/specs/stdlib_sorting.md
@@ -32,9 +32,7 @@ versions corresponding to differend types of array arguments.
 
 The `int_size` parameter is used to specify the kind of integer used
 in indexing the various arrays. Currently the module sets `int_size`
-to the value of `int64` from the intrinsic `ISO_FORTRAN_ENV` module.
-For many applications a value of `INT32` would be sufficient for
-addressing and would save some stack space for the subroutines,
+to the value of `int64` from the `stdlib_kinds` module.
 
 ### The module subroutines
 
@@ -43,29 +41,26 @@ subroutines intended to sort three different kinds of arrays of
 data:
 * `ORD_SORT` is intended to sort simple arrays of intrinsic data
   that have significant sections that were partially ordered before
-  the sort; and
+  the sort;
 * `ORD_SORTING` is intended to provide indices for sorting arrays of
   derived type data, based on the ordering of an intrinsic component
-  of the derived type.
+  of the derived type; and
 * `UNORD_SORT` is intended to sort simple arrays of intrinsic data
-  that are effectively unordered before the sort;
+  that are effectively unordered before the sort.
 
-#### The `ORD_SORT` subroutine
+#### Licensing
 
-`ORD_SORT` is a translation of the [`rust sort` sorting algorithm]
-(https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs)
-which in turn is inspired by the [`timsort` algorithm of Tim Peters]
-(http://svn.python.org/projects/python/trunk/Objects/listsort.txt).
-`ORD_SORT` is a hybrid stable comparison algorithm combining `merge sort`,
-and `insertion sort`. It has always at worst O(N Ln(N)) runtime
-performance in sorting random data, having a performance about 15-25%
-slower than `UNORD_SORT` on such data. However it has much better
-performance than `UNORD_SORT` on partially sorted data, having O(N)
-performance on uniformly increasing or decreasing data.
+The Fortran Standard Library is distributed under the MIT
+License. However components of the library may be based on code with
+additional licensing restriction. In particular `ORD_SORT`,
+`ORD_SORTING`, and `UNORD_SORT` are translations of codes with their
+own distribution restrictions.
 
-The [`rust sort` implementation]
-(https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs)
-is distributed with the header:
+The `ORD_SORT` and `ORD_SORTING` subroutines are essentially
+translations to Fortran 2008 of the `"rust" sort` of the Rust Lsnguage
+distributed as part of
+[`slice.rs`](https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs).
+The header of the `slice.rs` file has as its licensing requirements:
 
     Copyright 2012-2015 The Rust Project Developers. See the COPYRIGHT
     file at the top-level directory of this distribution and at
@@ -77,19 +72,48 @@ is distributed with the header:
     option. This file may not be copied, modified, or distributed
     except according to those terms.
 
-so the license for the original code is compatible with the use of
+so the license for the `slice.rs` code is compatible with the use of
 modified versions of the code in the Fortran Standard Library under
 the MIT license.
 
-As with `timsort`, `ORD_SORT` is a stable hybrid algorithm.
-It begins by traversing the array starting in its tail attempting to
-identify `runs` in the array, where a run is either a uniformly
-decreasing sequence, `ARRAY(i-1) > ARRAY(i)`, or non-decreasing,
-`ARRAY(i-1) <= ARRAY(i)`, sequence. Decreasing sequences are reversed.
-Then, if the sequence has less than `MIN_RUN` elements, previous
-elements in the array are added to the run using `insertion sort`
-until the run contains `MIN_RUN` elements or the array is completely
-processed. As each run is identified the start and length of the run
+The `UNORD_SORT` subroutine is essentially a translation to Fortran
+2008 of the
+[`introsort`]((http://www.cs.rpi.edu/~musser/gp/introsort.ps) of David
+Musser.  David Musser has given permission to include a variant of
+`introsort` in the Fortran Standard Library under the MIT license
+provided we cite:
+
+    Musser, D.R., “Introspective Sorting and Selection Algorithms,”
+    Software—Practice and Experience, Vol. 27(8), 983–993 (August 1997).
+
+as the official source of the algorithm.
+
+
+#### The `ORD_SORT` subroutine
+
+`ORD_SORT` is a translation of the `"Rust" sort` sorting algorithm
+contained in [`slice.rs`]
+(https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs).
+`"Rust" sort`, in turn, is inspired by the [`timsort` algorithm]
+(http://svn.python.org/projects/python/trunk/Objects/listsort.txt)
+that Tim Peters created for the Python Language.
+`ORD_SORT` is a hybrid stable comparison algorithm combining `merge sort`,
+and `insertion sort`. It has always at worst O(N Ln(N)) runtime
+performance in sorting random data, having a performance about 15-25%
+slower than `UNORD_SORT` on such data. However it has much better
+performance than `UNORD_SORT` on partially sorted data, having O(N)
+performance on uniformly increasing or decreasing data.
+
+
+`ORD_SORt` begins by traversing the array starting in its tail
+attempting to identify `runs` in the array, where a run is either a
+uniformly decreasing sequence, `ARRAY(i-1) > ARRAY(i)`, or a
+non-decreasing, `ARRAY(i-1) <= ARRAY(i)`, sequence. Once deliminated
+decreasing sequences are reversed in their order. Then, if the
+sequence has less than `MIN_RUN` elements, previous elements in the
+array are added to the run using `insertion sort` until the run
+contains `MIN_RUN` elements or the array is completely processed. As
+each run is identified the start and length of the run 
 is then pushed onto a stack and the stack is then processed using
 `merge` until it obeys the stack invariants:
 
@@ -101,39 +125,22 @@ Ln(N))`. However, because of the identification of decreasing and
 non-decreasing runs, processing of structured data can be much faster,
 with processing of uniformly decreasing or non-decreasing arrays being
 of order O(N). The result in our tests is that `ORD_SORT` is about
-15-25% slower than `UNORD_SORT` on purely random data, depending on
-the compiler, but can be more than an order of magnitude faster than
-`UNORD_SORT` on highly structured data. As a modified `merge sort`,
-`ORD_SORT` requires the use of a "scratch" array, that may be provided
-as an optional `work` argument or allocated internally on the stack.
+25% slower than `UNORD_SORT` on purely random data, depending on
+the compiler, but can be `Ln(N)` faster than `UNORD_SORT` on highly
+structured data. As a modified `merge sort`, `ORD_SORT` requires the
+use of a "scratch" array, that may be provided as an optional `work`
+argument or allocated internally on the stack.
 
 #### The `ORD_SORTING` subroutine
 
-The `UNORD_SORT` and `ORD_SORT` subroutines can sort isolated arrays
-of intrinsic types, but do nothing for the sorting of arrays of
-derived types. For arrays of derived types what is useful is an array
-of indices that maps the original array to an array sorted based on the
-value of a component of the derived type. For such a sort, a stable
-sort is useful, therefore the module provides a subroutine,
-`ORD_SORTING`, that generates such an array of indices based on
-the `ORD_SORT` algorithm.
-
-As `ORD_SORT` is also based on the `rust sort` algorithm the `rust
-sort` license must be acknowledged:
-
-    Copyright 2012-2015 The Rust Project Developers. See the COPYRIGHT
-    file at the top-level directory of this distribution and at
-    http://rust-lang.org/COPYRIGHT.
-
-    Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or
-    http://www.apache.org/licenses/LICENSE-2.0> or the MIT license
-    <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your
-    option. This file may not be copied, modified, or distributed
-    except according to those terms.
-
-noting that the Fortran Standard Library is released under the MIT
-license so that incorporation of the `rust sort` algorithm is
-compatible with its license.
+The `UNORD_SORT` and `ORD_SORT` subroutines can sort rank 1 isolated
+arrays of intrinsic types, but do nothing for the coordinated sorting
+of related data, e.g., multiple related rank 1 arrays, higher rank
+arrays, or arrays of derived types. For such related data, what is
+useful is an array of indices that maps a rank 1 array to its sorted
+form. For such a sort, a stable sort is useful, therefore the module
+provides a subroutine, `ORD_SORTING`, that generates such an array of
+indices based on the `ORD_SORT` algorithm.
 
 The logic of `ORD_SORTING` parallels that of `ORD_SORT`, with
 additional housekeeping to keep the array of indices consistent with
@@ -145,19 +152,12 @@ internally on the stack.
 
 #### The `UNORD_SORT` subroutines
 
-`UNORD_SORT` uses the [`introsort` sorting algorithm of David Musser]
-(http://www.cs.rpi.edu/~musser/gp/introsort.ps). `introsort` is a hybrid
-unstable comparison algorithm combining `quicksort`, `insertion sort`,
-and `heap sort`. While this algorithm's runtime performance is always
-O(N Ln(N)), it is relatively fast on randomly ordered data, but
-inconsistent in performance on partly sorted data. David Musser has
-given permission to include a variant of `introsort` in the Fortran
-Standard Library under the MIT license provided we cite:
-
-    Musser, D.R., “Introspective Sorting and Selection Algorithms,”
-    Software—Practice and Experience, Vol. 27(8), 983–993 (August 1997).
-
-as the official source of the algorithm.
+`UNORD_SORT` uses the `introsort` sorting algorithm of David Musser.
+`introsort` is a hybrid unstable comparison algorithm combining
+`quicksort`, `insertion sort`, and `heap sort`. While this algorithm's
+runtime performance is always O(N Ln(N)), it is relatively fast on
+randomly ordered data, but inconsistent in performance on partly
+sorted data.as the official source of the algorithm.
 
 As with `introsort`, `UNORD_SORT` is an unstable hybrid algorithm.
 First it examines the array and estimates the depth of recursion a
@@ -260,7 +260,7 @@ element of `array` is a `NaN`.
 	...
 ```
 
-#### `ord_sorting` - creates an arry of sorting indices for an input array.
+#### `ord_sorting` - creates an array of sorting indices for an input array.
 
 ##### Status
 
@@ -329,8 +329,52 @@ is a `NaN`. It should be emphasized that the order of `array` will
 typically be different on return.
 
 
-##### Example
+##### Examples
+
+Sorting a related rank one array:
+
+```Fortran
+	subroutine sort_related_data( a, b, work, index, iwork )
+	    ! Sort `b` in terms or its related array `a`
+	    integer, intent(inout)           :: a(:)
+		integer(int32), intent(inout)    :: b(:) ! The same size as a
+		integer(int32), intent(inout)    :: work(:)
+		integer(int_size), intent(inout) :: index(:)
+		integer(int_size), intent(inout) :: iwork(:)
+		! Find the indices to sort a
+		call ord_sorting(a, index(1:size(a)),&
+		    work(1:size(a)/2), iwork(1:size(a)/2))
+		! Sort b based on the sorting of a
+		b(:) = b( index(1:size(a)) )
+	end subroutine sort_related_data
+```
+
+Sorting a rank 2 array based on the data in a column
+
+```Fortran
+	subroutine sort_related_data( array, column, work, index, iwork )
+	    ! Sort `a_data` in terms or its component `a`
+	    integer, intent(inout)           :: a(:,:)
+		integer(int32), intent(in)       :: column
+		integer(int32), intent(inout)    :: work(:)
+		integer(int_size), intent(inout) :: index(:)
+		integer(int_size), intent(inout) :: iwork(:)
+		integer, allocatable             :: dummy(:)
+		integer :: i
+		allocate(dummy(size(a, dim=1)))
+		! Extract a component of `a_data`
+		dummy(:) = a(:, column)
+		! Find the indices to sort the column
+		call ord_sorting(dummy, index(1:size(dummy)),&
+		    work(1:size(dummy)/2), iwork(1:size(dummy)/2))
+		! Sort a based on the sorting of its column
+		do i=1, size(a, dim=2)
+		    a(:, i) = a(index(1:size(a, dim=1)), i)
+		end do
+	end subroutine sort_related_data
+```
 
+Sorting an array of a derived type based on the dsta in one component
 ```Fortran
 	subroutine sort_a_data( a_data, a, work, index, iwork )
 	    ! Sort `a_data` in terms or its component `a`
@@ -455,9 +499,9 @@ arrays each of size `2**20`:
 * Random-10 - the final ten elements of the increasing array are
   replaced by random values.
 
-These benchmarks have been performed on two different processors, both
+These benchmarks have been performed on two different compilers, both
 on a MacBook Pro, featuring a 2.3 GHz Quad-Core Intel Core i5, with 8
-GB 2133 MHz LPDDR3 memory. The first processor was GNU Fortran
+GB 2133 MHz LPDDR3 memory. The first compiler was GNU Fortran
 (GCC) 10.2.0, with the following results:
 
 |   Type  | Elements |   Order       |    Method     | Time (s)  |
@@ -490,7 +534,7 @@ GB 2133 MHz LPDDR3 memory. The first processor was GNU Fortran
 | Integer | 1048576  |      Random 3 |    Unord_Sort |   0.13826 |
 | Integer | 1048576  |     Random 10 |    Unord_Sort |   0.35356 |
 
-The second processor was ifort (IFORT) 18.0.3 20180410, with the
+The second compiler was ifort (IFORT) 18.0.3 20180410, with the
 following results:
 
 |   Type  | Elements |   Order       |    Method     | Time (s)  |

From 28afb184e7752cea2aeb8064d17c7d9db7812b91 Mon Sep 17 00:00:00 2001
From: William Clodius <w.clodius@icloud.com>
Date: Tue, 20 Apr 2021 22:51:21 -0600
Subject: [PATCH 05/45] Added sorting for character(*) and string_type

Added sorting for `character(*)` and `string_type` rank 1 arrays. Improved the
overall documentation. Added newer benchmarks for the addiitons and the Intel
One API compiler system.

[ticket: X]
---
 doc/specs/stdlib_sorting.md | 490 ++++++++++++++++++++----------------
 1 file changed, 280 insertions(+), 210 deletions(-)

diff --git a/doc/specs/stdlib_sorting.md b/doc/specs/stdlib_sorting.md
index 5ea6da868..c573c5876 100644
--- a/doc/specs/stdlib_sorting.md
+++ b/doc/specs/stdlib_sorting.md
@@ -17,16 +17,18 @@ However Fortran's arrays are by far its most widely used collection,
 and arrays of arbitrary types of data can often be sorted in terms of
 a single component of intrinsic type.
 The Fortran Standard Library therefore provides a module,
-`stdlib_sorting`, with procedures to sort arrays of single intrinsic
-numeric types, i.e. the different kinds of integers and reals.
+`stdlib_sorting`, with procedures to sort arrays of simple intrinsic
+numeric types, i.e. the different kinds of integers and reals, the
+default assumed length character, and the `stdlib_string_type`
+module's `string_type` type.
 
 ## Overview of the module
 
 The module `stdlib_sorting` defines several public entities one
 default integer parameter, `int_size`, and three overloaded
-subroutines: `ORD_SORT`, `UNORD_SORT`, and `ORD_SORTING`. The
+subroutines: `ORD_SORT`, `SORT`, and `SORT_INDEX`. The
 overloaded subroutines also each have seven specific names for
-versions corresponding to differend types of array arguments.
+versions corresponding to different types of array arguments.
 
 ### The `int_size` parameter
 
@@ -42,10 +44,10 @@ data:
 * `ORD_SORT` is intended to sort simple arrays of intrinsic data
   that have significant sections that were partially ordered before
   the sort;
-* `ORD_SORTING` is intended to provide indices for sorting arrays of
+* `SORT_INDEX` is intended to provide indices for sorting arrays of
   derived type data, based on the ordering of an intrinsic component
   of the derived type; and
-* `UNORD_SORT` is intended to sort simple arrays of intrinsic data
+* `SORT` is intended to sort simple arrays of intrinsic data
   that are effectively unordered before the sort.
 
 #### Licensing
@@ -53,11 +55,11 @@ data:
 The Fortran Standard Library is distributed under the MIT
 License. However components of the library may be based on code with
 additional licensing restriction. In particular `ORD_SORT`,
-`ORD_SORTING`, and `UNORD_SORT` are translations of codes with their
+`SORT_INDEX`, and `SORT` are translations of codes with their
 own distribution restrictions.
 
-The `ORD_SORT` and `ORD_SORTING` subroutines are essentially
-translations to Fortran 2008 of the `"rust" sort` of the Rust Lsnguage
+The `ORD_SORT` and `SORT_INDEX` subroutines are essentially
+translations to Fortran 2008 of the `"rust" sort` of the Rust Language
 distributed as part of
 [`slice.rs`](https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs).
 The header of the `slice.rs` file has as its licensing requirements:
@@ -76,7 +78,7 @@ so the license for the `slice.rs` code is compatible with the use of
 modified versions of the code in the Fortran Standard Library under
 the MIT license.
 
-The `UNORD_SORT` subroutine is essentially a translation to Fortran
+The `SORT` subroutine is essentially a translation to Fortran
 2008 of the
 [`introsort`]((http://www.cs.rpi.edu/~musser/gp/introsort.ps) of David
 Musser.  David Musser has given permission to include a variant of
@@ -100,15 +102,15 @@ that Tim Peters created for the Python Language.
 `ORD_SORT` is a hybrid stable comparison algorithm combining `merge sort`,
 and `insertion sort`. It has always at worst O(N Ln(N)) runtime
 performance in sorting random data, having a performance about 15-25%
-slower than `UNORD_SORT` on such data. However it has much better
-performance than `UNORD_SORT` on partially sorted data, having O(N)
+slower than `SORT` on such data. However it has much better
+performance than `SORT` on partially sorted data, having O(N)
 performance on uniformly increasing or decreasing data.
 
 
-`ORD_SORt` begins by traversing the array starting in its tail
+`ORD_SORT` begins by traversing the array starting in its tail
 attempting to identify `runs` in the array, where a run is either a
 uniformly decreasing sequence, `ARRAY(i-1) > ARRAY(i)`, or a
-non-decreasing, `ARRAY(i-1) <= ARRAY(i)`, sequence. Once deliminated
+non-decreasing, `ARRAY(i-1) <= ARRAY(i)`, sequence. Once delimitated
 decreasing sequences are reversed in their order. Then, if the
 sequence has less than `MIN_RUN` elements, previous elements in the
 array are added to the run using `insertion sort` until the run
@@ -125,45 +127,45 @@ Ln(N))`. However, because of the identification of decreasing and
 non-decreasing runs, processing of structured data can be much faster,
 with processing of uniformly decreasing or non-decreasing arrays being
 of order O(N). The result in our tests is that `ORD_SORT` is about
-25% slower than `UNORD_SORT` on purely random data, depending on
-the compiler, but can be `Ln(N)` faster than `UNORD_SORT` on highly
+25% slower than `SORT` on purely random data, depending on
+the compiler, but can be `Ln(N)` faster than `SORT` on highly
 structured data. As a modified `merge sort`, `ORD_SORT` requires the
 use of a "scratch" array, that may be provided as an optional `work`
 argument or allocated internally on the stack.
 
-#### The `ORD_SORTING` subroutine
+#### The `SORT_INDEX` subroutine
 
-The `UNORD_SORT` and `ORD_SORT` subroutines can sort rank 1 isolated
+The `SORT` and `ORD_SORT` subroutines can sort rank 1 isolated
 arrays of intrinsic types, but do nothing for the coordinated sorting
 of related data, e.g., multiple related rank 1 arrays, higher rank
 arrays, or arrays of derived types. For such related data, what is
 useful is an array of indices that maps a rank 1 array to its sorted
 form. For such a sort, a stable sort is useful, therefore the module
-provides a subroutine, `ORD_SORTING`, that generates such an array of
+provides a subroutine, `SORT_INDEX`, that generates such an array of
 indices based on the `ORD_SORT` algorithm.
 
-The logic of `ORD_SORTING` parallels that of `ORD_SORT`, with
+The logic of `SORT_INDEX` parallels that of `ORD_SORT`, with
 additional housekeeping to keep the array of indices consistent with
 the sorted positions of the input array. Because of this additional
 housekeeping it has slower runtime performance than `ORD_SORT`.
-`ORD_SORTING` requires the use of two "scratch" arrays, that may be
+`SORT_INDEX` requires the use of two "scratch" arrays, that may be
 provided as optional `work` and `iwork` arguments or allocated
 internally on the stack.
 
-#### The `UNORD_SORT` subroutines
+#### The `SORT` subroutines
 
-`UNORD_SORT` uses the `introsort` sorting algorithm of David Musser.
+`SORT` uses the `introsort` sorting algorithm of David Musser.
 `introsort` is a hybrid unstable comparison algorithm combining
 `quicksort`, `insertion sort`, and `heap sort`. While this algorithm's
 runtime performance is always O(N Ln(N)), it is relatively fast on
 randomly ordered data, but inconsistent in performance on partly
 sorted data.as the official source of the algorithm.
 
-As with `introsort`, `UNORD_SORT` is an unstable hybrid algorithm.
+As with `introsort`, `SORT` is an unstable hybrid algorithm.
 First it examines the array and estimates the depth of recursion a
 quick sort would require for ideal (random) data, `D =
 Ceiling(Ln(N)/Ln(2))`. It then defines a limit to the number of
-quicksort recursions to be allowed in processing,
+`quicksort` recursions to be allowed in processing,
 `D_limit = factor * D`, where factor is currently 2, and
 calls `introsort` proper. `introsort` proper then:
 
@@ -173,11 +175,11 @@ calls `introsort` proper. `introsort` proper then:
    recursion exceeds `D_limit` and, if it does, processes the remaining
    elements with heap sort and returns;
 3. If the current depth of recursion does not exceed `D_limit`, then
-   in effect does a quicksort step:
+   in effect does a `quicksort` step:
 
     * Partitions the remaining array using a median of three,
-    * Calls introsort proper on the leftmost partition,
-    * Calls introsort proper on the rightmost partition, and then
+    * Calls `introsort` proper on the leftmost partition,
+    * Calls `introsort` proper on the rightmost partition, and then
       returns.
 
 The resulting algorithm is of order O(N Ln(N)) run time
@@ -187,7 +189,7 @@ other sorting algorithms on random data. On partially sorted data it
 can show either slower `heap sort` performance, or enhanced
 performance by up to a factor of six. Still, even when it shows
 enhanced performance, its performance on partially sorted data is
-typically an order of magnitude slower then `ORD_SORT`.
+typically an order of magnitude slower than `ORD_SORT`.
 
 ### Tentative specifications of the `stdlib_sorting` procedures
 
@@ -214,13 +216,14 @@ Generic subroutine.
 
 `array` : shall be a rank one array of any of the types:
 `integer(int8)`, `integer(int16)`, `integer(int32)`, `integer(int64)`,
-`real(real32)`, `real(real64)`, or `real(real128)`. It is an
-`intent(inout)` argument. On input it is the array to be sorted. If
-both the type of `array` is real and at least one of the elements is a
-`NaN`, then the ordering of the result is undefined. Otherwise on
-return its elements will be sorted in order of non-decreasing value.
-
-`work` (optional): shall be a rank one array of any of the same type as
+`real(sp)`, `real(dp)`, `real(qp)`, `character(*)`, or
+`type(string_type)`. It is an `intent(inout)` argument. On input it is
+the array to be sorted. If both the type of `array` is real and at
+least one of the elements is a `NaN`, then the ordering of the result
+is undefined. Otherwise on return its elements will be sorted in order
+of non-decreasing value.
+
+`work` (optional): shall be a rank one array of the same type as
 array, and shall have at least `size(array)/2` elements. It is an
 `intent(inout)` argument. It is intended to be used as "scratch"
 memory for internal record keeping. If associated with an array in
@@ -232,35 +235,90 @@ requirements for the code. Its contents on return are undefined.
 `ORD_SORT` implements a hybrid sorting algorithm combining
 `merge sort`, and `insertion sort`. For most purposes it behaves like
 a `merge sort`, providing worst case `O(N Ln(N))` run time performance
-for most random arrays, that is typically slower than `UNORD_SORT`.
+for most random arrays, that is typically slower than `SORT`.
 However, if the array has significant runs of decreasing or
 non-decreasing values, performance can be much better than
-`UNORD_SORT`, with `O(N)` behavior on uniformly decreasing, or
+`SORT`, with `O(N)` behavior on uniformly decreasing, or
 non-decreasing arrays. The optional `work` array replaces "scratch"
 memory that would otherwise be allocated on the stack. If `array` is of
 any type `REAL` the order of its elements on return undefined if any
-element of `array` is a `NaN`.
+element of `array` is a `NaN`.  Sorting of `CHARACTER(*)` and
+`STRING_TYPE` arrays are based on the operator `>`, and not on the
+function `LGT`.
 
 
 ##### Example
 
-```Fortran
-	...
-	! Read arrays from sorted files
-	call read_sorted_file( 'dummy_file1', array1 )
-	call read_sorted_file( 'dummy_file2', array2 )
-	! Concatenate the arrays
-	allocate( array( size(array1) + size(array2) ) )
-	array( 1:size(array1) ) = array1(:)
-	array( size(array1)+1:size(array1)+size(array2) ) = array2(:)
-	! Sort the resulting array
-	call ord_sort( array, work )
-	! Process the sorted array
-	call array_search( array, values )
-	...
+```fortran
+    ...
+    ! Read arrays from sorted files
+    call read_sorted_file( 'dummy_file1', array1 )
+    call read_sorted_file( 'dummy_file2', array2 )
+    ! Concatenate the arrays
+    allocate( array( size(array1) + size(array2) ) )
+    array( 1:size(array1) ) = array1(:)
+    array( size(array1)+1:size(array1)+size(array2) ) = array2(:)
+    ! Sort the resulting array
+    call ord_sort( array, work )
+    ! Process the sorted array
+    call array_search( array, values )
+    ...
 ```
 
-#### `ord_sorting` - creates an array of sorting indices for an input array.
+#### `sort` - sorts an input array
+
+##### Status
+
+Experimental
+
+##### Description
+
+Returns an input array with the elements sorted in order of increasing
+value.
+
+##### Syntax
+
+`call [[stdlib_sorting(module):sort(subroutine)]]sort ( array )`
+
+##### Class
+
+Pure generic subroutine.
+
+##### Arguments
+
+`array` : shall be a rank one array of any of the types:
+`integer(int8)`, `integer(int16)`, `integer(int32)`, `integer(int64)`,
+`real(sp)`, `real(dp)`, `real(qp)`. `character(*)`, or
+`type(string_type)`. It is an `intent(inout)` argument. On return its
+input elements will be sorted in order of non-decreasing value.
+
+##### Notes
+
+`SORT` implements a hybrid sorting algorithm combining
+`quicksort`, `merge sort`, and `insertion sort`. For most purposes it
+behaves like a `quicksort` with a median of three partition, providing
+good, `O(N Ln(N))`, run time performance for most random arrays, but
+defaulting to `merge sort` if the structure of the array results in
+the `quicksort` not converging as rapidly as expected. If `array` is of
+any type `REAL`, the behavior of the sorting is undefined if any
+element of `array` is a `NaN`.  Sorting of `CHARACTER(*)` and
+`STRING_TYPE` arrays are based on the operators `<`, `<=`, `>`, and
+`>=`, and not on the functions `LLT`, `LLE`, `LGT`, or `LGE`.
+
+##### Example
+
+```fortran
+    ...
+    ! Read random data from a file
+    call read_file( 'dummy_file', array )
+    ! Sort the random data
+    call sort( array )
+    ! Process the sorted data
+    call array_search( array, values )
+    ...
+```
+
+#### `sort_index` - creates an array of sorting indices for an input array.
 
 ##### Status
 
@@ -273,7 +331,7 @@ the specified direction retaining order stability.
 
 ##### Syntax
 
-`call [[stdlib_sorting(module):ord_sorting(subroutine)]]ord_sorting ( array, index[, work, iwork, reverse, zero_based ] )`
+`call [[stdlib_sorting(module):sort_index(subroutine)]]sort_index ( array, index[, work, iwork, reverse ] )`
 
 ##### Class
 
@@ -283,10 +341,10 @@ Generic subroutine.
 
 `array`: shall be a rank one array of any of the types:
 `integer(int8)`, `integer(int16)`, `integer(int32)`, `integer(int64)`,
-`real(real32)`, `real(real64)`, or `real(real128)`. It is an
-`intent(inout)` argument. On input it will be an array whose sorting
-indices are to be determined. On return it will be the sorted
-array.
+`real(sp)`, `real(dp)`, `real(qp)`, `character(*)`, or
+`type(string_type)`. It is an `intent(inout)` argument. On input it
+will be an array whose sorting indices are to be determined. On return
+it will be the sorted array.
 
 `index`: shall be a rank one integer array of kind `int_size` and of
 the size of `array`. It is an `intent(out)` argument. On return it
@@ -294,7 +352,7 @@ shall have values that are the indices needed to sort the original
 array in the desired direction.
 
 `work` (optional): shall be a rank one array of any of the same type as
-array, and shall have at least `size(array)/2` elements. It is an
+`array`, and shall have at least `size(array)/2` elements. It is an
 `intent(inout)` argument. It is intended to be used as "scratch"
 memory for internal record keeping. If associated with an array in
 static storage, its use can significantly reduce the stack memory
@@ -302,7 +360,10 @@ requirements for the code. Its contents on return are undefined.
 
 `iwork` (optional): shall be a rank one integer array of kind
 `int_size`, and shall have at least `size(array)/2` elements. It
-is an `intent(inout)` argument. Its contents on return are undefined.
+is an `intent(inout)` argument.  It is intended to be used as "scratch"
+memory for internal record keeping. If associated with an array in
+static storage, its use can significantly reduce the stack memory
+requirements for the code. Its contents on return are undefined.
 
 `reverse` (optional): shall be a scalar of type default logical. It
 is an `intent(in)` argument. If present with a value of `.true.` then
@@ -310,23 +371,19 @@ is an `intent(in)` argument. If present with a value of `.true.` then
 order. Otherwise index will sort `array` in order of non-decreasing
 values in stable order.
 
-`zero_based` (optional): shall be a scalar of type default logical.
-It is an `intent(in)` argument. If present with the value `.true.`
-the values of `index` will be for zero based array indexing,
-otherwise the values of index will be for one's based array
-indexing.
-
 ##### Notes
 
-`ORD_SORTING` implements the hybrid sorting algorithm of `ORD_SORT`,
+`SORT_INDEX` implements the hybrid sorting algorithm of `ORD_SORT`,
 keeping the values of `index` consistent with the elements of `array`
 as it is sorted. As a `merge sort` based algorithm, it is a stable
 sorting comparison algorithm. The optional `work` and `iwork` arrays
 replace "scratch" memory that would otherwise be allocated on the
 stack. If `array` is of any kind of `REAL` the order of the elements in
 `index` and `array` on return are undefined if any element of `array`
-is a `NaN`. It should be emphasized that the order of `array` will
-typically be different on return.
+is a `NaN`. Sorting of `CHARACTER(*)` and `STRING_TYPE` arrays are
+based on the operator `>`, and not on the function `LGT`. It should be
+emphasized that the order of `array` will typically be different on
+return.
 
 
 ##### Examples
@@ -334,15 +391,15 @@ typically be different on return.
 Sorting a related rank one array:
 
 ```Fortran
-	subroutine sort_related_data( a, b, work, index, iwork )
-	    ! Sort `b` in terms or its related array `a`
-	    integer, intent(inout)           :: a(:)
-		integer(int32), intent(inout)    :: b(:) ! The same size as a
+    subroutine sort_related_data( a, b, work, index, iwork )
+        ! Sort `b` in terms or its related array `a`
+        integer, intent(inout)           :: a(:)
+        integer(int32), intent(inout)    :: b(:) ! The same size as a
 		integer(int32), intent(inout)    :: work(:)
 		integer(int_size), intent(inout) :: index(:)
 		integer(int_size), intent(inout) :: iwork(:)
 		! Find the indices to sort a
-		call ord_sorting(a, index(1:size(a)),&
+		call sort_index(a, index(1:size(a)),&
 		    work(1:size(a)/2), iwork(1:size(a)/2))
 		! Sort b based on the sorting of a
 		b(:) = b( index(1:size(a)) )
@@ -365,7 +422,7 @@ Sorting a rank 2 array based on the data in a column
 		! Extract a component of `a_data`
 		dummy(:) = a(:, column)
 		! Find the indices to sort the column
-		call ord_sorting(dummy, index(1:size(dummy)),&
+		call sort_index(dummy, index(1:size(dummy)),&
 		    work(1:size(dummy)/2), iwork(1:size(dummy)/2))
 		! Sort a based on the sorting of its column
 		do i=1, size(a, dim=2)
@@ -375,7 +432,8 @@ Sorting a rank 2 array based on the data in a column
 ```
 
 Sorting an array of a derived type based on the dsta in one component
-```Fortran
+
+```fortran
 	subroutine sort_a_data( a_data, a, work, index, iwork )
 	    ! Sort `a_data` in terms or its component `a`
 	    type(a_type), intent(inout)      :: a_data(:)
@@ -386,64 +444,13 @@ Sorting an array of a derived type based on the dsta in one component
 		! Extract a component of `a_data`
 		a(1:size(a_data)) = a_data(:) % a
 		! Find the indices to sort the component
-		call ord_sorting(a(1:size(a_data)), index(1:size(a_data)),&
+		call sort_index(a(1:size(a_data)), index(1:size(a_data)),&
 		    work(1:size(a_data)/2), iwork(1:size(a_data)/2))
 		! Sort a_data based on the sorting of that component
 		a_data(:) = a_data( index(1:size(a_data)) )
 	end subroutine sort_a_data
 ```
 
-#### `unord_sort` - sorts an input array
-
-##### Status
-
-Experimental
-
-##### Description
-
-Returns an input array with the elements sorted in order of increasing
-value.
-
-##### Syntax
-
-`call [[stdlib_sorting(module):unord_sort(subroutine)]]unord_sort ( array )`
-
-##### Class
-
-Pure generic subroutine.
-
-##### Arguments
-
-`array` : shall be a rank one array of any of the types:
-`integer(int8)`, `integer(int16)`, `integer(int32)`, `integer(int64)`,
-`real(real32)`, `real(real64)`, or `real(real128)`. It
-is an `intent(inout)` argument. On return its input elements will be
-sorted in order of non-decreasing value.
-
-##### Notes
-
-`UNORD_SORT` implements a hybrid sorting algorithm combining
-`quicksort`, `merge sort`, and `insertion sort`. For most purposes it
-behaves like a `quicksort` with a median of three partition, providing
-good, `O(N Ln(N))`, run time performance for most random arrays, but
-defaulting to `merge sort` if the structure of the array results in
-the `quicksort` not converging as rapidly as expected. If `array` is of
-any type `REAL`, the behavior of the sorting is undefined if any
-element of `array` is a `NaN`.
-
-##### Example
-
-```Fortran
-	...
-	! Read random data from a file
-	call read_file( 'dummy_file', array )
-	! Sort the random data
-	call unord_sort( array )
-	! Process the sorted data
-	call array_search( array, values )
-	...
-```
-
 #### Specific procedures
 
 Usually the name of a generic procedure is the most convenient way of
@@ -459,31 +466,37 @@ their `array` arguments.
 |                    | `INT16_ORD_SORT`    | `INTEGER(INT16)` |
 |                    | `INT32_ORD_SORT`    | `INTEGER(INT32)` |
 |                    | `INT64_ORD_SORT`    | `INTEGER(INT64)` |
-|                    | `SP_ORD_SORT`    | `REAL(REAL32)` |
-|                    | `DP_ORD_SORT`    | `REAL(REAL64)` |
-|                    | `QP_ORD_SORT`    | `REAL(REAL128)` |
-| `ORD_SORTING`      | `INT8_ORD_SORTING`     | `INTEGER(INT8)` |
-|                    | `INT16_ORD_SORTING`    | `INTEGER(INT16)` |
-|                    | `INT32_ORD_SORTING`    | `INTEGER(INT32)` |
-|                    | `INT64_ORD_SORTING`    | `INTEGER(INT64)` |
-|                    | `SP_ORD_SORTING`    | `REAL(REAL32)` |
-|                    | `DP_ORD_SORTING`    | `REAL(REAL64)` |
-|                    | `QP_ORD_SORTING`    | `REAL(REAL128)` |
-| `UNORD_SORT`         | `INT8_UNORD_SORT`     | `INTEGER(INT8)` |
-|                    | `INT16_UNORD_SORT`    | `INTEGER(INT16)` |
-|                    | `INT32_UNORD_SORT`    | `INTEGER(INT32)` |
-|                    | `INT64_UNORD_SORT`    | `INTEGER(INT64)` |
-|                    | `SP_UNORD_SORT`    | `REAL(REAL32)` |
-|                    | `DP_UNORD_SORT`    | `REAL(REAL64)` |
-|                    | `QP_UNORD_SORT`    | `REAL(REAL128)` |
+|                    | `SP_ORD_SORT`    | `REAL(SP)` |
+|                    | `DP_ORD_SORT`    | `REAL(DP)` |
+|                    | `QP_ORD_SORT`    | `REAL(QP)` |
+|                    | `CHAR_ORD_SORT`  | `CHARACTER(*)`  |
+|                    | `STRING_ORD_SORT`  | `STRING_TYPE`  |
+| `SORT`         | `INT8_SORT`     | `INTEGER(INT8)` |
+|                    | `INT16_SORT`    | `INTEGER(INT16)` |
+|                    | `INT32_SORT`    | `INTEGER(INT32)` |
+|                    | `INT64_SORT`    | `INTEGER(INT64)` |
+|                    | `SP_SORT`    | `REAL(SP)` |
+|                    | `DP_SORT`    | `REAL(DP)` |
+|                    | `QP_SORT`    | `REAL(QP)` |
+|                    | `CHAR_SORT`  | `CHARACTER(*)` |
+|                    | `STRING_SORT`  | `STRING_TYPE` |
+| `SORT_INDEX`      | `INT8_SORT_INDEX`     | `INTEGER(INT8)` |
+|                    | `INT16_SORT_INDEX`    | `INTEGER(INT16)` |
+|                    | `INT32_SORT_INDEX`    | `INTEGER(INT32)` |
+|                    | `INT64_SORT_INDEX`    | `INTEGER(INT64)` |
+|                    | `SP_SORT_INDEX`    | `REAL(SP)` |
+|                    | `DP_SORT_INDEX`    | `REAL(DP)` |
+|                    | `QP_SORT_INDEX`    | `REAL(QP)` |
+|                    | `CHAR_SORT_INDEX`    | `CHARACTER(*)` |
+|                    | `STRING_SORT_INDEX`    | `STRING_TYPE` |
 
 
 ### Performance benchmarks
 
 We have performed benchmarks of the procedures on nine different
-arrays each of size `2**20`:
+integer arrays each of size `2**20`:
 
-* Blocks - the array is divided into siz blocks, each of distinct
+* Blocks - the array is divided into six blocks, each of distinct
   uniformly increasing integers.
 * Decreasing - values decrease uniformly from `2**20-1` to `0`.
 * Identical - all integers have the same value of 10.
@@ -499,72 +512,129 @@ arrays each of size `2**20`:
 * Random-10 - the final ten elements of the increasing array are
   replaced by random values.
 
+On three different default character arrays, each of length 4 and of
+size `26**4`:
+
+* Char. Decreasing - values decrease uniformly from `"zzzz"` to
+  `"aaaa"`.
+* Char. Increasing - values decrease uniformly from `"aaaa"` to
+  `"zzzz"`.
+* Char. Random - the set of strings from `"aaaa"` to `"zzzz"` in
+  random order.
+
+On three different `string_type` arrays, each of length 4 elements and
+of size `26**4`:
+
+* String Decreasing - values decrease uniformly from `"zzzz"` to
+  `"aaaa"`.
+* String Increasing - values decrease uniformly from `"aaaa"` to
+  `"zzzz"`.
+* String Random - the set of strings from `"aaaa"` to `"zzzz"` in
+  random order.
+
 These benchmarks have been performed on two different compilers, both
 on a MacBook Pro, featuring a 2.3 GHz Quad-Core Intel Core i5, with 8
 GB 2133 MHz LPDDR3 memory. The first compiler was GNU Fortran
 (GCC) 10.2.0, with the following results:
 
-|   Type  | Elements |   Order       |    Method     | Time (s)  |
-|---------|----------|---------------|---------------|-----------|
-| Integer | 1048576  |        Blocks |      Ord_Sort |   0.00692 |
-| Integer | 1048576  |    Decreasing |      Ord_Sort |   0.00377 |
-| Integer | 1048576  |     Identical |      Ord_Sort |   0.00236 |
-| Integer | 1048576  |    Increasing |      Ord_Sort |   0.00205 |
-| Integer | 1048576  |  Random dense |      Ord_Sort |   0.16675 |
-| Integer | 1048576  |  Random order |      Ord_Sort |   0.16780 |
-| Integer | 1048576  | Random sparse |      Ord_Sort |   0.16715 |
-| Integer | 1048576  |      Random 3 |      Ord_Sort |   0.00907 |
-| Integer | 1048576  |     Random 10 |      Ord_Sort |   0.00435 |
-| Integer | 1048576  |        Blocks |   Ord_Sorting |   0.01177 |
-| Integer | 1048576  |    Decreasing |   Ord_Sorting |   0.00695 |
-| Integer | 1048576  |     Identical |   Ord_Sorting |   0.00438 |
-| Integer | 1048576  |    Increasing |   Ord_Sorting |   0.00429 |
-| Integer | 1048576  |  Random dense |   Ord_Sorting |   0.19079 |
-| Integer | 1048576  |  Random order |   Ord_Sorting |   0.19242 |
-| Integer | 1048576  | Random sparse |   Ord_Sorting |   0.19212 |
-| Integer | 1048576  |      Random 3 |   Ord_Sorting |   0.01504 |
-| Integer | 1048576  |     Random 10 |   Ord_Sorting |   0.00794 |
-| Integer | 1048576  |        Blocks |    Unord_Sort |   0.20889 |
-| Integer | 1048576  |    Decreasing |    Unord_Sort |   0.12826 |
-| Integer | 1048576  |     Identical |    Unord_Sort |   0.15646 |
-| Integer | 1048576  |    Increasing |    Unord_Sort |   0.05012 |
-| Integer | 1048576  |  Random dense |    Unord_Sort |   0.14359 |
-| Integer | 1048576  |  Random order |    Unord_Sort |   0.14613 |
-| Integer | 1048576  | Random sparse |    Unord_Sort |   0.15159 |
-| Integer | 1048576  |      Random 3 |    Unord_Sort |   0.13826 |
-| Integer | 1048576  |     Random 10 |    Unord_Sort |   0.35356 |
-
-The second compiler was ifort (IFORT) 18.0.3 20180410, with the
-following results:
-
-|   Type  | Elements |   Order       |    Method     | Time (s)  |
-|---------|----------|---------------|---------------|-----------|
-| Integer | 1048576  |        Blocks |      Ord_Sort |   0.00231 |
-| Integer | 1048576  |    Decreasing |      Ord_Sort |   0.00123 |
-| Integer | 1048576  |     Identical |      Ord_Sort |   0.00088 |
-| Integer | 1048576  |    Increasing |      Ord_Sort |   0.00088 |
-| Integer | 1048576  |  Random dense |      Ord_Sort |   0.08914 |
-| Integer | 1048576  |  Random order |      Ord_Sort |   0.08779 |
-| Integer | 1048576  | Random sparse |      Ord_Sort |   0.08357 |
-| Integer | 1048576  |      Random 3 |      Ord_Sort |   0.00335 |
-| Integer | 1048576  |     Random 10 |      Ord_Sort |   0.00196 |
-| Integer | 1048576  |        Blocks |   Ord_Sorting |   0.00519 |
-| Integer | 1048576  |    Decreasing |   Ord_Sorting |   0.00240 |
-| Integer | 1048576  |     Identical |   Ord_Sorting |   0.00110 |
-| Integer | 1048576  |    Increasing |   Ord_Sorting |   0.00113 |
-| Integer | 1048576  |  Random dense |   Ord_Sorting |   0.10185 |
-| Integer | 1048576  |  Random order |   Ord_Sorting |   0.10222 |
-| Integer | 1048576  | Random sparse |   Ord_Sorting |   0.10376 |
-| Integer | 1048576  |      Random 3 |   Ord_Sorting |   0.00628 |
-| Integer | 1048576  |     Random 10 |   Ord_Sorting |   0.00264 |
-| Integer | 1048576  |        Blocks |    Unord_Sort |   0.07272 |
-| Integer | 1048576  |    Decreasing |    Unord_Sort |   0.03873 |
-| Integer | 1048576  |     Identical |    Unord_Sort |   0.03677 |
-| Integer | 1048576  |    Increasing |    Unord_Sort |   0.01153 |
-| Integer | 1048576  |  Random dense |    Unord_Sort |   0.06706 |
-| Integer | 1048576  |  Random order |    Unord_Sort |   0.06808 |
-| Integer | 1048576  | Random sparse |    Unord_Sort |   0.06798 |
-| Integer | 1048576  |      Random 3 |    Unord_Sort |   0.07230 |
-| Integer | 1048576  |     Random 10 |    Unord_Sort |   0.13640 |
+|    Type     | Elements |      Order      |    Method   | Time (s)  |
+|-------------|----------|-----------------|-------------|-----------|
+|     Integer | 1048576  |          Blocks |    Ord_Sort |   0.00694 |
+|     Integer | 1048576  |      Decreasing |    Ord_Sort |   0.00363 |
+|     Integer | 1048576  |       Identical |    Ord_Sort |   0.00206 |
+|     Integer | 1048576  |      Increasing |    Ord_Sort |   0.00206 |
+|     Integer | 1048576  |    Random dense |    Ord_Sort |   0.17705 |
+|     Integer | 1048576  |    Random order |    Ord_Sort |   0.17749 |
+|     Integer | 1048576  |   Random sparse |    Ord_Sort |   0.17641 |
+|     Integer | 1048576  |        Random 3 |    Ord_Sort |   0.00943 |
+|     Integer | 1048576  |       Random 10 |    Ord_Sort |   0.00378 |
+|   Character |  456976  |  Char. Decrease |    Ord_Sort |   0.00766 |
+|   Character |  456976  |  Char. Increase |    Ord_Sort |   0.00416 |
+|   Character |  456976  |    Char. Random |    Ord_Sort |   0.23689 |
+| String_type |  456976  | String Decrease |    Ord_Sort |   0.15467 |
+| String_type |  456976  | String Increase |    Ord_Sort |   0.09072 |
+| String_type |  456976  |   String Random |    Ord_Sort |   4.04153 |
+|     Integer | 1048576  |          Blocks |        Sort |   0.11175 |
+|     Integer | 1048576  |      Decreasing |        Sort |   0.14222 |
+|     Integer | 1048576  |       Identical |        Sort |   0.16076 |
+|     Integer | 1048576  |      Increasing |        Sort |   0.05380 |
+|     Integer | 1048576  |    Random dense |        Sort |   0.15519 |
+|     Integer | 1048576  |    Random order |        Sort |   0.15907 |
+|     Integer | 1048576  |   Random sparse |        Sort |   0.15792 |
+|     Integer | 1048576  |        Random 3 |        Sort |   0.23041 |
+|     Integer | 1048576  |       Random 10 |        Sort |   0.24668 |
+|   Character |  456976  |  Char. Decrease |        Sort |   0.31256 |
+|   Character |  456976  |  Char. Increase |        Sort |   0.11330 |
+|   Character |  456976  |    Char. Random |        Sort |   0.20166 |
+| String_type |  456976  | String Decrease |        Sort |   6.41878 |
+| String_type |  456976  | String Increase |        Sort |   2.26954 |
+| String_type |  456976  |   String Random |        Sort |   3.58182 |
+|     Integer | 1048576  |          Blocks |  Sort_Index |   0.01175 |
+|     Integer | 1048576  |      Decreasing |  Sort_Index |   0.00732 |
+|     Integer | 1048576  |       Identical |  Sort_Index |   0.00453 |
+|     Integer | 1048576  |      Increasing |  Sort_Index |   0.00495 |
+|     Integer | 1048576  |    Random dense |  Sort_Index |   0.20723 |
+|     Integer | 1048576  |    Random order |  Sort_Index |   0.20818 |
+|     Integer | 1048576  |   Random sparse |  Sort_Index |   0.20441 |
+|     Integer | 1048576  |        Random 3 |  Sort_Index |   0.01468 |
+|     Integer | 1048576  |       Random 10 |  Sort_Index |   0.00669 |
+|   Character |  456976  |  Char. Decrease |  Sort_Index |   0.00915 |
+|   Character |  456976  |  Char. Increase |  Sort_Index |   0.00522 |
+|   Character |  456976  |    Char. Random |  Sort_Index |   0.26615 |
+| String_type |  456976  | String Decrease |  Sort_Index |   0.19026 |
+| String_type |  456976  | String Increase |  Sort_Index |   0.09407 |
+| String_type |  456976  |   String Random |  Sort_Index |   4.12616 |
+
+The second compiler was Intel(R) Fortran Intel(R) 64 Compiler Classic
+for applications running on Intel(R) 64, Version 2021.2.0 Build
+20210228_000000, with the following results:
+
+|    Type     | Elements |      Order      |    Method   | Time (s)  |
+|-------------|----------|-----------------|-------------|-----------|
+|     Integer | 1048576  |          Blocks |    Ord_Sort |   0.00287 |
+|     Integer | 1048576  |      Decreasing |    Ord_Sort |   0.00123 |
+|     Integer | 1048576  |       Identical |    Ord_Sort |   0.00094 |
+|     Integer | 1048576  |      Increasing |    Ord_Sort |   0.00095 |
+|     Integer | 1048576  |    Random dense |    Ord_Sort |   0.09865 |
+|     Integer | 1048576  |    Random order |    Ord_Sort |   0.10138 |
+|     Integer | 1048576  |   Random sparse |    Ord_Sort |   0.10223 |
+|     Integer | 1048576  |        Random 3 |    Ord_Sort |   0.00420 |
+|     Integer | 1048576  |       Random 10 |    Ord_Sort |   0.00174 |
+|   Character |  456976  |  Char. Decrease |    Ord_Sort |   0.00721 |
+|   Character |  456976  |  Char. Increase |    Ord_Sort |   0.00354 |
+|   Character |  456976  |    Char. Random |    Ord_Sort |   0.21287 |
+| String_type |  456976  | String Decrease |    Ord_Sort |   0.50840 |
+| String_type |  456976  | String Increase |    Ord_Sort |   0.14912 |
+| String_type |  456976  |   String Random |    Ord_Sort |  15.28657 |
+|     Integer | 1048576  |          Blocks |        Sort |   0.03632 |
+|     Integer | 1048576  |      Decreasing |        Sort |   0.04135 |
+|     Integer | 1048576  |       Identical |        Sort |   0.03672 |
+|     Integer | 1048576  |      Increasing |        Sort |   0.01346 |
+|     Integer | 1048576  |    Random dense |        Sort |   0.07099 |
+|     Integer | 1048576  |    Random order |        Sort |   0.07172 |
+|     Integer | 1048576  |   Random sparse |        Sort |   0.07196 |
+|     Integer | 1048576  |        Random 3 |        Sort |   0.07729 |
+|     Integer | 1048576  |       Random 10 |        Sort |   0.14964 |
+|   Character |  456976  |  Char. Decrease |        Sort |   0.31387 |
+|   Character |  456976  |  Char. Increase |        Sort |   0.12050 |
+|   Character |  456976  |    Char. Random |        Sort |   0.20228 |
+| String_type |  456976  | String Decrease |        Sort |  23.87099 |
+| String_type |  456976  | String Increase |        Sort |   8.71762 |
+| String_type |  456976  |   String Random |        Sort |  11.50930 |
+|     Integer | 1048576  |          Blocks |  Sort_Index |   0.00563 |
+|     Integer | 1048576  |      Decreasing |  Sort_Index |   0.00230 |
+|     Integer | 1048576  |       Identical |  Sort_Index |   0.00142 |
+|     Integer | 1048576  |      Increasing |  Sort_Index |   0.00142 |
+|     Integer | 1048576  |    Random dense |  Sort_Index |   0.11723 |
+|     Integer | 1048576  |    Random order |  Sort_Index |   0.12539 |
+|     Integer | 1048576  |   Random sparse |  Sort_Index |   0.12706 |
+|     Integer | 1048576  |        Random 3 |  Sort_Index |   0.00718 |
+|     Integer | 1048576  |       Random 10 |  Sort_Index |   0.00316 |
+|   Character |  456976  |  Char. Decrease |  Sort_Index |   0.00790 |
+|   Character |  456976  |  Char. Increase |  Sort_Index |   0.00405 |
+|   Character |  456976  |    Char. Random |  Sort_Index |   0.22963 |
+| String_type |  456976  | String Decrease |  Sort_Index |   0.51546 |
+| String_type |  456976  | String Increase |  Sort_Index |   0.14960 |
+| String_type |  456976  |   String Random |  Sort_Index |  15.59673 |
 
 

From 57273986b812ab5179586c277b39f6cfdc5a3283 Mon Sep 17 00:00:00 2001
From: William Clodius <w.clodius@icloud.com>
Date: Fri, 23 Apr 2021 17:37:11 -0600
Subject: [PATCH 06/45] Commited main sorting files

Added and commited the main sorting source codes: stdlib_sorting.fypp,
stdlib_sorting_sort.fypp, stdlib_sorting_ord_sort.f90, and
stdlib_sorting_sort_index.fypp. Also commited the revised CMakeLists.txt and
Makefile.manual files needed co compile the library including the new codes.

[ticket: X]
---
 src/CMakeLists.txt                 |    4 +
 src/Makefile.manual                |   13 +
 src/stdlib_sorting.fypp            |  577 ++++++++++
 src/stdlib_sorting_ord_sort.fypp   | 1419 +++++++++++++++++++++++
 src/stdlib_sorting_sort.fypp       |  736 ++++++++++++
 src/stdlib_sorting_sort_index.fypp | 1684 ++++++++++++++++++++++++++++
 6 files changed, 4433 insertions(+)
 create mode 100644 src/stdlib_sorting.fypp
 create mode 100644 src/stdlib_sorting_ord_sort.fypp
 create mode 100644 src/stdlib_sorting_sort.fypp
 create mode 100644 src/stdlib_sorting_sort_index.fypp

diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt
index 96a6ebcce..92f06fc04 100644
--- a/src/CMakeLists.txt
+++ b/src/CMakeLists.txt
@@ -9,6 +9,10 @@ set(fppFiles
     stdlib_linalg.fypp
     stdlib_linalg_diag.fypp
     stdlib_optval.fypp
+    stdlib_sorting.fypp
+    stdlib_sorting_ord_sort.fypp
+    stdlib_sorting_sort.fypp
+    stdlib_sorting_sort_index.fypp 
     stdlib_stats.fypp
     stdlib_stats_corr.fypp
     stdlib_stats_cov.fypp
diff --git a/src/Makefile.manual b/src/Makefile.manual
index 85541e038..fb176a235 100644
--- a/src/Makefile.manual
+++ b/src/Makefile.manual
@@ -9,6 +9,10 @@ SRCFYPP =\
         stdlib_quadrature.fypp \
         stdlib_quadrature_trapz.fypp \
         stdlib_quadrature_simps.fypp \
+        stdlib_sorting.fypp \
+        stdlib_sorting_ord_sort.fypp \
+        stdlib_sorting_sort.fypp \
+        stdlib_sorting_sort_index.fypp \
         stdlib_stats.fypp \
         stdlib_stats_corr.fypp \
         stdlib_stats_cov.fypp \
@@ -79,6 +83,15 @@ stdlib_quadrature_trapz.o: \
 	stdlib_quadrature.o \
 	stdlib_error.o \
 	stdlib_kinds.o
+stdlib_sorting.o: \
+	stdlib_kinds.o \
+	stdlib_string_type.o
+stdlib_sorting_ord_sort.o: \
+	stdlib_sorting.o
+stdlib_sorting_sort.o: \
+	stdlib_sorting.o
+stdlib_sorting_sort_index.o: \
+	stdlib_sorting.o
 stdlib_stats.o: \
 	stdlib_kinds.o
 stdlib_stats_corr.o: \
diff --git a/src/stdlib_sorting.fypp b/src/stdlib_sorting.fypp
new file mode 100644
index 000000000..e7fe92589
--- /dev/null
+++ b/src/stdlib_sorting.fypp
@@ -0,0 +1,577 @@
+#! Integer kinds to be considered during templating
+#:set INT_KINDS = ["int8", "int16", "int32", "int64"]
+
+#! Real kinds to be considered during templating
+#:set REAL_KINDS = ["sp", "dp", "qp"]
+
+!! Licensing:
+!!
+!! This file is subjec† both to the Fortran Standard Library license, and
+!! to additional licensing requirements as it contains translations of
+!! other software.
+!!
+!! The Fortran Standard Library, including this file, is distributed under
+!! the MIT license that should be included with the library's distribution.
+!!
+!!   Copyright (c) 2021 Fortran stdlib developers
+!!
+!!   Permission is hereby granted, free of charge, to any person obtaining a
+!!   copy of this software and associated documentation files (the
+!!   "Software"),  to deal in the Software without restriction, including
+!!   without limitation the rights to use, copy, modify, merge, publish,
+!!   distribute, sublicense, and/or sellcopies of the Software, and to permit
+!!   persons to whom the Software is furnished to do so, subject to the
+!!   following conditions:
+!!
+!!   The above copyright notice and this permission notice shall be included
+!!   in all copies or substantial portions of the Software.
+!!
+!!   THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+!!   OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+!!   MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+!!   IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
+!!   CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
+!!   TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
+!!   SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+!!
+!! Two of the generic subroutines, `ORD_SORT` and `SORT_INDEX`, are
+!! substantially translations to Fortran 2008 of the `"Rust" sort` sorting
+!! routines in
+!! [`slice.rs`](https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs)
+!! The `rust sort` implementation is distributed with the header:
+!!
+!!   Copyright 2012-2015 The Rust Project Developers. See the COPYRIGHT
+!!   file at the top-level directory of this distribution and at
+!!   http://rust-lang.org/COPYRIGHT.
+!!
+!!   Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or
+!!   http://www.apache.org/licenses/LICENSE-2.0> or the MIT license
+!!   <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your
+!!   option. This file may not be copied, modified, or distributed
+!!   except according to those terms.
+!!
+!! so the licence for the original`slice.rs` code is compatible with the use
+!! of modified versions of the code in the Fortran Standard Library under
+!! the MIT license.
+!!
+!! One of the generic subroutines, `SORT`, is substantially a
+!! translation to Fortran 2008, of the `introsort` of David Musser.
+!! David Musser has given permission to include a variant of `introsort`
+!! in the Fortran Standard Library under the MIT license provided
+!! we cite:
+!!
+!!   Musser, D.R., “Introspective Sorting and Selection Algorithms,”
+!!   Software—Practice and Experience, Vol. 27(8), 983–993 (August 1997).
+!!
+!! as the official source of the algorithm.
+
+module stdlib_sorting
+!! This module implements overloaded sorting subroutines named `ORD_SORT`,
+!! `SORT_INDEX`, and `SORT`, that each can be used to sort four kinds
+!! of `INTEGER` arrays and three kinds of `REAL` arrays. By default, sorting
+!! is in order of increasing value, though `SORT_INDEX` has the option of
+!! sorting in order of decresasing value. All the subroutines have worst
+!! case run time performance of `O(N Ln(N))`, but on largely sorted data
+!! `ORD_SORT` and `SORT_INDEX` can have a run time performance of `O(N)`.
+!!
+!! `ORD_SORT` is a translation of the `"Rust" sort` sorting algorithm in
+!! `slice.rs`:
+!! https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs
+!! which in turn is inspired by the `timsort` algorithm of Tim Peters,
+!! http://svn.python.org/projects/python/trunk/Objects/listsort.txt.
+!! `ORD_SORT` is a hybrid stable comparison algorithm combining `merge sort`,
+!! and `insertion sort`. It is always at worst O(N Ln(N)) in sorting random
+!! data, having a performance about 25% slower than `SORT` on such
+!! data, but has much better performance than `SORT` on partially
+!! sorted data, having O(N) performance on uniformly non-increasing or
+!! non-decreasing data.
+!!
+!! `SORT_INDEX` is a modification of `ORD_SORT` so that in addition to
+!! sorting the input array, it returns the indices that map to a
+!! stable sort of the original array. These indices are intended to be
+!! intended to be used to sort data that is correlated with the input
+!! array, e.g., different arrays in a database, different columns of a
+!! rank 2 array, different elements of a derived type. It is less
+!! efficient than `ORD_SORT` at sorting a simple array.
+!!
+!! `SORT` uses the `INTROSORT` sorting algorithm of David Musser,
+!! http://www.cs.rpi.edu/~musser/gp/introsort.ps. `introsort` is a hybrid
+!! unstable comparison algorithm combining `quicksort`, `insertion sort`, and
+!! `heap sort`. While this algorithm is always O(N Ln(N)) it is relatively
+!! fast on randomly ordered data, but inconsistent in performance on partly
+!! sorted data, sometimes having `merge sort` performance, sometimes having
+!! better than `quicksort` performance. `UNORD_SOORT` is about 25%
+!! more efficient than `ORD_SORT` at sorting purely random data, but af an
+!! order of `Ln(N)` less efficient at sorting partially sorted data.
+
+    use stdlib_kinds, only: &
+        int8,               &
+        int16,              &
+        int32,              &
+        int64,              &
+        sp,                 &
+        dp,                 &
+        qp
+
+    use stdlib_string_type, only: string_type, assignment(=), operator(>), &
+        operator(>=), operator(<), operator(<=)
+
+    implicit none
+    private
+
+    public :: int_size
+
+    integer, parameter :: int_size = int64 !! Integer kind for indexing
+
+! Constants for use by tim_sort
+    integer, parameter :: &
+! The maximum number of entries in a run stack, good for an array of
+! 2**64 elements see
+! https://svn.python.org/projects/python/trunk/Objects/listsort.txt
+        max_merge_stack = int( ceiling( log( 2._dp**64 ) / &
+                               log(1.6180339887_dp) ) )
+
+    type run_type
+! Version: experimental
+!
+! Used to pass state around in a stack among helper functions for the
+! `ORD_SORT` and `SORT_INDEX` algorithms
+        integer(int_size) :: base = 0
+        integer(int_size) :: len = 0
+    end type run_type
+
+    public ord_sort
+!! Version: experimental
+!!
+!! The generic subroutine implementing the `ORD_SORT` algorithm to return
+!! an input array with its elements sorted in order of non-decreasing
+!! value. Its use has the syntax:
+!!
+!!     call ord_sort( array[, work] )
+!!
+!! with the arguments:
+!!
+!! * array: the rank 1 array to be sorted. It is an `intent(inout)`
+!!   argument of any of the types `integer(int8)`, `integer(int16)`,
+!!   `integer(int32)`, `integer(int64)`, `real(real32)`, `real(real64)`,
+!!   `real(real128)`, or `character(*)`. If both the type of `array` is
+!!   real and at least one of the elements is a `NaN`, then the ordering
+!!   of the result is undefined. Otherwise it is defined to be the
+!!   original elements in non-decreasing order.
+!!
+!! * work (optional): shall be a rank 1 array of the same type as
+!!   `array`, and shall have at least `size(array)/2` elements. It is an
+!!   `intent(inout)` argument to be used as "scratch" memory
+!!   for internal record keeping. If associated with an array in static
+!!   storage, its use can significantly reduce the stack memory requirements
+!!   for the code. Its value on return is undefined.
+!!
+!!#### Example
+!!
+!!```fortran
+!!    ...
+!!    ! Read arrays from sorted files
+!!    call read_sorted_file( 'dummy_file1', array1 )
+!!    call read_sorted_file( 'dummy_file2', array2 )
+!!    ! Concatenate the arrays
+!!    allocate( array( size(array1) + size(array2) ) )
+!!    array( 1:size(array1) ) = array1(:)
+!!    array( size(array1)+1:size(array1)+size(array2) ) = array2(:)
+!!    ! Sort the resulting array
+!!    call ord_sort( array, work )
+!!    ! Process the sorted array
+!!    call array_search( array, values )
+!!    ...
+!!```
+
+    public sort
+!! Version: experimental
+!!
+!! The generic subroutine implementing the `SORT` algorithm to return
+!! an input array with its elements sorted in order of non-decreasing
+!! value. Its use has the syntax:
+!!
+!!     call sort( array )
+!!
+!! with the arguments:
+!!
+!! * array: the rank 1 array to be sorted. It is an `intent(inout)`
+!!   argument of any of the types `integer(int8)`, `integer(int16)`,
+!!   `integer(int32)`, `integer(int64)`, `real(real32)`, `real(real64)`,
+!!   `real(real128)`, or `character(*)`. If both the type of `array` is
+!!   real and at least one of the elements is a `NaN`, then the ordering
+!!   of the result is undefined. Otherwise it is defined to be the
+!!   original elements in non-decreasing order.
+!!
+!!#### Example
+!!
+!!```fortran
+!!    ...
+!!    ! Read random data from a file
+!!    call read_file( 'dummy_file', array )
+!!    ! Sort the random data
+!!    call sort( array )
+!!    ! Process the sorted data
+!!    call array_search( array, values )
+!!    ...
+!!```
+
+    public sort_index
+!! Version: experimental
+!!
+!! The generic subroutine implementing the `SORT_INDEX` algorithm to
+!! return an index array whose elements would sort the input array in the
+!! desired direction. It is primarilly intended to be used to sort a
+!! derived type array based on the values of a component of the array.
+!! Its use has the syntax:
+!!
+!!     call sort_index( array, index[, work, iwork, reverse ] )
+!!
+!! with the arguments:
+!!
+!! * array: the rank 1 array to be sorted. It is an `intent(inout)`
+!!   argument of any of the types `integer(int8)`, `integer(int16)`,
+!!   `integer(int32)`, `integer(int64)`, `real(real32)`, `real(real64)`,
+!!   `real(real128)`, or `character(*)`. If both the type of `array` is
+!!   real and at least one of the elements is a `NaN`, then the ordering
+!!   of the `array` and `index` results is undefined. Otherwise it is
+!!   defined to be as specified by reverse.
+!!
+!! * index: a rank 1 array of sorting indices. It is an `intent(inout)`
+!!   argument of the type `integer(int_size)`. Its size shall be the
+!!   same as `array`. On return, if defined, its elements would
+!!   sort the input `array` in the direction specified by `reverse`.
+!!
+!! * work (optional): shall be a rank 1 array of the same type as
+!!   `array`, and shall have at least `size(array)/2` elements. It is an
+!!   `intent(inout)` argument to be used as "scratch" memory
+!!   for internal record keeping. If associated with an array in static
+!!   storage, its use can significantly reduce the stack memory requirements
+!!   for the code. Its value on return is undefined.
+!!
+!! * iwork (optional): shall be a rank 1 integer array of kind `int_size`,
+!!   and shall have at least `size(array)/2` elements. It is an
+!!   `intent(inout)` argument to be used as "scratch" memory
+!!   for internal record keeping. If associated with an array in static
+!!   storage, its use can significantly reduce the stack memory requirements
+!!   for the code. Its value on return is undefined.
+!!
+!! * `reverse` (optional): shall be a scalar of type default logical. It
+!!   is an `intent(in)` argument. If present with a value of `.true.` then
+!!   `index` will sort `array` in order of non-increasing values in stable
+!!   order. Otherwise index will sort `array` in order of non-decreasing
+!!   values in stable order.
+!!
+!!#### Examples
+!!
+!! Sorting a related rank one array:
+!!
+!!```Fortran
+!!    subroutine sort_related_data( a, b, work, index, iwork )
+!!        ! Sort `b` in terms or its related array `a`
+!!        integer, intent(inout)           :: a(:)
+!!        integer(int32), intent(inout)    :: b(:) ! The same size as a
+!!        integer(int32), intent(inout)    :: work(:)
+!!        integer(int_size), intent(inout) :: index(:)
+!!        integer(int_size), intent(inout) :: iwork(:)
+!!    ! Find the indices to sort a
+!!        call sort_index(a, index(1:size(a)),&
+!!            work(1:size(a)/2), iwork(1:size(a)/2))
+!!    ! Sort b based on the sorting of a
+!!        b(:) = b( index(1:size(a)) )
+!!    end subroutine sort_related_data
+!!```
+!!
+!! Sorting a rank 2 array based on the data in a column
+!!
+!!```Fortran
+!!    subroutine sort_related_data( array, column, work, index, iwork )
+!!    ! Sort `a_data` in terms or its component `a`
+!!        integer, intent(inout)           :: a(:,:)
+!!        integer(int32), intent(in)       :: column
+!!        integer(int32), intent(inout)    :: work(:)
+!!        integer(int_size), intent(inout) :: index(:)
+!!        integer(int_size), intent(inout) :: iwork(:)
+!!        integer, allocatable             :: dummy(:)
+!!        integer :: i
+!!        allocate(dummy(size(a, dim=1)))
+!!    ! Extract a component of `a_data`
+!!        dummy(:) = a(:, column)
+!!    ! Find the indices to sort the column
+!!        call sort_index(dummy, index(1:size(dummy)),&
+!!                        work(1:size(dummy)/2), iwork(1:size(dummy)/2))
+!!    ! Sort a based on the sorting of its column
+!!        do i=1, size(a, dim=2)
+!!            a(:, i) = a(index(1:size(a, dim=1)), i)
+!!        end do
+!!    end subroutine sort_related_data
+!!```
+!!
+!! Sorting an array of a derived type based on the dsta in one component
+!!```fortran
+!!    subroutine sort_a_data( a_data, a, work, index, iwork )
+!!    ! Sort `a_data` in terms or its component `a`
+!!        type(a_type), intent(inout)      :: a_data(:)
+!!        integer(int32), intent(inout)    :: a(:)
+!!        integer(int32), intent(inout)    :: work(:)
+!!        integer(int_size), intent(inout) :: index(:)
+!!        integer(int_size), intent(inout) :: iwork(:)
+!!    ! Extract a component of `a_data`
+!!        a(1:size(a_data)) = a_data(:) % a
+!!    ! Find the indices to sort the component
+!!        call sort_index(a(1:size(a_data)), index(1:size(a_data)),&
+!!                        work(1:size(a_data)/2), iwork(1:size(a_data)/2))
+!!    ! Sort a_data based on the sorting of that component
+!!        a_data(:) = a_data( index(1:size(a_data)) )
+!!    end subroutine sort_a_data
+!!```
+
+#:for k1 in INT_KINDS
+    public ${k1}$_ord_sort
+!! Version: experimental
+!!
+!! `${k1}$_ord_sort` is the specific `ORD_SORT` subroutine for an `array`
+!! argument of type `integer(${k1}$)`.
+
+    public ${k1}$_sort
+!! Version: experimental
+!!
+!! `${k1}$_sort` is the specific `SORT` subroutine for an `array`
+!! argument of type `integer(${k1}$)`.
+
+    public ${k1}$_sort_index
+!! Version: experimental
+!!
+!! `${k1}$_sort_index` is the specific `SORT_INDEX` subroutine for an `array`
+!! argument of type `integer(${k1}$)`.
+
+#:endfor
+
+#:for k1 in REAL_KINDS
+    public ${k1}$_ord_sort
+!! Version: experimental
+!!
+!! `${k1}$_ord_sort` is the specific `ORD_SORT` subroutine for an `array`
+!! argument of type `real(${k1}$)`.
+
+    public ${k1}$_sort
+!! Version: experimental
+!!
+!! `${k1}$_sort` is the specific `SORT` subroutine for an `array`
+!! argument of type `real(${k1}$)`.
+
+    public ${k1}$_sort_index
+!! Version: experimental
+!!
+!! `${k1}$_sort_index` is the specific `SORT_INDEX` subroutine for an `array`
+!! argument of type `real(${k1}$)`.
+
+#:endfor
+
+    public char_ord_sort
+
+    public char_sort
+
+    public char_sort_index
+
+    public string_ord_sort
+
+    public string_sort
+
+    public string_sort_index
+
+    interface ord_sort
+!! Version: experimental
+!!
+!! The generic subroutine interface implementing the `ORD_SORT` algorithm,
+!! a translation to Fortran 2008, of the `"Rust" sort` algorithm found in
+!! `slice.rs`
+!! https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs#L2159
+!! `ORD_SORT` is a hybrid stable comparison algorithm combining `merge sort`,
+!! and `insertion sort`. It is always at worst O(N Ln(N)) in sorting random
+!! data, having a performance about 25% slower than `SORT` on such
+!! data, but has much better performance than `SORT` on partially
+!! sorted data, having O(N) performance on uniformly non-increasing or
+!! non-decreasing data.
+
+#:for k1 in INT_KINDS
+        module subroutine ${k1}$_ord_sort( array, work )
+!! Version: experimental
+!!
+!! `${k1}$_ord_sort( array )` sorts the input `ARRAY` of type `INTEGER(${k1}$)`
+!! using a hybrid sort based on the `'Rust" sort` algorithm found in `slice.rs`
+            integer(${k1}$), intent(inout)           :: array(0:)
+            integer(${k1}$), intent(inout), optional :: work(0:)
+        end subroutine ${k1}$_ord_sort
+
+#:endfor
+
+#:for k1 in REAL_KINDS
+        module subroutine ${k1}$_ord_sort( array, work )
+ !! Version: experimental
+!!
+!! `${k1}$_ord_sort( array )` sorts the input `ARRAY` of type `REAL(${k1}$)`
+!! using a hybrid sort based on the `'Rust" sort` algorithm found in `slice.rs`
+            real(${k1}$), intent(inout)           :: array(0:)
+            real(${k1}$), intent(inout), optional :: work(0:)
+        end subroutine ${k1}$_ord_sort
+
+#:endfor
+        module subroutine char_ord_sort( array, work )
+!! Version: experimental
+!!
+!! `char_ord_sort( array )` sorts the input `ARRAY` of type `CHARACTER(*)`
+!! using a hybrid sort based on the `'Rust" sort` algorithm found in `slice.rs`
+            character(len=*), intent(inout)                    :: array(0:)
+            character(len=len(array)), intent(inout), optional :: work(0:)
+        end subroutine char_ord_sort
+
+        module subroutine string_ord_sort( array, work )
+!! Version: experimental
+!!
+!! `char_ord_sort( array )` sorts the input `ARRAY` of type `CHARACTER(*)`
+!! using a hybrid sort based on the `'Rust" sort` algorithm found in `slice.rs`
+            type(string_type), intent(inout)           :: array(0:)
+            type(string_type), intent(inout), optional :: work(0:)
+        end subroutine string_ord_sort
+
+    end interface ord_sort
+
+    interface sort
+!! Version: experimental
+!!
+!! The generic subroutine interface implementing the `SORT` algorithm, based
+!! on the `introsort` of David Musser.
+
+#:for k1 in INT_KINDS
+        pure module subroutine ${k1}$_sort( array )
+!! Version: experimental
+!!
+!! `${k1}$_sort( array )` sorts the input `ARRAY` of type `INTEGER(${k1}$)`
+!! using a hybrid sort based on the `introsort` of David Musser.
+!! The algorithm is of order O(N Ln(N)) for all inputs.
+!! Because it relies on `quicksort`, the coefficient of the O(N Ln(N))
+!! behavior is small for random data compared to other sorting algorithms.
+            integer(${k1}$), intent(inout) :: array(0:)
+        end subroutine ${k1}$_sort
+
+#:endfor
+
+#:for k1 in REAL_KINDS
+        pure module subroutine ${k1}$_sort( array )
+!! Version: experimental
+!!
+!! `${k1}$_sort( array )` sorts the input `ARRAY` of type `REAL(${k1}$)`
+!! using a hybrid sort based on the `introsort` of David Musser.
+!! The algorithm is of order O(N Ln(N)) for all inputs.
+!! Because it relies on `quicksort`, the coefficient of the O(N Ln(N))
+!! behavior is small for random data compared to other sorting algorithms.
+            real(${k1}$), intent(inout) :: array(0:)
+        end subroutine ${k1}$_sort
+
+#:endfor
+
+        pure module subroutine char_sort( array )
+!! Version: experimental
+!!
+!! `char_sort( array )` sorts the input `ARRAY` of type `CHARACTER(*)`
+!! using a hybrid sort based on the `introsort` of David Musser.
+!! The algorithm is of order O(N Ln(N)) for all inputs.
+!! Because it relies on `quicksort`, the coefficient of the O(N Ln(N))
+!! behavior is small for random data compared to other sorting algorithms.
+            character(len=*), intent(inout) :: array(0:)
+        end subroutine char_sort
+
+        pure module subroutine string_sort( array )
+!! Version: experimental
+!!
+!! `string_sort( array )` sorts the input `ARRAY` of type `STRING_TYPE`
+!! using a hybrid sort based on the `introsort` of David Musser.
+!! The algorithm is of order O(N Ln(N)) for all inputs.
+!! Because it relies on `quicksort`, the coefficient of the O(N Ln(N))
+!! behavior is small for random data compared to other sorting algorithms.
+            type(string_type), intent(inout) :: array(0:)
+        end subroutine string_sort
+
+    end interface sort
+
+    interface sort_index
+!! Version: experimental
+!!
+!! The generic subroutine interface implementing the `SORT_INDEX` algorithm,
+!! based on the `"Rust" sort` algorithm found in `slice.rs`
+!! https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs#L2159
+!! but modified to return an array of indices that would provide a stable
+!! sort of the rank one `ARRAY` input. The indices by default correspond to a
+!! non-decreasing sort, but if the optional argument `REVERSE` is present
+!! with a value of `.TRUE.` the indices correspond to a non-increasing sort.
+
+#:for k1 in INT_KINDS
+        module subroutine ${k1}$_sort_index( array, index, work, iwork, &
+            reverse )
+!! Version: experimental
+!!
+!! `${k1}$_sort_index( array )` sorts an input `ARRAY` of type `INTEGER(${k1}$)`
+!! using a hybrid sort based on the `'Rust" sort` algorithm found in `slice.rs`
+!! and returns the sorted `ARRAY` and an array `INDEX of indices in the
+!! order that would sort the input `ARRAY` in the desired direction.
+            integer(${k1}$), intent(inout)             :: array(0:)
+            integer(int_size), intent(inout)           :: index(0:)
+            integer(${k1}$), intent(inout), optional   :: work(0:)
+            integer(int_size), intent(inout), optional :: iwork(0:)
+            logical, intent(in), optional              :: reverse
+        end subroutine ${k1}$_sort_index
+
+#:endfor
+
+#:for k1 in REAL_KINDS
+        module subroutine ${k1}$_sort_index( array, index, work, iwork, &
+            reverse )
+!! Version: experimental
+!!
+!! `${k1}$_sort_index( array )` sorts an input `ARRAY` of type `REAL(${k1}$)`
+!! using a hybrid sort based on the `'Rust" sort` algorithm found in `slice.rs`
+!! and returns the sorted `ARRAY` and an array `INDEX of indices in the
+!! order that would sort the input `ARRAY` in the desired direction.
+            real(${k1}$), intent(inout)                :: array(0:)
+            integer(int_size), intent(inout)           :: index(0:)
+            real(${k1}$), intent(inout), optional      :: work(0:)
+            integer(int_size), intent(inout), optional :: iwork(0:)
+            logical, intent(in), optional              :: reverse
+        end subroutine ${k1}$_sort_index
+
+#:endfor
+        module subroutine char_sort_index( array, index, work, iwork, &
+            reverse )
+!! Version: experimental
+!!
+!! `char_sort_index( array )` sorts an input `ARRAY` of type `CHARACTER(*)`
+!! using a hybrid sort based on the `'Rust" sort` algorithm found in `slice.rs`
+!! and returns the sorted `ARRAY` and an array `INDEX of indices in the
+!! order that would sort the input `ARRAY` in the desired direction.
+            character(len=*), intent(inout)                    :: array(0:)
+            integer(int_size), intent(inout)                   :: index(0:)
+            character(len=len(array)), intent(inout), optional :: work(0:)
+            integer(int_size), intent(inout), optional         :: iwork(0:)
+            logical, intent(in), optional                      :: reverse
+        end subroutine char_sort_index
+
+        module subroutine string_sort_index( array, index, work, iwork, &
+            reverse )
+!! Version: experimental
+!!
+!! `string_sort_index( array )` sorts an input `ARRAY` of type `STRING_TYPE`
+!! using a hybrid sort based on the `'Rust" sort` algorithm found in `slice.rs`
+!! and returns the sorted `ARRAY` and an array `INDEX of indices in the
+!! order that would sort the input `ARRAY` in the desired direction.
+            type(string_type), intent(inout)           :: array(0:)
+            integer(int_size), intent(inout)           :: index(0:)
+            type(string_type), intent(inout), optional :: work(0:)
+            integer(int_size), intent(inout), optional :: iwork(0:)
+            logical, intent(in), optional              :: reverse
+        end subroutine string_sort_index
+
+    end interface sort_index
+
+
+end module stdlib_sorting
diff --git a/src/stdlib_sorting_ord_sort.fypp b/src/stdlib_sorting_ord_sort.fypp
new file mode 100644
index 000000000..0e8c59a47
--- /dev/null
+++ b/src/stdlib_sorting_ord_sort.fypp
@@ -0,0 +1,1419 @@
+#! Integer kinds to be considered during templating
+#:set INT_KINDS = ["int8", "int16", "int32", "int64"]
+
+#! Real kinds to be considered during templating
+#:set REAL_KINDS = ["sp", "dp", "qp"]
+
+!! Licensing:
+!!
+!! This file is subjec† both to the Fortran Standard Library license, and
+!! to additional licensing requirements as it contains translations of
+!! other software.
+!!
+!! The Fortran Standard Library, including this file, is distributed under
+!! the MIT license that should be included with the library's distribution.
+!!
+!!   Copyright (c) 2021 Fortran stdlib developers
+!!
+!!   Permission is hereby granted, free of charge, to any person obtaining a
+!!   copy of this software and associated documentation files (the
+!!   "Software"),  to deal in the Software without restriction, including
+!!   without limitation the rights to use, copy, modify, merge, publish,
+!!   distribute, sublicense, and/or sellcopies of the Software, and to permit
+!!   persons to whom the Software is furnished to do so, subject to the
+!!   following conditions:
+!!
+!!   The above copyright notice and this permission notice shall be included
+!!   in all copies or substantial portions of the Software.
+!!
+!!   THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+!!   OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+!!   MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+!!   IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
+!!   CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
+!!   TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
+!!   SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+!!
+!! The generic subroutine, `ORD_SORT`, is substantially a translation to
+!! Fortran 2008 of the `"Rust" sort` sorting routines in
+!! [`slice.rs`](https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs)
+!! The `rust sort` implementation is distributed with the header:
+!!
+!!   Copyright 2012-2015 The Rust Project Developers. See the COPYRIGHT
+!!   file at the top-level directory of this distribution and at
+!!   http://rust-lang.org/COPYRIGHT.
+!!
+!!   Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or
+!!   http://www.apache.org/licenses/LICENSE-2.0> or the MIT license
+!!   <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your
+!!   option. This file may not be copied, modified, or distributed
+!!   except according to those terms.
+!!
+!! so the licence for the original`slice.rs` code is compatible with the use
+!! of modified versions of the code in the Fortran Standard Library under
+!! the MIT license.
+
+submodule(stdlib_sorting) stdlib_sorting_ord_sort
+
+    implicit none
+
+contains
+
+#:for k1 in INT_KINDS
+
+    module subroutine ${k1}$_ord_sort( array, work )
+! A translation to Fortran 2008, of the `"Rust" sort` algorithm found in
+! `slice.rs`
+! https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs#L2159
+! The Rust version in turn is a simplification of the Timsort algorithm
+! described in
+! https://svn.python.org/projects/python/trunk/Objects/listsort.txt, as
+! it drops both the use of 'galloping' to identify bounds of regions to be
+! sorted and the estimation of the optimal `run size`. However it remains
+! a hybrid sorting algorithm combining an iterative Merge sort controlled
+! by a stack of `RUNS` identified by regions of uniformly decreasing or
+! non-decreasing sequences that may be expanded to a minimum run size and
+! initialy processed by an insertion sort.
+!
+! Note the Fortran implementation simplifies the logic as it only has to
+! deal with Fortran arrays of intrinsic types and not the full generality
+! of Rust's arrays and lists for arbitrary types. It also adds the
+! estimation of the optimal `run size` as suggested in Tim Peters'
+! original `listsort.txt`, and an optional `work` array to be used as
+! scratch memory.
+        integer(${k1}$), intent(inout)           :: array(0:)
+        integer(${k1}$), intent(inout), optional :: work(0:)
+
+        integer(${k1}$), allocatable :: buf(:)
+        integer(int_size) :: array_size
+        integer :: stat
+
+        if ( present(work) ) then
+! Use the work array as scratch memory
+            call merge_sort( array, work )
+        else
+! Allocate a buffer to use as scratch memory.
+            array_size = size( array, kind=int_size )
+            allocate( buf(0:array_size/2-1), stat=stat )
+            if ( stat /= 0 ) error stop "Allocation of buffer failed."
+            call merge_sort( array, buf )
+        end if
+
+    contains
+
+        pure function calc_min_run( n ) result(min_run)
+!! Returns the minimum length of a run from 32-63 so that N/MIN_RUN is
+!! less than or equal to a power of two. See
+!! https://svn.python.org/projects/python/trunk/Objects/listsort.txt
+            integer(int_size)             :: min_run
+            integer(int_size), intent(in) :: n
+
+            integer(int_size) :: num, r
+
+            num = n
+            r = 0_int_size
+
+            do while( num >= 64 )
+                r = ior( r, iand(num, 1_int_size) )
+                num = ishft(num, -1_int_size)
+            end do
+            min_run = num + r
+
+        end function calc_min_run
+
+
+        pure subroutine insertion_sort( array )
+! Sorts `ARRAY` using an insertion sort.
+            integer(${k1}$), intent(inout) :: array(0:)
+
+            integer(int_size) :: i, j
+            integer(${k1}$) :: key
+
+            do j=1, size(array, kind=int_size)-1
+                key = array(j)
+                i = j - 1
+                do while( i >= 0 .and. array(i) > key )
+                    array(i+1) = array(i)
+                    i = i - 1
+                end do
+                array(i+1) = key
+            end do
+
+        end subroutine insertion_sort
+
+
+        pure function collapse( runs ) result ( r )
+! Examine the stack of runs waiting to be merged, identifying adjacent runs
+! to be merged until the stack invariants are restablished:
+!
+! 1. len(-3) > len(-2) + len(-1)
+! 2. len(-2) > len(-1)
+            integer(int_size) :: r
+            type(run_type), intent(in), target :: runs(0:)
+
+            integer(int_size) :: n
+            logical :: test
+
+            n = size(runs, kind=int_size)
+            test = .false.
+            if (n >= 2) then
+                if ( runs( n-1 ) % base == 0 .or. &
+                     runs( n-2 ) % len <= runs(n-1) % len ) then
+                    test = .true.
+                else if ( n >= 3 ) then ! X exists
+                    if ( runs(n-3) % len <= &
+                         runs(n-2) % len + runs(n-1) % len ) then
+                        test = .true.
+!               |X| <= |Y| + |Z| => will need to merge due to rho1 or rho2
+                    else if( n >= 4 ) then
+                        if ( runs(n-4) % len <= &
+                             runs(n-3) % len + runs(n-2) % len ) then
+                            test = .true.
+!               |W| <= |X| + |Y| => will need to merge due to rho1 or rho3
+                        end if
+                    end if
+                end if
+            end if
+            if ( test ) then
+! By default merge Y & Z, rho2 or rho3
+                if ( n >= 3 ) then
+                    if ( runs(n-3) % len < runs(n-1) % len ) then
+                        r = n - 3
+! |X| < |Z| => merge X & Y, rho1
+                        return
+                    end if
+                end if
+                r = n - 2
+! |Y| <= |Z| => merge Y & Z, rho4
+                return
+            else
+                r = -1
+            end if
+
+        end function collapse
+
+
+        pure subroutine insert_head( array )
+! Inserts `array(0)` into the pre-sorted sequence `array(1:)` so that the
+! whole `array(0:)` becomes sorted, copying the first element into
+! a temporary variable, iterating until the right place for it is found.
+! copying every traversed element into the slot preceding it, and finally,
+! copying data from the temporary variable into the resulting hole.
+
+            integer(${k1}$), intent(inout) :: array(0:)
+
+            integer(${k1}$) :: tmp
+            integer(int_size) :: i
+
+            tmp = array(0)
+            find_hole: do i=1, size(array, kind=int_size)-1
+                if ( array(i) >= tmp ) exit find_hole
+                array(i-1) = array(i)
+            end do find_hole
+            array(i-1) = tmp
+
+        end subroutine insert_head
+
+
+        subroutine merge_sort( array, buf )
+! The Rust merge sort borrows some (but not all) of the ideas from TimSort,
+! which is described in detail at
+! (http://svn.python.org/projects/python/trunk/Objects/listsort.txt).
+!
+! The algorithm identifies strictly descending and non-descending
+! subsequences, which are called natural runs. Where these runs are less
+! than a minimum run size they are padded by adding additional samples
+! using an insertion sort. The merge process is driven by a stack of
+! pending unmerged runs. Each newly found run is pushed onto the stack,
+! and then pairs of adjacentd runs are merged until these two invariants
+! are satisfied:
+!
+! 1. for every `i` in `1..size(runs)-1`: `runs(i - 1)%len > runs(i)%len`
+! 2. for every `i` in `2..size(runs)-1`: `runs(i - 2)%len >
+!    runs(i - 1)%len + runs(i)%len`
+!
+! The invariants ensure that the total running time is `O(n log n)`
+! worst-case.
+
+            integer(${k1}$), intent(inout) :: array(0:)
+            integer(${k1}$), intent(inout) :: buf(0:)
+
+            integer(int_size) :: array_size, finish, min_run, r, r_count, &
+                start
+            type(run_type) :: runs(0:max_merge_stack-1), left, right
+
+            array_size = size(array, kind=int_size)
+
+! Very short runs are extended using insertion sort to span at least
+! min_run elements. Slices of up to this length are sorted using insertion
+! sort.
+            min_run = calc_min_run( array_size )
+
+            if ( array_size <= min_run ) then
+                if ( array_size >= 2 ) call insertion_sort( array )
+                return
+            end if
+
+! Following Rust sort, natural runs in `array` are identified by traversing
+! it backwards. By traversing it backward, merges more often go in the
+! opposite direction (forwards). According to developers of Rust sort,
+! merging forwards is slightly faster than merging backwards. Therefore
+! identifying runs by traversing backwards should improve performance.
+            r_count = 0
+            finish = array_size - 1
+            do while ( finish >= 0 )
+! Find the next natural run, and reverse it if it's strictly descending.
+                start = finish
+                if ( start > 0 ) then
+                    start = start - 1
+                    if ( array(start+1) < array(start) ) then
+                        Descending: do while ( start > 0 )
+                            if ( array(start) >= array(start-1) ) &
+                                exit Descending
+                            start = start - 1
+                        end do Descending
+                        call reverse_segment( array(start:finish) )
+                    else
+                        Ascending: do while( start > 0 )
+                            if ( array(start) < array(start-1) ) exit Ascending
+                            start = start - 1
+                        end do Ascending
+                    end if
+                end if
+
+! If the run is too short insert some more elements using an insertion sort.
+                Insert: do while ( start > 0 )
+                    if ( finish - start >= min_run - 1 ) exit Insert
+                    start = start - 1
+                    call insert_head( array(start:finish) )
+                end do Insert
+                if ( start == 0 .and. finish == array_size - 1 ) return
+
+                runs(r_count) = run_type( base = start, &
+                                          len = finish - start + 1 )
+                finish = start-1
+                r_count = r_count + 1
+
+! Determine whether pairs of adjacent runs need to be merged to satisfy
+! the invariants, and, if so, merge them.
+                Merge_loop: do
+                    r = collapse( runs(0:r_count - 1) )
+                    if ( r < 0 .or. r_count <= 1 ) exit Merge_loop
+                    left = runs( r + 1 )
+                    right = runs( r )
+                    call merge( array( left % base: &
+                                       right % base + right % len - 1 ), &
+                                left % len, buf )
+
+                    runs(r) = run_type( base = left % base, &
+                                        len = left % len + right % len )
+                    if ( r == r_count - 3 ) runs(r+1) = runs(r+2)
+                    r_count = r_count - 1
+
+                end do Merge_loop
+            end do
+            if ( r_count /= 1 ) &
+                error stop "MERGE_SORT completed without RUN COUNT == 1."
+
+        end subroutine merge_sort
+
+
+        pure subroutine merge( array, mid, buf )
+! Merges the two non-decreasing runs `ARRAY(0:MID-1)` and `ARRAY(MID:)`
+! using `BUF` as temporary storage, and stores the merged runs into
+! `ARRAY(0:)`. `MID` must be > 0, and < `SIZE(ARRAY)-1`. Buffer `BUF`
+! must be long enough to hold the shorter of the two runs.
+            integer(${k1}$), intent(inout) :: array(0:)
+            integer(int_size), intent(in)  :: mid
+            integer(${k1}$), intent(inout) :: buf(0:)
+
+            integer(int_size) :: array_len, i, j, k
+
+            array_len = size(array, kind=int_size)
+
+! Merge first copies the shorter run into `buf`. Then, depending on which
+! run was shorter, it traces the copied run and the longer run forwards
+! (or backwards), comparing their next unprocessed elements and then
+! copying the lesser (or greater) one into `array`.
+
+            if ( mid <= array_len - mid ) then ! The left run is shorter.
+                buf(0:mid-1) = array(0:mid-1)
+                i = 0
+                j = mid
+                merge_lower: do k = 0, array_len-1
+                    if ( buf(i) <= array(j) ) then
+                        array(k) = buf(i)
+                        i = i + 1
+                        if ( i >= mid ) exit merge_lower
+                    else
+                        array(k) = array(j)
+                        j = j + 1
+                        if ( j >= array_len ) then
+                            array(k+1:) = buf(i:mid-1)
+                            exit merge_lower
+                        end if
+                    end if
+                end do merge_lower
+            else ! The right run is shorter ! check that it is stable
+                buf(0:array_len-mid) = array(mid:array_len-1)
+                i = mid - 1
+                j = array_len - mid -1
+                merge_upper: do k = array_len-1, 0, -1
+                    if ( buf(j) >= array(i) ) then
+                        array(k) = buf(j)
+                        j = j - 1
+                        if ( j < 0 ) exit merge_upper
+                    else
+                        array(k) = array(i)
+                        i = i - 1
+                        if ( i < 0 ) then
+                            array(0:k-1) = buf(0:j)
+                            exit merge_upper
+                        end if
+                    end if
+                end do merge_upper
+            end if
+        end subroutine merge
+
+
+        pure subroutine reverse_segment( array )
+! Reverse a segment of an array in place
+            integer(${k1}$), intent(inout) :: array(0:)
+
+            integer(int_size) :: lo, hi
+            integer(${k1}$) :: temp
+
+            lo = 0
+            hi = size( array, kind=int_size ) - 1
+            do while( lo < hi )
+                temp = array(lo)
+                array(lo) = array(hi)
+                array(hi) = temp
+                lo = lo + 1
+                hi = hi - 1
+            end do
+
+        end subroutine reverse_segment
+
+    end subroutine ${k1}$_ord_sort
+
+#:endfor
+
+
+#:for k1 in REAL_KINDS
+
+    module subroutine ${k1}$_ord_sort( array, work )
+! A translation to Fortran 2008, of the `"Rust" sort` algorithm found in
+! `slice.rs`
+! https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs#L2159
+! The Rust version in turn is a simplification of the Timsort algorithm
+! described in
+! https://svn.python.org/projects/python/trunk/Objects/listsort.txt, as
+! it drops both the use of 'galloping' to identify bounds of regions to be
+! sorted and the estimation of the optimal `run size`. However it remains
+! a hybrid sorting algorithm combining an iterative Merge sort controlled
+! by a stack of `RUNS` identified by regions of uniformly decreasing or
+! non-decreasing sequences that may be expanded to a minimum run size and
+! initialy processed by an insertion sort.
+!
+! Note the Fortran implementation simplifies the logic as it only has to
+! deal with Fortran arrays of intrinsic types and not the full generality
+! of Rust's arrays and lists for arbitrary types. It also adds the
+! estimation of the optimal `run size` as suggested in Tim Peters'
+! original `listsort.txt`, and an optional `work` array to be used as
+! scratch memory.
+        real(${k1}$), intent(inout)           :: array(0:)
+        real(${k1}$), intent(inout), optional :: work(0:)
+
+        real(${k1}$), allocatable :: buf(:)
+        integer(int_size) :: array_size
+        integer :: stat
+
+        if ( present(work) ) then
+! Use the work array as scratch memory
+            call merge_sort( array, work )
+        else
+! Allocate a buffer to use as scratch memory.
+            array_size = size( array, kind=int_size )
+            allocate( buf(0:array_size/2-1), stat=stat )
+            if ( stat /= 0 ) error stop "Allocation of buffer failed."
+            call merge_sort( array, buf )
+        end if
+
+    contains
+
+        pure function calc_min_run( n ) result(min_run)
+!! Returns the minimum length of a run from 32-63 so that N/MIN_RUN is
+!! less than or equal to a power of two. See
+!! https://svn.python.org/projects/python/trunk/Objects/listsort.txt
+            integer(int_size)             :: min_run
+            integer(int_size), intent(in) :: n
+
+            integer(int_size) :: num, r
+
+            num = n
+            r = 0_int_size
+
+            do while( num >= 64 )
+                r = ior( r, iand(num, 1_int_size) )
+                num = ishft(num, -1_int_size)
+            end do
+            min_run = num + r
+
+        end function calc_min_run
+
+
+        pure subroutine insertion_sort( array )
+! Sorts `ARRAY` using an insertion sort.
+            real(${k1}$), intent(inout) :: array(0:)
+
+            integer(int_size) :: i, j
+            real(${k1}$) :: key
+
+            do j=1, size(array, kind=int_size)-1
+                key = array(j)
+                i = j - 1
+                do while( i >= 0 .and. array(i) > key )
+                    array(i+1) = array(i)
+                    i = i - 1
+                end do
+                array(i+1) = key
+            end do
+
+        end subroutine insertion_sort
+
+
+        pure function collapse( runs ) result ( r )
+! Examine the stack of runs waiting to be merged, identifying adjacent runs
+! to be merged until the stack invariants are restablished:
+!
+! 1. len(-3) > len(-2) + len(-1)
+! 2. len(-2) > len(-1)
+            integer(int_size) :: r
+            type(run_type), intent(in), target :: runs(0:)
+
+            integer(int_size) :: n
+            logical :: test
+
+            n = size(runs, kind=int_size)
+            test = .false.
+            if (n >= 2) then
+                if ( runs( n-1 ) % base == 0 .or. &
+                     runs( n-2 ) % len <= runs(n-1) % len ) then
+                    test = .true.
+                else if ( n >= 3 ) then ! X exists
+                    if ( runs(n-3) % len <= &
+                         runs(n-2) % len + runs(n-1) % len ) then
+                        test = .true.
+!               |X| <= |Y| + |Z| => will need to merge due to rho1 or rho2
+                    else if( n >= 4 ) then
+                        if ( runs(n-4) % len <= &
+                             runs(n-3) % len + runs(n-2) % len ) then
+                            test = .true.
+!               |W| <= |X| + |Y| => will need to merge due to rho1 or rho3
+                        end if
+                    end if
+                end if
+            end if
+            if ( test ) then
+! By default merge Y & Z, rho2 or rho3
+                if ( n >= 3 ) then
+                    if ( runs(n-3) % len < runs(n-1) % len ) then
+                        r = n - 3
+! |X| < |Z| => merge X & Y, rho1
+                        return
+                    end if
+                end if
+                r = n - 2
+! |Y| <= |Z| => merge Y & Z, rho4
+                return
+            else
+                r = -1
+            end if
+
+        end function collapse
+
+
+        pure subroutine insert_head( array )
+! Inserts `array(0)` into the pre-sorted sequence `array(1:)` so that the
+! whole `array(0:)` becomes sorted, copying the first element into
+! a temporary variable, iterating until the right place for it is found.
+! copying every traversed element into the slot preceding it, and finally,
+! copying data from the temporary variable into the resulting hole.
+
+            real(${k1}$), intent(inout) :: array(0:)
+
+            real(${k1}$) :: tmp
+            integer(int_size) :: i
+
+            tmp = array(0)
+            find_hole: do i=1, size(array, kind=int_size)-1
+                if ( array(i) >= tmp ) exit find_hole
+                array(i-1) = array(i)
+            end do find_hole
+            array(i-1) = tmp
+
+        end subroutine insert_head
+
+
+        subroutine merge_sort( array, buf )
+! The Rust merge sort borrows some (but not all) of the ideas from TimSort,
+! which is described in detail at
+! (http://svn.python.org/projects/python/trunk/Objects/listsort.txt).
+!
+! The algorithm identifies strictly descending and non-descending
+! subsequences, which are called natural runs. Where these runs are less
+! than a minimum run size they are padded by adding additional samples
+! using an insertion sort. The merge process is driven by a stack of
+! pending unmerged runs. Each newly found run is pushed onto the stack,
+! and then pairs of adjacentd runs are merged until these two invariants
+! are satisfied:
+!
+! 1. for every `i` in `1..size(runs)-1`: `runs(i - 1)%len > runs(i)%len`
+! 2. for every `i` in `2..size(runs)-1`: `runs(i - 2)%len >
+!    runs(i - 1)%len + runs(i)%len`
+!
+! The invariants ensure that the total running time is `O(n log n)`
+! worst-case.
+
+            real(${k1}$), intent(inout) :: array(0:)
+            real(${k1}$), intent(inout) :: buf(0:)
+
+            integer(int_size) :: array_size, finish, min_run, r, r_count, &
+                start
+            type(run_type) :: runs(0:max_merge_stack-1), left, right
+
+            array_size = size(array, kind=int_size)
+
+! Very short runs are extended using insertion sort to span at least
+! min_run elements. Slices of up to this length are sorted using insertion
+! sort.
+            min_run = calc_min_run( array_size )
+
+            if ( array_size <= min_run ) then
+                if ( array_size >= 2 ) call insertion_sort( array )
+                return
+            end if
+
+! Following Rust sort, natural runs in `array` are identified by traversing
+! it backwards. By traversing it backward, merges more often go in the
+! opposite direction (forwards). According to developers of Rust sort,
+! merging forwards is slightly faster than merging backwards. Therefore
+! identifying runs by traversing backwards should improve performance.
+            r_count = 0
+            finish = array_size - 1
+            do while ( finish >= 0 )
+! Find the next natural run, and reverse it if it's strictly descending.
+                start = finish
+                if ( start > 0 ) then
+                    start = start - 1
+                    if ( array(start+1) < array(start) ) then
+                        Descending: do while ( start > 0 )
+                            if ( array(start) >= array(start-1) ) &
+                                exit Descending
+                            start = start - 1
+                        end do Descending
+                        call reverse_segment( array(start:finish) )
+                    else
+                        Ascending: do while( start > 0 )
+                            if ( array(start) < array(start-1) ) exit Ascending
+                            start = start - 1
+                        end do Ascending
+                    end if
+                end if
+
+! If the run is too short insert some more elements using an insertion sort.
+                Insert: do while ( start > 0 )
+                    if ( finish - start >= min_run - 1 ) exit Insert
+                    start = start - 1
+                    call insert_head( array(start:finish) )
+                end do Insert
+                if ( start == 0 .and. finish == array_size - 1 ) return
+
+                runs(r_count) = run_type( base = start, &
+                                          len = finish - start + 1 )
+                finish = start-1
+                r_count = r_count + 1
+
+! Determine whether pairs of adjacent runs need to be merged to satisfy
+! the invariants, and, if so, merge them.
+                Merge_loop: do
+                    r = collapse( runs(0:r_count - 1) )
+                    if ( r < 0 .or. r_count <= 1 ) exit Merge_loop
+                    left = runs( r + 1 )
+                    right = runs( r )
+                    call merge( array( left % base: &
+                                       right % base + right % len - 1 ), &
+                                left % len, buf )
+
+                    runs(r) = run_type( base = left % base, &
+                                        len = left % len + right % len )
+                    if ( r == r_count - 3 ) runs(r+1) = runs(r+2)
+                    r_count = r_count - 1
+
+                end do Merge_loop
+            end do
+            if ( r_count /= 1 ) &
+                error stop "MERGE_SORT completed without RUN COUNT == 1."
+
+        end subroutine merge_sort
+
+
+        pure subroutine merge( array, mid, buf )
+! Merges the two non-decreasing runs `ARRAY(0:MID-1)` and `ARRAY(MID:)`
+! using `BUF` as temporary storage, and stores the merged runs into
+! `ARRAY(0:)`. `MID` must be > 0, and < `SIZE(ARRAY)-1`. Buffer `BUF`
+! must be long enough to hold the shorter of the two runs.
+            real(${k1}$), intent(inout) :: array(0:)
+            integer(int_size), intent(in)  :: mid
+            real(${k1}$), intent(inout) :: buf(0:)
+
+            integer(int_size) :: array_len, i, j, k
+
+            array_len = size(array, kind=int_size)
+
+! Merge first copies the shorter run into `buf`. Then, depending on which
+! run was shorter, it traces the copied run and the longer run forwards
+! (or backwards), comparing their next unprocessed elements and then
+! copying the lesser (or greater) one into `array`.
+
+            if ( mid <= array_len - mid ) then ! The left run is shorter.
+                buf(0:mid-1) = array(0:mid-1)
+                i = 0
+                j = mid
+                merge_lower: do k = 0, array_len-1
+                    if ( buf(i) <= array(j) ) then
+                        array(k) = buf(i)
+                        i = i + 1
+                        if ( i >= mid ) exit merge_lower
+                    else
+                        array(k) = array(j)
+                        j = j + 1
+                        if ( j >= array_len ) then
+                            array(k+1:) = buf(i:mid-1)
+                            exit merge_lower
+                        end if
+                    end if
+                end do merge_lower
+            else ! The right run is shorter ! check that it is stable
+                buf(0:array_len-mid) = array(mid:array_len-1)
+                i = mid - 1
+                j = array_len - mid -1
+                merge_upper: do k = array_len-1, 0, -1
+                    if ( buf(j) >= array(i) ) then
+                        array(k) = buf(j)
+                        j = j - 1
+                        if ( j < 0 ) exit merge_upper
+                    else
+                        array(k) = array(i)
+                        i = i - 1
+                        if ( i < 0 ) then
+                            array(0:k-1) = buf(0:j)
+                            exit merge_upper
+                        end if
+                    end if
+                end do merge_upper
+            end if
+        end subroutine merge
+
+
+        pure subroutine reverse_segment( array )
+! Reverse a segment of an array in place
+            real(${k1}$), intent(inout) :: array(0:)
+
+            integer(int_size) :: lo, hi
+            real(${k1}$) :: temp
+
+            lo = 0
+            hi = size( array, kind=int_size ) - 1
+            do while( lo < hi )
+                temp = array(lo)
+                array(lo) = array(hi)
+                array(hi) = temp
+                lo = lo + 1
+                hi = hi - 1
+            end do
+
+        end subroutine reverse_segment
+
+    end subroutine ${k1}$_ord_sort
+
+#:endfor
+
+
+    module subroutine char_ord_sort( array, work )
+! A translation to Fortran 2008, of the `"Rust" sort` algorithm found in
+! `slice.rs`
+! https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs#L2159
+! The Rust version in turn is a simplification of the Timsort algorithm
+! described in
+! https://svn.python.org/projects/python/trunk/Objects/listsort.txt, as
+! it drops both the use of 'galloping' to identify bounds of regions to be
+! sorted and the estimation of the optimal `run size`. However it remains
+! a hybrid sorting algorithm combining an iterative Merge sort controlled
+! by a stack of `RUNS` identified by regions of uniformly decreasing or
+! non-decreasing sequences that may be expanded to a minimum run size and
+! initialy processed by an insertion sort.
+!
+! Note the Fortran implementation simplifies the logic as it only has to
+! deal with Fortran arrays of intrinsic types and not the full generality
+! of Rust's arrays and lists for arbitrary types. It also adds the
+! estimation of the optimal `run size` as suggested in Tim Peters'
+! original `listsort.txt`, and an optional `work` array to be used as
+! scratch memory.
+        character(len=*), intent(inout)                    :: array(0:)
+        character(len=len(array)), intent(inout), optional :: work(0:)
+
+        character(len=:), allocatable :: buf(:)
+        integer(int_size) :: array_size
+        integer :: stat
+
+        if ( present(work) ) then
+! Use the work array as scratch memory
+            call merge_sort( array, work )
+        else
+! Allocate a buffer to use as scratch memory.
+            array_size = size( array, kind=int_size )
+            allocate( character(len=len(array)) :: buf(0:array_size/2-1), &
+                      stat=stat )
+            if ( stat /= 0 ) error stop "Allocation of buffer failed."
+            call merge_sort( array, buf )
+        end if
+
+    contains
+
+        pure function calc_min_run( n ) result(min_run)
+!! Returns the minimum length of a run from 32-63 so that N/MIN_RUN is
+!! less than or equal to a power of two. See
+!! https://svn.python.org/projects/python/trunk/Objects/listsort.txt
+            integer(int_size)             :: min_run
+            integer(int_size), intent(in) :: n
+
+            integer(int_size) :: num, r
+
+            num = n
+            r = 0_int_size
+
+            do while( num >= 64 )
+                r = ior( r, iand(num, 1_int_size) )
+                num = ishft(num, -1_int_size)
+            end do
+            min_run = num + r
+
+        end function calc_min_run
+
+
+        pure subroutine insertion_sort( array )
+! Sorts `ARRAY` using an insertion sort.
+            character(len=*), intent(inout) :: array(0:)
+
+            integer(int_size)         :: i, j
+            character(len=len(array)) :: key
+
+            do j=1, size(array, kind=int_size)-1
+                key = array(j)
+                i = j - 1
+                do while( i >= 0 .and. array(i) > key )
+                    array(i+1) = array(i)
+                    i = i - 1
+                end do
+                array(i+1) = key
+            end do
+
+        end subroutine insertion_sort
+
+
+        pure function collapse( runs ) result ( r )
+! Examine the stack of runs waiting to be merged, identifying adjacent runs
+! to be merged until the stack invariants are restablished:
+!
+! 1. len(-3) > len(-2) + len(-1)
+! 2. len(-2) > len(-1)
+            integer(int_size) :: r
+            type(run_type), intent(in), target :: runs(0:)
+
+            integer(int_size) :: n
+            logical :: test
+
+            n = size(runs, kind=int_size)
+            test = .false.
+            if (n >= 2) then
+                if ( runs( n-1 ) % base == 0 .or. &
+                     runs( n-2 ) % len <= runs(n-1) % len ) then
+                    test = .true.
+                else if ( n >= 3 ) then ! X exists
+                    if ( runs(n-3) % len <= &
+                         runs(n-2) % len + runs(n-1) % len ) then
+                        test = .true.
+!               |X| <= |Y| + |Z| => will need to merge due to rho1 or rho2
+                    else if( n >= 4 ) then
+                        if ( runs(n-4) % len <= &
+                             runs(n-3) % len + runs(n-2) % len ) then
+                            test = .true.
+!               |W| <= |X| + |Y| => will need to merge due to rho1 or rho3
+                        end if
+                    end if
+                end if
+            end if
+            if ( test ) then
+! By default merge Y & Z, rho2 or rho3
+                if ( n >= 3 ) then
+                    if ( runs(n-3) % len < runs(n-1) % len ) then
+                        r = n - 3
+! |X| < |Z| => merge X & Y, rho1
+                        return
+                    end if
+                end if
+                r = n - 2
+! |Y| <= |Z| => merge Y & Z, rho4
+                return
+            else
+                r = -1
+            end if
+
+        end function collapse
+
+
+        pure subroutine insert_head( array )
+! Inserts `array(0)` into the pre-sorted sequence `array(1:)` so that the
+! whole `array(0:)` becomes sorted, copying the first element into
+! a temporary variable, iterating until the right place for it is found.
+! copying every traversed element into the slot preceding it, and finally,
+! copying data from the temporary variable into the resulting hole.
+
+            character(len=*), intent(inout) :: array(0:)
+
+            character(len=len(array)) :: tmp
+            integer(int_size)         :: i
+
+            tmp = array(0)
+            find_hole: do i=1, size(array, kind=int_size)-1
+                if ( array(i) >= tmp ) exit find_hole
+                array(i-1) = array(i)
+            end do find_hole
+            array(i-1) = tmp
+
+        end subroutine insert_head
+
+
+        subroutine merge_sort( array, buf )
+! The Rust merge sort borrows some (but not all) of the ideas from TimSort,
+! which is described in detail at
+! (http://svn.python.org/projects/python/trunk/Objects/listsort.txt).
+!
+! The algorithm identifies strictly descending and non-descending
+! subsequences, which are called natural runs. Where these runs are less
+! than a minimum run size they are padded by adding additional samples
+! using an insertion sort. The merge process is driven by a stack of
+! pending unmerged runs. Each newly found run is pushed onto the stack,
+! and then pairs of adjacentd runs are merged until these two invariants
+! are satisfied:
+!
+! 1. for every `i` in `1..size(runs)-1`: `runs(i - 1)%len > runs(i)%len`
+! 2. for every `i` in `2..size(runs)-1`: `runs(i - 2)%len >
+!    runs(i - 1)%len + runs(i)%len`
+!
+! The invariants ensure that the total running time is `O(n log n)`
+! worst-case.
+
+            character(len=*), intent(inout)          :: array(0:)
+            character(len=len(array)), intent(inout) :: buf(0:)
+
+            integer(int_size) :: array_size, finish, min_run, r, r_count, &
+                start
+            type(run_type) :: runs(0:max_merge_stack-1), left, right
+
+            array_size = size(array, kind=int_size)
+
+! Very short runs are extended using insertion sort to span at least
+! min_run elements. Slices of up to this length are sorted using insertion
+! sort.
+            min_run = calc_min_run( array_size )
+
+            if ( array_size <= min_run ) then
+                if ( array_size >= 2 ) call insertion_sort( array )
+                return
+            end if
+
+! Following Rust sort, natural runs in `array` are identified by traversing
+! it backwards. By traversing it backward, merges more often go in the
+! opposite direction (forwards). According to developers of Rust sort,
+! merging forwards is slightly faster than merging backwards. Therefore
+! identifying runs by traversing backwards should improve performance.
+            r_count = 0
+            finish = array_size - 1
+            do while ( finish >= 0 )
+! Find the next natural run, and reverse it if it's strictly descending.
+                start = finish
+                if ( start > 0 ) then
+                    start = start - 1
+                    if ( array(start+1) < array(start) ) then
+                        Descending: do while ( start > 0 )
+                            if ( array(start) >= array(start-1) ) &
+                                exit Descending
+                            start = start - 1
+                        end do Descending
+                        call reverse_segment( array(start:finish) )
+                    else
+                        Ascending: do while( start > 0 )
+                            if ( array(start) < array(start-1) ) exit Ascending
+                            start = start - 1
+                        end do Ascending
+                    end if
+                end if
+
+! If the run is too short insert some more elements using an insertion sort.
+                Insert: do while ( start > 0 )
+                    if ( finish - start >= min_run - 1 ) exit Insert
+                    start = start - 1
+                    call insert_head( array(start:finish) )
+                end do Insert
+                if ( start == 0 .and. finish == array_size - 1 ) return
+
+                runs(r_count) = run_type( base = start, &
+                                          len = finish - start + 1 )
+                finish = start-1
+                r_count = r_count + 1
+
+! Determine whether pairs of adjacent runs need to be merged to satisfy
+! the invariants, and, if so, merge them.
+                Merge_loop: do
+                    r = collapse( runs(0:r_count - 1) )
+                    if ( r < 0 .or. r_count <= 1 ) exit Merge_loop
+                    left = runs( r + 1 )
+                    right = runs( r )
+                    call merge( array( left % base: &
+                                       right % base + right % len - 1 ), &
+                                left % len, buf )
+
+                    runs(r) = run_type( base = left % base, &
+                                        len = left % len + right % len )
+                    if ( r == r_count - 3 ) runs(r+1) = runs(r+2)
+                    r_count = r_count - 1
+
+                end do Merge_loop
+            end do
+            if ( r_count /= 1 ) &
+                error stop "MERGE_SORT completed without RUN COUNT == 1."
+
+        end subroutine merge_sort
+
+
+        pure subroutine merge( array, mid, buf )
+! Merges the two non-decreasing runs `ARRAY(0:MID-1)` and `ARRAY(MID:)`
+! using `BUF` as temporary storage, and stores the merged runs into
+! `ARRAY(0:)`. `MID` must be > 0, and < `SIZE(ARRAY)-1`. Buffer `BUF`
+! must be long enough to hold the shorter of the two runs.
+            character(len=*), intent(inout)          :: array(0:)
+            integer(int_size), intent(in)            :: mid
+            character(len=len(array)), intent(inout) :: buf(0:)
+
+            integer(int_size) :: array_len, i, j, k
+
+            array_len = size(array, kind=int_size)
+
+! Merge first copies the shorter run into `buf`. Then, depending on which
+! run was shorter, it traces the copied run and the longer run forwards
+! (or backwards), comparing their next unprocessed elements and then
+! copying the lesser (or greater) one into `array`.
+
+            if ( mid <= array_len - mid ) then ! The left run is shorter.
+                buf(0:mid-1) = array(0:mid-1)
+                i = 0
+                j = mid
+                merge_lower: do k = 0, array_len-1
+                    if ( buf(i) <= array(j) ) then
+                        array(k) = buf(i)
+                        i = i + 1
+                        if ( i >= mid ) exit merge_lower
+                    else
+                        array(k) = array(j)
+                        j = j + 1
+                        if ( j >= array_len ) then
+                            array(k+1:) = buf(i:mid-1)
+                            exit merge_lower
+                        end if
+                    end if
+                end do merge_lower
+            else ! The right run is shorter ! check that it is stable
+                buf(0:array_len-mid) = array(mid:array_len-1)
+                i = mid - 1
+                j = array_len - mid -1
+                merge_upper: do k = array_len-1, 0, -1
+                    if ( buf(j) >= array(i) ) then
+                        array(k) = buf(j)
+                        j = j - 1
+                        if ( j < 0 ) exit merge_upper
+                    else
+                        array(k) = array(i)
+                        i = i - 1
+                        if ( i < 0 ) then
+                            array(0:k-1) = buf(0:j)
+                            exit merge_upper
+                        end if
+                    end if
+                end do merge_upper
+            end if
+        end subroutine merge
+
+
+        pure subroutine reverse_segment( array )
+! Reverse a segment of an array in place
+            character(len=*), intent(inout) :: array(0:)
+
+            integer(int_size)         :: lo, hi
+            character(len=len(array)) :: temp
+
+            lo = 0
+            hi = size( array, kind=int_size ) - 1
+            do while( lo < hi )
+                temp = array(lo)
+                array(lo) = array(hi)
+                array(hi) = temp
+                lo = lo + 1
+                hi = hi - 1
+            end do
+
+        end subroutine reverse_segment
+
+    end subroutine char_ord_sort
+
+
+    module subroutine string_ord_sort( array, work )
+! A translation to Fortran 2008, of the `"Rust" sort` algorithm found in
+! `slice.rs`
+! https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs#L2159
+! The Rust version in turn is a simplification of the Timsort algorithm
+! described in
+! https://svn.python.org/projects/python/trunk/Objects/listsort.txt, as
+! it drops both the use of 'galloping' to identify bounds of regions to be
+! sorted and the estimation of the optimal `run size`. However it remains
+! a hybrid sorting algorithm combining an iterative Merge sort controlled
+! by a stack of `RUNS` identified by regions of uniformly decreasing or
+! non-decreasing sequences that may be expanded to a minimum run size and
+! initialy processed by an insertion sort.
+!
+! Note the Fortran implementation simplifies the logic as it only has to
+! deal with Fortran arrays of intrinsic types and not the full generality
+! of Rust's arrays and lists for arbitrary types. It also adds the
+! estimation of the optimal `run size` as suggested in Tim Peters'
+! original `listsort.txt`, and an optional `work` array to be used as
+! scratch memory.
+        type(string_type), intent(inout)           :: array(0:)
+        type(string_type), intent(inout), optional :: work(0:)
+
+        type(string_type), allocatable :: buf(:)
+        integer(int_size)              :: array_size
+        integer                        :: stat
+
+        if ( present(work) ) then
+! Use the work array as scratch memory
+            call merge_sort( array, work )
+        else
+! Allocate a buffer to use as scratch memory.
+            array_size = size( array, kind=int_size )
+            allocate( buf(0:array_size/2-1), stat=stat )
+            if ( stat /= 0 ) error stop "Allocation of buffer failed."
+            call merge_sort( array, buf )
+        end if
+
+    contains
+
+        pure function calc_min_run( n ) result(min_run)
+!! Returns the minimum length of a run from 32-63 so that N/MIN_RUN is
+!! less than or equal to a power of two. See
+!! https://svn.python.org/projects/python/trunk/Objects/listsort.txt
+            integer(int_size)             :: min_run
+            integer(int_size), intent(in) :: n
+
+            integer(int_size) :: num, r
+
+            num = n
+            r = 0_int_size
+
+            do while( num >= 64 )
+                r = ior( r, iand(num, 1_int_size) )
+                num = ishft(num, -1_int_size)
+            end do
+            min_run = num + r
+
+        end function calc_min_run
+
+
+        pure subroutine insertion_sort( array )
+! Sorts `ARRAY` using an insertion sort.
+            type(string_type), intent(inout) :: array(0:)
+
+            integer(int_size) :: i, j
+            type(string_type) :: key
+
+            do j=1, size(array, kind=int_size)-1
+                key = array(j)
+                i = j - 1
+                do while( i >= 0 .and. array(i) > key )
+                    array(i+1) = array(i)
+                    i = i - 1
+                end do
+                array(i+1) = key
+            end do
+
+        end subroutine insertion_sort
+
+
+        pure function collapse( runs ) result ( r )
+! Examine the stack of runs waiting to be merged, identifying adjacent runs
+! to be merged until the stack invariants are restablished:
+!
+! 1. len(-3) > len(-2) + len(-1)
+! 2. len(-2) > len(-1)
+            integer(int_size) :: r
+            type(run_type), intent(in), target :: runs(0:)
+
+            integer(int_size) :: n
+            logical           :: test
+
+            n = size(runs, kind=int_size)
+            test = .false.
+            if (n >= 2) then
+                if ( runs( n-1 ) % base == 0 .or. &
+                     runs( n-2 ) % len <= runs(n-1) % len ) then
+                    test = .true.
+                else if ( n >= 3 ) then ! X exists
+                    if ( runs(n-3) % len <= &
+                         runs(n-2) % len + runs(n-1) % len ) then
+                        test = .true.
+!               |X| <= |Y| + |Z| => will need to merge due to rho1 or rho2
+                    else if( n >= 4 ) then
+                        if ( runs(n-4) % len <= &
+                             runs(n-3) % len + runs(n-2) % len ) then
+                            test = .true.
+!               |W| <= |X| + |Y| => will need to merge due to rho1 or rho3
+                        end if
+                    end if
+                end if
+            end if
+            if ( test ) then
+! By default merge Y & Z, rho2 or rho3
+                if ( n >= 3 ) then
+                    if ( runs(n-3) % len < runs(n-1) % len ) then
+                        r = n - 3
+! |X| < |Z| => merge X & Y, rho1
+                        return
+                    end if
+                end if
+                r = n - 2
+! |Y| <= |Z| => merge Y & Z, rho4
+                return
+            else
+                r = -1
+            end if
+
+        end function collapse
+
+
+        pure subroutine insert_head( array )
+! Inserts `array(0)` into the pre-sorted sequence `array(1:)` so that the
+! whole `array(0:)` becomes sorted, copying the first element into
+! a temporary variable, iterating until the right place for it is found.
+! copying every traversed element into the slot preceding it, and finally,
+! copying data from the temporary variable into the resulting hole.
+
+            type(string_type), intent(inout) :: array(0:)
+
+            type(string_type) :: tmp
+            integer(int_size) :: i
+
+            tmp = array(0)
+            find_hole: do i=1, size(array, kind=int_size)-1
+                if ( array(i) >= tmp ) exit find_hole
+                array(i-1) = array(i)
+            end do find_hole
+            array(i-1) = tmp
+
+        end subroutine insert_head
+
+
+        subroutine merge_sort( array, buf )
+! The Rust merge sort borrows some (but not all) of the ideas from TimSort,
+! which is described in detail at
+! (http://svn.python.org/projects/python/trunk/Objects/listsort.txt).
+!
+! The algorithm identifies strictly descending and non-descending
+! subsequences, which are called natural runs. Where these runs are less
+! than a minimum run size they are padded by adding additional samples
+! using an insertion sort. The merge process is driven by a stack of
+! pending unmerged runs. Each newly found run is pushed onto the stack,
+! and then pairs of adjacentd runs are merged until these two invariants
+! are satisfied:
+!
+! 1. for every `i` in `1..size(runs)-1`: `runs(i - 1)%len > runs(i)%len`
+! 2. for every `i` in `2..size(runs)-1`: `runs(i - 2)%len >
+!    runs(i - 1)%len + runs(i)%len`
+!
+! The invariants ensure that the total running time is `O(n log n)`
+! worst-case.
+
+            type(string_type), intent(inout) :: array(0:)
+            type(string_type), intent(inout) :: buf(0:)
+
+            integer(int_size) :: array_size, finish, min_run, r, r_count, &
+                start
+            type(run_type)    :: runs(0:max_merge_stack-1), left, right
+
+            array_size = size(array, kind=int_size)
+
+! Very short runs are extended using insertion sort to span at least
+! min_run elements. Slices of up to this length are sorted using insertion
+! sort.
+            min_run = calc_min_run( array_size )
+
+            if ( array_size <= min_run ) then
+                if ( array_size >= 2 ) call insertion_sort( array )
+                return
+            end if
+
+! Following Rust sort, natural runs in `array` are identified by traversing
+! it backwards. By traversing it backward, merges more often go in the
+! opposite direction (forwards). According to developers of Rust sort,
+! merging forwards is slightly faster than merging backwards. Therefore
+! identifying runs by traversing backwards should improve performance.
+            r_count = 0
+            finish = array_size - 1
+            do while ( finish >= 0 )
+! Find the next natural run, and reverse it if it's strictly descending.
+                start = finish
+                if ( start > 0 ) then
+                    start = start - 1
+                    if ( array(start+1) < array(start) ) then
+                        Descending: do while ( start > 0 )
+                            if ( array(start) >= array(start-1) ) &
+                                exit Descending
+                            start = start - 1
+                        end do Descending
+                        call reverse_segment( array(start:finish) )
+                    else
+                        Ascending: do while( start > 0 )
+                            if ( array(start) < array(start-1) ) exit Ascending
+                            start = start - 1
+                        end do Ascending
+                    end if
+                end if
+
+! If the run is too short insert some more elements using an insertion sort.
+                Insert: do while ( start > 0 )
+                    if ( finish - start >= min_run - 1 ) exit Insert
+                    start = start - 1
+                    call insert_head( array(start:finish) )
+                end do Insert
+                if ( start == 0 .and. finish == array_size - 1 ) return
+
+                runs(r_count) = run_type( base = start, &
+                                          len = finish - start + 1 )
+                finish = start-1
+                r_count = r_count + 1
+
+! Determine whether pairs of adjacent runs need to be merged to satisfy
+! the invariants, and, if so, merge them.
+                Merge_loop: do
+                    r = collapse( runs(0:r_count - 1) )
+                    if ( r < 0 .or. r_count <= 1 ) exit Merge_loop
+                    left = runs( r + 1 )
+                    right = runs( r )
+                    call merge( array( left % base: &
+                                       right % base + right % len - 1 ), &
+                                left % len, buf )
+
+                    runs(r) = run_type( base = left % base, &
+                                        len = left % len + right % len )
+                    if ( r == r_count - 3 ) runs(r+1) = runs(r+2)
+                    r_count = r_count - 1
+
+                end do Merge_loop
+            end do
+            if ( r_count /= 1 ) &
+                error stop "MERGE_SORT completed without RUN COUNT == 1."
+
+        end subroutine merge_sort
+
+
+        pure subroutine merge( array, mid, buf )
+! Merges the two non-decreasing runs `ARRAY(0:MID-1)` and `ARRAY(MID:)`
+! using `BUF` as temporary storage, and stores the merged runs into
+! `ARRAY(0:)`. `MID` must be > 0, and < `SIZE(ARRAY)-1`. Buffer `BUF`
+! must be long enough to hold the shorter of the two runs.
+            type(string_type), intent(inout) :: array(0:)
+            integer(int_size), intent(in)    :: mid
+            type(string_type), intent(inout) :: buf(0:)
+
+            integer(int_size) :: array_len, i, j, k
+
+            array_len = size(array, kind=int_size)
+
+! Merge first copies the shorter run into `buf`. Then, depending on which
+! run was shorter, it traces the copied run and the longer run forwards
+! (or backwards), comparing their next unprocessed elements and then
+! copying the lesser (or greater) one into `array`.
+
+            if ( mid <= array_len - mid ) then ! The left run is shorter.
+                buf(0:mid-1) = array(0:mid-1)
+                i = 0
+                j = mid
+                merge_lower: do k = 0, array_len-1
+                    if ( buf(i) <= array(j) ) then
+                        array(k) = buf(i)
+                        i = i + 1
+                        if ( i >= mid ) exit merge_lower
+                    else
+                        array(k) = array(j)
+                        j = j + 1
+                        if ( j >= array_len ) then
+                            array(k+1:) = buf(i:mid-1)
+                            exit merge_lower
+                        end if
+                    end if
+                end do merge_lower
+            else ! The right run is shorter ! check that it is stable
+                buf(0:array_len-mid) = array(mid:array_len-1)
+                i = mid - 1
+                j = array_len - mid -1
+                merge_upper: do k = array_len-1, 0, -1
+                    if ( buf(j) >= array(i) ) then
+                        array(k) = buf(j)
+                        j = j - 1
+                        if ( j < 0 ) exit merge_upper
+                    else
+                        array(k) = array(i)
+                        i = i - 1
+                        if ( i < 0 ) then
+                            array(0:k-1) = buf(0:j)
+                            exit merge_upper
+                        end if
+                    end if
+                end do merge_upper
+            end if
+        end subroutine merge
+
+
+        pure subroutine reverse_segment( array )
+! Reverse a segment of an array in place
+            type(string_type), intent(inout) :: array(0:)
+
+            integer(int_size) :: lo, hi
+            type(string_type) :: temp
+
+            lo = 0
+            hi = size( array, kind=int_size ) - 1
+            do while( lo < hi )
+                temp = array(lo)
+                array(lo) = array(hi)
+                array(hi) = temp
+                lo = lo + 1
+                hi = hi - 1
+            end do
+
+        end subroutine reverse_segment
+
+    end subroutine string_ord_sort
+
+end submodule stdlib_sorting_ord_sort
+
diff --git a/src/stdlib_sorting_sort.fypp b/src/stdlib_sorting_sort.fypp
new file mode 100644
index 000000000..fab24aa1c
--- /dev/null
+++ b/src/stdlib_sorting_sort.fypp
@@ -0,0 +1,736 @@
+#! Integer kinds to be considered during templating
+#:set INT_KINDS = ["int8", "int16", "int32", "int64"]
+
+#! Real kinds to be considered during templating
+#:set REAL_KINDS = ["sp", "dp", "qp"]
+
+!! Licensing:
+!!
+!! This file is subjec† both to the Fortran Standard Library license, and
+!! to additional licensing requirements as it contains translations of
+!! other software.
+!!
+!! The Fortran Standard Library, including this file, is distributed under
+!! the MIT license that should be included with the library's distribution.
+!!
+!!   Copyright (c) 2021 Fortran stdlib developers
+!!
+!!   Permission is hereby granted, free of charge, to any person obtaining a
+!!   copy of this software and associated documentation files (the
+!!   "Software"),  to deal in the Software without restriction, including
+!!   without limitation the rights to use, copy, modify, merge, publish,
+!!   distribute, sublicense, and/or sellcopies of the Software, and to permit
+!!   persons to whom the Software is furnished to do so, subject to the
+!!   following conditions:
+!!
+!!   The above copyright notice and this permission notice shall be included
+!!   in all copies or substantial portions of the Software.
+!!
+!!   THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+!!   OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+!!   MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+!!   IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
+!!   CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
+!!   TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
+!!   SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+!!
+!! The generic subroutine, `SORT`, is substantially a
+!! translation to Fortran 2008, of the `introsort` of David Musser.
+!! David Musser has given permission to include a variant of `introsort`
+!! in the Fortran Standard Library under the MIT license provided
+!! we cite:
+!!
+!!   Musser, D.R., “Introspective Sorting and Selection Algorithms,”
+!!   Software—Practice and Experience, Vol. 27(8), 983–993 (August 1997).
+!!
+!! as the official source of the algorithm.
+
+submodule(stdlib_sorting) stdlib_sorting_sort
+!! This submodule implements the overloaded sorting subroutine `SORT`
+!! that can be used to sort four kinds of `INTEGER` arrays and three kinds
+!! of `REAL` arrays. Sorting is in order of increasing value, with the worst
+!! case run time performance of `O(N Ln(N))`.
+!!
+!! `SORT` uses the `INTROSORT` sorting algorithm of David Musser,
+!! http://www.cs.rpi.edu/~musser/gp/introsort.ps. `introsort` is a hybrid
+!! unstable comparison algorithm combining `quicksort`, `insertion sort`, and
+!! `heap sort`. While this algorithm is always O(N Ln(N)) it is relatively
+!! fast on randomly ordered data, but inconsistent in performance on partly
+!! sorted data, sometimes having `merge sort` performance, sometimes having
+!! better than `quicksort` performance.
+
+    implicit none
+
+contains
+
+
+#:for k1 in INT_KINDS
+
+    pure module subroutine ${k1}$_sort( array )
+! `${k1}$_sort( array )` sorts the input `ARRAY` of type `INTEGER(${k1}$)`
+! using a hybrid sort based on the `introsort` of David Musser. As with
+! `introsort`, `${k1}$_sort( array )` is an unstable hybrid comparison
+! algorithm using `quicksort` for the main body of the sort tree,
+! supplemented by `insertion sort` for the outer brances, but if
+! `quicksort` is converging too slowly the algorithm resorts
+! to `heapsort`. The algorithm is of order O(N Ln(N)) for all inputs.
+! Because it relies on `quicksort`, the coefficient of the O(N Ln(N))
+! behavior is typically small compared to other sorting algorithms.
+
+        integer(${k1}$), intent(inout) :: array(0:)
+
+        integer(int32) :: depth_limit
+
+        depth_limit = 2 * int( floor( log( real( size( array, kind=int64 ),  &
+                                                 kind=dp) ) / log(2.0_dp) ), &
+                               kind=int32 )
+        call introsort(array, depth_limit)
+
+    contains
+
+        pure recursive subroutine introsort( array, depth_limit )
+! It devolves to `insertionsort` if the remaining number of elements
+! is fewer than or equal to `INSERT_SIZE`, `heapsort` if the completion
+! of the `quicksort` is too slow as estimated from `DEPTH_LIMIT`,
+! otherwise sorting is done by a `quicksort`.
+            integer(${k1}$), intent(inout) :: array(0:)
+            integer(int32), intent(in)     :: depth_limit
+
+            integer(int_size), parameter :: insert_size = 16_int_size
+            integer(int_size)            :: index
+
+            if ( size(array, kind=int_size) <= insert_size ) then
+                ! May be best at the end of SORT processing the whole array
+                ! See Musser, D.R., “Introspective Sorting and Selection
+                ! Algorithms,” Software—Practice and Experience, Vol. 27(8),
+                ! 983–993 (August 1997).
+
+                call insertion_sort( array )
+            else if ( depth_limit == 0 ) then
+                call heap_sort( array )
+            else
+                call partition( array, index )
+                call introsort( array(0:index-1), depth_limit-1 )
+                call introsort( array(index+1:), depth_limit-1 )
+            end if
+
+        end subroutine introsort
+
+
+        pure subroutine partition( array, index )
+! quicksort partition using median of three.
+            integer(${k1}$), intent(inout) :: array(0:)
+            integer(int_size), intent(out) :: index
+
+            integer(${k1}$) :: u, v, w, x, y
+            integer(int_size) :: i, j
+
+! Determine median of three and exchange it with the end.
+            u = array( 0 )
+            v = array( size(array, kind=int_size)/2-1 )
+            w = array( size(array, kind=int_size)-1 )
+            if ( (u > v) .neqv. (u > w) ) then
+                x = u
+                y = array(0)
+                array(0) = array( size( array, kind=int_size ) - 1 )
+                array( size( array, kind=int_size ) - 1 ) = y
+            else if ( (v < u) .neqv. (v < w) ) then
+                x = v
+                y = array(size( array, kind=int_size )/2-1)
+                array( size( array, kind=int_size )/2-1 ) = &
+                    array( size( array, kind=int_size )-1 )
+                array( size( array, kind=int_size )-1 ) = y
+            else
+                x = w
+            end if
+! Partition the array.
+            i = -1_int_size
+            do j = 0_int_size, size(array, kind=int_size)-2
+                if ( array(j) <= x ) then
+                    i = i + 1
+                    y = array(i)
+                    array(i) = array(j)
+                    array(j) = y
+                end if
+            end do
+            y = array(i+1)
+            array(i+1) = array(size(array, kind=int_size)-1)
+            array(size(array, kind=int_size)-1) = y
+            index = i + 1
+
+        end subroutine partition
+
+        pure subroutine insertion_sort( array )
+! Bog standard insertion sort.
+            integer(${k1}$), intent(inout) :: array(0:)
+
+            integer(int_size) :: i, j
+            integer(${k1}$)   :: key
+
+            do j=1_int_size, size(array, kind=int_size)-1
+                key = array(j)
+                i = j - 1
+                do while( i >= 0 .and. array(i) > key )
+                    array(i+1) = array(i)
+                    i = i - 1
+                end do
+                array(i+1) = key
+            end do
+
+        end subroutine insertion_sort
+
+        pure subroutine heap_sort( array )
+! A bog standard heap sort
+            integer(${k1}$), intent(inout) :: array(0:)
+
+            integer(int_size) :: i, heap_size
+            integer(${k1}$)   :: y
+
+            heap_size = size( array, kind=int_size )
+! Build the max heap
+            do i = (heap_size-2)/2_int_size, 0_int_size, -1_int_size
+                call max_heapify( array, i, heap_size )
+            end do
+            do i = heap_size-1, 1_int_size, -1_int_size
+! Swap the first element with the current final element
+                y = array(0)
+                array(0) = array(i)
+                array(i) = y
+! Sift down using max_heapify
+                call max_heapify( array, 0_int_size, i )
+            end do
+
+        end subroutine heap_sort
+
+        pure recursive subroutine max_heapify( array, i, heap_size )
+! Transform the array into a max heap
+            integer(${k1}$), intent(inout) :: array(0:)
+            integer(int_size), intent(in)  :: i, heap_size
+
+            integer(int_size) :: l, r, largest
+            integer(${k1}$)   :: y
+
+            largest = i
+            l = 2_int_size * i + 1_int_size
+            r = l + 1_int_size
+            if ( l < heap_size ) then
+                if ( array(l) > array(largest) ) largest = l
+            end if
+            if ( r < heap_size ) then
+                if ( array(r) > array(largest) ) largest = r
+            end if
+            if ( largest /= i ) then
+                y = array(i)
+                array(i) = array(largest)
+                array(largest) = y
+                call max_heapify( array, largest, heap_size )
+            end if
+
+        end subroutine max_heapify
+
+    end subroutine ${k1}$_sort
+
+#:endfor
+
+
+#:for k1 in REAL_KINDS
+
+    pure module subroutine ${k1}$_sort( array )
+
+! `${k1}$_sort( array )` sorts the input `ARRAY` of type `REAL(${k1}$)`
+! using a hybrid sort based on the `introsort` of David Musser. As with
+! `introsort`, `${k1}$_sort( array )` is an unstable hybrid comparison
+! algorithm using `quicksort` for the main body of the sort tree,
+! supplemented by `insertion sort` for the outer brances, but if
+! `quicksort` is converging too slowly the algorithm resorts
+! to `heapsort`. The algorithm is of order O(N Ln(N)) for all inputs.
+! Because it relies on `quicksort`, the coefficient of the O(N Ln(N))
+! behavior is typically small compared to other sorting algorithms.
+
+        real(${k1}$), intent(inout) :: array(0:)
+        integer(int32)              :: depth_limit
+
+        depth_limit = &
+            2 * int( floor( log( real( size( array, kind=int_size ),   &
+                                             kind=dp) ) / log(2.0_dp) ), &
+                               kind=int32 )
+        call introsort(array, depth_limit)
+
+    contains
+
+        pure recursive subroutine introsort( array, depth_limit )
+! It devolves to `insertionsort` if the remaining number of elements
+! is fewer than or equal to `INSERT_SIZE`, `heapsort` if the completion of
+! the quicksort is too slow as estimated from `DEPTH_LIMIT`, otherwise
+! sorting is done by a `quicksort`.
+            real(${k1}$), intent(inout) :: array(0:)
+            integer(int32), intent(in)  :: depth_limit
+
+            integer(int_size), parameter :: insert_size = 16_int_size
+            integer(int_size)            :: index
+
+            if ( size(array, kind=int_size) <= insert_size ) then
+                ! May be best at the end of SORT processing the whole array
+                ! See Musser, D.R., “Introspective Sorting and Selection
+                ! Algorithms,” Software—Practice and Experience, Vol. 27(8),
+                ! 983–993 (August 1997).
+
+                call insertion_sort( array )
+
+            else if ( depth_limit == 0 ) then
+                call heap_sort( array )
+
+            else
+                call partition( array, index )
+                call introsort( array(0:index-1), depth_limit-1 )
+                call introsort( array(index+1:), depth_limit-1 )
+            end if
+
+        end subroutine introsort
+
+
+        pure subroutine partition( array, index )
+! quicksort partition using median of three.
+            real(${k1}$), intent(inout)    :: array(0:)
+            integer(int_size), intent(out) :: index
+
+            real(${k1}$)      :: u, v, w, x, y
+            integer(int_size) :: i, j
+
+! Determine median of three and exchange it with the end.
+            u = array( 0 )
+            v = array( size(array, kind=int_size)/2-1 )
+            w = array( size(array, kind=int_size)-1 )
+            if ( (u > v) .neqv. (u > w) ) then
+                x = u
+                y = array(0)
+                array(0) = array( size( array, kind=int_size ) - 1 )
+                array( size( array, kind=int_size ) - 1 ) = y
+            else if ( (v < u) .neqv. (v < w) ) then
+                x = v
+                y = array(size( array, kind=int_size )/2-1)
+                array(size( array, kind=int_size )/2-1) = &
+                    array(size( array, kind=int_size )-1)
+                array( size( array, kind=int_size )-1 ) = y
+            else
+                x = w
+            end if
+! Partition the array
+            i = -1_int_size
+            do j = 0_int_size, size(array, kind=int_size)-2
+                if ( array(j) <= x ) then
+                    i = i + 1
+                    y = array(i)
+                    array(i) = array(j)
+                    array(j) = y
+                end if
+            end do
+            y = array(i+1)
+            array(i+1) = array(size(array, kind=int_size)-1)
+            array(size(array, kind=int_size)-1) = y
+            index = i + 1
+
+        end subroutine partition
+
+        pure subroutine insertion_sort( array )
+! Bog standard insertion sort.
+            real(${k1}$), intent(inout) :: array(0:)
+
+            integer(int_size) :: i, j
+            real(${k1}$)      :: key
+
+            do j=1_int_size, size(array, kind=int_size)-1
+                key = array(j)
+                i = j - 1
+                do while( i >= 0 .and. array(i) > key )
+                    array(i+1) = array(i)
+                    i = i - 1
+                end do
+                array(i+1) = key
+            end do
+
+        end subroutine insertion_sort
+
+        pure subroutine heap_sort( array )
+! A bog standard heap sort
+            real(${k1}$), intent(inout) :: array(0:)
+
+            integer(int_size) :: i, heap_size
+            real(${k1}$)      :: y
+
+            heap_size = size( array, kind=int_size )
+! Build the max heap
+            do i = (heap_size-2)/2_int_size, 0_int_size, -1_int_size
+                call max_heapify( array, i, heap_size )
+            end do
+            do i = heap_size-1, 1_int_size, -1_int_size
+! Swap the first element with the current final element
+                y = array(0)
+                array(0) = array(i)
+                array(i) = y
+! Sift down using max_heapify
+                call max_heapify( array, 0_int_size, i )
+            end do
+
+        end subroutine heap_sort
+
+        pure recursive subroutine max_heapify( array, i, heap_size )
+! Transform the array into a max heap
+            real(${k1}$), intent(inout)   :: array(0:)
+            integer(int_size), intent(in) :: i, heap_size
+
+            integer(int_size) :: l, r, largest
+            real(${k1}$)      :: y
+
+            largest = i
+            l = 2_int_size * i + 1_int_size
+            r = l + 1_int_size
+            if ( l < heap_size ) then
+                if ( array(l) > array(largest) ) largest = l
+            end if
+            if ( r < heap_size ) then
+                if ( array(r) > array(largest) ) largest = r
+            end if
+            if ( largest /= i ) then
+                y = array(i)
+                array(i) = array(largest)
+                array(largest) = y
+                call max_heapify( array, largest, heap_size )
+            end if
+
+        end subroutine max_heapify
+
+    end subroutine ${k1}$_sort
+
+#:endfor
+
+
+    pure module subroutine char_sort( array )
+! `char_sort( array )` sorts the input `ARRAY` of type `CHARACTER(*)`
+! using a hybrid sort based on the `introsort` of David Musser. As with
+! `introsort`, `char_sort( array )` is an unstable hybrid comparison
+! algorithm using `quicksort` for the main body of the sort tree,
+! supplemented by `insertion sort` for the outer brances, but if
+! `quicksort` is converging too slowly the algorithm resorts
+! to `heapsort`. The algorithm is of order O(N Ln(N)) for all inputs.
+! Because it relies on `quicksort`, the coefficient of the O(N Ln(N))
+! behavior is typically small compared to other sorting algorithms.
+
+        character(len=*), intent(inout) :: array(0:)
+
+        integer(int32) :: depth_limit
+
+        depth_limit = 2 * int( floor( log( real( size( array, kind=int64 ),  &
+                                                 kind=dp) ) / log(2.0_dp) ), &
+                               kind=int32 )
+        call introsort(array, depth_limit)
+
+    contains
+
+        pure recursive subroutine introsort( array, depth_limit )
+! It devolves to `insertionsort` if the remaining number of elements
+! is fewer than or equal to `INSERT_SIZE`, `heapsort` if the completion
+! of the `quicksort` is too slow as estimated from `DEPTH_LIMIT`,
+! otherwise sorting is done by a `quicksort`.
+            character(len=*), intent(inout) :: array(0:)
+            integer(int32), intent(in)      :: depth_limit
+
+            integer(int_size), parameter :: insert_size = 16_int_size
+            integer(int_size)            :: index
+
+            if ( size(array, kind=int_size) <= insert_size ) then
+                ! May be best at the end of SORT processing the whole array
+                ! See Musser, D.R., “Introspective Sorting and Selection
+                ! Algorithms,” Software—Practice and Experience, Vol. 27(8),
+                ! 983–993 (August 1997).
+
+                call insertion_sort( array )
+            else if ( depth_limit == 0 ) then
+                call heap_sort( array )
+            else
+                call partition( array, index )
+                call introsort( array(0:index-1), depth_limit-1 )
+                call introsort( array(index+1:), depth_limit-1 )
+            end if
+
+        end subroutine introsort
+
+
+        pure subroutine partition( array, index )
+! quicksort partition using median of three.
+            character(len=*), intent(inout) :: array(0:)
+            integer(int_size), intent(out)  :: index
+
+            integer(int_size)         :: i, j
+            character(len=len(array)) :: u, v, w, x, y
+
+! Determine median of three and exchange it with the end.
+            u = array( 0 )
+            v = array( size(array, kind=int_size)/2-1 )
+            w = array( size(array, kind=int_size)-1 )
+            if ( (u > v) .neqv. (u > w) ) then
+                x = u
+                y = array(0)
+                array(0) = array( size( array, kind=int_size ) - 1 )
+                array( size( array, kind=int_size ) - 1 ) = y
+            else if ( (v < u) .neqv. (v < w) ) then
+                x = v
+                y = array(size( array, kind=int_size )/2-1)
+                array( size( array, kind=int_size )/2-1 ) = &
+                    array( size( array, kind=int_size )-1 )
+                array( size( array, kind=int_size )-1 ) = y
+            else
+                x = w
+            end if
+! Partition the array.
+            i = -1_int_size
+            do j = 0_int_size, size(array, kind=int_size)-2
+                if ( array(j) <= x ) then
+                    i = i + 1
+                    y = array(i)
+                    array(i) = array(j)
+                    array(j) = y
+                end if
+            end do
+            y = array(i+1)
+            array(i+1) = array(size(array, kind=int_size)-1)
+            array(size(array, kind=int_size)-1) = y
+            index = i + 1
+
+        end subroutine partition
+
+        pure subroutine insertion_sort( array )
+! Bog standard insertion sort.
+            character(len=*), intent(inout) :: array(0:)
+
+            integer(int_size)         :: i, j
+            character(len=len(array)) :: key
+
+            do j=1_int_size, size(array, kind=int_size)-1
+                key = array(j)
+                i = j - 1
+                do while( i >= 0 .and. array(i) > key )
+                    array(i+1) = array(i)
+                    i = i - 1
+                end do
+                array(i+1) = key
+            end do
+
+        end subroutine insertion_sort
+
+        pure subroutine heap_sort( array )
+! A bog standard heap sort
+            character(len=*), intent(inout) :: array(0:)
+
+            integer(int_size)         :: i, heap_size
+            character(len=len(array)) :: y
+
+            heap_size = size( array, kind=int_size )
+! Build the max heap
+            do i = (heap_size-2)/2_int_size, 0_int_size, -1_int_size
+                call max_heapify( array, i, heap_size )
+            end do
+            do i = heap_size-1, 1_int_size, -1_int_size
+! Swap the first element with the current final element
+                y = array(0)
+                array(0) = array(i)
+                array(i) = y
+! Sift down using max_heapify
+                call max_heapify( array, 0_int_size, i )
+            end do
+
+        end subroutine heap_sort
+
+        pure recursive subroutine max_heapify( array, i, heap_size )
+! Transform the array into a max heap
+            character(len=*), intent(inout) :: array(0:)
+            integer(int_size), intent(in)   :: i, heap_size
+
+            integer(int_size)         :: l, r, largest
+            character(len=len(array)) :: y
+
+            largest = i
+            l = 2_int_size * i + 1_int_size
+            r = l + 1_int_size
+            if ( l < heap_size ) then
+                if ( array(l) > array(largest) ) largest = l
+            end if
+            if ( r < heap_size ) then
+                if ( array(r) > array(largest) ) largest = r
+            end if
+            if ( largest /= i ) then
+                y = array(i)
+                array(i) = array(largest)
+                array(largest) = y
+                call max_heapify( array, largest, heap_size )
+            end if
+
+        end subroutine max_heapify
+
+    end subroutine char_sort
+
+    pure module subroutine string_sort( array )
+! `string_sort( array )` sorts the input `ARRAY` of type `STRING_TyPE`
+! using a hybrid sort based on the `introsort` of David Musser. As with
+! `introsort`, `string_sort( array )` is an unstable hybrid comparison
+! algorithm using `quicksort` for the main body of the sort tree,
+! supplemented by `insertion sort` for the outer brances, but if
+! `quicksort` is converging too slowly the algorithm resorts
+! to `heapsort`. The algorithm is of order O(N Ln(N)) for all inputs.
+! Because it relies on `quicksort`, the coefficient of the O(N Ln(N))
+! behavior is typically small compared to other sorting algorithms.
+
+        type(string_type), intent(inout) :: array(0:)
+
+        integer(int32) :: depth_limit
+
+        depth_limit = 2 * int( floor( log( real( size( array, kind=int64 ),  &
+                                                 kind=dp) ) / log(2.0_dp) ), &
+                               kind=int32 )
+        call introsort(array, depth_limit)
+
+    contains
+
+        pure recursive subroutine introsort( array, depth_limit )
+! It devolves to `insertionsort` if the remaining number of elements
+! is fewer than or equal to `INSERT_SIZE`, `heapsort` if the completion
+! of the `quicksort` is too slow as estimated from `DEPTH_LIMIT`,
+! otherwise sorting is done by a `quicksort`.
+            type(string_type), intent(inout) :: array(0:)
+            integer(int32), intent(in)      :: depth_limit
+
+            integer(int_size), parameter :: insert_size = 16_int_size
+            integer(int_size)            :: index
+
+            if ( size(array, kind=int_size) <= insert_size ) then
+                ! May be best at the end of SORT processing the whole array
+                ! See Musser, D.R., “Introspective Sorting and Selection
+                ! Algorithms,” Software—Practice and Experience, Vol. 27(8),
+                ! 983–993 (August 1997).
+
+                call insertion_sort( array )
+            else if ( depth_limit == 0 ) then
+                call heap_sort( array )
+            else
+                call partition( array, index )
+                call introsort( array(0:index-1), depth_limit-1 )
+                call introsort( array(index+1:), depth_limit-1 )
+            end if
+
+        end subroutine introsort
+
+
+        pure subroutine partition( array, index )
+! quicksort partition using median of three.
+            type(string_type), intent(inout) :: array(0:)
+            integer(int_size), intent(out)  :: index
+
+            integer(int_size) :: i, j
+            type(string_type) :: u, v, w, x, y
+
+! Determine median of three and exchange it with the end.
+            u = array( 0 )
+            v = array( size(array, kind=int_size)/2-1 )
+            w = array( size(array, kind=int_size)-1 )
+            if ( (u > v) .neqv. (u > w) ) then
+                x = u
+                y = array(0)
+                array(0) = array( size( array, kind=int_size ) - 1 )
+                array( size( array, kind=int_size ) - 1 ) = y
+            else if ( (v < u) .neqv. (v < w) ) then
+                x = v
+                y = array(size( array, kind=int_size )/2-1)
+                array( size( array, kind=int_size )/2-1 ) = &
+                    array( size( array, kind=int_size )-1 )
+                array( size( array, kind=int_size )-1 ) = y
+            else
+                x = w
+            end if
+! Partition the array.
+            i = -1_int_size
+            do j = 0_int_size, size(array, kind=int_size)-2
+                if ( array(j) <= x ) then
+                    i = i + 1
+                    y = array(i)
+                    array(i) = array(j)
+                    array(j) = y
+                end if
+            end do
+            y = array(i+1)
+            array(i+1) = array(size(array, kind=int_size)-1)
+            array(size(array, kind=int_size)-1) = y
+            index = i + 1
+
+        end subroutine partition
+
+        pure subroutine insertion_sort( array )
+! Bog standard insertion sort.
+            type(string_type), intent(inout) :: array(0:)
+
+            integer(int_size) :: i, j
+            type(string_type) :: key
+
+            do j=1_int_size, size(array, kind=int_size)-1
+                key = array(j)
+                i = j - 1
+                do while( i >= 0 .and. array(i) > key )
+                    array(i+1) = array(i)
+                    i = i - 1
+                end do
+                array(i+1) = key
+            end do
+
+        end subroutine insertion_sort
+
+        pure subroutine heap_sort( array )
+! A bog standard heap sort
+            type(string_type), intent(inout) :: array(0:)
+
+            integer(int_size) :: i, heap_size
+            type(string_type) :: y
+
+            heap_size = size( array, kind=int_size )
+! Build the max heap
+            do i = (heap_size-2)/2_int_size, 0_int_size, -1_int_size
+                call max_heapify( array, i, heap_size )
+            end do
+            do i = heap_size-1, 1_int_size, -1_int_size
+! Swap the first element with the current final element
+                y = array(0)
+                array(0) = array(i)
+                array(i) = y
+! Sift down using max_heapify
+                call max_heapify( array, 0_int_size, i )
+            end do
+
+        end subroutine heap_sort
+
+        pure recursive subroutine max_heapify( array, i, heap_size )
+! Transform the array into a max heap
+            type(string_type), intent(inout) :: array(0:)
+            integer(int_size), intent(in)    :: i, heap_size
+
+            integer(int_size) :: l, r, largest
+            type(string_type) :: y
+
+            largest = i
+            l = 2_int_size * i + 1_int_size
+            r = l + 1_int_size
+            if ( l < heap_size ) then
+                if ( array(l) > array(largest) ) largest = l
+            end if
+            if ( r < heap_size ) then
+                if ( array(r) > array(largest) ) largest = r
+            end if
+            if ( largest /= i ) then
+                y = array(i)
+                array(i) = array(largest)
+                array(largest) = y
+                call max_heapify( array, largest, heap_size )
+            end if
+
+        end subroutine max_heapify
+
+    end subroutine string_sort
+
+end submodule stdlib_sorting_sort
diff --git a/src/stdlib_sorting_sort_index.fypp b/src/stdlib_sorting_sort_index.fypp
new file mode 100644
index 000000000..4146e77f1
--- /dev/null
+++ b/src/stdlib_sorting_sort_index.fypp
@@ -0,0 +1,1684 @@
+#! Integer kinds to be considered during templating
+#:set INT_KINDS = ["int8", "int16", "int32", "int64"]
+
+#! Real kinds to be considered during templating
+#:set REAL_KINDS = ["sp", "dp", "qp"]
+
+!! Licensing:
+!!
+!! This file is subjec† both to the Fortran Standard Library license, and
+!! to additional licensing requirements as it contains translations of
+!! other software.
+!!
+!! The Fortran Standard Library, including this file, is distributed under
+!! the MIT license that should be included with the library's distribution.
+!!
+!!   Copyright (c) 2021 Fortran stdlib developers
+!!
+!!   Permission is hereby granted, free of charge, to any person obtaining a
+!!   copy of this software and associated documentation files (the
+!!   "Software"),  to deal in the Software without restriction, including
+!!   without limitation the rights to use, copy, modify, merge, publish,
+!!   distribute, sublicense, and/or sellcopies of the Software, and to permit
+!!   persons to whom the Software is furnished to do so, subject to the
+!!   following conditions:
+!!
+!!   The above copyright notice and this permission notice shall be included
+!!   in all copies or substantial portions of the Software.
+!!
+!!   THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+!!   OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+!!   MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+!!   IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
+!!   CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
+!!   TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
+!!   SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+!!
+!! The generic subroutine, `SORT_INDEX`, is substantially a translation to
+!! Fortran 2008 of the `"Rust" sort` sorting routines in
+!! [`slice.rs`](https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs)
+!! The `rust sort` implementation is distributed with the header:
+!!
+!!   Copyright 2012-2015 The Rust Project Developers. See the COPYRIGHT
+!!   file at the top-level directory of this distribution and at
+!!   http://rust-lang.org/COPYRIGHT.
+!!
+!!   Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or
+!!   http://www.apache.org/licenses/LICENSE-2.0> or the MIT license
+!!   <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your
+!!   option. This file may not be copied, modified, or distributed
+!!   except according to those terms.
+!!
+!! so the licence for the original`slice.rs` code is compatible with the use
+!! of modified versions of the code in the Fortran Standard Library under
+!! the MIT license.
+
+submodule(stdlib_sorting) stdlib_sorting_sort_index
+
+    implicit none
+
+contains
+
+#:for k1 in INT_KINDS
+
+    module subroutine ${k1}$_sort_index( array, index, work, iwork, reverse )
+! A modification of `${k1}$_ord_sort` to return an array of indices that
+! would provide a stable sort of the `ARRAY` input. The indices by default
+! correspond to a non-decreasing sort, but if the optional argument
+! `REVERSE` is present with a value of `.TRUE.` the indices correspond to
+! a non-increasing sort. The logic of the determination of indexing largely
+! follows the `"Rust" sort` found in `slice.rs`:
+! https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs#L2159
+! The Rust version is a simplification of the Timsort algorithm described
+! in https://svn.python.org/projects/python/trunk/Objects/listsort.txt, as
+! it drops both the use of 'galloping' to identify bounds of regions to be
+! sorted and the estimation of the optimal `run size`. However it remains
+! a hybrid sorting algorithm combining an iterative Merge sort controlled
+! by a stack of `RUNS` identified by regions of uniformly decreasing or
+! non-decreasing sequences that may be expanded to a minimum run size, with
+! an insertion sort.
+!
+! Note the Fortran implementation simplifies the logic as it only has to
+! deal with Fortran arrays of intrinsic types and not the full generality
+! of Rust's arrays and lists for arbitrary types. It also adds the
+! estimation of the optimal `run size` as suggested in Tim Peters'
+! original listsort.txt, and the optional `work` and `iwork` arraya to be
+! used as scratch memory.
+
+        integer(${k1}$), intent(inout)             :: array(0:)
+        integer(int_size), intent(inout)           :: index(0:)
+        integer(${k1}$), intent(inout), optional   :: work(0:)
+        integer(int_size), intent(inout), optional :: iwork(0:)
+        logical, intent(in), optional              :: reverse
+
+        integer(int_size) :: array_size, i, stat
+        integer(${k1}$), allocatable :: buf(:)
+        integer(int_size), allocatable :: ibuf(:)
+
+        array_size = size(array, kind=int_size)
+
+        do i = 0, array_size-1
+            index(i) = i+1
+        end do
+
+        if ( present(reverse) ) then
+            if ( reverse ) then
+                call reverse_segment( array, index )
+            end if
+        end if
+
+! If necessary allocate buffers to serve as scratch memory.
+        if ( present(work) ) then
+            if ( present(iwork) ) then
+                call merge_sort( array, index, work, iwork )
+            else
+                allocate( ibuf(0:array_size/2-1), stat=stat )
+                if ( stat /= 0 ) error stop "Allocation of index buffer failed."
+                call merge_sort( array, index, work, ibuf )
+            end if
+        else
+            allocate( buf(0:array_size/2-1), stat=stat )
+            if ( stat /= 0 ) error stop "Allocation of array buffer failed."
+            if ( present(iwork) ) then
+                call merge_sort( array, index, buf, iwork )
+            else
+                allocate( ibuf(0:array_size/2-1), stat=stat )
+                if ( stat /= 0 ) error stop "Allocation of index buffer failed."
+                call merge_sort( array, index, buf, ibuf )
+            end if
+        end if
+
+        if ( present(reverse) ) then
+            if ( reverse ) then
+                call reverse_segment( array, index )
+            end if
+        end if
+
+    contains
+
+        pure function calc_min_run( n ) result(min_run)
+!! Returns the minimum length of a run from 32-63 so that N/MIN_RUN is
+!! less than or equal to a power of two. See
+!! https://svn.python.org/projects/python/trunk/Objects/listsort.txt
+            integer(int_size)             :: min_run
+            integer(int_size), intent(in) :: n
+
+            integer(int_size) :: num, r
+
+            num = n
+            r = 0_int_size
+
+            do while( num >= 64 )
+                r = ior( r, iand(num, 1_int_size) )
+                num = ishft(num, -1_int_size)
+            end do
+            min_run = num + r
+
+        end function calc_min_run
+
+
+        pure subroutine insertion_sort( array, index )
+! Sorts `ARRAY` using an insertion sort, while maintaining consistency in
+! location of the indices in `INDEX` to the elements of `ARRAY`.
+            integer(${k1}$), intent(inout) :: array(0:)
+            integer(int_size), intent(inout) :: index(0:)
+
+            integer(int_size) :: i, j, key_index
+            integer(${k1}$) :: key
+
+            do j=1, size(array, kind=int_size)-1
+                key = array(j)
+                key_index = index(j)
+                i = j - 1
+                do while( i >= 0 .and. array(i) > key )
+                    array(i+1) = array(i)
+                    index(i+1) = index(i)
+                    i = i - 1
+                end do
+                array(i+1) = key
+                index(i+1) = key_index
+            end do
+
+        end subroutine insertion_sort
+
+
+        pure function collapse( runs ) result ( r )
+! Examine the stack of runs waiting to be merged, identifying adjacent runs
+! to be merged until the stack invariants are restablished:
+!
+! 1. len(-3) > len(-2) + len(-1)
+! 2. len(-2) > len(-1)
+
+            integer(int_size) :: r
+            type(run_type), intent(in), target :: runs(0:)
+
+            integer(int_size) :: n
+            logical :: test
+
+            n = size(runs, kind=int_size)
+            test = .false.
+            if (n >= 2) then
+                if ( runs( n-1 ) % base == 0 .or. &
+                     runs( n-2 ) % len <= runs(n-1) % len ) then
+                    test = .true.
+                else if ( n >= 3 ) then ! X exists
+                    if ( runs(n-3) % len <= &
+                         runs(n-2) % len + runs(n-1) % len ) then
+                        test = .true.
+!               |X| <= |Y| + |Z| => will need to merge due to rho1 or rho2
+                    else if( n >= 4 ) then
+                        if ( runs(n-4) % len <= &
+                             runs(n-3) % len + runs(n-2) % len ) then
+                            test = .true.
+!               |W| <= |X| + |Y| => will need to merge due to rho1 or rho3
+                        end if
+                    end if
+                end if
+            end if
+            if ( test ) then
+! By default merge Y & Z, rho2 or rho3
+                if ( n >= 3 ) then
+                    if ( runs(n-3) % len < runs(n-1) % len ) then
+                        r = n - 3
+! |X| < |Z| => merge X & Y, rho1
+                        return
+                    end if
+                end if
+                r = n - 2
+! |Y| <= |Z| => merge Y & Z, rho4
+                return
+            else
+                r = -1
+            end if
+
+        end function collapse
+
+
+        pure subroutine insert_head( array, index )
+! Inserts `array(0)` into the pre-sorted sequence `array(1:)` so that the
+! whole `array(0:)` becomes sorted, copying the first element into
+! a temporary variable, iterating until the right place for it is found.
+! copying every traversed element into the slot preceding it, and finally,
+! copying data from the temporary variable into the resulting hole.
+! Consistency of the indices in `index` with the elements of `array`
+! are maintained.
+
+            integer(${k1}$), intent(inout)   :: array(0:)
+            integer(int_size), intent(inout) :: index(0:)
+
+            integer(${k1}$) :: tmp
+            integer(int_size) :: i, tmp_index
+
+            tmp = array(0)
+            tmp_index = index(0)
+            find_hole: do i=1, size(array, kind=int_size)-1
+                if ( array(i) >= tmp ) exit find_hole
+                array(i-1) = array(i)
+                index(i-1) = index(i)
+            end do find_hole
+            array(i-1) = tmp
+            index(i-1) = tmp_index
+
+        end subroutine insert_head
+
+
+        subroutine merge_sort( array, index, buf, ibuf )
+! The Rust merge sort borrows some (but not all) of the ideas from TimSort,
+! which is described in detail at
+! (http://svn.python.org/projects/python/trunk/Objects/listsort.txt).
+!
+! The algorithm identifies strictly descending and non-descending
+! subsequences, which are called natural runs. Where these runs are less
+! than a minimum run size they are padded by adding additional samples
+! using an insertion sort. The merge process is driven by a stack of
+! pending unmerged runs. Each newly found run is pushed onto the stack,
+! and then pairs of adjacentd runs are merged until these two invariants
+! are satisfied:
+!
+! 1. for every `i` in `1..size(runs)-1`: `runs(i - 1)%len > runs(i)%len`
+! 2. for every `i` in `2..size(runs)-1`: `runs(i - 2)%len >
+!    runs(i - 1)%len + runs(i)%len`
+!
+! The invariants ensure that the total running time is `O(n log n)`
+! worst-case. Consistency of the indices in `index` with the elements of
+! `array` are maintained.
+
+            integer(${k1}$), intent(inout)   :: array(0:)
+            integer(int_size), intent(inout) :: index(0:)
+            integer(${k1}$), intent(inout)   :: buf(0:)
+            integer(int_size), intent(inout) :: ibuf(0:)
+
+            integer(int_size) :: array_size, finish, min_run, r, r_count, &
+                start
+            type(run_type) :: runs(0:max_merge_stack-1), left, right
+
+            array_size = size(array, kind=int_size)
+
+! Very short runs are extended using insertion sort to span at least this
+! many elements. Slices of up to this length are sorted using insertion sort.
+            min_run = calc_min_run( array_size )
+
+            if ( array_size <= min_run ) then
+                if ( array_size >= 2 ) call insertion_sort( array, index )
+                return
+            end if
+
+
+! Following Rust sort, natural runs in `array` are identified by traversing
+! it backwards. By traversing it backward, merges more often go in the
+! opposite direction (forwards). According to developers of Rust sort,
+! merging forwards is slightly faster than merging backwards. Therefore
+! identifying runs by traversing backwards should improve performance.
+            r_count = 0
+            finish = array_size - 1
+            do while ( finish >= 0 )
+! Find the next natural run, and reverse it if it's strictly descending.
+                start = finish
+                if ( start > 0 ) then
+                    start = start - 1
+                    if ( array(start+1) < array(start) ) then
+                        Descending: do while ( start > 0 )
+                            if ( array(start) >= array(start-1) ) &
+                                exit Descending
+                            start = start - 1
+                        end do Descending
+                        call reverse_segment( array(start:finish), &
+                                            index(start:finish) )
+                    else
+                        Ascending: do while( start > 0 )
+                            if ( array(start) < array(start-1) ) exit Ascending
+                            start = start - 1
+                        end do Ascending
+                    end if
+                end if
+
+! If the run is too short insert some more elements using an insertion sort.
+                Insert: do while ( start > 0 )
+                    if ( finish - start >= min_run - 1 ) exit Insert
+                    start = start - 1
+                    call insert_head( array(start:finish), index(start:finish) )
+                end do Insert
+                if ( start == 0 .and. finish == array_size - 1 ) return
+
+                runs(r_count) = run_type( base = start, &
+                                          len = finish - start + 1 )
+                finish = start-1
+                r_count = r_count + 1
+
+! Determine whether pairs of adjacent runs need to be merged to satisfy
+! the invariants, and, if so, merge them.
+                Merge_loop: do
+                    r = collapse( runs(0:r_count - 1) )
+                    if ( r < 0 .or. r_count <= 1 ) exit Merge_loop
+                    left = runs( r + 1 )
+                    right = runs( r )
+                    call merge( array( left % base: &
+                                     right % base + right % len - 1 ), &
+                                left % len, buf, &
+                                index( left % base: &
+                                     right % base + right % len - 1 ), ibuf )
+
+                    runs(r) = run_type( base = left % base, &
+                                        len = left % len + right % len )
+                    if ( r == r_count - 3 ) runs(r+1) = runs(r+2)
+                    r_count = r_count - 1
+
+                end do Merge_loop
+            end do
+            if ( r_count /= 1 ) &
+                error stop "MERGE_SORT completed without RUN COUNT == 1."
+
+        end subroutine merge_sort
+
+
+        pure subroutine merge( array, mid, buf, index, ibuf )
+! Merges the two non-decreasing runs `ARRAY(0:MID-1)` and `ARRAY(MID:)`
+! using `BUF` as temporary storage, and stores the merged runs into
+! `ARRAY(0:)`. `MID` must be > 0, and < `SIZE(ARRAY)-1`. Buffer `BUF`
+! must be long enough to hold the shorter of the two runs.
+            integer(${k1}$), intent(inout)   :: array(0:)
+            integer(int_size), intent(in)    :: mid
+            integer(${k1}$), intent(inout)   :: buf(0:)
+            integer(int_size), intent(inout) :: index(0:)
+            integer(int_size), intent(inout) :: ibuf(0:)
+
+            integer(int_size) :: array_len, i, j, k
+
+            array_len = size(array, kind=int_size)
+
+! Merge first copies the shorter run into `buf`. Then, depending on which
+! run was shorter, it traces the copied run and the longer run forwards
+! (or backwards), comparing their next unprocessed elements and then
+! copying the lesser (or greater) one into `array`.
+
+            if ( mid <= array_len - mid ) then ! The left run is shorter.
+                buf(0:mid-1) = array(0:mid-1)
+                ibuf(0:mid-1) = index(0:mid-1)
+                i = 0
+                j = mid
+                merge_lower: do k = 0, array_len-1
+                    if ( buf(i) <= array(j) ) then
+                        array(k) = buf(i)
+                        index(k) = ibuf(i)
+                        i = i + 1
+                        if ( i >= mid ) exit merge_lower
+                    else
+                        array(k) = array(j)
+                        index(k) = index(j)
+                        j = j + 1
+                        if ( j >= array_len ) then
+                            array(k+1:) = buf(i:mid-1)
+                            index(k+1:) = ibuf(i:mid-1)
+                            exit merge_lower
+                        end if
+                    end if
+                end do merge_lower
+            else ! The right run is shorter
+                buf(0:array_len-mid) = array(mid:array_len-1)
+                ibuf(0:array_len-mid) = index(mid:array_len-1)
+                i = mid - 1
+                j = array_len - mid -1
+                merge_upper: do k = array_len-1, 0, -1
+                    if ( buf(j) >= array(i) ) then
+                        array(k) = buf(j)
+                        index(k) = ibuf(j)
+                        j = j - 1
+                        if ( j < 0 ) exit merge_upper
+                    else
+                        array(k) = array(i)
+                        index(k) = index(i)
+                        i = i - 1
+                        if ( i < 0 ) then
+                            array(0:k-1) = buf(0:j)
+                            index(0:k-1) = ibuf(0:j)
+                            exit merge_upper
+                        end if
+                    end if
+                end do merge_upper
+            end if
+        end subroutine merge
+
+
+        pure subroutine reverse_segment( array, index )
+! Reverse a segment of an array in place
+            integer(${k1}$), intent(inout) :: array(0:)
+            integer(int_size), intent(inout) :: index(0:)
+
+            integer(int_size) :: itemp, lo, hi
+            integer(${k1}$) :: temp
+
+            lo = 0
+            hi = size( array, kind=int_size ) - 1
+            do while( lo < hi )
+                temp = array(lo)
+                array(lo) = array(hi)
+                array(hi) = temp
+                itemp = index(lo)
+                index(lo) = index(hi)
+                index(hi) = itemp
+                lo = lo + 1
+                hi = hi - 1
+            end do
+
+        end subroutine reverse_segment
+
+    end subroutine ${k1}$_sort_index
+
+#:endfor
+
+
+#:for k1 in REAL_KINDS
+
+    module subroutine ${k1}$_sort_index( array, index, work, iwork, reverse )
+! A modification of `${k1}$_ord_sort` to return an array of indices that
+! would provide a stable sort of the `ARRAY` input. The indices by default
+! correspond to a non-decreasing sort, but if the optional argument
+! `REVERSE` is present with a value of `.TRUE.` the indices correspond to
+! a non-increasing sort. The logic of the determination of indexing largely
+! follows the `"Rust" sort` found in `slice.rs`:
+! https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs#L2159
+! The Rust version is a simplification of the Timsort algorithm described
+! in https://svn.python.org/projects/python/trunk/Objects/listsort.txt, as
+! it drops both the use of 'galloping' to identify bounds of regions to be
+! sorted and the estimation of the optimal `run size`. However it remains
+! a hybrid sorting algorithm combining an iterative Merge sort controlled
+! by a stack of `RUNS` identified by regions of uniformly decreasing or
+! non-decreasing sequences that may be expanded to a minimum run size, with
+! an insertion sort.
+!
+! Note the Fortran implementation simplifies the logic as it only has to
+! deal with Fortran arrays of intrinsic types and not the full generality
+! of Rust's arrays and lists for arbitrary types. It also adds the
+! estimation of the optimal `run size` as suggested in Tim Peters'
+! original listsort.txt, and the optional `work` and `iwork` arraya to be
+! used as scratch memory.
+
+        real(${k1}$), intent(inout)             :: array(0:)
+        integer(int_size), intent(inout)           :: index(0:)
+        real(${k1}$), intent(inout), optional   :: work(0:)
+        integer(int_size), intent(inout), optional :: iwork(0:)
+        logical, intent(in), optional              :: reverse
+
+        integer(int_size) :: array_size, i, stat
+        real(${k1}$), allocatable :: buf(:)
+        integer(int_size), allocatable :: ibuf(:)
+
+        array_size = size(array, kind=int_size)
+
+        do i = 0, array_size-1
+            index(i) = i+1
+        end do
+
+        if ( present(reverse) ) then
+            if ( reverse ) then
+                call reverse_segment( array, index )
+            end if
+        end if
+
+! If necessary allocate buffers to serve as scratch memory.
+        if ( present(work) ) then
+            if ( present(iwork) ) then
+                call merge_sort( array, index, work, iwork )
+            else
+                allocate( ibuf(0:array_size/2-1), stat=stat )
+                if ( stat /= 0 ) error stop "Allocation of index buffer failed."
+                call merge_sort( array, index, work, ibuf )
+            end if
+        else
+            allocate( buf(0:array_size/2-1), stat=stat )
+            if ( stat /= 0 ) error stop "Allocation of array buffer failed."
+            if ( present(iwork) ) then
+                call merge_sort( array, index, buf, iwork )
+            else
+                allocate( ibuf(0:array_size/2-1), stat=stat )
+                if ( stat /= 0 ) error stop "Allocation of index buffer failed."
+                call merge_sort( array, index, buf, ibuf )
+            end if
+        end if
+
+        if ( present(reverse) ) then
+            if ( reverse ) then
+                call reverse_segment( array, index )
+            end if
+        end if
+
+    contains
+
+        pure function calc_min_run( n ) result(min_run)
+!! Returns the minimum length of a run from 32-63 so that N/MIN_RUN is
+!! less than or equal to a power of two. See
+!! https://svn.python.org/projects/python/trunk/Objects/listsort.txt
+            integer(int_size)             :: min_run
+            integer(int_size), intent(in) :: n
+
+            integer(int_size) :: num, r
+
+            num = n
+            r = 0_int_size
+
+            do while( num >= 64 )
+                r = ior( r, iand(num, 1_int_size) )
+                num = ishft(num, -1_int_size)
+            end do
+            min_run = num + r
+
+        end function calc_min_run
+
+
+        pure subroutine insertion_sort( array, index )
+! Sorts `ARRAY` using an insertion sort, while maintaining consistency in
+! location of the indices in `INDEX` to the elements of `ARRAY`.
+            real(${k1}$), intent(inout) :: array(0:)
+            integer(int_size), intent(inout) :: index(0:)
+
+            integer(int_size) :: i, j, key_index
+            real(${k1}$) :: key
+
+            do j=1, size(array, kind=int_size)-1
+                key = array(j)
+                key_index = index(j)
+                i = j - 1
+                do while( i >= 0 .and. array(i) > key )
+                    array(i+1) = array(i)
+                    index(i+1) = index(i)
+                    i = i - 1
+                end do
+                array(i+1) = key
+                index(i+1) = key_index
+            end do
+
+        end subroutine insertion_sort
+
+
+        pure function collapse( runs ) result ( r )
+! Examine the stack of runs waiting to be merged, identifying adjacent runs
+! to be merged until the stack invariants are restablished:
+!
+! 1. len(-3) > len(-2) + len(-1)
+! 2. len(-2) > len(-1)
+
+            integer(int_size) :: r
+            type(run_type), intent(in), target :: runs(0:)
+
+            integer(int_size) :: n
+            logical :: test
+
+            n = size(runs, kind=int_size)
+            test = .false.
+            if (n >= 2) then
+                if ( runs( n-1 ) % base == 0 .or. &
+                     runs( n-2 ) % len <= runs(n-1) % len ) then
+                    test = .true.
+                else if ( n >= 3 ) then ! X exists
+                    if ( runs(n-3) % len <= &
+                         runs(n-2) % len + runs(n-1) % len ) then
+                        test = .true.
+!               |X| <= |Y| + |Z| => will need to merge due to rho1 or rho2
+                    else if( n >= 4 ) then
+                        if ( runs(n-4) % len <= &
+                             runs(n-3) % len + runs(n-2) % len ) then
+                            test = .true.
+!               |W| <= |X| + |Y| => will need to merge due to rho1 or rho3
+                        end if
+                    end if
+                end if
+            end if
+            if ( test ) then
+! By default merge Y & Z, rho2 or rho3
+                if ( n >= 3 ) then
+                    if ( runs(n-3) % len < runs(n-1) % len ) then
+                        r = n - 3
+! |X| < |Z| => merge X & Y, rho1
+                        return
+                    end if
+                end if
+                r = n - 2
+! |Y| <= |Z| => merge Y & Z, rho4
+                return
+            else
+                r = -1
+            end if
+
+        end function collapse
+
+
+        pure subroutine insert_head( array, index )
+! Inserts `array(0)` into the pre-sorted sequence `array(1:)` so that the
+! whole `array(0:)` becomes sorted, copying the first element into
+! a temporary variable, iterating until the right place for it is found.
+! copying every traversed element into the slot preceding it, and finally,
+! copying data from the temporary variable into the resulting hole.
+! Consistency of the indices in `index` with the elements of `array`
+! are maintained.
+
+            real(${k1}$), intent(inout)   :: array(0:)
+            integer(int_size), intent(inout) :: index(0:)
+
+            real(${k1}$) :: tmp
+            integer(int_size) :: i, tmp_index
+
+            tmp = array(0)
+            tmp_index = index(0)
+            find_hole: do i=1, size(array, kind=int_size)-1
+                if ( array(i) >= tmp ) exit find_hole
+                array(i-1) = array(i)
+                index(i-1) = index(i)
+            end do find_hole
+            array(i-1) = tmp
+            index(i-1) = tmp_index
+
+        end subroutine insert_head
+
+
+        subroutine merge_sort( array, index, buf, ibuf )
+! The Rust merge sort borrows some (but not all) of the ideas from TimSort,
+! which is described in detail at
+! (http://svn.python.org/projects/python/trunk/Objects/listsort.txt).
+!
+! The algorithm identifies strictly descending and non-descending
+! subsequences, which are called natural runs. Where these runs are less
+! than a minimum run size they are padded by adding additional samples
+! using an insertion sort. The merge process is driven by a stack of
+! pending unmerged runs. Each newly found run is pushed onto the stack,
+! and then pairs of adjacentd runs are merged until these two invariants
+! are satisfied:
+!
+! 1. for every `i` in `1..size(runs)-1`: `runs(i - 1)%len > runs(i)%len`
+! 2. for every `i` in `2..size(runs)-1`: `runs(i - 2)%len >
+!    runs(i - 1)%len + runs(i)%len`
+!
+! The invariants ensure that the total running time is `O(n log n)`
+! worst-case. Consistency of the indices in `index` with the elements of
+! `array` are maintained.
+
+            real(${k1}$), intent(inout)   :: array(0:)
+            integer(int_size), intent(inout) :: index(0:)
+            real(${k1}$), intent(inout)   :: buf(0:)
+            integer(int_size), intent(inout) :: ibuf(0:)
+
+            integer(int_size) :: array_size, finish, min_run, r, r_count, &
+                start
+            type(run_type) :: runs(0:max_merge_stack-1), left, right
+
+            array_size = size(array, kind=int_size)
+
+! Very short runs are extended using insertion sort to span at least this
+! many elements. Slices of up to this length are sorted using insertion sort.
+            min_run = calc_min_run( array_size )
+
+            if ( array_size <= min_run ) then
+                if ( array_size >= 2 ) call insertion_sort( array, index )
+                return
+            end if
+
+
+! Following Rust sort, natural runs in `array` are identified by traversing
+! it backwards. By traversing it backward, merges more often go in the
+! opposite direction (forwards). According to developers of Rust sort,
+! merging forwards is slightly faster than merging backwards. Therefore
+! identifying runs by traversing backwards should improve performance.
+            r_count = 0
+            finish = array_size - 1
+            do while ( finish >= 0 )
+! Find the next natural run, and reverse it if it's strictly descending.
+                start = finish
+                if ( start > 0 ) then
+                    start = start - 1
+                    if ( array(start+1) < array(start) ) then
+                        Descending: do while ( start > 0 )
+                            if ( array(start) >= array(start-1) ) &
+                                exit Descending
+                            start = start - 1
+                        end do Descending
+                        call reverse_segment( array(start:finish), &
+                                            index(start:finish) )
+                    else
+                        Ascending: do while( start > 0 )
+                            if ( array(start) < array(start-1) ) exit Ascending
+                            start = start - 1
+                        end do Ascending
+                    end if
+                end if
+
+! If the run is too short insert some more elements using an insertion sort.
+                Insert: do while ( start > 0 )
+                    if ( finish - start >= min_run - 1 ) exit Insert
+                    start = start - 1
+                    call insert_head( array(start:finish), index(start:finish) )
+                end do Insert
+                if ( start == 0 .and. finish == array_size - 1 ) return
+
+                runs(r_count) = run_type( base = start, &
+                                          len = finish - start + 1 )
+                finish = start-1
+                r_count = r_count + 1
+
+! Determine whether pairs of adjacent runs need to be merged to satisfy
+! the invariants, and, if so, merge them.
+                Merge_loop: do
+                    r = collapse( runs(0:r_count - 1) )
+                    if ( r < 0 .or. r_count <= 1 ) exit Merge_loop
+                    left = runs( r + 1 )
+                    right = runs( r )
+                    call merge( array( left % base: &
+                                     right % base + right % len - 1 ), &
+                                left % len, buf, &
+                                index( left % base: &
+                                     right % base + right % len - 1 ), ibuf )
+
+                    runs(r) = run_type( base = left % base, &
+                                        len = left % len + right % len )
+                    if ( r == r_count - 3 ) runs(r+1) = runs(r+2)
+                    r_count = r_count - 1
+
+                end do Merge_loop
+            end do
+            if ( r_count /= 1 ) &
+                error stop "MERGE_SORT completed without RUN COUNT == 1."
+
+        end subroutine merge_sort
+
+
+        pure subroutine merge( array, mid, buf, index, ibuf )
+! Merges the two non-decreasing runs `ARRAY(0:MID-1)` and `ARRAY(MID:)`
+! using `BUF` as temporary storage, and stores the merged runs into
+! `ARRAY(0:)`. `MID` must be > 0, and < `SIZE(ARRAY)-1`. Buffer `BUF`
+! must be long enough to hold the shorter of the two runs.
+            real(${k1}$), intent(inout)   :: array(0:)
+            integer(int_size), intent(in)    :: mid
+            real(${k1}$), intent(inout)   :: buf(0:)
+            integer(int_size), intent(inout) :: index(0:)
+            integer(int_size), intent(inout) :: ibuf(0:)
+
+            integer(int_size) :: array_len, i, j, k
+
+            array_len = size(array, kind=int_size)
+
+! Merge first copies the shorter run into `buf`. Then, depending on which
+! run was shorter, it traces the copied run and the longer run forwards
+! (or backwards), comparing their next unprocessed elements and then
+! copying the lesser (or greater) one into `array`.
+
+            if ( mid <= array_len - mid ) then ! The left run is shorter.
+                buf(0:mid-1) = array(0:mid-1)
+                ibuf(0:mid-1) = index(0:mid-1)
+                i = 0
+                j = mid
+                merge_lower: do k = 0, array_len-1
+                    if ( buf(i) <= array(j) ) then
+                        array(k) = buf(i)
+                        index(k) = ibuf(i)
+                        i = i + 1
+                        if ( i >= mid ) exit merge_lower
+                    else
+                        array(k) = array(j)
+                        index(k) = index(j)
+                        j = j + 1
+                        if ( j >= array_len ) then
+                            array(k+1:) = buf(i:mid-1)
+                            index(k+1:) = ibuf(i:mid-1)
+                            exit merge_lower
+                        end if
+                    end if
+                end do merge_lower
+            else ! The right run is shorter
+                buf(0:array_len-mid) = array(mid:array_len-1)
+                ibuf(0:array_len-mid) = index(mid:array_len-1)
+                i = mid - 1
+                j = array_len - mid -1
+                merge_upper: do k = array_len-1, 0, -1
+                    if ( buf(j) >= array(i) ) then
+                        array(k) = buf(j)
+                        index(k) = ibuf(j)
+                        j = j - 1
+                        if ( j < 0 ) exit merge_upper
+                    else
+                        array(k) = array(i)
+                        index(k) = index(i)
+                        i = i - 1
+                        if ( i < 0 ) then
+                            array(0:k-1) = buf(0:j)
+                            index(0:k-1) = ibuf(0:j)
+                            exit merge_upper
+                        end if
+                    end if
+                end do merge_upper
+            end if
+        end subroutine merge
+
+
+        pure subroutine reverse_segment( array, index )
+! Reverse a segment of an array in place
+            real(${k1}$), intent(inout) :: array(0:)
+            integer(int_size), intent(inout) :: index(0:)
+
+            integer(int_size) :: itemp, lo, hi
+            real(${k1}$) :: temp
+
+            lo = 0
+            hi = size( array, kind=int_size ) - 1
+            do while( lo < hi )
+                temp = array(lo)
+                array(lo) = array(hi)
+                array(hi) = temp
+                itemp = index(lo)
+                index(lo) = index(hi)
+                index(hi) = itemp
+                lo = lo + 1
+                hi = hi - 1
+            end do
+
+        end subroutine reverse_segment
+
+    end subroutine ${k1}$_sort_index
+
+#:endfor
+
+    module subroutine char_sort_index( array, index, work, iwork, reverse )
+! A modification of `char_ord_sort` to return an array of indices that
+! would provide a stable sort of the `ARRAY` input. The indices by default
+! correspond to a non-decreasing sort, but if the optional argument
+! `REVERSE` is present with a value of `.TRUE.` the indices correspond to
+! a non-increasing sort. The logic of the determination of indexing largely
+! follows the `"Rust" sort` found in `slice.rs`:
+! https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs#L2159
+! The Rust version is a simplification of the Timsort algorithm described
+! in https://svn.python.org/projects/python/trunk/Objects/listsort.txt, as
+! it drops both the use of 'galloping' to identify bounds of regions to be
+! sorted and the estimation of the optimal `run size`. However it remains
+! a hybrid sorting algorithm combining an iterative Merge sort controlled
+! by a stack of `RUNS` identified by regions of uniformly decreasing or
+! non-decreasing sequences that may be expanded to a minimum run size, with
+! an insertion sort.
+!
+! Note the Fortran implementation simplifies the logic as it only has to
+! deal with Fortran arrays of intrinsic types and not the full generality
+! of Rust's arrays and lists for arbitrary types. It also adds the
+! estimation of the optimal `run size` as suggested in Tim Peters'
+! original listsort.txt, and the optional `work` and `iwork` arraya to be
+! used as scratch memory.
+
+        character(len=*), intent(inout)                      :: array(0:)
+        integer(int_size), intent(inout)                     :: index(0:)
+        character(len=len(array)), intent(inout), optional   :: work(0:)
+        integer(int_size), intent(inout), optional           :: iwork(0:)
+        logical, intent(in), optional                        :: reverse
+
+        integer(int_size) :: array_size, i, stat
+        character(len=:), allocatable :: buf(:)
+        integer(int_size), allocatable :: ibuf(:)
+
+        array_size = size(array, kind=int_size)
+
+        do i = 0, array_size-1
+            index(i) = i+1
+        end do
+
+        if ( present(reverse) ) then
+            if ( reverse ) then
+                call reverse_segment( array, index )
+            end if
+        end if
+
+! If necessary allocate buffers to serve as scratch memory.
+        if ( present(work) ) then
+            if ( present(iwork) ) then
+                call merge_sort( array, index, work, iwork )
+            else
+                allocate( ibuf(0:array_size/2-1), stat=stat )
+                if ( stat /= 0 ) error stop "Allocation of index buffer failed."
+                call merge_sort( array, index, work, ibuf )
+            end if
+        else
+            allocate( character(len=len(array)) :: buf(0:array_size/2-1), &
+                      stat=stat )
+            if ( stat /= 0 ) error stop "Allocation of array buffer failed."
+            if ( present(iwork) ) then
+                call merge_sort( array, index, buf, iwork )
+            else
+                allocate( ibuf(0:array_size/2-1), stat=stat )
+                if ( stat /= 0 ) error stop "Allocation of index buffer failed."
+                call merge_sort( array, index, buf, ibuf )
+            end if
+        end if
+
+        if ( present(reverse) ) then
+            if ( reverse ) then
+                call reverse_segment( array, index )
+            end if
+        end if
+
+    contains
+
+        pure function calc_min_run( n ) result(min_run)
+!! Returns the minimum length of a run from 32-63 so that N/MIN_RUN is
+!! less than or equal to a power of two. See
+!! https://svn.python.org/projects/python/trunk/Objects/listsort.txt
+            integer(int_size)             :: min_run
+            integer(int_size), intent(in) :: n
+
+            integer(int_size) :: num, r
+
+            num = n
+            r = 0_int_size
+
+            do while( num >= 64 )
+                r = ior( r, iand(num, 1_int_size) )
+                num = ishft(num, -1_int_size)
+            end do
+            min_run = num + r
+
+        end function calc_min_run
+
+
+        pure subroutine insertion_sort( array, index )
+! Sorts `ARRAY` using an insertion sort, while maintaining consistency in
+! location of the indices in `INDEX` to the elements of `ARRAY`.
+            character(len=*), intent(inout)  :: array(0:)
+            integer(int_size), intent(inout) :: index(0:)
+
+            integer(int_size)         :: i, j, key_index
+            character(len=len(array)) :: key
+
+            do j=1, size(array, kind=int_size)-1
+                key = array(j)
+                key_index = index(j)
+                i = j - 1
+                do while( i >= 0 .and. array(i) > key )
+                    array(i+1) = array(i)
+                    index(i+1) = index(i)
+                    i = i - 1
+                end do
+                array(i+1) = key
+                index(i+1) = key_index
+            end do
+
+        end subroutine insertion_sort
+
+
+        pure function collapse( runs ) result ( r )
+! Examine the stack of runs waiting to be merged, identifying adjacent runs
+! to be merged until the stack invariants are restablished:
+!
+! 1. len(-3) > len(-2) + len(-1)
+! 2. len(-2) > len(-1)
+
+            integer(int_size) :: r
+            type(run_type), intent(in), target :: runs(0:)
+
+            integer(int_size) :: n
+            logical :: test
+
+            n = size(runs, kind=int_size)
+            test = .false.
+            if (n >= 2) then
+                if ( runs( n-1 ) % base == 0 .or. &
+                     runs( n-2 ) % len <= runs(n-1) % len ) then
+                    test = .true.
+                else if ( n >= 3 ) then ! X exists
+                    if ( runs(n-3) % len <= &
+                         runs(n-2) % len + runs(n-1) % len ) then
+                        test = .true.
+!               |X| <= |Y| + |Z| => will need to merge due to rho1 or rho2
+                    else if( n >= 4 ) then
+                        if ( runs(n-4) % len <= &
+                             runs(n-3) % len + runs(n-2) % len ) then
+                            test = .true.
+!               |W| <= |X| + |Y| => will need to merge due to rho1 or rho3
+                        end if
+                    end if
+                end if
+            end if
+            if ( test ) then
+! By default merge Y & Z, rho2 or rho3
+                if ( n >= 3 ) then
+                    if ( runs(n-3) % len < runs(n-1) % len ) then
+                        r = n - 3
+! |X| < |Z| => merge X & Y, rho1
+                        return
+                    end if
+                end if
+                r = n - 2
+! |Y| <= |Z| => merge Y & Z, rho4
+                return
+            else
+                r = -1
+            end if
+
+        end function collapse
+
+
+        pure subroutine insert_head( array, index )
+! Inserts `array(0)` into the pre-sorted sequence `array(1:)` so that the
+! whole `array(0:)` becomes sorted, copying the first element into
+! a temporary variable, iterating until the right place for it is found.
+! copying every traversed element into the slot preceding it, and finally,
+! copying data from the temporary variable into the resulting hole.
+! Consistency of the indices in `index` with the elements of `array`
+! are maintained.
+
+            character(len=*), intent(inout)  :: array(0:)
+            integer(int_size), intent(inout) :: index(0:)
+
+            character(len=len(array)) :: tmp
+            integer(int_size) :: i, tmp_index
+
+            tmp = array(0)
+            tmp_index = index(0)
+            find_hole: do i=1, size(array, kind=int_size)-1
+                if ( array(i) >= tmp ) exit find_hole
+                array(i-1) = array(i)
+                index(i-1) = index(i)
+            end do find_hole
+            array(i-1) = tmp
+            index(i-1) = tmp_index
+
+        end subroutine insert_head
+
+
+        subroutine merge_sort( array, index, buf, ibuf )
+! The Rust merge sort borrows some (but not all) of the ideas from TimSort,
+! which is described in detail at
+! (http://svn.python.org/projects/python/trunk/Objects/listsort.txt).
+!
+! The algorithm identifies strictly descending and non-descending
+! subsequences, which are called natural runs. Where these runs are less
+! than a minimum run size they are padded by adding additional samples
+! using an insertion sort. The merge process is driven by a stack of
+! pending unmerged runs. Each newly found run is pushed onto the stack,
+! and then pairs of adjacentd runs are merged until these two invariants
+! are satisfied:
+!
+! 1. for every `i` in `1..size(runs)-1`: `runs(i - 1)%len > runs(i)%len`
+! 2. for every `i` in `2..size(runs)-1`: `runs(i - 2)%len >
+!    runs(i - 1)%len + runs(i)%len`
+!
+! The invariants ensure that the total running time is `O(n log n)`
+! worst-case. Consistency of the indices in `index` with the elements of
+! `array` are maintained.
+
+            character(len=*), intent(inout)          :: array(0:)
+            integer(int_size), intent(inout)         :: index(0:)
+            character(len=len(array)), intent(inout) :: buf(0:)
+            integer(int_size), intent(inout)         :: ibuf(0:)
+
+            integer(int_size) :: array_size, finish, min_run, r, r_count, &
+                start
+            type(run_type) :: runs(0:max_merge_stack-1), left, right
+
+            array_size = size(array, kind=int_size)
+
+! Very short runs are extended using insertion sort to span at least this
+! many elements. Slices of up to this length are sorted using insertion sort.
+            min_run = calc_min_run( array_size )
+
+            if ( array_size <= min_run ) then
+                if ( array_size >= 2 ) call insertion_sort( array, index )
+                return
+            end if
+
+
+! Following Rust sort, natural runs in `array` are identified by traversing
+! it backwards. By traversing it backward, merges more often go in the
+! opposite direction (forwards). According to developers of Rust sort,
+! merging forwards is slightly faster than merging backwards. Therefore
+! identifying runs by traversing backwards should improve performance.
+            r_count = 0
+            finish = array_size - 1
+            do while ( finish >= 0 )
+! Find the next natural run, and reverse it if it's strictly descending.
+                start = finish
+                if ( start > 0 ) then
+                    start = start - 1
+                    if ( array(start+1) < array(start) ) then
+                        Descending: do while ( start > 0 )
+                            if ( array(start) >= array(start-1) ) &
+                                exit Descending
+                            start = start - 1
+                        end do Descending
+                        call reverse_segment( array(start:finish), &
+                                            index(start:finish) )
+                    else
+                        Ascending: do while( start > 0 )
+                            if ( array(start) < array(start-1) ) exit Ascending
+                            start = start - 1
+                        end do Ascending
+                    end if
+                end if
+
+! If the run is too short insert some more elements using an insertion sort.
+                Insert: do while ( start > 0 )
+                    if ( finish - start >= min_run - 1 ) exit Insert
+                    start = start - 1
+                    call insert_head( array(start:finish), index(start:finish) )
+                end do Insert
+                if ( start == 0 .and. finish == array_size - 1 ) return
+
+                runs(r_count) = run_type( base = start, &
+                                          len = finish - start + 1 )
+                finish = start-1
+                r_count = r_count + 1
+
+! Determine whether pairs of adjacent runs need to be merged to satisfy
+! the invariants, and, if so, merge them.
+                Merge_loop: do
+                    r = collapse( runs(0:r_count - 1) )
+                    if ( r < 0 .or. r_count <= 1 ) exit Merge_loop
+                    left = runs( r + 1 )
+                    right = runs( r )
+                    call merge( array( left % base: &
+                                     right % base + right % len - 1 ), &
+                                left % len, buf, &
+                                index( left % base: &
+                                     right % base + right % len - 1 ), ibuf )
+
+                    runs(r) = run_type( base = left % base, &
+                                        len = left % len + right % len )
+                    if ( r == r_count - 3 ) runs(r+1) = runs(r+2)
+                    r_count = r_count - 1
+
+                end do Merge_loop
+            end do
+            if ( r_count /= 1 ) &
+                error stop "MERGE_SORT completed without RUN COUNT == 1."
+
+        end subroutine merge_sort
+
+
+        pure subroutine merge( array, mid, buf, index, ibuf )
+! Merges the two non-decreasing runs `ARRAY(0:MID-1)` and `ARRAY(MID:)`
+! using `BUF` as temporary storage, and stores the merged runs into
+! `ARRAY(0:)`. `MID` must be > 0, and < `SIZE(ARRAY)-1`. Buffer `BUF`
+! must be long enough to hold the shorter of the two runs.
+            character(len=*), intent(inout)          :: array(0:)
+            integer(int_size), intent(in)            :: mid
+            character(len=len(array)), intent(inout) :: buf(0:)
+            integer(int_size), intent(inout)         :: index(0:)
+            integer(int_size), intent(inout)         :: ibuf(0:)
+
+            integer(int_size) :: array_len, i, j, k
+
+            array_len = size(array, kind=int_size)
+
+! Merge first copies the shorter run into `buf`. Then, depending on which
+! run was shorter, it traces the copied run and the longer run forwards
+! (or backwards), comparing their next unprocessed elements and then
+! copying the lesser (or greater) one into `array`.
+
+            if ( mid <= array_len - mid ) then ! The left run is shorter.
+                buf(0:mid-1) = array(0:mid-1)
+                ibuf(0:mid-1) = index(0:mid-1)
+                i = 0
+                j = mid
+                merge_lower: do k = 0, array_len-1
+                    if ( buf(i) <= array(j) ) then
+                        array(k) = buf(i)
+                        index(k) = ibuf(i)
+                        i = i + 1
+                        if ( i >= mid ) exit merge_lower
+                    else
+                        array(k) = array(j)
+                        index(k) = index(j)
+                        j = j + 1
+                        if ( j >= array_len ) then
+                            array(k+1:) = buf(i:mid-1)
+                            index(k+1:) = ibuf(i:mid-1)
+                            exit merge_lower
+                        end if
+                    end if
+                end do merge_lower
+            else ! The right run is shorter
+                buf(0:array_len-mid) = array(mid:array_len-1)
+                ibuf(0:array_len-mid) = index(mid:array_len-1)
+                i = mid - 1
+                j = array_len - mid -1
+                merge_upper: do k = array_len-1, 0, -1
+                    if ( buf(j) >= array(i) ) then
+                        array(k) = buf(j)
+                        index(k) = ibuf(j)
+                        j = j - 1
+                        if ( j < 0 ) exit merge_upper
+                    else
+                        array(k) = array(i)
+                        index(k) = index(i)
+                        i = i - 1
+                        if ( i < 0 ) then
+                            array(0:k-1) = buf(0:j)
+                            index(0:k-1) = ibuf(0:j)
+                            exit merge_upper
+                        end if
+                    end if
+                end do merge_upper
+            end if
+        end subroutine merge
+
+
+        pure subroutine reverse_segment( array, index )
+! Reverse a segment of an array in place
+            character(len=*), intent(inout)  :: array(0:)
+            integer(int_size), intent(inout) :: index(0:)
+
+            integer(int_size)         :: itemp, lo, hi
+            character(len=len(array)) :: temp
+
+            lo = 0
+            hi = size( array, kind=int_size ) - 1
+            do while( lo < hi )
+                temp = array(lo)
+                array(lo) = array(hi)
+                array(hi) = temp
+                itemp = index(lo)
+                index(lo) = index(hi)
+                index(hi) = itemp
+                lo = lo + 1
+                hi = hi - 1
+            end do
+
+        end subroutine reverse_segment
+
+    end subroutine char_sort_index
+
+    module subroutine string_sort_index( array, index, work, iwork, reverse )
+! A modification of `string_ord_sort` to return an array of indices that
+! would provide a stable sort of the `ARRAY` input. The indices by default
+! correspond to a non-decreasing sort, but if the optional argument
+! `REVERSE` is present with a value of `.TRUE.` the indices correspond to
+! a non-increasing sort. The logic of the determination of indexing largely
+! follows the `"Rust" sort` found in `slice.rs`:
+! https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs#L2159
+! The Rust version is a simplification of the Timsort algorithm described
+! in https://svn.python.org/projects/python/trunk/Objects/listsort.txt, as
+! it drops both the use of 'galloping' to identify bounds of regions to be
+! sorted and the estimation of the optimal `run size`. However it remains
+! a hybrid sorting algorithm combining an iterative Merge sort controlled
+! by a stack of `RUNS` identified by regions of uniformly decreasing or
+! non-decreasing sequences that may be expanded to a minimum run size, with
+! an insertion sort.
+!
+! Note the Fortran implementation simplifies the logic as it only has to
+! deal with Fortran arrays of intrinsic types and not the full generality
+! of Rust's arrays and lists for arbitrary types. It also adds the
+! estimation of the optimal `run size` as suggested in Tim Peters'
+! original listsort.txt, and the optional `work` and `iwork` arraya to be
+! used as scratch memory.
+
+        type(string_type), intent(inout)           :: array(0:)
+        integer(int_size), intent(inout)           :: index(0:)
+        type(string_type), intent(inout), optional :: work(0:)
+        integer(int_size), intent(inout), optional :: iwork(0:)
+        logical, intent(in), optional              :: reverse
+
+        integer(int_size)              :: array_size, i, stat
+        type(string_type), allocatable :: buf(:)
+        integer(int_size), allocatable :: ibuf(:)
+
+        array_size = size(array, kind=int_size)
+
+        do i = 0, array_size-1
+            index(i) = i+1
+        end do
+
+        if ( present(reverse) ) then
+            if ( reverse ) then
+                call reverse_segment( array, index )
+            end if
+        end if
+
+! If necessary allocate buffers to serve as scratch memory.
+        if ( present(work) ) then
+            if ( present(iwork) ) then
+                call merge_sort( array, index, work, iwork )
+            else
+                allocate( ibuf(0:array_size/2-1), stat=stat )
+                if ( stat /= 0 ) error stop "Allocation of index buffer failed."
+                call merge_sort( array, index, work, ibuf )
+            end if
+        else
+            allocate( buf(0:array_size/2-1), stat=stat )
+            if ( stat /= 0 ) error stop "Allocation of array buffer failed."
+            if ( present(iwork) ) then
+                call merge_sort( array, index, buf, iwork )
+            else
+                allocate( ibuf(0:array_size/2-1), stat=stat )
+                if ( stat /= 0 ) error stop "Allocation of index buffer failed."
+                call merge_sort( array, index, buf, ibuf )
+            end if
+        end if
+
+        if ( present(reverse) ) then
+            if ( reverse ) then
+                call reverse_segment( array, index )
+            end if
+        end if
+
+    contains
+
+        pure function calc_min_run( n ) result(min_run)
+!! Returns the minimum length of a run from 32-63 so that N/MIN_RUN is
+!! less than or equal to a power of two. See
+!! https://svn.python.org/projects/python/trunk/Objects/listsort.txt
+            integer(int_size)             :: min_run
+            integer(int_size), intent(in) :: n
+
+            integer(int_size) :: num, r
+
+            num = n
+            r = 0_int_size
+
+            do while( num >= 64 )
+                r = ior( r, iand(num, 1_int_size) )
+                num = ishft(num, -1_int_size)
+            end do
+            min_run = num + r
+
+        end function calc_min_run
+
+
+        pure subroutine insertion_sort( array, index )
+! Sorts `ARRAY` using an insertion sort, while maintaining consistency in
+! location of the indices in `INDEX` to the elements of `ARRAY`.
+            type(string_type), intent(inout) :: array(0:)
+            integer(int_size), intent(inout) :: index(0:)
+
+            integer(int_size) :: i, j, key_index
+            type(string_type) :: key
+
+            do j=1, size(array, kind=int_size)-1
+                key = array(j)
+                key_index = index(j)
+                i = j - 1
+                do while( i >= 0 .and. array(i) > key )
+                    array(i+1) = array(i)
+                    index(i+1) = index(i)
+                    i = i - 1
+                end do
+                array(i+1) = key
+                index(i+1) = key_index
+            end do
+
+        end subroutine insertion_sort
+
+
+        pure function collapse( runs ) result ( r )
+! Examine the stack of runs waiting to be merged, identifying adjacent runs
+! to be merged until the stack invariants are restablished:
+!
+! 1. len(-3) > len(-2) + len(-1)
+! 2. len(-2) > len(-1)
+
+            integer(int_size) :: r
+            type(run_type), intent(in), target :: runs(0:)
+
+            integer(int_size) :: n
+            logical :: test
+
+            n = size(runs, kind=int_size)
+            test = .false.
+            if (n >= 2) then
+                if ( runs( n-1 ) % base == 0 .or. &
+                     runs( n-2 ) % len <= runs(n-1) % len ) then
+                    test = .true.
+                else if ( n >= 3 ) then ! X exists
+                    if ( runs(n-3) % len <= &
+                         runs(n-2) % len + runs(n-1) % len ) then
+                        test = .true.
+!               |X| <= |Y| + |Z| => will need to merge due to rho1 or rho2
+                    else if( n >= 4 ) then
+                        if ( runs(n-4) % len <= &
+                             runs(n-3) % len + runs(n-2) % len ) then
+                            test = .true.
+!               |W| <= |X| + |Y| => will need to merge due to rho1 or rho3
+                        end if
+                    end if
+                end if
+            end if
+            if ( test ) then
+! By default merge Y & Z, rho2 or rho3
+                if ( n >= 3 ) then
+                    if ( runs(n-3) % len < runs(n-1) % len ) then
+                        r = n - 3
+! |X| < |Z| => merge X & Y, rho1
+                        return
+                    end if
+                end if
+                r = n - 2
+! |Y| <= |Z| => merge Y & Z, rho4
+                return
+            else
+                r = -1
+            end if
+
+        end function collapse
+
+
+        pure subroutine insert_head( array, index )
+! Inserts `array(0)` into the pre-sorted sequence `array(1:)` so that the
+! whole `array(0:)` becomes sorted, copying the first element into
+! a temporary variable, iterating until the right place for it is found.
+! copying every traversed element into the slot preceding it, and finally,
+! copying data from the temporary variable into the resulting hole.
+! Consistency of the indices in `index` with the elements of `array`
+! are maintained.
+
+            type(string_type), intent(inout) :: array(0:)
+            integer(int_size), intent(inout) :: index(0:)
+
+            type(string_type) :: tmp
+            integer(int_size) :: i, tmp_index
+
+            tmp = array(0)
+            tmp_index = index(0)
+            find_hole: do i=1, size(array, kind=int_size)-1
+                if ( array(i) >= tmp ) exit find_hole
+                array(i-1) = array(i)
+                index(i-1) = index(i)
+            end do find_hole
+            array(i-1) = tmp
+            index(i-1) = tmp_index
+
+        end subroutine insert_head
+
+
+        subroutine merge_sort( array, index, buf, ibuf )
+! The Rust merge sort borrows some (but not all) of the ideas from TimSort,
+! which is described in detail at
+! (http://svn.python.org/projects/python/trunk/Objects/listsort.txt).
+!
+! The algorithm identifies strictly descending and non-descending
+! subsequences, which are called natural runs. Where these runs are less
+! than a minimum run size they are padded by adding additional samples
+! using an insertion sort. The merge process is driven by a stack of
+! pending unmerged runs. Each newly found run is pushed onto the stack,
+! and then pairs of adjacentd runs are merged until these two invariants
+! are satisfied:
+!
+! 1. for every `i` in `1..size(runs)-1`: `runs(i - 1)%len > runs(i)%len`
+! 2. for every `i` in `2..size(runs)-1`: `runs(i - 2)%len >
+!    runs(i - 1)%len + runs(i)%len`
+!
+! The invariants ensure that the total running time is `O(n log n)`
+! worst-case. Consistency of the indices in `index` with the elements of
+! `array` are maintained.
+
+            type(string_type), intent(inout) :: array(0:)
+            integer(int_size), intent(inout) :: index(0:)
+            type(string_type), intent(inout) :: buf(0:)
+            integer(int_size), intent(inout) :: ibuf(0:)
+
+            integer(int_size) :: array_size, finish, min_run, r, r_count, &
+                start
+            type(run_type)    :: runs(0:max_merge_stack-1), left, right
+
+            array_size = size(array, kind=int_size)
+
+! Very short runs are extended using insertion sort to span at least this
+! many elements. Slices of up to this length are sorted using insertion sort.
+            min_run = calc_min_run( array_size )
+
+            if ( array_size <= min_run ) then
+                if ( array_size >= 2 ) call insertion_sort( array, index )
+                return
+            end if
+
+
+! Following Rust sort, natural runs in `array` are identified by traversing
+! it backwards. By traversing it backward, merges more often go in the
+! opposite direction (forwards). According to developers of Rust sort,
+! merging forwards is slightly faster than merging backwards. Therefore
+! identifying runs by traversing backwards should improve performance.
+            r_count = 0
+            finish = array_size - 1
+            do while ( finish >= 0 )
+! Find the next natural run, and reverse it if it's strictly descending.
+                start = finish
+                if ( start > 0 ) then
+                    start = start - 1
+                    if ( array(start+1) < array(start) ) then
+                        Descending: do while ( start > 0 )
+                            if ( array(start) >= array(start-1) ) &
+                                exit Descending
+                            start = start - 1
+                        end do Descending
+                        call reverse_segment( array(start:finish), &
+                                            index(start:finish) )
+                    else
+                        Ascending: do while( start > 0 )
+                            if ( array(start) < array(start-1) ) exit Ascending
+                            start = start - 1
+                        end do Ascending
+                    end if
+                end if
+
+! If the run is too short insert some more elements using an insertion sort.
+                Insert: do while ( start > 0 )
+                    if ( finish - start >= min_run - 1 ) exit Insert
+                    start = start - 1
+                    call insert_head( array(start:finish), index(start:finish) )
+                end do Insert
+                if ( start == 0 .and. finish == array_size - 1 ) return
+
+                runs(r_count) = run_type( base = start, &
+                                          len = finish - start + 1 )
+                finish = start-1
+                r_count = r_count + 1
+
+! Determine whether pairs of adjacent runs need to be merged to satisfy
+! the invariants, and, if so, merge them.
+                Merge_loop: do
+                    r = collapse( runs(0:r_count - 1) )
+                    if ( r < 0 .or. r_count <= 1 ) exit Merge_loop
+                    left = runs( r + 1 )
+                    right = runs( r )
+                    call merge( array( left % base: &
+                                     right % base + right % len - 1 ), &
+                                left % len, buf, &
+                                index( left % base: &
+                                     right % base + right % len - 1 ), ibuf )
+
+                    runs(r) = run_type( base = left % base, &
+                                        len = left % len + right % len )
+                    if ( r == r_count - 3 ) runs(r+1) = runs(r+2)
+                    r_count = r_count - 1
+
+                end do Merge_loop
+            end do
+            if ( r_count /= 1 ) &
+                error stop "MERGE_SORT completed without RUN COUNT == 1."
+
+        end subroutine merge_sort
+
+
+        pure subroutine merge( array, mid, buf, index, ibuf )
+! Merges the two non-decreasing runs `ARRAY(0:MID-1)` and `ARRAY(MID:)`
+! using `BUF` as temporary storage, and stores the merged runs into
+! `ARRAY(0:)`. `MID` must be > 0, and < `SIZE(ARRAY)-1`. Buffer `BUF`
+! must be long enough to hold the shorter of the two runs.
+            type(string_type), intent(inout) :: array(0:)
+            integer(int_size), intent(in)    :: mid
+            type(string_type), intent(inout) :: buf(0:)
+            integer(int_size), intent(inout) :: index(0:)
+            integer(int_size), intent(inout) :: ibuf(0:)
+
+            integer(int_size) :: array_len, i, j, k
+
+            array_len = size(array, kind=int_size)
+
+! Merge first copies the shorter run into `buf`. Then, depending on which
+! run was shorter, it traces the copied run and the longer run forwards
+! (or backwards), comparing their next unprocessed elements and then
+! copying the lesser (or greater) one into `array`.
+
+            if ( mid <= array_len - mid ) then ! The left run is shorter.
+                buf(0:mid-1) = array(0:mid-1)
+                ibuf(0:mid-1) = index(0:mid-1)
+                i = 0
+                j = mid
+                merge_lower: do k = 0, array_len-1
+                    if ( buf(i) <= array(j) ) then
+                        array(k) = buf(i)
+                        index(k) = ibuf(i)
+                        i = i + 1
+                        if ( i >= mid ) exit merge_lower
+                    else
+                        array(k) = array(j)
+                        index(k) = index(j)
+                        j = j + 1
+                        if ( j >= array_len ) then
+                            array(k+1:) = buf(i:mid-1)
+                            index(k+1:) = ibuf(i:mid-1)
+                            exit merge_lower
+                        end if
+                    end if
+                end do merge_lower
+            else ! The right run is shorter
+                buf(0:array_len-mid) = array(mid:array_len-1)
+                ibuf(0:array_len-mid) = index(mid:array_len-1)
+                i = mid - 1
+                j = array_len - mid -1
+                merge_upper: do k = array_len-1, 0, -1
+                    if ( buf(j) >= array(i) ) then
+                        array(k) = buf(j)
+                        index(k) = ibuf(j)
+                        j = j - 1
+                        if ( j < 0 ) exit merge_upper
+                    else
+                        array(k) = array(i)
+                        index(k) = index(i)
+                        i = i - 1
+                        if ( i < 0 ) then
+                            array(0:k-1) = buf(0:j)
+                            index(0:k-1) = ibuf(0:j)
+                            exit merge_upper
+                        end if
+                    end if
+                end do merge_upper
+            end if
+        end subroutine merge
+
+
+        pure subroutine reverse_segment( array, index )
+! Reverse a segment of an array in place
+            type(string_type), intent(inout)  :: array(0:)
+            integer(int_size), intent(inout) :: index(0:)
+
+            integer(int_size) :: itemp, lo, hi
+            type(string_type) :: temp
+
+            lo = 0
+            hi = size( array, kind=int_size ) - 1
+            do while( lo < hi )
+                temp = array(lo)
+                array(lo) = array(hi)
+                array(hi) = temp
+                itemp = index(lo)
+                index(lo) = index(hi)
+                index(hi) = itemp
+                lo = lo + 1
+                hi = hi - 1
+            end do
+
+        end subroutine reverse_segment
+
+    end subroutine string_sort_index
+
+end submodule stdlib_sorting_sort_index

From bc3d6524459d4e2655b1762c83038cf1c218e392 Mon Sep 17 00:00:00 2001
From: William Clodius <w.clodius@icloud.com>
Date: Fri, 23 Apr 2021 17:48:45 -0600
Subject: [PATCH 07/45] Commited files related to testing the sorting
 procedures.

Commited the stdlib_sorting test code tests/sorting/test_sorting.f90, and related
CMakeLists.txt, and Makefile.manual files needed to compile the test code:
tests/CMakeLists.txt, tests/Makefile.manual, tests/sorting/CMakeLists.txt, and
tests/sorting/Makefile.manual.

[ticket: X]
---
 src/tests/CMakeLists.txt           |   1 +
 src/tests/Makefile.manual          |   1 +
 src/tests/sorting/CMakeLists.txt   |   2 +
 src/tests/sorting/Makefile.manual  |   3 +
 src/tests/sorting/test_sorting.f90 | 625 +++++++++++++++++++++++++++++
 5 files changed, 632 insertions(+)
 create mode 100644 src/tests/sorting/CMakeLists.txt
 create mode 100644 src/tests/sorting/Makefile.manual
 create mode 100644 src/tests/sorting/test_sorting.f90

diff --git a/src/tests/CMakeLists.txt b/src/tests/CMakeLists.txt
index 288445de9..381226222 100644
--- a/src/tests/CMakeLists.txt
+++ b/src/tests/CMakeLists.txt
@@ -12,6 +12,7 @@ add_subdirectory(io)
 add_subdirectory(linalg)
 add_subdirectory(logger)
 add_subdirectory(optval)
+add_subdirectory(sorting)
 add_subdirectory(stats)
 add_subdirectory(string)
 add_subdirectory(system)
diff --git a/src/tests/Makefile.manual b/src/tests/Makefile.manual
index 773066b1c..57f663bef 100644
--- a/src/tests/Makefile.manual
+++ b/src/tests/Makefile.manual
@@ -6,6 +6,7 @@ all test clean:
 	$(MAKE) -f Makefile.manual --directory=io $@
 	$(MAKE) -f Makefile.manual --directory=logger $@
 	$(MAKE) -f Makefile.manual --directory=optval $@
+	$(MAKE) -f Makefile.manual --directory=sorting $@
 	$(MAKE) -f Makefile.manual --directory=quadrature $@
 	$(MAKE) -f Makefile.manual --directory=stats $@
 	$(MAKE) -f Makefile.manual --directory=string $@
diff --git a/src/tests/sorting/CMakeLists.txt b/src/tests/sorting/CMakeLists.txt
new file mode 100644
index 000000000..bb4b27896
--- /dev/null
+++ b/src/tests/sorting/CMakeLists.txt
@@ -0,0 +1,2 @@
+ADDTEST(sorting)
+
diff --git a/src/tests/sorting/Makefile.manual b/src/tests/sorting/Makefile.manual
new file mode 100644
index 000000000..39657c8f3
--- /dev/null
+++ b/src/tests/sorting/Makefile.manual
@@ -0,0 +1,3 @@
+PROGS_SRC = test_sorting.f90
+
+include ../Makefile.manual.test.mk
diff --git a/src/tests/sorting/test_sorting.f90 b/src/tests/sorting/test_sorting.f90
new file mode 100644
index 000000000..4025aef42
--- /dev/null
+++ b/src/tests/sorting/test_sorting.f90
@@ -0,0 +1,625 @@
+program test_sorting
+
+    use, intrinsic :: iso_fortran_env, only: &
+        compiler_version
+    use stdlib_kinds, only: int32, int64, dp, sp
+    use stdlib_sorting
+    use stdlib_string_type, only: string_type, assignment(=), operator(>), &
+        write(formatted)
+
+    implicit none
+
+    integer(int32), parameter :: test_size = 2_int32**20
+    integer(int32), parameter :: char_size = 26**4
+    integer(int32), parameter :: string_size = 26**3
+    integer(int32), parameter :: block_size = test_size/6
+    integer, parameter        :: repeat = 8
+
+    integer(int32) ::             &
+        blocks(0:test_size-1),    &
+        decrease(0:test_size-1),  &
+        identical(0:test_size-1), &
+        increase(0:test_size-1),  &
+        rand0(0:test_size-1),     &
+        rand1(0:test_size-1),     &
+        rand2(0:test_size-1),     &
+        rand3(0:test_size-1),     &
+        rand10(0:test_size-1)
+    character(len=4) ::               &
+        char_decrease(0:char_size-1), &
+        char_increase(0:char_size-1), &
+        char_rand(0:char_size-1)
+    type(string_type) ::                  &
+        string_decrease(0:string_size-1), &
+        string_increase(0:string_size-1), &
+        string_rand(0:string_size-1)
+
+    integer(int32)          :: dummy(0:test_size-1)
+    character(len=4)        :: char_dummy(0:char_size-1)
+    type(string_type)       :: string_dummy(0:string_size-1)
+    integer(int_size)       :: index(0:test_size-1)
+    integer(int32)          :: work(0:test_size/2-1)
+    character(len=4)        :: char_work(0:char_size/2-1)
+    type(string_type)       :: string_work(0:string_size/2-1)
+    integer(int_size)       :: iwork(0:test_size/2-1)
+    integer                 :: count, i, index1, index2, j, k, l, temp
+    real(sp)                :: arand, brand
+    character(*), parameter :: filename = 'test_sorting.txt'
+    integer                 :: lun
+    character(len=4)        :: char_temp
+    type(string_type)       :: string_temp
+
+! Create the test arrays
+    identical(:) = 10
+    do i=0, test_size-1
+        increase(i) = i
+        decrease(i) = test_size - 1 - i
+        call random_number( arand )
+        rand0(i) = int( floor( 4 * arand * test_size ), kind=int32 )
+        rand1(i) = int( floor( arand * test_size / 4 ), kind=int32 )
+    end do
+    blocks(:) = increase(:)
+    blocks(0:block_size-1) = increase(4*block_size:5*block_size-1)
+    blocks(block_size:2*block_size-1) = increase(0:block_size-1)
+    blocks(2*block_size:3*block_size-1) = increase(2*block_size:3*block_size-1)
+    blocks(3*block_size:4*block_size-1) = increase(block_size:2*block_size-1)
+    blocks(4*block_size:5*block_size-1) = increase(3*block_size:4*block_size-1)
+    rand2(:) = increase(:)
+    do i=0, test_size-1
+        call random_number( arand )
+        index1 = int( floor( arand * test_size ), kind=int32 )
+        temp = rand2(i)
+        rand2(i) = rand2(index1)
+        rand2(index1) = temp
+    end do
+    rand3(:) = increase(:)
+    do i=0, 2
+        call random_number( arand )
+        call random_number( brand )
+        index1 = int( floor( arand * test_size ), kind=int32 )
+        index2 = int( floor( brand * test_size ), kind=int32 )
+        temp = rand3(index1)
+        rand3(index1) = rand3(index2)
+        rand3(index2) = temp
+    end do
+    rand10(:) = increase(:)
+    do i=test_size-10, test_size-1
+        call random_number( arand )
+        rand10(i) = int( floor( arand * test_size ), kind=int32 )
+    end do
+
+    count = 0
+    do i=0, 25
+        do j=0, 25
+            do k=0, 25
+                do l=0, 25
+                    char_increase(count) = achar(97+i) // achar(97+j) // &
+                        achar(97+k) // achar(97+l)
+                    count = count + 1
+                end do
+            end do
+        end do
+    end do
+
+    do i=0, char_size-1
+        char_decrease(char_size-1-i) = char_increase(i)
+    end do
+
+    char_rand(:) = char_increase(:)
+    do i=0, char_size-1
+        call random_number( arand )
+        index1 = int( floor( arand * char_size ), kind=int32 )
+        char_temp = char_rand(i)
+        char_rand(i) = char_rand(index1)
+        char_rand(index1) = char_temp
+    end do
+
+    count = 0
+    do i=0, 25
+        do j=0, 25
+            do k=0, 25
+                string_increase(count) = achar(97+i) // achar(97+j) // &
+                    achar(97+k)
+                count = count + 1
+            end do
+        end do
+    end do
+
+    do i=0, string_size-1
+        string_decrease(string_size - 1 - i) = char_increase(i)
+    end do
+
+    string_rand(:) = string_increase(:)
+    do i=0, string_size-1
+        call random_number( arand )
+        index1 = int( floor( arand * string_size ), kind=int32 )
+        string_temp = string_rand(i)
+        string_rand(i) = string_rand(index1)
+        string_rand(index1) = string_temp
+    end do
+
+! Create and intialize file to report the results of the sortings
+    open( newunit=lun, file=filename, access='sequential', action='write', &
+        form='formatted', status='replace' )
+    write( lun, '(a)' ) trim(compiler_version())
+    write( lun, * )
+    write( lun, '("|    Type     | Elements |    Array Name   |    Method ' // &
+        '  |  Time (s) |")' )
+    write( lun, '("|-------------|----------|-----------------|-----------' // &
+        '--|-----------|")' )
+
+! test the sorting routines on the test arrays
+    call test_int_ord_sorts( )
+
+    call test_char_ord_sorts( )
+
+    call test_string_ord_sorts( )
+
+    call test_int_sorts( )
+
+    call test_char_sorts( )
+
+    call test_string_sorts( )
+
+    call test_int_sort_indexes( )
+
+    call test_char_sort_indexes( )
+
+    call test_string_sort_indexes( )
+
+contains
+
+    subroutine test_int_ord_sorts( )
+
+        call test_int_ord_sort( blocks, "Blocks" )
+        call test_int_ord_sort( decrease, "Decreasing" )
+        call test_int_ord_sort( identical, "Identical" )
+        call test_int_ord_sort( increase, "Increasing" )
+        call test_int_ord_sort( rand1, "Random dense" )
+        call test_int_ord_sort( rand2, "Random order" )
+        call test_int_ord_sort( rand0, "Random sparse" )
+        call test_int_ord_sort( rand3, "Random 3" )
+        call test_int_ord_sort( rand10, "Random 10" )
+
+    end subroutine test_int_ord_sorts
+
+
+    subroutine test_int_ord_sort( a, a_name )
+        integer(int32), intent(in) :: a(:)
+        character(*), intent(in)   :: a_name 
+
+        integer(int64) :: t0, t1, tdiff
+        real(dp)       :: rate
+        integer(int64) :: i
+        logical :: valid
+
+        tdiff = 0
+        do i = 1, repeat
+            dummy = a
+            call system_clock( t0, rate )
+            call ord_sort( dummy, work )
+            call system_clock( t1, rate )
+            tdiff = tdiff + t1 - t0
+        end do
+        tdiff = tdiff/repeat
+
+        call verify_sort( dummy, valid, i )
+        if ( .not. valid ) then
+            write( *, * ) "ORD_SORT did not sort " // a_name // "."
+            write(*,*) 'i = ', i
+            write(*,'(a12, 2i7)') 'dummy(i-1:i) = ', dummy(i-1:i)
+        end if
+        write( lun, '("|     Integer |", 1x, i7, 2x, "|", 1x, a15, " |", ' // &
+            'a12, " |",  F10.5, " |" )' ) &
+            test_size, a_name, "Ord_Sort", tdiff/rate
+
+    end subroutine test_int_ord_sort
+
+    subroutine test_char_ord_sorts( )
+
+        call test_char_ord_sort( char_decrease, "Char. Decrease" )
+        call test_char_ord_sort( char_increase, "Char. Increase" )
+        call test_char_ord_sort( char_rand, "Char. Random" )
+
+    end subroutine test_char_ord_sorts
+
+    subroutine test_char_ord_sort( a, a_name )
+        character(len=4), intent(in) :: a(0:)
+        character(*), intent(in) :: a_name
+
+        integer(int64) :: t0, t1, tdiff
+        real(dp)       :: rate
+        integer(int64) :: i
+        logical        :: valid
+
+        tdiff = 0
+        do i = 1, repeat
+            char_dummy = a
+            call system_clock( t0, rate )
+            call ord_sort( char_dummy, char_work )
+            call system_clock( t1, rate )
+            tdiff = tdiff + t1 - t0
+        end do
+        tdiff = tdiff/repeat
+
+        call verify_char_sort( char_dummy, valid, i )
+        if ( .not. valid ) then
+            write( *, * ) "ORD_SORT did not sort " // a_name // "."
+            write(*,*) 'i = ', i
+            write(*,'(a17, 2(1x,a4))') 'char_dummy(i-1:i) = ', char_dummy(i-1:i)
+        end if
+        write( lun, '("|   Character |", 1x, i7, 2x, "|", 1x, a15, " |", ' // &
+            'a12, " |",  F10.5, " |" )' ) &
+            char_size, a_name, "Ord_Sort", tdiff/rate
+
+    end subroutine test_char_ord_sort
+
+    subroutine test_string_ord_sorts( )
+
+        call test_string_ord_sort( string_decrease, "String Decrease" )
+        call test_string_ord_sort( string_increase, "String Increase" )
+        call test_string_ord_sort( string_rand, "String Random" )
+
+    end subroutine test_string_ord_sorts
+
+    subroutine test_string_ord_sort( a, a_name )
+        type(string_type), intent(in) :: a(0:)
+        character(*), intent(in)      :: a_name
+
+        integer(int64) :: t0, t1, tdiff
+        real(dp)       :: rate
+        integer(int64) :: i
+        logical        :: valid
+
+        tdiff = 0
+        do i = 1, repeat
+            string_dummy = a
+            call system_clock( t0, rate )
+            call ord_sort( string_dummy, string_work )
+            call system_clock( t1, rate )
+            tdiff = tdiff + t1 - t0
+        end do
+        tdiff = tdiff/repeat
+
+        call verify_string_sort( string_dummy, valid, i )
+        if ( .not. valid ) then
+            write( *, * ) "ORD_SORT did not sort " // a_name // "."
+            write(*,*) 'i = ', i
+            write(*,'(a17, 2(1x,a4))') 'string_dummy(i-1:i) = ', &
+                string_dummy(i-1:i)
+        end if
+        write( lun, '("| String_type |", 1x, i7, 2x, "|", 1x, a15, " |", ' // &
+            'a12, " |",  F10.5, " |" )' ) &
+            string_size, a_name, "Ord_Sort", tdiff/rate
+
+    end subroutine test_string_ord_sort
+
+
+    subroutine test_int_sorts( )
+
+        call test_int_sort( blocks, "Blocks" )
+        call test_int_sort( decrease, "Decreasing" )
+        call test_int_sort( identical, "Identical" )
+        call test_int_sort( increase, "Increasing" )
+        call test_int_sort( rand1, "Random dense" )
+        call test_int_sort( rand2, "Random order" )
+        call test_int_sort( rand0, "Random sparse" )
+        call test_int_sort( rand3, "Random 3" )
+        call test_int_sort( rand10, "Random 10" )
+
+    end subroutine test_int_sorts
+
+    subroutine test_int_sort( a, a_name )
+        integer(int32), intent(in) :: a(:)
+        character(*), intent(in)   :: a_name
+
+        integer(int64) :: t0, t1, tdiff
+        real(dp)       :: rate
+        integer(int64) :: i
+        logical        :: valid
+
+        tdiff = 0
+        do i = 1, repeat
+            dummy = a
+            call system_clock( t0, rate )
+            call sort( dummy )
+            call system_clock( t1, rate )
+            tdiff = tdiff + t1 - t0
+        end do
+        tdiff = tdiff/repeat
+
+        call verify_sort( dummy, valid, i )
+        if ( .not. valid ) then
+            write( *, * ) "SORT did not sort " // a_name // "."
+            write(*,*) 'i = ', i
+            write(*,'(a12, 2i7)') 'dummy(i-1:i) = ', dummy(i-1:i)
+        end if
+        write( lun, '("|     Integer |", 1x, i7, 2x, "|", 1x, a15, " |", ' // &
+            'a12, " |",  F10.5, " |" )' ) &
+            test_size, a_name, "Sort", tdiff/rate
+
+    end subroutine test_int_sort
+
+    subroutine test_char_sorts( )
+
+        call test_char_sort( char_decrease, "Char. Decrease" )
+        call test_char_sort( char_increase, "Char. Increase" )
+        call test_char_sort( char_rand, "Char. Random" )
+
+    end subroutine test_char_sorts
+
+    subroutine test_char_sort( a, a_name )
+        character(len=4), intent(in) :: a(0:)
+        character(*), intent(in) :: a_name
+
+        integer(int64) :: t0, t1, tdiff
+        real(dp)       :: rate
+        integer(int64) :: i
+        logical        :: valid
+
+        tdiff = 0
+        do i = 1, repeat
+            char_dummy = a
+            call system_clock( t0, rate )
+            call sort( char_dummy )
+            call system_clock( t1, rate )
+            tdiff = tdiff + t1 - t0
+        end do
+        tdiff = tdiff/repeat
+
+        call verify_char_sort( char_dummy, valid, i )
+        if ( .not. valid ) then
+            write( *, * ) "SORT did not sort " // a_name // "."
+            write(*,*) 'i = ', i
+            write(*,'(a17, 2(1x,a4))') 'char_dummy(i-1:i) = ', char_dummy(i-1:i)
+        end if
+        write( lun, '("|   Character |", 1x, i7, 2x, "|", 1x, a15, " |", ' // &
+            'a12, " |",  F10.5, " |" )' ) &
+            char_size, a_name, "Sort", tdiff/rate
+
+    end subroutine test_char_sort
+
+    subroutine test_string_sorts( )
+
+        call test_string_sort( string_decrease, "String Decrease" )
+        call test_string_sort( string_increase, "String Increase" )
+        call test_string_sort( string_rand, "String Random" )
+
+    end subroutine test_string_sorts
+
+    subroutine test_string_sort( a, a_name )
+        type(string_type), intent(in) :: a(0:)
+        character(*), intent(in) :: a_name
+
+        integer(int64) :: t0, t1, tdiff
+        real(dp)       :: rate
+        integer(int64) :: i
+        logical        :: valid
+
+        tdiff = 0
+        do i = 1, repeat
+            string_dummy = a
+            call system_clock( t0, rate )
+            call sort( string_dummy )
+            call system_clock( t1, rate )
+            tdiff = tdiff + t1 - t0
+        end do
+        tdiff = tdiff/repeat
+
+        call verify_string_sort( string_dummy, valid, i )
+        if ( .not. valid ) then
+            write( *, * ) "SORT did not sort " // a_name // "."
+            write(*,*) 'i = ', i
+            write(*,'(a17, 2(1x,a4))') 'string_dummy(i-1:i) = ', &
+                string_dummy(i-1:i)
+        end if
+        write( lun, '("| String_type |", 1x, i7, 2x, "|", 1x, a15, " |", ' // &
+            'a12, " |",  F10.5, " |" )' ) &
+            string_size, a_name, "Sort", tdiff/rate
+
+    end subroutine test_string_sort
+
+    subroutine test_int_sort_indexes( )
+
+        call test_int_sort_index( blocks, "Blocks" )
+        call test_int_sort_index( decrease, "Decreasing" )
+        call test_int_sort_index( identical, "Identical" )
+        call test_int_sort_index( increase, "Increasing" )
+        call test_int_sort_index( rand1, "Random dense" )
+        call test_int_sort_index( rand2, "Random order" )
+        call test_int_sort_index( rand0, "Random sparse" )
+        call test_int_sort_index( rand3, "Random 3" )
+        call test_int_sort_index( rand10, "Random 10" )
+
+    end subroutine test_int_sort_indexes
+
+    subroutine test_int_sort_index( a, a_name )
+        integer(int32), intent(inout) :: a(:)
+        character(*), intent(in)      :: a_name
+
+        integer(int64)                 :: t0, t1, tdiff
+        real(dp)                       :: rate
+        integer(int64)                 :: i
+        logical                        :: valid
+
+        tdiff = 0
+        do i = 1, repeat
+            dummy = a
+            call system_clock( t0, rate )
+            call sort_index( dummy, index, work, iwork )
+            call system_clock( t1, rate )
+            tdiff = tdiff + t1 - t0
+        end do
+        tdiff = tdiff/repeat
+
+        dummy = a(index)
+        call verify_sort( dummy, valid, i )
+        if ( .not. valid ) then
+            write( *, * ) "SORT_INDEX did not sort " // a_name // "."
+            write(*,*) 'i = ', i
+            write(*,'(a18, 2i7)') 'a(index(i-1:i)) = ', a(index(i-1:i))
+        end if
+        write( lun, '("|     Integer |", 1x, i7, 2x, "|", 1x, a15, " |", ' // &
+            'a12, " |",  F10.5, " |" )' ) &
+            test_size, a_name, "Sort_Index", tdiff/rate
+
+        dummy = a
+        call sort_index( dummy, index, work, iwork, reverse=.true. )
+        dummy = a(index)
+        call verify_reverse_sort( dummy, valid, i )
+        if ( .not. valid ) then
+            write( *, * ) "SORT_INDEX did not reverse sort " // &
+                a_name // "."
+            write(*,*) 'i = ', i
+            write(*,'(a18, 2i7)') 'a(index(i-1:i)) = ', a(index(i-1:i))
+        end if
+
+    end subroutine test_int_sort_index
+
+    subroutine test_char_sort_indexes( )
+
+        call test_char_sort_index( char_decrease, "Char. Decrease" )
+        call test_char_sort_index( char_increase, "Char. Increase" )
+        call test_char_sort_index( char_rand, "Char. Random" )
+
+    end subroutine test_char_sort_indexes
+
+    subroutine test_char_sort_index( a, a_name )
+        character(len=4), intent(in) :: a(0:)
+        character(*), intent(in) :: a_name
+
+        integer(int64) :: t0, t1, tdiff
+        real(dp)       :: rate
+        integer(int64) :: i
+        logical        :: valid
+
+        tdiff = 0
+        do i = 1, repeat
+            char_dummy = a
+            call system_clock( t0, rate )
+            call sort_index( char_dummy, index, char_work, iwork )
+            call system_clock( t1, rate )
+            tdiff = tdiff + t1 - t0
+        end do
+        tdiff = tdiff/repeat
+
+        call verify_char_sort( char_dummy, valid, i )
+        if ( .not. valid ) then
+            write( *, * ) "SORT_INDEX did not sort " // a_name // "."
+            write(*,*) 'i = ', i
+            write(*,'(a17, 2(1x,a4))') 'char_dummy(i-1:i) = ', char_dummy(i-1:i)
+        end if
+        write( lun, '("|   Character |", 1x, i7, 2x, "|", 1x, a15, " |", ' // &
+            'a12, " |",  F10.5, " |" )' ) &
+            char_size, a_name, "Sort_Index", tdiff/rate
+
+    end subroutine test_char_sort_index
+
+    subroutine test_string_sort_indexes( )
+
+        call test_string_sort_index( string_decrease, "String Decrease" )
+        call test_string_sort_index( string_increase, "String Increase" )
+        call test_string_sort_index( string_rand, "String Random" )
+
+    end subroutine test_string_sort_indexes
+
+    subroutine test_string_sort_index( a, a_name )
+        type(string_type), intent(in) :: a(0:)
+        character(*), intent(in) :: a_name
+
+        integer(int64) :: t0, t1, tdiff
+        real(dp)       :: rate
+        integer(int64) :: i
+        logical        :: valid
+
+        tdiff = 0
+        do i = 1, repeat
+            string_dummy = a
+            call system_clock( t0, rate )
+            call sort_index( string_dummy, index, string_work, iwork )
+            call system_clock( t1, rate )
+            tdiff = tdiff + t1 - t0
+        end do
+        tdiff = tdiff/repeat
+
+        call verify_string_sort( string_dummy, valid, i )
+        if ( .not. valid ) then
+            write( *, * ) "SORT_INDEX did not sort " // a_name // "."
+            write(*,*) 'i = ', i
+            write(*,'(a17, 2(1x,a4))') 'string_dummy(i-1:i) = ', &
+                string_dummy(i-1:i)
+        end if
+        write( lun, '("| String_type |", 1x, i7, 2x, "|", 1x, a15, " |", ' // &
+            'a12, " |",  F10.5, " |" )' ) &
+            string_size, a_name, "Sort_Index", tdiff/rate
+
+    end subroutine test_string_sort_index
+
+
+    subroutine verify_sort( a, valid, i )
+        integer(int32), intent(in) :: a(0:)
+        logical, intent(out) :: valid
+        integer(int64), intent(out) :: i
+
+        integer(int64) :: n
+
+        n = size( a, kind=int64 )
+        valid = .false.
+        do i=1, n-1
+            if ( a(i-1) > a(i) ) return
+        end do
+        valid = .true.
+
+    end subroutine verify_sort
+
+
+    subroutine verify_string_sort( a, valid, i )
+        type(string_type), intent(in) :: a(0:)
+        logical, intent(out) :: valid
+        integer(int64), intent(out) :: i
+
+        integer(int64) :: n
+
+        n = size( a, kind=int64 )
+        valid = .false.
+        do i=1, n-1
+            if ( a(i-1) > a(i) ) return
+        end do
+        valid = .true.
+
+    end subroutine verify_string_sort
+
+    subroutine verify_char_sort( a, valid, i )
+        character(len=4), intent(in) :: a(0:)
+        logical, intent(out) :: valid
+        integer(int64), intent(out) :: i
+
+        integer(int64) :: n
+
+        n = size( a, kind=int64 )
+        valid = .false.
+        do i=1, n-1
+            if ( a(i-1) > a(i) ) return
+        end do
+        valid = .true.
+
+    end subroutine verify_char_sort
+
+
+    subroutine verify_reverse_sort( a, valid, i )
+        integer(int32), intent(in) :: a(0:)
+        logical, intent(out) :: valid
+        integer(int64), intent(out) :: i
+
+        integer(int64) :: n
+
+        n = size( a, kind=int64 )
+        valid = .false.
+        do i=1, n-1
+            if ( a(i-1) < a(i) ) return
+        end do
+        valid = .true.
+
+    end subroutine verify_reverse_sort
+
+end program test_sorting

From cb2f0d85e72364e7bf3220b179362bed2a29f333 Mon Sep 17 00:00:00 2001
From: William Clodius <w.clodius@icloud.com>
Date: Fri, 23 Apr 2021 17:56:30 -0600
Subject: [PATCH 08/45] Commited the revised stdlib/doc/specs/stdlib_sorting.md

Commited the revised stdlib_sorting markdown document stdlib_sorting.md. It was
revised largely to reflect changes in test_sorting.f90 intended to shorten the
processor time devoted to string_type sorting.

[ticket: X]
---
 doc/specs/stdlib_sorting.md | 196 ++++++++++++++++++------------------
 1 file changed, 98 insertions(+), 98 deletions(-)

diff --git a/doc/specs/stdlib_sorting.md b/doc/specs/stdlib_sorting.md
index c573c5876..346a95448 100644
--- a/doc/specs/stdlib_sorting.md
+++ b/doc/specs/stdlib_sorting.md
@@ -523,13 +523,13 @@ size `26**4`:
   random order.
 
 On three different `string_type` arrays, each of length 4 elements and
-of size `26**4`:
+of size `26**3`:
 
-* String Decreasing - values decrease uniformly from `"zzzz"` to
-  `"aaaa"`.
-* String Increasing - values decrease uniformly from `"aaaa"` to
-  `"zzzz"`.
-* String Random - the set of strings from `"aaaa"` to `"zzzz"` in
+* String Decreasing - values decrease uniformly from `"zzz"` to
+  `"aaa"`.
+* String Increasing - values decrease uniformly from `"aaa"` to
+  `"zzz"`.
+* String Random - the set of strings from `"aaa"` to `"zzz"` in
   random order.
 
 These benchmarks have been performed on two different compilers, both
@@ -537,104 +537,104 @@ on a MacBook Pro, featuring a 2.3 GHz Quad-Core Intel Core i5, with 8
 GB 2133 MHz LPDDR3 memory. The first compiler was GNU Fortran
 (GCC) 10.2.0, with the following results:
 
-|    Type     | Elements |      Order      |    Method   | Time (s)  |
+|    Type     | Elements |    Array Name   |    Method   |  Time (s) |
 |-------------|----------|-----------------|-------------|-----------|
-|     Integer | 1048576  |          Blocks |    Ord_Sort |   0.00694 |
-|     Integer | 1048576  |      Decreasing |    Ord_Sort |   0.00363 |
-|     Integer | 1048576  |       Identical |    Ord_Sort |   0.00206 |
-|     Integer | 1048576  |      Increasing |    Ord_Sort |   0.00206 |
-|     Integer | 1048576  |    Random dense |    Ord_Sort |   0.17705 |
-|     Integer | 1048576  |    Random order |    Ord_Sort |   0.17749 |
-|     Integer | 1048576  |   Random sparse |    Ord_Sort |   0.17641 |
-|     Integer | 1048576  |        Random 3 |    Ord_Sort |   0.00943 |
-|     Integer | 1048576  |       Random 10 |    Ord_Sort |   0.00378 |
-|   Character |  456976  |  Char. Decrease |    Ord_Sort |   0.00766 |
-|   Character |  456976  |  Char. Increase |    Ord_Sort |   0.00416 |
-|   Character |  456976  |    Char. Random |    Ord_Sort |   0.23689 |
-| String_type |  456976  | String Decrease |    Ord_Sort |   0.15467 |
-| String_type |  456976  | String Increase |    Ord_Sort |   0.09072 |
-| String_type |  456976  |   String Random |    Ord_Sort |   4.04153 |
-|     Integer | 1048576  |          Blocks |        Sort |   0.11175 |
-|     Integer | 1048576  |      Decreasing |        Sort |   0.14222 |
-|     Integer | 1048576  |       Identical |        Sort |   0.16076 |
-|     Integer | 1048576  |      Increasing |        Sort |   0.05380 |
-|     Integer | 1048576  |    Random dense |        Sort |   0.15519 |
-|     Integer | 1048576  |    Random order |        Sort |   0.15907 |
-|     Integer | 1048576  |   Random sparse |        Sort |   0.15792 |
-|     Integer | 1048576  |        Random 3 |        Sort |   0.23041 |
-|     Integer | 1048576  |       Random 10 |        Sort |   0.24668 |
-|   Character |  456976  |  Char. Decrease |        Sort |   0.31256 |
-|   Character |  456976  |  Char. Increase |        Sort |   0.11330 |
-|   Character |  456976  |    Char. Random |        Sort |   0.20166 |
-| String_type |  456976  | String Decrease |        Sort |   6.41878 |
-| String_type |  456976  | String Increase |        Sort |   2.26954 |
-| String_type |  456976  |   String Random |        Sort |   3.58182 |
-|     Integer | 1048576  |          Blocks |  Sort_Index |   0.01175 |
-|     Integer | 1048576  |      Decreasing |  Sort_Index |   0.00732 |
-|     Integer | 1048576  |       Identical |  Sort_Index |   0.00453 |
-|     Integer | 1048576  |      Increasing |  Sort_Index |   0.00495 |
-|     Integer | 1048576  |    Random dense |  Sort_Index |   0.20723 |
-|     Integer | 1048576  |    Random order |  Sort_Index |   0.20818 |
-|     Integer | 1048576  |   Random sparse |  Sort_Index |   0.20441 |
-|     Integer | 1048576  |        Random 3 |  Sort_Index |   0.01468 |
-|     Integer | 1048576  |       Random 10 |  Sort_Index |   0.00669 |
-|   Character |  456976  |  Char. Decrease |  Sort_Index |   0.00915 |
-|   Character |  456976  |  Char. Increase |  Sort_Index |   0.00522 |
-|   Character |  456976  |    Char. Random |  Sort_Index |   0.26615 |
-| String_type |  456976  | String Decrease |  Sort_Index |   0.19026 |
-| String_type |  456976  | String Increase |  Sort_Index |   0.09407 |
-| String_type |  456976  |   String Random |  Sort_Index |   4.12616 |
+|     Integer | 1048576  |          Blocks |    Ord_Sort |   0.00738 |
+|     Integer | 1048576  |      Decreasing |    Ord_Sort |   0.00380 |
+|     Integer | 1048576  |       Identical |    Ord_Sort |   0.00220 |
+|     Integer | 1048576  |      Increasing |    Ord_Sort |   0.00209 |
+|     Integer | 1048576  |    Random dense |    Ord_Sort |   0.17972 |
+|     Integer | 1048576  |    Random order |    Ord_Sort |   0.17503 |
+|     Integer | 1048576  |   Random sparse |    Ord_Sort |   0.17340 |
+|     Integer | 1048576  |        Random 3 |    Ord_Sort |   0.00847 |
+|     Integer | 1048576  |       Random 10 |    Ord_Sort |   0.00484 |
+|   Character |  456976  |  Char. Decrease |    Ord_Sort |   0.00763 |
+|   Character |  456976  |  Char. Increase |    Ord_Sort |   0.00414 |
+|   Character |  456976  |    Char. Random |    Ord_Sort |   0.23746 |
+| String_type |   17576  | String Decrease |    Ord_Sort |   0.00543 |
+| String_type |   17576  | String Increase |    Ord_Sort |   0.00347 |
+| String_type |   17576  |   String Random |    Ord_Sort |   0.09461 |
+|     Integer | 1048576  |          Blocks |        Sort |   0.10556 |
+|     Integer | 1048576  |      Decreasing |        Sort |   0.13348 |
+|     Integer | 1048576  |       Identical |        Sort |   0.15719 |
+|     Integer | 1048576  |      Increasing |        Sort |   0.05316 |
+|     Integer | 1048576  |    Random dense |        Sort |   0.15047 |
+|     Integer | 1048576  |    Random order |        Sort |   0.15176 |
+|     Integer | 1048576  |   Random sparse |        Sort |   0.15767 |
+|     Integer | 1048576  |        Random 3 |        Sort |   0.19907 |
+|     Integer | 1048576  |       Random 10 |        Sort |   0.34244 |
+|   Character |  456976  |  Char. Decrease |        Sort |   0.30723 |
+|   Character |  456976  |  Char. Increase |        Sort |   0.10984 |
+|   Character |  456976  |    Char. Random |        Sort |   0.20642 |
+| String_type |   17576  | String Decrease |        Sort |   0.15101 |
+| String_type |   17576  | String Increase |        Sort |   0.05569 |
+| String_type |   17576  |   String Random |        Sort |   0.08499 |
+|     Integer | 1048576  |          Blocks |  Sort_Index |   0.01163 |
+|     Integer | 1048576  |      Decreasing |  Sort_Index |   0.00720 |
+|     Integer | 1048576  |       Identical |  Sort_Index |   0.00451 |
+|     Integer | 1048576  |      Increasing |  Sort_Index |   0.00452 |
+|     Integer | 1048576  |    Random dense |  Sort_Index |   0.20295 |
+|     Integer | 1048576  |    Random order |  Sort_Index |   0.20190 |
+|     Integer | 1048576  |   Random sparse |  Sort_Index |   0.20221 |
+|     Integer | 1048576  |        Random 3 |  Sort_Index |   0.01406 |
+|     Integer | 1048576  |       Random 10 |  Sort_Index |   0.00765 |
+|   Character |  456976  |  Char. Decrease |  Sort_Index |   0.00912 |
+|   Character |  456976  |  Char. Increase |  Sort_Index |   0.00515 |
+|   Character |  456976  |    Char. Random |  Sort_Index |   0.24693 |
+| String_type |   17576  | String Decrease |  Sort_Index |   0.00528 |
+| String_type |   17576  | String Increase |  Sort_Index |   0.00341 |
+| String_type |   17576  |   String Random |  Sort_Index |   0.09554 |
 
 The second compiler was Intel(R) Fortran Intel(R) 64 Compiler Classic
 for applications running on Intel(R) 64, Version 2021.2.0 Build
 20210228_000000, with the following results:
 
-|    Type     | Elements |      Order      |    Method   | Time (s)  |
+|    Type     | Elements |   Array Name    |    Method   |  Time (s) |
 |-------------|----------|-----------------|-------------|-----------|
-|     Integer | 1048576  |          Blocks |    Ord_Sort |   0.00287 |
-|     Integer | 1048576  |      Decreasing |    Ord_Sort |   0.00123 |
-|     Integer | 1048576  |       Identical |    Ord_Sort |   0.00094 |
-|     Integer | 1048576  |      Increasing |    Ord_Sort |   0.00095 |
-|     Integer | 1048576  |    Random dense |    Ord_Sort |   0.09865 |
-|     Integer | 1048576  |    Random order |    Ord_Sort |   0.10138 |
-|     Integer | 1048576  |   Random sparse |    Ord_Sort |   0.10223 |
-|     Integer | 1048576  |        Random 3 |    Ord_Sort |   0.00420 |
-|     Integer | 1048576  |       Random 10 |    Ord_Sort |   0.00174 |
-|   Character |  456976  |  Char. Decrease |    Ord_Sort |   0.00721 |
-|   Character |  456976  |  Char. Increase |    Ord_Sort |   0.00354 |
-|   Character |  456976  |    Char. Random |    Ord_Sort |   0.21287 |
-| String_type |  456976  | String Decrease |    Ord_Sort |   0.50840 |
-| String_type |  456976  | String Increase |    Ord_Sort |   0.14912 |
-| String_type |  456976  |   String Random |    Ord_Sort |  15.28657 |
-|     Integer | 1048576  |          Blocks |        Sort |   0.03632 |
-|     Integer | 1048576  |      Decreasing |        Sort |   0.04135 |
-|     Integer | 1048576  |       Identical |        Sort |   0.03672 |
-|     Integer | 1048576  |      Increasing |        Sort |   0.01346 |
-|     Integer | 1048576  |    Random dense |        Sort |   0.07099 |
-|     Integer | 1048576  |    Random order |        Sort |   0.07172 |
-|     Integer | 1048576  |   Random sparse |        Sort |   0.07196 |
-|     Integer | 1048576  |        Random 3 |        Sort |   0.07729 |
-|     Integer | 1048576  |       Random 10 |        Sort |   0.14964 |
-|   Character |  456976  |  Char. Decrease |        Sort |   0.31387 |
-|   Character |  456976  |  Char. Increase |        Sort |   0.12050 |
-|   Character |  456976  |    Char. Random |        Sort |   0.20228 |
-| String_type |  456976  | String Decrease |        Sort |  23.87099 |
-| String_type |  456976  | String Increase |        Sort |   8.71762 |
-| String_type |  456976  |   String Random |        Sort |  11.50930 |
-|     Integer | 1048576  |          Blocks |  Sort_Index |   0.00563 |
-|     Integer | 1048576  |      Decreasing |  Sort_Index |   0.00230 |
-|     Integer | 1048576  |       Identical |  Sort_Index |   0.00142 |
-|     Integer | 1048576  |      Increasing |  Sort_Index |   0.00142 |
-|     Integer | 1048576  |    Random dense |  Sort_Index |   0.11723 |
-|     Integer | 1048576  |    Random order |  Sort_Index |   0.12539 |
-|     Integer | 1048576  |   Random sparse |  Sort_Index |   0.12706 |
-|     Integer | 1048576  |        Random 3 |  Sort_Index |   0.00718 |
-|     Integer | 1048576  |       Random 10 |  Sort_Index |   0.00316 |
-|   Character |  456976  |  Char. Decrease |  Sort_Index |   0.00790 |
-|   Character |  456976  |  Char. Increase |  Sort_Index |   0.00405 |
-|   Character |  456976  |    Char. Random |  Sort_Index |   0.22963 |
-| String_type |  456976  | String Decrease |  Sort_Index |   0.51546 |
-| String_type |  456976  | String Increase |  Sort_Index |   0.14960 |
-| String_type |  456976  |   String Random |  Sort_Index |  15.59673 |
+|     Integer | 1048576  |          Blocks |    Ord_Sort |   0.00320 |
+|     Integer | 1048576  |      Decreasing |    Ord_Sort |   0.00142 |
+|     Integer | 1048576  |       Identical |    Ord_Sort |   0.00102 |
+|     Integer | 1048576  |      Increasing |    Ord_Sort |   0.00158 |
+|     Integer | 1048576  |    Random dense |    Ord_Sort |   0.09859 |
+|     Integer | 1048576  |    Random order |    Ord_Sort |   0.09704 |
+|     Integer | 1048576  |   Random sparse |    Ord_Sort |   0.09599 |
+|     Integer | 1048576  |        Random 3 |    Ord_Sort |   0.00396 |
+|     Integer | 1048576  |       Random 10 |    Ord_Sort |   0.00183 |
+|   Character |  456976  |  Char. Decrease |    Ord_Sort |   0.00763 |
+|   Character |  456976  |  Char. Increase |    Ord_Sort |   0.00341 |
+|   Character |  456976  |    Char. Random |    Ord_Sort |   0.21991 |
+| String_type |   17576  | String Decrease |    Ord_Sort |   0.01957 |
+| String_type |   17576  | String Increase |    Ord_Sort |   0.00573 |
+| String_type |   17576  |   String Random |    Ord_Sort |   0.37850 |
+|     Integer | 1048576  |          Blocks |        Sort |   0.03668 |
+|     Integer | 1048576  |      Decreasing |        Sort |   0.04073 |
+|     Integer | 1048576  |       Identical |        Sort |   0.03884 |
+|     Integer | 1048576  |      Increasing |        Sort |   0.01279 |
+|     Integer | 1048576  |    Random dense |        Sort |   0.06945 |
+|     Integer | 1048576  |    Random order |        Sort |   0.07151 |
+|     Integer | 1048576  |   Random sparse |        Sort |   0.07224 |
+|     Integer | 1048576  |        Random 3 |        Sort |   0.07954 |
+|     Integer | 1048576  |       Random 10 |        Sort |   0.14395 |
+|   Character |  456976  |  Char. Decrease |        Sort |   0.30367 |
+|   Character |  456976  |  Char. Increase |        Sort |   0.11316 |
+|   Character |  456976  |    Char. Random |        Sort |   0.20233 |
+| String_type |   17576  | String Decrease |        Sort |   0.64479 |
+| String_type |   17576  | String Increase |        Sort |   0.23737 |
+| String_type |   17576  |   String Random |        Sort |   0.31361 |
+|     Integer | 1048576  |          Blocks |  Sort_Index |   0.00643 |
+|     Integer | 1048576  |      Decreasing |  Sort_Index |   0.00219 |
+|     Integer | 1048576  |       Identical |  Sort_Index |   0.00126 |
+|     Integer | 1048576  |      Increasing |  Sort_Index |   0.00130 |
+|     Integer | 1048576  |    Random dense |  Sort_Index |   0.12911 |
+|     Integer | 1048576  |    Random order |  Sort_Index |   0.13024 |
+|     Integer | 1048576  |   Random sparse |  Sort_Index |   0.12956 |
+|     Integer | 1048576  |        Random 3 |  Sort_Index |   0.00781 |
+|     Integer | 1048576  |       Random 10 |  Sort_Index |   0.00281 |
+|   Character |  456976  |  Char. Decrease |  Sort_Index |   0.00779 |
+|   Character |  456976  |  Char. Increase |  Sort_Index |   0.00393 |
+|   Character |  456976  |    Char. Random |  Sort_Index |   0.22561 |
+| String_type |   17576  | String Decrease |  Sort_Index |   0.01878 |
+| String_type |   17576  | String Increase |  Sort_Index |   0.00543 |
+| String_type |   17576  |   String Random |  Sort_Index |   0.37748 |
 
 

From d84da28bbcf1d9d5671008154395b770d1367239 Mon Sep 17 00:00:00 2001
From: William Clodius <w.clodius@icloud.com>
Date: Fri, 23 Apr 2021 21:32:37 -0600
Subject: [PATCH 09/45] Fixed an off by one error in an array assignment

In both stdlib_sorting_ord_sort.fypp and stdlib_sorting_sort_index.fypp there
was an off by one array assignment, that was repeated several times throough
copy/paste/edit.
                buf(0:array_len-mid) = array(mid:array_len-1)
should have been
                buf(0:array_len-mid-1) = array(mid:array_len-1)
Fixed in both files.

[ticket: X]
---
 src/stdlib_sorting_ord_sort.fypp   |  8 ++++----
 src/stdlib_sorting_sort_index.fypp | 16 ++++++++--------
 2 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/src/stdlib_sorting_ord_sort.fypp b/src/stdlib_sorting_ord_sort.fypp
index 0e8c59a47..39f8866a8 100644
--- a/src/stdlib_sorting_ord_sort.fypp
+++ b/src/stdlib_sorting_ord_sort.fypp
@@ -355,7 +355,7 @@ contains
                     end if
                 end do merge_lower
             else ! The right run is shorter ! check that it is stable
-                buf(0:array_len-mid) = array(mid:array_len-1)
+                buf(0:array_len-mid-1) = array(mid:array_len-1)
                 i = mid - 1
                 j = array_len - mid -1
                 merge_upper: do k = array_len-1, 0, -1
@@ -696,7 +696,7 @@ contains
                     end if
                 end do merge_lower
             else ! The right run is shorter ! check that it is stable
-                buf(0:array_len-mid) = array(mid:array_len-1)
+                buf(0:array_len-mid-1) = array(mid:array_len-1)
                 i = mid - 1
                 j = array_len - mid -1
                 merge_upper: do k = array_len-1, 0, -1
@@ -1036,7 +1036,7 @@ contains
                     end if
                 end do merge_lower
             else ! The right run is shorter ! check that it is stable
-                buf(0:array_len-mid) = array(mid:array_len-1)
+                buf(0:array_len-mid-1) = array(mid:array_len-1)
                 i = mid - 1
                 j = array_len - mid -1
                 merge_upper: do k = array_len-1, 0, -1
@@ -1373,7 +1373,7 @@ contains
                     end if
                 end do merge_lower
             else ! The right run is shorter ! check that it is stable
-                buf(0:array_len-mid) = array(mid:array_len-1)
+                buf(0:array_len-mid-1) = array(mid:array_len-1)
                 i = mid - 1
                 j = array_len - mid -1
                 merge_upper: do k = array_len-1, 0, -1
diff --git a/src/stdlib_sorting_sort_index.fypp b/src/stdlib_sorting_sort_index.fypp
index 4146e77f1..158baa1f3 100644
--- a/src/stdlib_sorting_sort_index.fypp
+++ b/src/stdlib_sorting_sort_index.fypp
@@ -414,8 +414,8 @@ contains
                     end if
                 end do merge_lower
             else ! The right run is shorter
-                buf(0:array_len-mid) = array(mid:array_len-1)
-                ibuf(0:array_len-mid) = index(mid:array_len-1)
+                buf(0:array_len-mid-1) = array(mid:array_len-1)
+                ibuf(0:array_len-mid-1) = index(mid:array_len-1)
                 i = mid - 1
                 j = array_len - mid -1
                 merge_upper: do k = array_len-1, 0, -1
@@ -822,8 +822,8 @@ contains
                     end if
                 end do merge_lower
             else ! The right run is shorter
-                buf(0:array_len-mid) = array(mid:array_len-1)
-                ibuf(0:array_len-mid) = index(mid:array_len-1)
+                buf(0:array_len-mid-1) = array(mid:array_len-1)
+                ibuf(0:array_len-mid-1) = index(mid:array_len-1)
                 i = mid - 1
                 j = array_len - mid -1
                 merge_upper: do k = array_len-1, 0, -1
@@ -1228,8 +1228,8 @@ contains
                     end if
                 end do merge_lower
             else ! The right run is shorter
-                buf(0:array_len-mid) = array(mid:array_len-1)
-                ibuf(0:array_len-mid) = index(mid:array_len-1)
+                buf(0:array_len-mid-1) = array(mid:array_len-1)
+                ibuf(0:array_len-mid-1) = index(mid:array_len-1)
                 i = mid - 1
                 j = array_len - mid -1
                 merge_upper: do k = array_len-1, 0, -1
@@ -1631,8 +1631,8 @@ contains
                     end if
                 end do merge_lower
             else ! The right run is shorter
-                buf(0:array_len-mid) = array(mid:array_len-1)
-                ibuf(0:array_len-mid) = index(mid:array_len-1)
+                buf(0:array_len-mid-1) = array(mid:array_len-1)
+                ibuf(0:array_len-mid-1) = index(mid:array_len-1)
                 i = mid - 1
                 j = array_len - mid -1
                 merge_upper: do k = array_len-1, 0, -1

From b2a2149e470ddc6770e0e77ec4c22a46524ed462 Mon Sep 17 00:00:00 2001
From: William Clodius <w.clodius@icloud.com>
Date: Fri, 23 Apr 2021 21:54:33 -0600
Subject: [PATCH 10/45] Fixed improperly assumed short circuiting.

The code stdlib_sorting_sort.f90 was failing on some systems becasue it
improperlly assumed short circuiting
                do while( i >= 0 .and. array(i) > key )
should have been
                do while( i >= 0 )
		    if (array(i) <= key ) exit

[ticket: X]
---
 src/stdlib_sorting_sort.fypp | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/src/stdlib_sorting_sort.fypp b/src/stdlib_sorting_sort.fypp
index fab24aa1c..d0b803bf9 100644
--- a/src/stdlib_sorting_sort.fypp
+++ b/src/stdlib_sorting_sort.fypp
@@ -170,7 +170,8 @@ contains
             do j=1_int_size, size(array, kind=int_size)-1
                 key = array(j)
                 i = j - 1
-                do while( i >= 0 .and. array(i) > key )
+                do while( i >= 0 )
+                    if ( array(i) <= key ) exit
                     array(i+1) = array(i)
                     i = i - 1
                 end do
@@ -342,7 +343,8 @@ contains
             do j=1_int_size, size(array, kind=int_size)-1
                 key = array(j)
                 i = j - 1
-                do while( i >= 0 .and. array(i) > key )
+                do while( i >= 0 )
+                    if ( array(i) <= key ) exit
                     array(i+1) = array(i)
                     i = i - 1
                 end do
@@ -509,7 +511,8 @@ contains
             do j=1_int_size, size(array, kind=int_size)-1
                 key = array(j)
                 i = j - 1
-                do while( i >= 0 .and. array(i) > key )
+                do while( i >= 0 )
+                    if ( array(i) <= key ) exit
                     array(i+1) = array(i)
                     i = i - 1
                 end do
@@ -673,7 +676,8 @@ contains
             do j=1_int_size, size(array, kind=int_size)-1
                 key = array(j)
                 i = j - 1
-                do while( i >= 0 .and. array(i) > key )
+                do while( i >= 0 )
+                    if ( array(i) <= key ) exit
                     array(i+1) = array(i)
                     i = i - 1
                 end do

From 4280f3ee7d2963c2cd834a2ee6a475372f0942f9 Mon Sep 17 00:00:00 2001
From: William Clodius <w.clodius@icloud.com>
Date: Fri, 30 Apr 2021 12:55:16 -0600
Subject: [PATCH 11/45] Fixed documentation problems noted by gareth_nx

Changed the markdown document docs/specs/stdlib_sorting.md to correct problems
noted by gareth_nx. Changed comments in the FYPP file
src/stlib_sorting_sort_index.fypp to note that the input array is sorted on
output.

[ticket: X]
---
 doc/specs/stdlib_sorting.md        | 62 +++++++++++++++---------------
 src/stdlib_sorting_sort_index.fypp | 12 ++++--
 2 files changed, 40 insertions(+), 34 deletions(-)

diff --git a/doc/specs/stdlib_sorting.md b/doc/specs/stdlib_sorting.md
index 346a95448..9b190b05c 100644
--- a/doc/specs/stdlib_sorting.md
+++ b/doc/specs/stdlib_sorting.md
@@ -152,14 +152,14 @@ housekeeping it has slower runtime performance than `ORD_SORT`.
 provided as optional `work` and `iwork` arguments or allocated
 internally on the stack.
 
-#### The `SORT` subroutines
+#### The `SORT` subroutine
 
 `SORT` uses the `introsort` sorting algorithm of David Musser.
 `introsort` is a hybrid unstable comparison algorithm combining
 `quicksort`, `insertion sort`, and `heap sort`. While this algorithm's
 runtime performance is always O(N Ln(N)), it is relatively fast on
-randomly ordered data, but inconsistent in performance on partly
-sorted data.as the official source of the algorithm.
+randomly ordered data, but does not show the improvement in
+performance on partly sorted data found for `ORD_SORT`.
 
 As with `introsort`, `SORT` is an unstable hybrid algorithm.
 First it examines the array and estimates the depth of recursion a
@@ -182,14 +182,16 @@ calls `introsort` proper. `introsort` proper then:
     * Calls `introsort` proper on the rightmost partition, and then
       returns.
 
-The resulting algorithm is of order O(N Ln(N)) run time
-performance for all inputs. Because it relies on `quicksort`, the
-coefficient of the O(N Ln(N)) behavior is typically small compared to
-other sorting algorithms on random data. On partially sorted data it
-can show either slower `heap sort` performance, or enhanced
-performance by up to a factor of six. Still, even when it shows
-enhanced performance, its performance on partially sorted data is
-typically an order of magnitude slower than `ORD_SORT`.
+The resulting algorithm is of order O(N Ln(N)) run time performance
+for all inputs. Because it relies on `quicksort`, the coefficient of
+the O(N Ln(N)) behavior is typically small compared to other sorting
+algorithms on random data. On partially sorted data it can show either
+slower `heap sort` performance, or enhanced performance by up to a
+factor of six. Still, even when it shows enhanced performance, its
+performance on partially sorted data is typically an order of
+magnitude slower than `ORD_SORT`. Its memory requirements are also
+low, being of order O(Ln(N)), while the memory requirements of
+`ORD_SORT` and `SORT_INDEX` are of order O(N).
 
 ### Tentative specifications of the `stdlib_sorting` procedures
 
@@ -255,9 +257,7 @@ function `LGT`.
     call read_sorted_file( 'dummy_file1', array1 )
     call read_sorted_file( 'dummy_file2', array2 )
     ! Concatenate the arrays
-    allocate( array( size(array1) + size(array2) ) )
-    array( 1:size(array1) ) = array1(:)
-    array( size(array1)+1:size(array1)+size(array2) ) = array2(:)
+	array = [ array1, array2 ]
     ! Sort the resulting array
     call ord_sort( array, work )
     ! Process the sorted array
@@ -318,7 +318,7 @@ element of `array` is a `NaN`.  Sorting of `CHARACTER(*)` and
     ...
 ```
 
-#### `sort_index` - creates an array of sorting indices for an input array.
+#### `sort_index` - creates an array of sorting indices for an input array, while also sorting the array.
 
 ##### Status
 
@@ -326,8 +326,9 @@ Experimental
 
 ##### Description
 
-Returns an integer array whose elements would sort the input array in
-the specified direction retaining order stability.
+Returns the input `array` sorted in the direction requested while
+retaining order stability, and an integer array whose elements would
+sort the input `array` to produce the output `array`.
 
 ##### Syntax
 
@@ -381,9 +382,10 @@ replace "scratch" memory that would otherwise be allocated on the
 stack. If `array` is of any kind of `REAL` the order of the elements in
 `index` and `array` on return are undefined if any element of `array`
 is a `NaN`. Sorting of `CHARACTER(*)` and `STRING_TYPE` arrays are
-based on the operator `>`, and not on the function `LGT`. It should be
-emphasized that the order of `array` will typically be different on
-return.
+based on the operator `>`, and not on the function `LGT`.
+
+It should be emphasized that the order of `array` will typically be
+different on return
 
 
 ##### Examples
@@ -392,15 +394,15 @@ Sorting a related rank one array:
 
 ```Fortran
     subroutine sort_related_data( a, b, work, index, iwork )
-        ! Sort `b` in terms or its related array `a`
+        ! Sort `a`, and  also  sort `b` to be reorderd the same way as `a`
         integer, intent(inout)           :: a(:)
         integer(int32), intent(inout)    :: b(:) ! The same size as a
 		integer(int32), intent(inout)    :: work(:)
 		integer(int_size), intent(inout) :: index(:)
 		integer(int_size), intent(inout) :: iwork(:)
 		! Find the indices to sort a
-		call sort_index(a, index(1:size(a)),&
-		    work(1:size(a)/2), iwork(1:size(a)/2))
+        call sort_index(a, index(1:size(a)),&
+            work(1:size(a)/2), iwork(1:size(a)/2))
 		! Sort b based on the sorting of a
 		b(:) = b( index(1:size(a)) )
 	end subroutine sort_related_data
@@ -410,23 +412,23 @@ Sorting a rank 2 array based on the data in a column
 
 ```Fortran
 	subroutine sort_related_data( array, column, work, index, iwork )
-	    ! Sort `a_data` in terms or its component `a`
-	    integer, intent(inout)           :: a(:,:)
+	    ! Reorder rows of `array` such that `array(:, column)` is  sorted
+	    integer, intent(inout)           :: array(:,:)
 		integer(int32), intent(in)       :: column
 		integer(int32), intent(inout)    :: work(:)
 		integer(int_size), intent(inout) :: index(:)
 		integer(int_size), intent(inout) :: iwork(:)
 		integer, allocatable             :: dummy(:)
 		integer :: i
-		allocate(dummy(size(a, dim=1)))
-		! Extract a component of `a_data`
-		dummy(:) = a(:, column)
+		allocate(dummy(size(array, dim=1)))
+		! Extract a column of `array`
+		dummy(:) = array(:, column)
 		! Find the indices to sort the column
 		call sort_index(dummy, index(1:size(dummy)),&
 		    work(1:size(dummy)/2), iwork(1:size(dummy)/2))
 		! Sort a based on the sorting of its column
-		do i=1, size(a, dim=2)
-		    a(:, i) = a(index(1:size(a, dim=1)), i)
+		do i=1, size(array, dim=2)
+		    array(:, i) = array(index(1:size(array, dim=1)), i)
 		end do
 	end subroutine sort_related_data
 ```
diff --git a/src/stdlib_sorting_sort_index.fypp b/src/stdlib_sorting_sort_index.fypp
index 158baa1f3..6aa078b2b 100644
--- a/src/stdlib_sorting_sort_index.fypp
+++ b/src/stdlib_sorting_sort_index.fypp
@@ -63,7 +63,8 @@ contains
 
     module subroutine ${k1}$_sort_index( array, index, work, iwork, reverse )
 ! A modification of `${k1}$_ord_sort` to return an array of indices that
-! would provide a stable sort of the `ARRAY` input. The indices by default
+! would perform a stable sort of the `ARRAY` as input, and also sort `ARRAY`
+! as desired. The indices by default
 ! correspond to a non-decreasing sort, but if the optional argument
 ! `REVERSE` is present with a value of `.TRUE.` the indices correspond to
 ! a non-increasing sort. The logic of the determination of indexing largely
@@ -471,7 +472,8 @@ contains
 
     module subroutine ${k1}$_sort_index( array, index, work, iwork, reverse )
 ! A modification of `${k1}$_ord_sort` to return an array of indices that
-! would provide a stable sort of the `ARRAY` input. The indices by default
+! would perform a stable sort of the `ARRAY` as input, and also sort `ARRAY`
+! as desired. The indices by default
 ! correspond to a non-decreasing sort, but if the optional argument
 ! `REVERSE` is present with a value of `.TRUE.` the indices correspond to
 ! a non-increasing sort. The logic of the determination of indexing largely
@@ -876,7 +878,8 @@ contains
 
     module subroutine char_sort_index( array, index, work, iwork, reverse )
 ! A modification of `char_ord_sort` to return an array of indices that
-! would provide a stable sort of the `ARRAY` input. The indices by default
+! would perform a stable sort of the `ARRAY` as input, and also sort `ARRAY`
+! as desired. The indices by default
 ! correspond to a non-decreasing sort, but if the optional argument
 ! `REVERSE` is present with a value of `.TRUE.` the indices correspond to
 ! a non-increasing sort. The logic of the determination of indexing largely
@@ -1280,7 +1283,8 @@ contains
 
     module subroutine string_sort_index( array, index, work, iwork, reverse )
 ! A modification of `string_ord_sort` to return an array of indices that
-! would provide a stable sort of the `ARRAY` input. The indices by default
+! would perform a stable sort of the `ARRAY` as input, and also sort `ARRAY`
+! as desired. The indices by default
 ! correspond to a non-decreasing sort, but if the optional argument
 ! `REVERSE` is present with a value of `.TRUE.` the indices correspond to
 ! a non-increasing sort. The logic of the determination of indexing largely

From 62c678148c84104d40afa957d28c15df15cfbcac Mon Sep 17 00:00:00 2001
From: William Clodius <w.clodius@icloud.com>
Date: Fri, 30 Apr 2021 13:04:23 -0600
Subject: [PATCH 12/45] Commited final edit

Forgot to save an edit prior to the previous commit of
docs/specs/stdlib_sorting.md. Saved and commited that edit.

[ticket: X]
---
 doc/specs/stdlib_sorting.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/specs/stdlib_sorting.md b/doc/specs/stdlib_sorting.md
index 9b190b05c..85823052f 100644
--- a/doc/specs/stdlib_sorting.md
+++ b/doc/specs/stdlib_sorting.md
@@ -433,7 +433,7 @@ Sorting a rank 2 array based on the data in a column
 	end subroutine sort_related_data
 ```
 
-Sorting an array of a derived type based on the dsta in one component
+Sorting an array of a derived type based on the data in one component
 
 ```fortran
 	subroutine sort_a_data( a_data, a, work, index, iwork )

From 0a73bbc994dfc99a70777fa41be007d14d5c8819 Mon Sep 17 00:00:00 2001
From: William Clodius <w.clodius@icloud.com>
Date: Fri, 30 Apr 2021 13:21:41 -0600
Subject: [PATCH 13/45] Fixed misspellings

Review-dog caught several misspellings in the comments of the source codes.

[ticket: X]
---
 src/stdlib_sorting.fypp            |  4 ++--
 src/stdlib_sorting_ord_sort.fypp   | 10 +++++-----
 src/stdlib_sorting_sort_index.fypp |  2 +-
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/src/stdlib_sorting.fypp b/src/stdlib_sorting.fypp
index e7fe92589..00d153d43 100644
--- a/src/stdlib_sorting.fypp
+++ b/src/stdlib_sorting.fypp
@@ -50,7 +50,7 @@
 !!   option. This file may not be copied, modified, or distributed
 !!   except according to those terms.
 !!
-!! so the licence for the original`slice.rs` code is compatible with the use
+!! so the license for the original`slice.rs` code is compatible with the use
 !! of modified versions of the code in the Fortran Standard Library under
 !! the MIT license.
 !!
@@ -221,7 +221,7 @@ module stdlib_sorting
 !!
 !! The generic subroutine implementing the `SORT_INDEX` algorithm to
 !! return an index array whose elements would sort the input array in the
-!! desired direction. It is primarilly intended to be used to sort a
+!! desired direction. It is primarily intended to be used to sort a
 !! derived type array based on the values of a component of the array.
 !! Its use has the syntax:
 !!
diff --git a/src/stdlib_sorting_ord_sort.fypp b/src/stdlib_sorting_ord_sort.fypp
index 39f8866a8..eb0e58480 100644
--- a/src/stdlib_sorting_ord_sort.fypp
+++ b/src/stdlib_sorting_ord_sort.fypp
@@ -49,7 +49,7 @@
 !!   option. This file may not be copied, modified, or distributed
 !!   except according to those terms.
 !!
-!! so the licence for the original`slice.rs` code is compatible with the use
+!! so the license for the original`slice.rs` code is compatible with the use
 !! of modified versions of the code in the Fortran Standard Library under
 !! the MIT license.
 
@@ -73,7 +73,7 @@ contains
 ! a hybrid sorting algorithm combining an iterative Merge sort controlled
 ! by a stack of `RUNS` identified by regions of uniformly decreasing or
 ! non-decreasing sequences that may be expanded to a minimum run size and
-! initialy processed by an insertion sort.
+! initially processed by an insertion sort.
 !
 ! Note the Fortran implementation simplifies the logic as it only has to
 ! deal with Fortran arrays of intrinsic types and not the full generality
@@ -414,7 +414,7 @@ contains
 ! a hybrid sorting algorithm combining an iterative Merge sort controlled
 ! by a stack of `RUNS` identified by regions of uniformly decreasing or
 ! non-decreasing sequences that may be expanded to a minimum run size and
-! initialy processed by an insertion sort.
+! initially processed by an insertion sort.
 !
 ! Note the Fortran implementation simplifies the logic as it only has to
 ! deal with Fortran arrays of intrinsic types and not the full generality
@@ -753,7 +753,7 @@ contains
 ! a hybrid sorting algorithm combining an iterative Merge sort controlled
 ! by a stack of `RUNS` identified by regions of uniformly decreasing or
 ! non-decreasing sequences that may be expanded to a minimum run size and
-! initialy processed by an insertion sort.
+! initially processed by an insertion sort.
 !
 ! Note the Fortran implementation simplifies the logic as it only has to
 ! deal with Fortran arrays of intrinsic types and not the full generality
@@ -1091,7 +1091,7 @@ contains
 ! a hybrid sorting algorithm combining an iterative Merge sort controlled
 ! by a stack of `RUNS` identified by regions of uniformly decreasing or
 ! non-decreasing sequences that may be expanded to a minimum run size and
-! initialy processed by an insertion sort.
+! initially processed by an insertion sort.
 !
 ! Note the Fortran implementation simplifies the logic as it only has to
 ! deal with Fortran arrays of intrinsic types and not the full generality
diff --git a/src/stdlib_sorting_sort_index.fypp b/src/stdlib_sorting_sort_index.fypp
index 6aa078b2b..222c2cd97 100644
--- a/src/stdlib_sorting_sort_index.fypp
+++ b/src/stdlib_sorting_sort_index.fypp
@@ -49,7 +49,7 @@
 !!   option. This file may not be copied, modified, or distributed
 !!   except according to those terms.
 !!
-!! so the licence for the original`slice.rs` code is compatible with the use
+!! so the license for the original`slice.rs` code is compatible with the use
 !! of modified versions of the code in the Fortran Standard Library under
 !! the MIT license.
 

From a0a5d0c6e78502da1d500c9d1a2d35cd669ef8e5 Mon Sep 17 00:00:00 2001
From: William Clodius <w.clodius@icloud.com>
Date: Sun, 2 May 2021 07:56:12 -0600
Subject: [PATCH 14/45] Commit modified index.md

Modified doc/specs/index.md so it mentioned stdlib_sorting.md.

[ticket: X]
---
 doc/specs/index.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/specs/index.md b/doc/specs/index.md
index 4238dc35e..0883ea07c 100644
--- a/doc/specs/index.md
+++ b/doc/specs/index.md
@@ -19,7 +19,7 @@ This is and index/directory of the specifications (specs) for each new module/fe
  - [logger](./stdlib_logger.html) - Runtime logging system
  - [optval](./stdlib_optval.html) - Fallback value for optional arguments
  - [quadrature](./stdlib_quadrature.html) - Numerical integration
- - [sorting](.stdlib_sorting.html) - Sorting of numerical arrays
+ - [sorting](.stdlib_sorting.html) - Sorting of rank one arrays
  - [stats](./stdlib_stats.html) - Descriptive Statistics
  - [stats_distribution_PRNG](./stdlib_stats_distribution_PRNG.html) - Probability Distributions random number generator
  - [string\_type](./stdlib_string_type.html) - Basic string support

From b45441b193a5491731593b8d623fa9db66464480 Mon Sep 17 00:00:00 2001
From: William Clodius <w.clodius@icloud.com>
Date: Sun, 2 May 2021 08:12:22 -0600
Subject: [PATCH 15/45] Added changes suggested by gareth-nx.

Added changes in the description of `SORT_INDEX`.

[ticket: X]
---
 doc/specs/stdlib_sorting.md | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/doc/specs/stdlib_sorting.md b/doc/specs/stdlib_sorting.md
index 85823052f..83429166d 100644
--- a/doc/specs/stdlib_sorting.md
+++ b/doc/specs/stdlib_sorting.md
@@ -44,9 +44,10 @@ data:
 * `ORD_SORT` is intended to sort simple arrays of intrinsic data
   that have significant sections that were partially ordered before
   the sort;
-* `SORT_INDEX` is intended to provide indices for sorting arrays of
-  derived type data, based on the ordering of an intrinsic component
-  of the derived type; and
+* `SORT_INDEX` is based on ORD_SORT, but in addition to sorting the
+  input array, it returns indices that map the original array to its
+  sorted version. This enables related arrays to be re-ordered in the
+  same way; and
 * `SORT` is intended to sort simple arrays of intrinsic data
   that are effectively unordered before the sort.
 
@@ -142,7 +143,8 @@ arrays, or arrays of derived types. For such related data, what is
 useful is an array of indices that maps a rank 1 array to its sorted
 form. For such a sort, a stable sort is useful, therefore the module
 provides a subroutine, `SORT_INDEX`, that generates such an array of
-indices based on the `ORD_SORT` algorithm.
+indices based on the `ORD_SORT` algorithm, in addition to sorting
+the input array.
 
 The logic of `SORT_INDEX` parallels that of `ORD_SORT`, with
 additional housekeeping to keep the array of indices consistent with

From 7087255513f4a9248e941d87b09aa45070756309 Mon Sep 17 00:00:00 2001
From: "William B. Clodius" <65470906+wclodius2@users.noreply.github.com>
Date: Sat, 8 May 2021 19:59:12 -0600
Subject: [PATCH 16/45] Update src/stdlib_sorting.fypp

Co-authored-by: Jeremie Vandenplas <jeremie.vandenplas@gmail.com>
---
 src/stdlib_sorting.fypp | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/src/stdlib_sorting.fypp b/src/stdlib_sorting.fypp
index 00d153d43..fe96afa43 100644
--- a/src/stdlib_sorting.fypp
+++ b/src/stdlib_sorting.fypp
@@ -1,8 +1,4 @@
-#! Integer kinds to be considered during templating
-#:set INT_KINDS = ["int8", "int16", "int32", "int64"]
-
-#! Real kinds to be considered during templating
-#:set REAL_KINDS = ["sp", "dp", "qp"]
+#:include "common.fypp"
 
 !! Licensing:
 !!

From 6f0f189344469f8f4d9685945e4ad14afed6bc1b Mon Sep 17 00:00:00 2001
From: "William B. Clodius" <65470906+wclodius2@users.noreply.github.com>
Date: Sat, 8 May 2021 20:10:30 -0600
Subject: [PATCH 17/45] Update src/stdlib_sorting.fypp

Co-authored-by: Jeremie Vandenplas <jeremie.vandenplas@gmail.com>
---
 src/stdlib_sorting.fypp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/stdlib_sorting.fypp b/src/stdlib_sorting.fypp
index fe96afa43..8a1020d81 100644
--- a/src/stdlib_sorting.fypp
+++ b/src/stdlib_sorting.fypp
@@ -84,7 +84,7 @@ module stdlib_sorting
 !!
 !! `SORT_INDEX` is a modification of `ORD_SORT` so that in addition to
 !! sorting the input array, it returns the indices that map to a
-!! stable sort of the original array. These indices are intended to be
+!! stable sort of the original array. These indices are
 !! intended to be used to sort data that is correlated with the input
 !! array, e.g., different arrays in a database, different columns of a
 !! rank 2 array, different elements of a derived type. It is less

From 7c8ed79ce3fe25445ba8f187c552b2bd35d86d69 Mon Sep 17 00:00:00 2001
From: "William B. Clodius" <65470906+wclodius2@users.noreply.github.com>
Date: Sat, 8 May 2021 20:12:01 -0600
Subject: [PATCH 18/45] Update src/stdlib_sorting.fypp

Co-authored-by: Jeremie Vandenplas <jeremie.vandenplas@gmail.com>
---
 src/stdlib_sorting.fypp | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/stdlib_sorting.fypp b/src/stdlib_sorting.fypp
index 8a1020d81..8908ccfb7 100644
--- a/src/stdlib_sorting.fypp
+++ b/src/stdlib_sorting.fypp
@@ -128,10 +128,10 @@ module stdlib_sorting
                                log(1.6180339887_dp) ) )
 
     type run_type
-! Version: experimental
-!
-! Used to pass state around in a stack among helper functions for the
-! `ORD_SORT` and `SORT_INDEX` algorithms
+!! Version: experimental
+!!
+!! Used to pass state around in a stack among helper functions for the
+!! `ORD_SORT` and `SORT_INDEX` algorithms
         integer(int_size) :: base = 0
         integer(int_size) :: len = 0
     end type run_type

From 8b9be8118ca63db329e1f0e67e964c1aafaafebd Mon Sep 17 00:00:00 2001
From: William Clodius <w.clodius@icloud.com>
Date: Sat, 8 May 2021 20:14:21 -0600
Subject: [PATCH 19/45] Commited suggestions by jvdp1

Made various corrections in the document stdlib_sorting.md and the code
stdlib_sorting.fypp.

[ticket: X]
---
 doc/specs/stdlib_sorting.md | 38 ++++++++++++++++++++++++++++++-------
 1 file changed, 31 insertions(+), 7 deletions(-)

diff --git a/doc/specs/stdlib_sorting.md b/doc/specs/stdlib_sorting.md
index 83429166d..22447a222 100644
--- a/doc/specs/stdlib_sorting.md
+++ b/doc/specs/stdlib_sorting.md
@@ -4,7 +4,7 @@ title: Sorting Procedures
 
 # The `stdlib_sorting` module
 
-(TOC)
+[TOC]
 
 ## Overview of sorting
 
@@ -24,7 +24,7 @@ module's `string_type` type.
 
 ## Overview of the module
 
-The module `stdlib_sorting` defines several public entities one
+The module `stdlib_sorting` defines several public entities, one
 default integer parameter, `int_size`, and three overloaded
 subroutines: `ORD_SORT`, `SORT`, and `SORT_INDEX`. The
 overloaded subroutines also each have seven specific names for
@@ -44,7 +44,7 @@ data:
 * `ORD_SORT` is intended to sort simple arrays of intrinsic data
   that have significant sections that were partially ordered before
   the sort;
-* `SORT_INDEX` is based on ORD_SORT, but in addition to sorting the
+* `SORT_INDEX` is based on `ORD_SORT`, but in addition to sorting the
   input array, it returns indices that map the original array to its
   sorted version. This enables related arrays to be re-ordered in the
   same way; and
@@ -111,13 +111,13 @@ performance on uniformly increasing or decreasing data.
 `ORD_SORT` begins by traversing the array starting in its tail
 attempting to identify `runs` in the array, where a run is either a
 uniformly decreasing sequence, `ARRAY(i-1) > ARRAY(i)`, or a
-non-decreasing, `ARRAY(i-1) <= ARRAY(i)`, sequence. Once delimitated
+non-decreasing, `ARRAY(i-1) <= ARRAY(i)`, sequence. First delimitated
 decreasing sequences are reversed in their order. Then, if the
 sequence has less than `MIN_RUN` elements, previous elements in the
 array are added to the run using `insertion sort` until the run
 contains `MIN_RUN` elements or the array is completely processed. As
 each run is identified the start and length of the run 
-is then pushed onto a stack and the stack is then processed using
+are then pushed onto a stack and the stack is then processed using
 `merge` until it obeys the stack invariants:
 
 1. len(i-2) > len(i-1) + len(i)
@@ -163,7 +163,6 @@ runtime performance is always O(N Ln(N)), it is relatively fast on
 randomly ordered data, but does not show the improvement in
 performance on partly sorted data found for `ORD_SORT`.
 
-As with `introsort`, `SORT` is an unstable hybrid algorithm.
 First it examines the array and estimates the depth of recursion a
 quick sort would require for ideal (random) data, `D =
 Ceiling(Ln(N)/Ln(2))`. It then defines a limit to the number of
@@ -195,7 +194,7 @@ magnitude slower than `ORD_SORT`. Its memory requirements are also
 low, being of order O(Ln(N)), while the memory requirements of
 `ORD_SORT` and `SORT_INDEX` are of order O(N).
 
-### Tentative specifications of the `stdlib_sorting` procedures
+### Specifications of the `stdlib_sorting` procedures
 
 #### `ord_sort` - sorts an input array
 
@@ -267,6 +266,19 @@ function `LGT`.
     ...
 ```
 
+```fortran
+    program demo_ord_sort
+      use stdlib_sorting, only: ord_sort
+      implicit none
+      integer, allocatable :: array1(:), work(:)
+    
+      array1 = [ 5, 4, 3, 1, 10, 4, 9]
+      allocate(work, mold = array1)
+      call ord_sort(array1, work)
+      print*, array1   !print [1, 3, 4, 4, 5, 9, 10]
+    end program demo_ord_sort
+```
+
 #### `sort` - sorts an input array
 
 ##### Status
@@ -320,6 +332,18 @@ element of `array` is a `NaN`.  Sorting of `CHARACTER(*)` and
     ...
 ```
 
+```fortran
+    program demo_sort
+      use stdlib_sorting, only: sort
+      implicit none
+      integer, allocatable :: array(:)
+    
+      array = [ 5, 4, 3, 1, 10, 4, 9]
+      call sort(array)
+      print*, array   !print [1, 3, 4, 4, 5, 9, 10]
+    end program demo_sort
+```
+
 #### `sort_index` - creates an array of sorting indices for an input array, while also sorting the array.
 
 ##### Status

From abf10c66d581d3e61a00118cdf3b54232bb96511 Mon Sep 17 00:00:00 2001
From: William Clodius <w.clodius@icloud.com>
Date: Wed, 19 May 2021 20:40:17 -0600
Subject: [PATCH 20/45] Implemented changes suggested by Jeremie

Made various changes:
* in stdlib_sorting.fypp changed
  "public :: int_size ... integer, parameter :: in_size..." to
  "integer, parameter, public :: int_size ..."

* Eliminated the specific procedures public in stdlib_sorting.fypp and their
  discussion in stdlib_sorting.md.

* replaced specific reference to INT_KINDS and REAL_KINDS in
  stdlib_sorting_sort.fypp, stdlib_sorting_ord_sort.fypp, and
  stdlib_sorting_sort_index.fypp with reference to common.fypp.

* in stdlib_sorting_ord_sort.fypp made ERROR STOP return the procedure name
  in which the allocation error occured.

[ticket: X]
---
 doc/specs/stdlib_sorting.md        | 39 -------------------------
 src/stdlib_sorting.fypp            | 46 +-----------------------------
 src/stdlib_sorting_ord_sort.fypp   | 14 ++++-----
 src/stdlib_sorting_sort.fypp       |  6 +---
 src/stdlib_sorting_sort_index.fypp |  6 +---
 5 files changed, 8 insertions(+), 103 deletions(-)

diff --git a/doc/specs/stdlib_sorting.md b/doc/specs/stdlib_sorting.md
index 22447a222..12ca35dc1 100644
--- a/doc/specs/stdlib_sorting.md
+++ b/doc/specs/stdlib_sorting.md
@@ -479,45 +479,6 @@ Sorting an array of a derived type based on the data in one component
 	end subroutine sort_a_data
 ```
 
-#### Specific procedures
-
-Usually the name of a generic procedure is the most convenient way of
-invoking it. However sometimes it is necessary to pass a procedure as
-an argument to another procedure. In that case it is usually necessary
-to know the name of the specific procedure desired. The following
-table lists the specific subroutines and the corresponding types of
-their `array` arguments.
-
-| Generic Subroutine | Specific Subroutine | Array type      |
-|--------------------|---------------------|-----------------|
-| `ORD_SORT`         | `INT8_ORD_SORT`     | `INTEGER(INT8)` |
-|                    | `INT16_ORD_SORT`    | `INTEGER(INT16)` |
-|                    | `INT32_ORD_SORT`    | `INTEGER(INT32)` |
-|                    | `INT64_ORD_SORT`    | `INTEGER(INT64)` |
-|                    | `SP_ORD_SORT`    | `REAL(SP)` |
-|                    | `DP_ORD_SORT`    | `REAL(DP)` |
-|                    | `QP_ORD_SORT`    | `REAL(QP)` |
-|                    | `CHAR_ORD_SORT`  | `CHARACTER(*)`  |
-|                    | `STRING_ORD_SORT`  | `STRING_TYPE`  |
-| `SORT`         | `INT8_SORT`     | `INTEGER(INT8)` |
-|                    | `INT16_SORT`    | `INTEGER(INT16)` |
-|                    | `INT32_SORT`    | `INTEGER(INT32)` |
-|                    | `INT64_SORT`    | `INTEGER(INT64)` |
-|                    | `SP_SORT`    | `REAL(SP)` |
-|                    | `DP_SORT`    | `REAL(DP)` |
-|                    | `QP_SORT`    | `REAL(QP)` |
-|                    | `CHAR_SORT`  | `CHARACTER(*)` |
-|                    | `STRING_SORT`  | `STRING_TYPE` |
-| `SORT_INDEX`      | `INT8_SORT_INDEX`     | `INTEGER(INT8)` |
-|                    | `INT16_SORT_INDEX`    | `INTEGER(INT16)` |
-|                    | `INT32_SORT_INDEX`    | `INTEGER(INT32)` |
-|                    | `INT64_SORT_INDEX`    | `INTEGER(INT64)` |
-|                    | `SP_SORT_INDEX`    | `REAL(SP)` |
-|                    | `DP_SORT_INDEX`    | `REAL(DP)` |
-|                    | `QP_SORT_INDEX`    | `REAL(QP)` |
-|                    | `CHAR_SORT_INDEX`    | `CHARACTER(*)` |
-|                    | `STRING_SORT_INDEX`    | `STRING_TYPE` |
-
 
 ### Performance benchmarks
 
diff --git a/src/stdlib_sorting.fypp b/src/stdlib_sorting.fypp
index 8908ccfb7..ba197a4bf 100644
--- a/src/stdlib_sorting.fypp
+++ b/src/stdlib_sorting.fypp
@@ -115,9 +115,7 @@ module stdlib_sorting
     implicit none
     private
 
-    public :: int_size
-
-    integer, parameter :: int_size = int64 !! Integer kind for indexing
+    integer, parameter, public :: int_size = int64 !! Integer kind for indexing
 
 ! Constants for use by tim_sort
     integer, parameter :: &
@@ -322,48 +320,6 @@ module stdlib_sorting
 !!    end subroutine sort_a_data
 !!```
 
-#:for k1 in INT_KINDS
-    public ${k1}$_ord_sort
-!! Version: experimental
-!!
-!! `${k1}$_ord_sort` is the specific `ORD_SORT` subroutine for an `array`
-!! argument of type `integer(${k1}$)`.
-
-    public ${k1}$_sort
-!! Version: experimental
-!!
-!! `${k1}$_sort` is the specific `SORT` subroutine for an `array`
-!! argument of type `integer(${k1}$)`.
-
-    public ${k1}$_sort_index
-!! Version: experimental
-!!
-!! `${k1}$_sort_index` is the specific `SORT_INDEX` subroutine for an `array`
-!! argument of type `integer(${k1}$)`.
-
-#:endfor
-
-#:for k1 in REAL_KINDS
-    public ${k1}$_ord_sort
-!! Version: experimental
-!!
-!! `${k1}$_ord_sort` is the specific `ORD_SORT` subroutine for an `array`
-!! argument of type `real(${k1}$)`.
-
-    public ${k1}$_sort
-!! Version: experimental
-!!
-!! `${k1}$_sort` is the specific `SORT` subroutine for an `array`
-!! argument of type `real(${k1}$)`.
-
-    public ${k1}$_sort_index
-!! Version: experimental
-!!
-!! `${k1}$_sort_index` is the specific `SORT_INDEX` subroutine for an `array`
-!! argument of type `real(${k1}$)`.
-
-#:endfor
-
     public char_ord_sort
 
     public char_sort
diff --git a/src/stdlib_sorting_ord_sort.fypp b/src/stdlib_sorting_ord_sort.fypp
index eb0e58480..8977267c3 100644
--- a/src/stdlib_sorting_ord_sort.fypp
+++ b/src/stdlib_sorting_ord_sort.fypp
@@ -1,8 +1,4 @@
-#! Integer kinds to be considered during templating
-#:set INT_KINDS = ["int8", "int16", "int32", "int64"]
-
-#! Real kinds to be considered during templating
-#:set REAL_KINDS = ["sp", "dp", "qp"]
+#:include "common.fypp"
 
 !! Licensing:
 !!
@@ -95,7 +91,7 @@ contains
 ! Allocate a buffer to use as scratch memory.
             array_size = size( array, kind=int_size )
             allocate( buf(0:array_size/2-1), stat=stat )
-            if ( stat /= 0 ) error stop "Allocation of buffer failed."
+            if ( stat /= 0 ) error stop "${k1}$_ord_sort: Allocation of buffer failed."
             call merge_sort( array, buf )
         end if
 
@@ -436,7 +432,7 @@ contains
 ! Allocate a buffer to use as scratch memory.
             array_size = size( array, kind=int_size )
             allocate( buf(0:array_size/2-1), stat=stat )
-            if ( stat /= 0 ) error stop "Allocation of buffer failed."
+            if ( stat /= 0 ) error stop "${k1}$_ord_sort: Allocation of buffer failed."
             call merge_sort( array, buf )
         end if
 
@@ -776,7 +772,7 @@ contains
             array_size = size( array, kind=int_size )
             allocate( character(len=len(array)) :: buf(0:array_size/2-1), &
                       stat=stat )
-            if ( stat /= 0 ) error stop "Allocation of buffer failed."
+            if ( stat /= 0 ) error stop "${k1}$_ord_sort: Allocation of buffer failed."
             call merge_sort( array, buf )
         end if
 
@@ -1113,7 +1109,7 @@ contains
 ! Allocate a buffer to use as scratch memory.
             array_size = size( array, kind=int_size )
             allocate( buf(0:array_size/2-1), stat=stat )
-            if ( stat /= 0 ) error stop "Allocation of buffer failed."
+            if ( stat /= 0 ) error stop "${k1}$_ord_sort: Allocation of buffer failed."
             call merge_sort( array, buf )
         end if
 
diff --git a/src/stdlib_sorting_sort.fypp b/src/stdlib_sorting_sort.fypp
index d0b803bf9..312a38cfa 100644
--- a/src/stdlib_sorting_sort.fypp
+++ b/src/stdlib_sorting_sort.fypp
@@ -1,8 +1,4 @@
-#! Integer kinds to be considered during templating
-#:set INT_KINDS = ["int8", "int16", "int32", "int64"]
-
-#! Real kinds to be considered during templating
-#:set REAL_KINDS = ["sp", "dp", "qp"]
+#:include "common.fypp"
 
 !! Licensing:
 !!
diff --git a/src/stdlib_sorting_sort_index.fypp b/src/stdlib_sorting_sort_index.fypp
index 222c2cd97..ea488768f 100644
--- a/src/stdlib_sorting_sort_index.fypp
+++ b/src/stdlib_sorting_sort_index.fypp
@@ -1,8 +1,4 @@
-#! Integer kinds to be considered during templating
-#:set INT_KINDS = ["int8", "int16", "int32", "int64"]
-
-#! Real kinds to be considered during templating
-#:set REAL_KINDS = ["sp", "dp", "qp"]
+#:include "common.fypp"
 
 !! Licensing:
 !!

From e8d80de39a24bf52b5b5ff84b2daafc1138e8518 Mon Sep 17 00:00:00 2001
From: Jeremie Vandenplas <jeremie.vandenplas@gmail.com>
Date: Thu, 20 May 2021 20:41:53 +0200
Subject: [PATCH 21/45] simplifies code by adding fypp variables

---
 src/stdlib_sorting_ord_sort.fypp | 370 ++-----------------------------
 1 file changed, 15 insertions(+), 355 deletions(-)

diff --git a/src/stdlib_sorting_ord_sort.fypp b/src/stdlib_sorting_ord_sort.fypp
index 8977267c3..437a4fb7c 100644
--- a/src/stdlib_sorting_ord_sort.fypp
+++ b/src/stdlib_sorting_ord_sort.fypp
@@ -1,4 +1,5 @@
 #:include "common.fypp"
+#:set IR_KINDS_TYPES = INT_KINDS_TYPES + REAL_KINDS_TYPES
 
 !! Licensing:
 !!
@@ -55,7 +56,7 @@ submodule(stdlib_sorting) stdlib_sorting_ord_sort
 
 contains
 
-#:for k1 in INT_KINDS
+#:for k1, t1 in IR_KINDS_TYPES
 
     module subroutine ${k1}$_ord_sort( array, work )
 ! A translation to Fortran 2008, of the `"Rust" sort` algorithm found in
@@ -77,10 +78,10 @@ contains
 ! estimation of the optimal `run size` as suggested in Tim Peters'
 ! original `listsort.txt`, and an optional `work` array to be used as
 ! scratch memory.
-        integer(${k1}$), intent(inout)           :: array(0:)
-        integer(${k1}$), intent(inout), optional :: work(0:)
+        ${t1}$, intent(inout)           :: array(0:)
+        ${t1}$, intent(inout), optional :: work(0:)
 
-        integer(${k1}$), allocatable :: buf(:)
+        ${t1}$, allocatable :: buf(:)
         integer(int_size) :: array_size
         integer :: stat
 
@@ -120,10 +121,10 @@ contains
 
         pure subroutine insertion_sort( array )
 ! Sorts `ARRAY` using an insertion sort.
-            integer(${k1}$), intent(inout) :: array(0:)
+            ${t1}$, intent(inout) :: array(0:)
 
             integer(int_size) :: i, j
-            integer(${k1}$) :: key
+            ${t1}$ :: key
 
             do j=1, size(array, kind=int_size)-1
                 key = array(j)
@@ -196,9 +197,9 @@ contains
 ! copying every traversed element into the slot preceding it, and finally,
 ! copying data from the temporary variable into the resulting hole.
 
-            integer(${k1}$), intent(inout) :: array(0:)
+            ${t1}$, intent(inout) :: array(0:)
 
-            integer(${k1}$) :: tmp
+            ${t1}$ :: tmp
             integer(int_size) :: i
 
             tmp = array(0)
@@ -231,8 +232,8 @@ contains
 ! The invariants ensure that the total running time is `O(n log n)`
 ! worst-case.
 
-            integer(${k1}$), intent(inout) :: array(0:)
-            integer(${k1}$), intent(inout) :: buf(0:)
+            ${t1}$, intent(inout) :: array(0:)
+            ${t1}$, intent(inout) :: buf(0:)
 
             integer(int_size) :: array_size, finish, min_run, r, r_count, &
                 start
@@ -319,9 +320,9 @@ contains
 ! using `BUF` as temporary storage, and stores the merged runs into
 ! `ARRAY(0:)`. `MID` must be > 0, and < `SIZE(ARRAY)-1`. Buffer `BUF`
 ! must be long enough to hold the shorter of the two runs.
-            integer(${k1}$), intent(inout) :: array(0:)
+            ${t1}$, intent(inout) :: array(0:)
             integer(int_size), intent(in)  :: mid
-            integer(${k1}$), intent(inout) :: buf(0:)
+            ${t1}$, intent(inout) :: buf(0:)
 
             integer(int_size) :: array_len, i, j, k
 
@@ -374,351 +375,10 @@ contains
 
         pure subroutine reverse_segment( array )
 ! Reverse a segment of an array in place
-            integer(${k1}$), intent(inout) :: array(0:)
+            ${t1}$, intent(inout) :: array(0:)
 
             integer(int_size) :: lo, hi
-            integer(${k1}$) :: temp
-
-            lo = 0
-            hi = size( array, kind=int_size ) - 1
-            do while( lo < hi )
-                temp = array(lo)
-                array(lo) = array(hi)
-                array(hi) = temp
-                lo = lo + 1
-                hi = hi - 1
-            end do
-
-        end subroutine reverse_segment
-
-    end subroutine ${k1}$_ord_sort
-
-#:endfor
-
-
-#:for k1 in REAL_KINDS
-
-    module subroutine ${k1}$_ord_sort( array, work )
-! A translation to Fortran 2008, of the `"Rust" sort` algorithm found in
-! `slice.rs`
-! https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs#L2159
-! The Rust version in turn is a simplification of the Timsort algorithm
-! described in
-! https://svn.python.org/projects/python/trunk/Objects/listsort.txt, as
-! it drops both the use of 'galloping' to identify bounds of regions to be
-! sorted and the estimation of the optimal `run size`. However it remains
-! a hybrid sorting algorithm combining an iterative Merge sort controlled
-! by a stack of `RUNS` identified by regions of uniformly decreasing or
-! non-decreasing sequences that may be expanded to a minimum run size and
-! initially processed by an insertion sort.
-!
-! Note the Fortran implementation simplifies the logic as it only has to
-! deal with Fortran arrays of intrinsic types and not the full generality
-! of Rust's arrays and lists for arbitrary types. It also adds the
-! estimation of the optimal `run size` as suggested in Tim Peters'
-! original `listsort.txt`, and an optional `work` array to be used as
-! scratch memory.
-        real(${k1}$), intent(inout)           :: array(0:)
-        real(${k1}$), intent(inout), optional :: work(0:)
-
-        real(${k1}$), allocatable :: buf(:)
-        integer(int_size) :: array_size
-        integer :: stat
-
-        if ( present(work) ) then
-! Use the work array as scratch memory
-            call merge_sort( array, work )
-        else
-! Allocate a buffer to use as scratch memory.
-            array_size = size( array, kind=int_size )
-            allocate( buf(0:array_size/2-1), stat=stat )
-            if ( stat /= 0 ) error stop "${k1}$_ord_sort: Allocation of buffer failed."
-            call merge_sort( array, buf )
-        end if
-
-    contains
-
-        pure function calc_min_run( n ) result(min_run)
-!! Returns the minimum length of a run from 32-63 so that N/MIN_RUN is
-!! less than or equal to a power of two. See
-!! https://svn.python.org/projects/python/trunk/Objects/listsort.txt
-            integer(int_size)             :: min_run
-            integer(int_size), intent(in) :: n
-
-            integer(int_size) :: num, r
-
-            num = n
-            r = 0_int_size
-
-            do while( num >= 64 )
-                r = ior( r, iand(num, 1_int_size) )
-                num = ishft(num, -1_int_size)
-            end do
-            min_run = num + r
-
-        end function calc_min_run
-
-
-        pure subroutine insertion_sort( array )
-! Sorts `ARRAY` using an insertion sort.
-            real(${k1}$), intent(inout) :: array(0:)
-
-            integer(int_size) :: i, j
-            real(${k1}$) :: key
-
-            do j=1, size(array, kind=int_size)-1
-                key = array(j)
-                i = j - 1
-                do while( i >= 0 .and. array(i) > key )
-                    array(i+1) = array(i)
-                    i = i - 1
-                end do
-                array(i+1) = key
-            end do
-
-        end subroutine insertion_sort
-
-
-        pure function collapse( runs ) result ( r )
-! Examine the stack of runs waiting to be merged, identifying adjacent runs
-! to be merged until the stack invariants are restablished:
-!
-! 1. len(-3) > len(-2) + len(-1)
-! 2. len(-2) > len(-1)
-            integer(int_size) :: r
-            type(run_type), intent(in), target :: runs(0:)
-
-            integer(int_size) :: n
-            logical :: test
-
-            n = size(runs, kind=int_size)
-            test = .false.
-            if (n >= 2) then
-                if ( runs( n-1 ) % base == 0 .or. &
-                     runs( n-2 ) % len <= runs(n-1) % len ) then
-                    test = .true.
-                else if ( n >= 3 ) then ! X exists
-                    if ( runs(n-3) % len <= &
-                         runs(n-2) % len + runs(n-1) % len ) then
-                        test = .true.
-!               |X| <= |Y| + |Z| => will need to merge due to rho1 or rho2
-                    else if( n >= 4 ) then
-                        if ( runs(n-4) % len <= &
-                             runs(n-3) % len + runs(n-2) % len ) then
-                            test = .true.
-!               |W| <= |X| + |Y| => will need to merge due to rho1 or rho3
-                        end if
-                    end if
-                end if
-            end if
-            if ( test ) then
-! By default merge Y & Z, rho2 or rho3
-                if ( n >= 3 ) then
-                    if ( runs(n-3) % len < runs(n-1) % len ) then
-                        r = n - 3
-! |X| < |Z| => merge X & Y, rho1
-                        return
-                    end if
-                end if
-                r = n - 2
-! |Y| <= |Z| => merge Y & Z, rho4
-                return
-            else
-                r = -1
-            end if
-
-        end function collapse
-
-
-        pure subroutine insert_head( array )
-! Inserts `array(0)` into the pre-sorted sequence `array(1:)` so that the
-! whole `array(0:)` becomes sorted, copying the first element into
-! a temporary variable, iterating until the right place for it is found.
-! copying every traversed element into the slot preceding it, and finally,
-! copying data from the temporary variable into the resulting hole.
-
-            real(${k1}$), intent(inout) :: array(0:)
-
-            real(${k1}$) :: tmp
-            integer(int_size) :: i
-
-            tmp = array(0)
-            find_hole: do i=1, size(array, kind=int_size)-1
-                if ( array(i) >= tmp ) exit find_hole
-                array(i-1) = array(i)
-            end do find_hole
-            array(i-1) = tmp
-
-        end subroutine insert_head
-
-
-        subroutine merge_sort( array, buf )
-! The Rust merge sort borrows some (but not all) of the ideas from TimSort,
-! which is described in detail at
-! (http://svn.python.org/projects/python/trunk/Objects/listsort.txt).
-!
-! The algorithm identifies strictly descending and non-descending
-! subsequences, which are called natural runs. Where these runs are less
-! than a minimum run size they are padded by adding additional samples
-! using an insertion sort. The merge process is driven by a stack of
-! pending unmerged runs. Each newly found run is pushed onto the stack,
-! and then pairs of adjacentd runs are merged until these two invariants
-! are satisfied:
-!
-! 1. for every `i` in `1..size(runs)-1`: `runs(i - 1)%len > runs(i)%len`
-! 2. for every `i` in `2..size(runs)-1`: `runs(i - 2)%len >
-!    runs(i - 1)%len + runs(i)%len`
-!
-! The invariants ensure that the total running time is `O(n log n)`
-! worst-case.
-
-            real(${k1}$), intent(inout) :: array(0:)
-            real(${k1}$), intent(inout) :: buf(0:)
-
-            integer(int_size) :: array_size, finish, min_run, r, r_count, &
-                start
-            type(run_type) :: runs(0:max_merge_stack-1), left, right
-
-            array_size = size(array, kind=int_size)
-
-! Very short runs are extended using insertion sort to span at least
-! min_run elements. Slices of up to this length are sorted using insertion
-! sort.
-            min_run = calc_min_run( array_size )
-
-            if ( array_size <= min_run ) then
-                if ( array_size >= 2 ) call insertion_sort( array )
-                return
-            end if
-
-! Following Rust sort, natural runs in `array` are identified by traversing
-! it backwards. By traversing it backward, merges more often go in the
-! opposite direction (forwards). According to developers of Rust sort,
-! merging forwards is slightly faster than merging backwards. Therefore
-! identifying runs by traversing backwards should improve performance.
-            r_count = 0
-            finish = array_size - 1
-            do while ( finish >= 0 )
-! Find the next natural run, and reverse it if it's strictly descending.
-                start = finish
-                if ( start > 0 ) then
-                    start = start - 1
-                    if ( array(start+1) < array(start) ) then
-                        Descending: do while ( start > 0 )
-                            if ( array(start) >= array(start-1) ) &
-                                exit Descending
-                            start = start - 1
-                        end do Descending
-                        call reverse_segment( array(start:finish) )
-                    else
-                        Ascending: do while( start > 0 )
-                            if ( array(start) < array(start-1) ) exit Ascending
-                            start = start - 1
-                        end do Ascending
-                    end if
-                end if
-
-! If the run is too short insert some more elements using an insertion sort.
-                Insert: do while ( start > 0 )
-                    if ( finish - start >= min_run - 1 ) exit Insert
-                    start = start - 1
-                    call insert_head( array(start:finish) )
-                end do Insert
-                if ( start == 0 .and. finish == array_size - 1 ) return
-
-                runs(r_count) = run_type( base = start, &
-                                          len = finish - start + 1 )
-                finish = start-1
-                r_count = r_count + 1
-
-! Determine whether pairs of adjacent runs need to be merged to satisfy
-! the invariants, and, if so, merge them.
-                Merge_loop: do
-                    r = collapse( runs(0:r_count - 1) )
-                    if ( r < 0 .or. r_count <= 1 ) exit Merge_loop
-                    left = runs( r + 1 )
-                    right = runs( r )
-                    call merge( array( left % base: &
-                                       right % base + right % len - 1 ), &
-                                left % len, buf )
-
-                    runs(r) = run_type( base = left % base, &
-                                        len = left % len + right % len )
-                    if ( r == r_count - 3 ) runs(r+1) = runs(r+2)
-                    r_count = r_count - 1
-
-                end do Merge_loop
-            end do
-            if ( r_count /= 1 ) &
-                error stop "MERGE_SORT completed without RUN COUNT == 1."
-
-        end subroutine merge_sort
-
-
-        pure subroutine merge( array, mid, buf )
-! Merges the two non-decreasing runs `ARRAY(0:MID-1)` and `ARRAY(MID:)`
-! using `BUF` as temporary storage, and stores the merged runs into
-! `ARRAY(0:)`. `MID` must be > 0, and < `SIZE(ARRAY)-1`. Buffer `BUF`
-! must be long enough to hold the shorter of the two runs.
-            real(${k1}$), intent(inout) :: array(0:)
-            integer(int_size), intent(in)  :: mid
-            real(${k1}$), intent(inout) :: buf(0:)
-
-            integer(int_size) :: array_len, i, j, k
-
-            array_len = size(array, kind=int_size)
-
-! Merge first copies the shorter run into `buf`. Then, depending on which
-! run was shorter, it traces the copied run and the longer run forwards
-! (or backwards), comparing their next unprocessed elements and then
-! copying the lesser (or greater) one into `array`.
-
-            if ( mid <= array_len - mid ) then ! The left run is shorter.
-                buf(0:mid-1) = array(0:mid-1)
-                i = 0
-                j = mid
-                merge_lower: do k = 0, array_len-1
-                    if ( buf(i) <= array(j) ) then
-                        array(k) = buf(i)
-                        i = i + 1
-                        if ( i >= mid ) exit merge_lower
-                    else
-                        array(k) = array(j)
-                        j = j + 1
-                        if ( j >= array_len ) then
-                            array(k+1:) = buf(i:mid-1)
-                            exit merge_lower
-                        end if
-                    end if
-                end do merge_lower
-            else ! The right run is shorter ! check that it is stable
-                buf(0:array_len-mid-1) = array(mid:array_len-1)
-                i = mid - 1
-                j = array_len - mid -1
-                merge_upper: do k = array_len-1, 0, -1
-                    if ( buf(j) >= array(i) ) then
-                        array(k) = buf(j)
-                        j = j - 1
-                        if ( j < 0 ) exit merge_upper
-                    else
-                        array(k) = array(i)
-                        i = i - 1
-                        if ( i < 0 ) then
-                            array(0:k-1) = buf(0:j)
-                            exit merge_upper
-                        end if
-                    end if
-                end do merge_upper
-            end if
-        end subroutine merge
-
-
-        pure subroutine reverse_segment( array )
-! Reverse a segment of an array in place
-            real(${k1}$), intent(inout) :: array(0:)
-
-            integer(int_size) :: lo, hi
-            real(${k1}$) :: temp
+            ${t1}$ :: temp
 
             lo = 0
             hi = size( array, kind=int_size ) - 1

From 5a9ccff8e6645c3dd3ac7197a5ac15d42b0965ef Mon Sep 17 00:00:00 2001
From: Jeremie Vandenplas <jeremie.vandenplas@gmail.com>
Date: Thu, 20 May 2021 20:46:49 +0200
Subject: [PATCH 22/45] simplifies code by adding fypp variables

---
 src/stdlib_sorting.fypp | 20 +++++---------------
 1 file changed, 5 insertions(+), 15 deletions(-)

diff --git a/src/stdlib_sorting.fypp b/src/stdlib_sorting.fypp
index ba197a4bf..c9ad0ce11 100644
--- a/src/stdlib_sorting.fypp
+++ b/src/stdlib_sorting.fypp
@@ -1,4 +1,5 @@
 #:include "common.fypp"
+#:set IR_KINDS_TYPES = INT_KINDS_TYPES + REAL_KINDS_TYPES
 
 !! Licensing:
 !!
@@ -346,29 +347,18 @@ module stdlib_sorting
 !! sorted data, having O(N) performance on uniformly non-increasing or
 !! non-decreasing data.
 
-#:for k1 in INT_KINDS
+#:for k1, t1 in IR_KINDS_TYPES
         module subroutine ${k1}$_ord_sort( array, work )
 !! Version: experimental
 !!
-!! `${k1}$_ord_sort( array )` sorts the input `ARRAY` of type `INTEGER(${k1}$)`
+!! `${k1}$_ord_sort( array )` sorts the input `ARRAY` of type `${t1}$`
 !! using a hybrid sort based on the `'Rust" sort` algorithm found in `slice.rs`
-            integer(${k1}$), intent(inout)           :: array(0:)
-            integer(${k1}$), intent(inout), optional :: work(0:)
+            ${t1}$, intent(inout)           :: array(0:)
+            ${t1}$, intent(inout), optional :: work(0:)
         end subroutine ${k1}$_ord_sort
 
 #:endfor
 
-#:for k1 in REAL_KINDS
-        module subroutine ${k1}$_ord_sort( array, work )
- !! Version: experimental
-!!
-!! `${k1}$_ord_sort( array )` sorts the input `ARRAY` of type `REAL(${k1}$)`
-!! using a hybrid sort based on the `'Rust" sort` algorithm found in `slice.rs`
-            real(${k1}$), intent(inout)           :: array(0:)
-            real(${k1}$), intent(inout), optional :: work(0:)
-        end subroutine ${k1}$_ord_sort
-
-#:endfor
         module subroutine char_ord_sort( array, work )
 !! Version: experimental
 !!

From c3128ac76f19cfe2759a1126337229fce1a9b019 Mon Sep 17 00:00:00 2001
From: Jeremie Vandenplas <jeremie.vandenplas@gmail.com>
Date: Thu, 20 May 2021 21:08:02 +0200
Subject: [PATCH 23/45] sorting_sort: add fypp variable for simplifying code

---
 src/stdlib_sorting.fypp      |  20 +---
 src/stdlib_sorting_sort.fypp | 198 +++--------------------------------
 2 files changed, 17 insertions(+), 201 deletions(-)

diff --git a/src/stdlib_sorting.fypp b/src/stdlib_sorting.fypp
index c9ad0ce11..feb138be0 100644
--- a/src/stdlib_sorting.fypp
+++ b/src/stdlib_sorting.fypp
@@ -385,30 +385,16 @@ module stdlib_sorting
 !! The generic subroutine interface implementing the `SORT` algorithm, based
 !! on the `introsort` of David Musser.
 
-#:for k1 in INT_KINDS
-        pure module subroutine ${k1}$_sort( array )
-!! Version: experimental
-!!
-!! `${k1}$_sort( array )` sorts the input `ARRAY` of type `INTEGER(${k1}$)`
-!! using a hybrid sort based on the `introsort` of David Musser.
-!! The algorithm is of order O(N Ln(N)) for all inputs.
-!! Because it relies on `quicksort`, the coefficient of the O(N Ln(N))
-!! behavior is small for random data compared to other sorting algorithms.
-            integer(${k1}$), intent(inout) :: array(0:)
-        end subroutine ${k1}$_sort
-
-#:endfor
-
-#:for k1 in REAL_KINDS
+#:for k1, t1 in IR_KINDS_TYPES
         pure module subroutine ${k1}$_sort( array )
 !! Version: experimental
 !!
-!! `${k1}$_sort( array )` sorts the input `ARRAY` of type `REAL(${k1}$)`
+!! `${k1}$_sort( array )` sorts the input `ARRAY` of type `${t1}$`
 !! using a hybrid sort based on the `introsort` of David Musser.
 !! The algorithm is of order O(N Ln(N)) for all inputs.
 !! Because it relies on `quicksort`, the coefficient of the O(N Ln(N))
 !! behavior is small for random data compared to other sorting algorithms.
-            real(${k1}$), intent(inout) :: array(0:)
+            ${t1}$, intent(inout) :: array(0:)
         end subroutine ${k1}$_sort
 
 #:endfor
diff --git a/src/stdlib_sorting_sort.fypp b/src/stdlib_sorting_sort.fypp
index 312a38cfa..89200ddb1 100644
--- a/src/stdlib_sorting_sort.fypp
+++ b/src/stdlib_sorting_sort.fypp
@@ -1,4 +1,5 @@
 #:include "common.fypp"
+#:set IR_KINDS_TYPES = INT_KINDS_TYPES + REAL_KINDS_TYPES
 
 !! Licensing:
 !!
@@ -60,10 +61,10 @@ submodule(stdlib_sorting) stdlib_sorting_sort
 contains
 
 
-#:for k1 in INT_KINDS
+#:for k1, t1 in IR_KINDS_TYPES
 
     pure module subroutine ${k1}$_sort( array )
-! `${k1}$_sort( array )` sorts the input `ARRAY` of type `INTEGER(${k1}$)`
+! `${k1}$_sort( array )` sorts the input `ARRAY` of type `${t1}$`
 ! using a hybrid sort based on the `introsort` of David Musser. As with
 ! `introsort`, `${k1}$_sort( array )` is an unstable hybrid comparison
 ! algorithm using `quicksort` for the main body of the sort tree,
@@ -73,11 +74,11 @@ contains
 ! Because it relies on `quicksort`, the coefficient of the O(N Ln(N))
 ! behavior is typically small compared to other sorting algorithms.
 
-        integer(${k1}$), intent(inout) :: array(0:)
+        ${t1}$, intent(inout) :: array(0:)
 
         integer(int32) :: depth_limit
 
-        depth_limit = 2 * int( floor( log( real( size( array, kind=int64 ),  &
+        depth_limit = 2 * int( floor( log( real( size( array, kind=int_size),  &
                                                  kind=dp) ) / log(2.0_dp) ), &
                                kind=int32 )
         call introsort(array, depth_limit)
@@ -89,7 +90,7 @@ contains
 ! is fewer than or equal to `INSERT_SIZE`, `heapsort` if the completion
 ! of the `quicksort` is too slow as estimated from `DEPTH_LIMIT`,
 ! otherwise sorting is done by a `quicksort`.
-            integer(${k1}$), intent(inout) :: array(0:)
+            ${t1}$, intent(inout) :: array(0:)
             integer(int32), intent(in)     :: depth_limit
 
             integer(int_size), parameter :: insert_size = 16_int_size
@@ -115,10 +116,10 @@ contains
 
         pure subroutine partition( array, index )
 ! quicksort partition using median of three.
-            integer(${k1}$), intent(inout) :: array(0:)
+            ${t1}$, intent(inout) :: array(0:)
             integer(int_size), intent(out) :: index
 
-            integer(${k1}$) :: u, v, w, x, y
+            ${t1}$ :: u, v, w, x, y
             integer(int_size) :: i, j
 
 ! Determine median of three and exchange it with the end.
@@ -158,10 +159,10 @@ contains
 
         pure subroutine insertion_sort( array )
 ! Bog standard insertion sort.
-            integer(${k1}$), intent(inout) :: array(0:)
+            ${t1}$, intent(inout) :: array(0:)
 
             integer(int_size) :: i, j
-            integer(${k1}$)   :: key
+            ${t1}$ :: key
 
             do j=1_int_size, size(array, kind=int_size)-1
                 key = array(j)
@@ -178,10 +179,10 @@ contains
 
         pure subroutine heap_sort( array )
 ! A bog standard heap sort
-            integer(${k1}$), intent(inout) :: array(0:)
+            ${t1}$, intent(inout) :: array(0:)
 
             integer(int_size) :: i, heap_size
-            integer(${k1}$)   :: y
+            ${t1}$   :: y
 
             heap_size = size( array, kind=int_size )
 ! Build the max heap
@@ -201,11 +202,11 @@ contains
 
         pure recursive subroutine max_heapify( array, i, heap_size )
 ! Transform the array into a max heap
-            integer(${k1}$), intent(inout) :: array(0:)
+            ${t1}$, intent(inout) :: array(0:)
             integer(int_size), intent(in)  :: i, heap_size
 
             integer(int_size) :: l, r, largest
-            integer(${k1}$)   :: y
+            ${t1}$   :: y
 
             largest = i
             l = 2_int_size * i + 1_int_size
@@ -230,177 +231,6 @@ contains
 #:endfor
 
 
-#:for k1 in REAL_KINDS
-
-    pure module subroutine ${k1}$_sort( array )
-
-! `${k1}$_sort( array )` sorts the input `ARRAY` of type `REAL(${k1}$)`
-! using a hybrid sort based on the `introsort` of David Musser. As with
-! `introsort`, `${k1}$_sort( array )` is an unstable hybrid comparison
-! algorithm using `quicksort` for the main body of the sort tree,
-! supplemented by `insertion sort` for the outer brances, but if
-! `quicksort` is converging too slowly the algorithm resorts
-! to `heapsort`. The algorithm is of order O(N Ln(N)) for all inputs.
-! Because it relies on `quicksort`, the coefficient of the O(N Ln(N))
-! behavior is typically small compared to other sorting algorithms.
-
-        real(${k1}$), intent(inout) :: array(0:)
-        integer(int32)              :: depth_limit
-
-        depth_limit = &
-            2 * int( floor( log( real( size( array, kind=int_size ),   &
-                                             kind=dp) ) / log(2.0_dp) ), &
-                               kind=int32 )
-        call introsort(array, depth_limit)
-
-    contains
-
-        pure recursive subroutine introsort( array, depth_limit )
-! It devolves to `insertionsort` if the remaining number of elements
-! is fewer than or equal to `INSERT_SIZE`, `heapsort` if the completion of
-! the quicksort is too slow as estimated from `DEPTH_LIMIT`, otherwise
-! sorting is done by a `quicksort`.
-            real(${k1}$), intent(inout) :: array(0:)
-            integer(int32), intent(in)  :: depth_limit
-
-            integer(int_size), parameter :: insert_size = 16_int_size
-            integer(int_size)            :: index
-
-            if ( size(array, kind=int_size) <= insert_size ) then
-                ! May be best at the end of SORT processing the whole array
-                ! See Musser, D.R., “Introspective Sorting and Selection
-                ! Algorithms,” Software—Practice and Experience, Vol. 27(8),
-                ! 983–993 (August 1997).
-
-                call insertion_sort( array )
-
-            else if ( depth_limit == 0 ) then
-                call heap_sort( array )
-
-            else
-                call partition( array, index )
-                call introsort( array(0:index-1), depth_limit-1 )
-                call introsort( array(index+1:), depth_limit-1 )
-            end if
-
-        end subroutine introsort
-
-
-        pure subroutine partition( array, index )
-! quicksort partition using median of three.
-            real(${k1}$), intent(inout)    :: array(0:)
-            integer(int_size), intent(out) :: index
-
-            real(${k1}$)      :: u, v, w, x, y
-            integer(int_size) :: i, j
-
-! Determine median of three and exchange it with the end.
-            u = array( 0 )
-            v = array( size(array, kind=int_size)/2-1 )
-            w = array( size(array, kind=int_size)-1 )
-            if ( (u > v) .neqv. (u > w) ) then
-                x = u
-                y = array(0)
-                array(0) = array( size( array, kind=int_size ) - 1 )
-                array( size( array, kind=int_size ) - 1 ) = y
-            else if ( (v < u) .neqv. (v < w) ) then
-                x = v
-                y = array(size( array, kind=int_size )/2-1)
-                array(size( array, kind=int_size )/2-1) = &
-                    array(size( array, kind=int_size )-1)
-                array( size( array, kind=int_size )-1 ) = y
-            else
-                x = w
-            end if
-! Partition the array
-            i = -1_int_size
-            do j = 0_int_size, size(array, kind=int_size)-2
-                if ( array(j) <= x ) then
-                    i = i + 1
-                    y = array(i)
-                    array(i) = array(j)
-                    array(j) = y
-                end if
-            end do
-            y = array(i+1)
-            array(i+1) = array(size(array, kind=int_size)-1)
-            array(size(array, kind=int_size)-1) = y
-            index = i + 1
-
-        end subroutine partition
-
-        pure subroutine insertion_sort( array )
-! Bog standard insertion sort.
-            real(${k1}$), intent(inout) :: array(0:)
-
-            integer(int_size) :: i, j
-            real(${k1}$)      :: key
-
-            do j=1_int_size, size(array, kind=int_size)-1
-                key = array(j)
-                i = j - 1
-                do while( i >= 0 )
-                    if ( array(i) <= key ) exit
-                    array(i+1) = array(i)
-                    i = i - 1
-                end do
-                array(i+1) = key
-            end do
-
-        end subroutine insertion_sort
-
-        pure subroutine heap_sort( array )
-! A bog standard heap sort
-            real(${k1}$), intent(inout) :: array(0:)
-
-            integer(int_size) :: i, heap_size
-            real(${k1}$)      :: y
-
-            heap_size = size( array, kind=int_size )
-! Build the max heap
-            do i = (heap_size-2)/2_int_size, 0_int_size, -1_int_size
-                call max_heapify( array, i, heap_size )
-            end do
-            do i = heap_size-1, 1_int_size, -1_int_size
-! Swap the first element with the current final element
-                y = array(0)
-                array(0) = array(i)
-                array(i) = y
-! Sift down using max_heapify
-                call max_heapify( array, 0_int_size, i )
-            end do
-
-        end subroutine heap_sort
-
-        pure recursive subroutine max_heapify( array, i, heap_size )
-! Transform the array into a max heap
-            real(${k1}$), intent(inout)   :: array(0:)
-            integer(int_size), intent(in) :: i, heap_size
-
-            integer(int_size) :: l, r, largest
-            real(${k1}$)      :: y
-
-            largest = i
-            l = 2_int_size * i + 1_int_size
-            r = l + 1_int_size
-            if ( l < heap_size ) then
-                if ( array(l) > array(largest) ) largest = l
-            end if
-            if ( r < heap_size ) then
-                if ( array(r) > array(largest) ) largest = r
-            end if
-            if ( largest /= i ) then
-                y = array(i)
-                array(i) = array(largest)
-                array(largest) = y
-                call max_heapify( array, largest, heap_size )
-            end if
-
-        end subroutine max_heapify
-
-    end subroutine ${k1}$_sort
-
-#:endfor
 
 
     pure module subroutine char_sort( array )

From 450247ee01fa49b1361b53c0fe35a63d5c7409e0 Mon Sep 17 00:00:00 2001
From: Jeremie Vandenplas <jeremie.vandenplas@gmail.com>
Date: Thu, 20 May 2021 21:17:17 +0200
Subject: [PATCH 24/45] sort_index: add fypp variable to reduce code size

---
 src/stdlib_sorting.fypp            |  24 +-
 src/stdlib_sorting_sort_index.fypp | 437 +----------------------------
 2 files changed, 19 insertions(+), 442 deletions(-)

diff --git a/src/stdlib_sorting.fypp b/src/stdlib_sorting.fypp
index feb138be0..07eb5a72a 100644
--- a/src/stdlib_sorting.fypp
+++ b/src/stdlib_sorting.fypp
@@ -435,40 +435,24 @@ module stdlib_sorting
 !! with a value of `.TRUE.` the indices correspond to a non-increasing sort.
 
 #:for k1 in INT_KINDS
+#:for k1, t1 in IR_KINDS_TYPES
         module subroutine ${k1}$_sort_index( array, index, work, iwork, &
             reverse )
 !! Version: experimental
 !!
-!! `${k1}$_sort_index( array )` sorts an input `ARRAY` of type `INTEGER(${k1}$)`
+!! `${k1}$_sort_index( array )` sorts an input `ARRAY` of type `${t1}$`
 !! using a hybrid sort based on the `'Rust" sort` algorithm found in `slice.rs`
 !! and returns the sorted `ARRAY` and an array `INDEX of indices in the
 !! order that would sort the input `ARRAY` in the desired direction.
-            integer(${k1}$), intent(inout)             :: array(0:)
+            ${t1}$, intent(inout)                      :: array(0:)
             integer(int_size), intent(inout)           :: index(0:)
-            integer(${k1}$), intent(inout), optional   :: work(0:)
+            ${t1}$, intent(inout), optional            :: work(0:)
             integer(int_size), intent(inout), optional :: iwork(0:)
             logical, intent(in), optional              :: reverse
         end subroutine ${k1}$_sort_index
 
 #:endfor
 
-#:for k1 in REAL_KINDS
-        module subroutine ${k1}$_sort_index( array, index, work, iwork, &
-            reverse )
-!! Version: experimental
-!!
-!! `${k1}$_sort_index( array )` sorts an input `ARRAY` of type `REAL(${k1}$)`
-!! using a hybrid sort based on the `'Rust" sort` algorithm found in `slice.rs`
-!! and returns the sorted `ARRAY` and an array `INDEX of indices in the
-!! order that would sort the input `ARRAY` in the desired direction.
-            real(${k1}$), intent(inout)                :: array(0:)
-            integer(int_size), intent(inout)           :: index(0:)
-            real(${k1}$), intent(inout), optional      :: work(0:)
-            integer(int_size), intent(inout), optional :: iwork(0:)
-            logical, intent(in), optional              :: reverse
-        end subroutine ${k1}$_sort_index
-
-#:endfor
         module subroutine char_sort_index( array, index, work, iwork, &
             reverse )
 !! Version: experimental
diff --git a/src/stdlib_sorting_sort_index.fypp b/src/stdlib_sorting_sort_index.fypp
index ea488768f..f592ccf9d 100644
--- a/src/stdlib_sorting_sort_index.fypp
+++ b/src/stdlib_sorting_sort_index.fypp
@@ -1,4 +1,5 @@
 #:include "common.fypp"
+#:set IR_KINDS_TYPES = INT_KINDS_TYPES + REAL_KINDS_TYPES
 
 !! Licensing:
 !!
@@ -55,7 +56,7 @@ submodule(stdlib_sorting) stdlib_sorting_sort_index
 
 contains
 
-#:for k1 in INT_KINDS
+#:for k1, t1 in IR_KINDS_TYPES
 
     module subroutine ${k1}$_sort_index( array, index, work, iwork, reverse )
 ! A modification of `${k1}$_ord_sort` to return an array of indices that
@@ -82,14 +83,14 @@ contains
 ! original listsort.txt, and the optional `work` and `iwork` arraya to be
 ! used as scratch memory.
 
-        integer(${k1}$), intent(inout)             :: array(0:)
+        ${t1}$, intent(inout)                      :: array(0:)
         integer(int_size), intent(inout)           :: index(0:)
-        integer(${k1}$), intent(inout), optional   :: work(0:)
+        ${t1}$, intent(inout), optional            :: work(0:)
         integer(int_size), intent(inout), optional :: iwork(0:)
         logical, intent(in), optional              :: reverse
 
         integer(int_size) :: array_size, i, stat
-        integer(${k1}$), allocatable :: buf(:)
+        ${t1}$, allocatable :: buf(:)
         integer(int_size), allocatable :: ibuf(:)
 
         array_size = size(array, kind=int_size)
@@ -157,11 +158,11 @@ contains
         pure subroutine insertion_sort( array, index )
 ! Sorts `ARRAY` using an insertion sort, while maintaining consistency in
 ! location of the indices in `INDEX` to the elements of `ARRAY`.
-            integer(${k1}$), intent(inout) :: array(0:)
+            ${t1}$, intent(inout)            :: array(0:)
             integer(int_size), intent(inout) :: index(0:)
 
             integer(int_size) :: i, j, key_index
-            integer(${k1}$) :: key
+            ${t1}$ :: key
 
             do j=1, size(array, kind=int_size)-1
                 key = array(j)
@@ -240,10 +241,10 @@ contains
 ! Consistency of the indices in `index` with the elements of `array`
 ! are maintained.
 
-            integer(${k1}$), intent(inout)   :: array(0:)
+            ${t1}$, intent(inout)            :: array(0:)
             integer(int_size), intent(inout) :: index(0:)
 
-            integer(${k1}$) :: tmp
+            ${t1}$ :: tmp
             integer(int_size) :: i, tmp_index
 
             tmp = array(0)
@@ -280,9 +281,9 @@ contains
 ! worst-case. Consistency of the indices in `index` with the elements of
 ! `array` are maintained.
 
-            integer(${k1}$), intent(inout)   :: array(0:)
+            ${t1}$, intent(inout)            :: array(0:)
             integer(int_size), intent(inout) :: index(0:)
-            integer(${k1}$), intent(inout)   :: buf(0:)
+            ${t1}$, intent(inout)            :: buf(0:)
             integer(int_size), intent(inout) :: ibuf(0:)
 
             integer(int_size) :: array_size, finish, min_run, r, r_count, &
@@ -373,9 +374,9 @@ contains
 ! using `BUF` as temporary storage, and stores the merged runs into
 ! `ARRAY(0:)`. `MID` must be > 0, and < `SIZE(ARRAY)-1`. Buffer `BUF`
 ! must be long enough to hold the shorter of the two runs.
-            integer(${k1}$), intent(inout)   :: array(0:)
+            ${t1}$, intent(inout)            :: array(0:)
             integer(int_size), intent(in)    :: mid
-            integer(${k1}$), intent(inout)   :: buf(0:)
+            ${t1}$, intent(inout)            :: buf(0:)
             integer(int_size), intent(inout) :: index(0:)
             integer(int_size), intent(inout) :: ibuf(0:)
 
@@ -438,11 +439,11 @@ contains
 
         pure subroutine reverse_segment( array, index )
 ! Reverse a segment of an array in place
-            integer(${k1}$), intent(inout) :: array(0:)
+            ${t1}$, intent(inout) :: array(0:)
             integer(int_size), intent(inout) :: index(0:)
 
             integer(int_size) :: itemp, lo, hi
-            integer(${k1}$) :: temp
+            ${t1}$ :: temp
 
             lo = 0
             hi = size( array, kind=int_size ) - 1
@@ -464,414 +465,6 @@ contains
 #:endfor
 
 
-#:for k1 in REAL_KINDS
-
-    module subroutine ${k1}$_sort_index( array, index, work, iwork, reverse )
-! A modification of `${k1}$_ord_sort` to return an array of indices that
-! would perform a stable sort of the `ARRAY` as input, and also sort `ARRAY`
-! as desired. The indices by default
-! correspond to a non-decreasing sort, but if the optional argument
-! `REVERSE` is present with a value of `.TRUE.` the indices correspond to
-! a non-increasing sort. The logic of the determination of indexing largely
-! follows the `"Rust" sort` found in `slice.rs`:
-! https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs#L2159
-! The Rust version is a simplification of the Timsort algorithm described
-! in https://svn.python.org/projects/python/trunk/Objects/listsort.txt, as
-! it drops both the use of 'galloping' to identify bounds of regions to be
-! sorted and the estimation of the optimal `run size`. However it remains
-! a hybrid sorting algorithm combining an iterative Merge sort controlled
-! by a stack of `RUNS` identified by regions of uniformly decreasing or
-! non-decreasing sequences that may be expanded to a minimum run size, with
-! an insertion sort.
-!
-! Note the Fortran implementation simplifies the logic as it only has to
-! deal with Fortran arrays of intrinsic types and not the full generality
-! of Rust's arrays and lists for arbitrary types. It also adds the
-! estimation of the optimal `run size` as suggested in Tim Peters'
-! original listsort.txt, and the optional `work` and `iwork` arraya to be
-! used as scratch memory.
-
-        real(${k1}$), intent(inout)             :: array(0:)
-        integer(int_size), intent(inout)           :: index(0:)
-        real(${k1}$), intent(inout), optional   :: work(0:)
-        integer(int_size), intent(inout), optional :: iwork(0:)
-        logical, intent(in), optional              :: reverse
-
-        integer(int_size) :: array_size, i, stat
-        real(${k1}$), allocatable :: buf(:)
-        integer(int_size), allocatable :: ibuf(:)
-
-        array_size = size(array, kind=int_size)
-
-        do i = 0, array_size-1
-            index(i) = i+1
-        end do
-
-        if ( present(reverse) ) then
-            if ( reverse ) then
-                call reverse_segment( array, index )
-            end if
-        end if
-
-! If necessary allocate buffers to serve as scratch memory.
-        if ( present(work) ) then
-            if ( present(iwork) ) then
-                call merge_sort( array, index, work, iwork )
-            else
-                allocate( ibuf(0:array_size/2-1), stat=stat )
-                if ( stat /= 0 ) error stop "Allocation of index buffer failed."
-                call merge_sort( array, index, work, ibuf )
-            end if
-        else
-            allocate( buf(0:array_size/2-1), stat=stat )
-            if ( stat /= 0 ) error stop "Allocation of array buffer failed."
-            if ( present(iwork) ) then
-                call merge_sort( array, index, buf, iwork )
-            else
-                allocate( ibuf(0:array_size/2-1), stat=stat )
-                if ( stat /= 0 ) error stop "Allocation of index buffer failed."
-                call merge_sort( array, index, buf, ibuf )
-            end if
-        end if
-
-        if ( present(reverse) ) then
-            if ( reverse ) then
-                call reverse_segment( array, index )
-            end if
-        end if
-
-    contains
-
-        pure function calc_min_run( n ) result(min_run)
-!! Returns the minimum length of a run from 32-63 so that N/MIN_RUN is
-!! less than or equal to a power of two. See
-!! https://svn.python.org/projects/python/trunk/Objects/listsort.txt
-            integer(int_size)             :: min_run
-            integer(int_size), intent(in) :: n
-
-            integer(int_size) :: num, r
-
-            num = n
-            r = 0_int_size
-
-            do while( num >= 64 )
-                r = ior( r, iand(num, 1_int_size) )
-                num = ishft(num, -1_int_size)
-            end do
-            min_run = num + r
-
-        end function calc_min_run
-
-
-        pure subroutine insertion_sort( array, index )
-! Sorts `ARRAY` using an insertion sort, while maintaining consistency in
-! location of the indices in `INDEX` to the elements of `ARRAY`.
-            real(${k1}$), intent(inout) :: array(0:)
-            integer(int_size), intent(inout) :: index(0:)
-
-            integer(int_size) :: i, j, key_index
-            real(${k1}$) :: key
-
-            do j=1, size(array, kind=int_size)-1
-                key = array(j)
-                key_index = index(j)
-                i = j - 1
-                do while( i >= 0 .and. array(i) > key )
-                    array(i+1) = array(i)
-                    index(i+1) = index(i)
-                    i = i - 1
-                end do
-                array(i+1) = key
-                index(i+1) = key_index
-            end do
-
-        end subroutine insertion_sort
-
-
-        pure function collapse( runs ) result ( r )
-! Examine the stack of runs waiting to be merged, identifying adjacent runs
-! to be merged until the stack invariants are restablished:
-!
-! 1. len(-3) > len(-2) + len(-1)
-! 2. len(-2) > len(-1)
-
-            integer(int_size) :: r
-            type(run_type), intent(in), target :: runs(0:)
-
-            integer(int_size) :: n
-            logical :: test
-
-            n = size(runs, kind=int_size)
-            test = .false.
-            if (n >= 2) then
-                if ( runs( n-1 ) % base == 0 .or. &
-                     runs( n-2 ) % len <= runs(n-1) % len ) then
-                    test = .true.
-                else if ( n >= 3 ) then ! X exists
-                    if ( runs(n-3) % len <= &
-                         runs(n-2) % len + runs(n-1) % len ) then
-                        test = .true.
-!               |X| <= |Y| + |Z| => will need to merge due to rho1 or rho2
-                    else if( n >= 4 ) then
-                        if ( runs(n-4) % len <= &
-                             runs(n-3) % len + runs(n-2) % len ) then
-                            test = .true.
-!               |W| <= |X| + |Y| => will need to merge due to rho1 or rho3
-                        end if
-                    end if
-                end if
-            end if
-            if ( test ) then
-! By default merge Y & Z, rho2 or rho3
-                if ( n >= 3 ) then
-                    if ( runs(n-3) % len < runs(n-1) % len ) then
-                        r = n - 3
-! |X| < |Z| => merge X & Y, rho1
-                        return
-                    end if
-                end if
-                r = n - 2
-! |Y| <= |Z| => merge Y & Z, rho4
-                return
-            else
-                r = -1
-            end if
-
-        end function collapse
-
-
-        pure subroutine insert_head( array, index )
-! Inserts `array(0)` into the pre-sorted sequence `array(1:)` so that the
-! whole `array(0:)` becomes sorted, copying the first element into
-! a temporary variable, iterating until the right place for it is found.
-! copying every traversed element into the slot preceding it, and finally,
-! copying data from the temporary variable into the resulting hole.
-! Consistency of the indices in `index` with the elements of `array`
-! are maintained.
-
-            real(${k1}$), intent(inout)   :: array(0:)
-            integer(int_size), intent(inout) :: index(0:)
-
-            real(${k1}$) :: tmp
-            integer(int_size) :: i, tmp_index
-
-            tmp = array(0)
-            tmp_index = index(0)
-            find_hole: do i=1, size(array, kind=int_size)-1
-                if ( array(i) >= tmp ) exit find_hole
-                array(i-1) = array(i)
-                index(i-1) = index(i)
-            end do find_hole
-            array(i-1) = tmp
-            index(i-1) = tmp_index
-
-        end subroutine insert_head
-
-
-        subroutine merge_sort( array, index, buf, ibuf )
-! The Rust merge sort borrows some (but not all) of the ideas from TimSort,
-! which is described in detail at
-! (http://svn.python.org/projects/python/trunk/Objects/listsort.txt).
-!
-! The algorithm identifies strictly descending and non-descending
-! subsequences, which are called natural runs. Where these runs are less
-! than a minimum run size they are padded by adding additional samples
-! using an insertion sort. The merge process is driven by a stack of
-! pending unmerged runs. Each newly found run is pushed onto the stack,
-! and then pairs of adjacentd runs are merged until these two invariants
-! are satisfied:
-!
-! 1. for every `i` in `1..size(runs)-1`: `runs(i - 1)%len > runs(i)%len`
-! 2. for every `i` in `2..size(runs)-1`: `runs(i - 2)%len >
-!    runs(i - 1)%len + runs(i)%len`
-!
-! The invariants ensure that the total running time is `O(n log n)`
-! worst-case. Consistency of the indices in `index` with the elements of
-! `array` are maintained.
-
-            real(${k1}$), intent(inout)   :: array(0:)
-            integer(int_size), intent(inout) :: index(0:)
-            real(${k1}$), intent(inout)   :: buf(0:)
-            integer(int_size), intent(inout) :: ibuf(0:)
-
-            integer(int_size) :: array_size, finish, min_run, r, r_count, &
-                start
-            type(run_type) :: runs(0:max_merge_stack-1), left, right
-
-            array_size = size(array, kind=int_size)
-
-! Very short runs are extended using insertion sort to span at least this
-! many elements. Slices of up to this length are sorted using insertion sort.
-            min_run = calc_min_run( array_size )
-
-            if ( array_size <= min_run ) then
-                if ( array_size >= 2 ) call insertion_sort( array, index )
-                return
-            end if
-
-
-! Following Rust sort, natural runs in `array` are identified by traversing
-! it backwards. By traversing it backward, merges more often go in the
-! opposite direction (forwards). According to developers of Rust sort,
-! merging forwards is slightly faster than merging backwards. Therefore
-! identifying runs by traversing backwards should improve performance.
-            r_count = 0
-            finish = array_size - 1
-            do while ( finish >= 0 )
-! Find the next natural run, and reverse it if it's strictly descending.
-                start = finish
-                if ( start > 0 ) then
-                    start = start - 1
-                    if ( array(start+1) < array(start) ) then
-                        Descending: do while ( start > 0 )
-                            if ( array(start) >= array(start-1) ) &
-                                exit Descending
-                            start = start - 1
-                        end do Descending
-                        call reverse_segment( array(start:finish), &
-                                            index(start:finish) )
-                    else
-                        Ascending: do while( start > 0 )
-                            if ( array(start) < array(start-1) ) exit Ascending
-                            start = start - 1
-                        end do Ascending
-                    end if
-                end if
-
-! If the run is too short insert some more elements using an insertion sort.
-                Insert: do while ( start > 0 )
-                    if ( finish - start >= min_run - 1 ) exit Insert
-                    start = start - 1
-                    call insert_head( array(start:finish), index(start:finish) )
-                end do Insert
-                if ( start == 0 .and. finish == array_size - 1 ) return
-
-                runs(r_count) = run_type( base = start, &
-                                          len = finish - start + 1 )
-                finish = start-1
-                r_count = r_count + 1
-
-! Determine whether pairs of adjacent runs need to be merged to satisfy
-! the invariants, and, if so, merge them.
-                Merge_loop: do
-                    r = collapse( runs(0:r_count - 1) )
-                    if ( r < 0 .or. r_count <= 1 ) exit Merge_loop
-                    left = runs( r + 1 )
-                    right = runs( r )
-                    call merge( array( left % base: &
-                                     right % base + right % len - 1 ), &
-                                left % len, buf, &
-                                index( left % base: &
-                                     right % base + right % len - 1 ), ibuf )
-
-                    runs(r) = run_type( base = left % base, &
-                                        len = left % len + right % len )
-                    if ( r == r_count - 3 ) runs(r+1) = runs(r+2)
-                    r_count = r_count - 1
-
-                end do Merge_loop
-            end do
-            if ( r_count /= 1 ) &
-                error stop "MERGE_SORT completed without RUN COUNT == 1."
-
-        end subroutine merge_sort
-
-
-        pure subroutine merge( array, mid, buf, index, ibuf )
-! Merges the two non-decreasing runs `ARRAY(0:MID-1)` and `ARRAY(MID:)`
-! using `BUF` as temporary storage, and stores the merged runs into
-! `ARRAY(0:)`. `MID` must be > 0, and < `SIZE(ARRAY)-1`. Buffer `BUF`
-! must be long enough to hold the shorter of the two runs.
-            real(${k1}$), intent(inout)   :: array(0:)
-            integer(int_size), intent(in)    :: mid
-            real(${k1}$), intent(inout)   :: buf(0:)
-            integer(int_size), intent(inout) :: index(0:)
-            integer(int_size), intent(inout) :: ibuf(0:)
-
-            integer(int_size) :: array_len, i, j, k
-
-            array_len = size(array, kind=int_size)
-
-! Merge first copies the shorter run into `buf`. Then, depending on which
-! run was shorter, it traces the copied run and the longer run forwards
-! (or backwards), comparing their next unprocessed elements and then
-! copying the lesser (or greater) one into `array`.
-
-            if ( mid <= array_len - mid ) then ! The left run is shorter.
-                buf(0:mid-1) = array(0:mid-1)
-                ibuf(0:mid-1) = index(0:mid-1)
-                i = 0
-                j = mid
-                merge_lower: do k = 0, array_len-1
-                    if ( buf(i) <= array(j) ) then
-                        array(k) = buf(i)
-                        index(k) = ibuf(i)
-                        i = i + 1
-                        if ( i >= mid ) exit merge_lower
-                    else
-                        array(k) = array(j)
-                        index(k) = index(j)
-                        j = j + 1
-                        if ( j >= array_len ) then
-                            array(k+1:) = buf(i:mid-1)
-                            index(k+1:) = ibuf(i:mid-1)
-                            exit merge_lower
-                        end if
-                    end if
-                end do merge_lower
-            else ! The right run is shorter
-                buf(0:array_len-mid-1) = array(mid:array_len-1)
-                ibuf(0:array_len-mid-1) = index(mid:array_len-1)
-                i = mid - 1
-                j = array_len - mid -1
-                merge_upper: do k = array_len-1, 0, -1
-                    if ( buf(j) >= array(i) ) then
-                        array(k) = buf(j)
-                        index(k) = ibuf(j)
-                        j = j - 1
-                        if ( j < 0 ) exit merge_upper
-                    else
-                        array(k) = array(i)
-                        index(k) = index(i)
-                        i = i - 1
-                        if ( i < 0 ) then
-                            array(0:k-1) = buf(0:j)
-                            index(0:k-1) = ibuf(0:j)
-                            exit merge_upper
-                        end if
-                    end if
-                end do merge_upper
-            end if
-        end subroutine merge
-
-
-        pure subroutine reverse_segment( array, index )
-! Reverse a segment of an array in place
-            real(${k1}$), intent(inout) :: array(0:)
-            integer(int_size), intent(inout) :: index(0:)
-
-            integer(int_size) :: itemp, lo, hi
-            real(${k1}$) :: temp
-
-            lo = 0
-            hi = size( array, kind=int_size ) - 1
-            do while( lo < hi )
-                temp = array(lo)
-                array(lo) = array(hi)
-                array(hi) = temp
-                itemp = index(lo)
-                index(lo) = index(hi)
-                index(hi) = itemp
-                lo = lo + 1
-                hi = hi - 1
-            end do
-
-        end subroutine reverse_segment
-
-    end subroutine ${k1}$_sort_index
-
-#:endfor
-
     module subroutine char_sort_index( array, index, work, iwork, reverse )
 ! A modification of `char_ord_sort` to return an array of indices that
 ! would perform a stable sort of the `ARRAY` as input, and also sort `ARRAY`

From b7d79af678283bdbfeebf850485ef3bda4899d2f Mon Sep 17 00:00:00 2001
From: Jeremie Vandenplas <jeremie.vandenplas@gmail.com>
Date: Thu, 20 May 2021 21:23:52 +0200
Subject: [PATCH 25/45] fix issue

---
 src/stdlib_sorting.fypp | 1 -
 1 file changed, 1 deletion(-)

diff --git a/src/stdlib_sorting.fypp b/src/stdlib_sorting.fypp
index 07eb5a72a..ef391ee70 100644
--- a/src/stdlib_sorting.fypp
+++ b/src/stdlib_sorting.fypp
@@ -434,7 +434,6 @@ module stdlib_sorting
 !! non-decreasing sort, but if the optional argument `REVERSE` is present
 !! with a value of `.TRUE.` the indices correspond to a non-increasing sort.
 
-#:for k1 in INT_KINDS
 #:for k1, t1 in IR_KINDS_TYPES
         module subroutine ${k1}$_sort_index( array, index, work, iwork, &
             reverse )

From 30fda4281f02786e535826c0d82bc47c14604dba Mon Sep 17 00:00:00 2001
From: Jeremie Vandenplas <jeremie.vandenplas@gmail.com>
Date: Thu, 20 May 2021 21:38:38 +0200
Subject: [PATCH 26/45] sort_index: add fypp variable for code simplifications

---
 src/common.fypp                    |  15 +-
 src/stdlib_sorting.fypp            |  19 +-
 src/stdlib_sorting_sort_index.fypp | 408 +----------------------------
 3 files changed, 16 insertions(+), 426 deletions(-)

diff --git a/src/common.fypp b/src/common.fypp
index 0dbd3e7d0..42be93691 100644
--- a/src/common.fypp
+++ b/src/common.fypp
@@ -3,7 +3,7 @@
 #! Real kinds to be considered during templating
 #:set REAL_KINDS = ["sp", "dp", "qp"]
 
-#! Real types to be considere during templating
+#! Real types to be considered during templating
 #:set REAL_TYPES = ["real({})".format(k) for k in REAL_KINDS]
 
 #! Collected (kind, type) tuples for real types
@@ -12,7 +12,7 @@
 #! Complex kinds to be considered during templating
 #:set CMPLX_KINDS = ["sp", "dp", "qp"]
 
-#! Complex types to be considere during templating
+#! Complex types to be considered during templating
 #:set CMPLX_TYPES = ["complex({})".format(k) for k in CMPLX_KINDS]
 
 #! Collected (kind, type) tuples for complex types
@@ -21,12 +21,21 @@
 #! Integer kinds to be considered during templating
 #:set INT_KINDS = ["int8", "int16", "int32", "int64"]
 
-#! Integer types to be considere during templating
+#! Integer types to be considered during templating
 #:set INT_TYPES = ["integer({})".format(k) for k in INT_KINDS]
 
 #! Collected (kind, type) tuples for integer types
 #:set INT_KINDS_TYPES = list(zip(INT_KINDS, INT_TYPES))
 
+#! Derived type string_type
+#:set STRING_KINDS = ["string_type"]
+
+#! String types to be considered during templating
+#:set STRING_TYPES = ["type({})".format(k) for k in STRING_KINDS]
+
+#! Collected (kind, type) tuples for string derived types
+#:set STRING_KINDS_TYPES = list(zip(STRING_KINDS, STRING_TYPES))
+
 
 #! Whether Fortran 90 compatible code should be generated
 #:set VERSION90 = defined('VERSION90')
diff --git a/src/stdlib_sorting.fypp b/src/stdlib_sorting.fypp
index ef391ee70..d87a6fd59 100644
--- a/src/stdlib_sorting.fypp
+++ b/src/stdlib_sorting.fypp
@@ -1,5 +1,5 @@
 #:include "common.fypp"
-#:set IR_KINDS_TYPES = INT_KINDS_TYPES + REAL_KINDS_TYPES
+#:set IR_KINDS_TYPES = INT_KINDS_TYPES + REAL_KINDS_TYPES + STRING_KINDS_TYPES
 
 !! Licensing:
 !!
@@ -434,7 +434,7 @@ module stdlib_sorting
 !! non-decreasing sort, but if the optional argument `REVERSE` is present
 !! with a value of `.TRUE.` the indices correspond to a non-increasing sort.
 
-#:for k1, t1 in IR_KINDS_TYPES
+#:for k1, t1 in IRS_KINDS_TYPES
         module subroutine ${k1}$_sort_index( array, index, work, iwork, &
             reverse )
 !! Version: experimental
@@ -467,21 +467,6 @@ module stdlib_sorting
             logical, intent(in), optional                      :: reverse
         end subroutine char_sort_index
 
-        module subroutine string_sort_index( array, index, work, iwork, &
-            reverse )
-!! Version: experimental
-!!
-!! `string_sort_index( array )` sorts an input `ARRAY` of type `STRING_TYPE`
-!! using a hybrid sort based on the `'Rust" sort` algorithm found in `slice.rs`
-!! and returns the sorted `ARRAY` and an array `INDEX of indices in the
-!! order that would sort the input `ARRAY` in the desired direction.
-            type(string_type), intent(inout)           :: array(0:)
-            integer(int_size), intent(inout)           :: index(0:)
-            type(string_type), intent(inout), optional :: work(0:)
-            integer(int_size), intent(inout), optional :: iwork(0:)
-            logical, intent(in), optional              :: reverse
-        end subroutine string_sort_index
-
     end interface sort_index
 
 
diff --git a/src/stdlib_sorting_sort_index.fypp b/src/stdlib_sorting_sort_index.fypp
index f592ccf9d..5a11a5efe 100644
--- a/src/stdlib_sorting_sort_index.fypp
+++ b/src/stdlib_sorting_sort_index.fypp
@@ -1,5 +1,5 @@
 #:include "common.fypp"
-#:set IR_KINDS_TYPES = INT_KINDS_TYPES + REAL_KINDS_TYPES
+#:set IRS_KINDS_TYPES = INT_KINDS_TYPES + REAL_KINDS_TYPES + STRING_KINDS_TYPES
 
 !! Licensing:
 !!
@@ -56,7 +56,7 @@ submodule(stdlib_sorting) stdlib_sorting_sort_index
 
 contains
 
-#:for k1, t1 in IR_KINDS_TYPES
+#:for k1, t1 in IRS_KINDS_TYPES
 
     module subroutine ${k1}$_sort_index( array, index, work, iwork, reverse )
 ! A modification of `${k1}$_ord_sort` to return an array of indices that
@@ -870,408 +870,4 @@ contains
 
     end subroutine char_sort_index
 
-    module subroutine string_sort_index( array, index, work, iwork, reverse )
-! A modification of `string_ord_sort` to return an array of indices that
-! would perform a stable sort of the `ARRAY` as input, and also sort `ARRAY`
-! as desired. The indices by default
-! correspond to a non-decreasing sort, but if the optional argument
-! `REVERSE` is present with a value of `.TRUE.` the indices correspond to
-! a non-increasing sort. The logic of the determination of indexing largely
-! follows the `"Rust" sort` found in `slice.rs`:
-! https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs#L2159
-! The Rust version is a simplification of the Timsort algorithm described
-! in https://svn.python.org/projects/python/trunk/Objects/listsort.txt, as
-! it drops both the use of 'galloping' to identify bounds of regions to be
-! sorted and the estimation of the optimal `run size`. However it remains
-! a hybrid sorting algorithm combining an iterative Merge sort controlled
-! by a stack of `RUNS` identified by regions of uniformly decreasing or
-! non-decreasing sequences that may be expanded to a minimum run size, with
-! an insertion sort.
-!
-! Note the Fortran implementation simplifies the logic as it only has to
-! deal with Fortran arrays of intrinsic types and not the full generality
-! of Rust's arrays and lists for arbitrary types. It also adds the
-! estimation of the optimal `run size` as suggested in Tim Peters'
-! original listsort.txt, and the optional `work` and `iwork` arraya to be
-! used as scratch memory.
-
-        type(string_type), intent(inout)           :: array(0:)
-        integer(int_size), intent(inout)           :: index(0:)
-        type(string_type), intent(inout), optional :: work(0:)
-        integer(int_size), intent(inout), optional :: iwork(0:)
-        logical, intent(in), optional              :: reverse
-
-        integer(int_size)              :: array_size, i, stat
-        type(string_type), allocatable :: buf(:)
-        integer(int_size), allocatable :: ibuf(:)
-
-        array_size = size(array, kind=int_size)
-
-        do i = 0, array_size-1
-            index(i) = i+1
-        end do
-
-        if ( present(reverse) ) then
-            if ( reverse ) then
-                call reverse_segment( array, index )
-            end if
-        end if
-
-! If necessary allocate buffers to serve as scratch memory.
-        if ( present(work) ) then
-            if ( present(iwork) ) then
-                call merge_sort( array, index, work, iwork )
-            else
-                allocate( ibuf(0:array_size/2-1), stat=stat )
-                if ( stat /= 0 ) error stop "Allocation of index buffer failed."
-                call merge_sort( array, index, work, ibuf )
-            end if
-        else
-            allocate( buf(0:array_size/2-1), stat=stat )
-            if ( stat /= 0 ) error stop "Allocation of array buffer failed."
-            if ( present(iwork) ) then
-                call merge_sort( array, index, buf, iwork )
-            else
-                allocate( ibuf(0:array_size/2-1), stat=stat )
-                if ( stat /= 0 ) error stop "Allocation of index buffer failed."
-                call merge_sort( array, index, buf, ibuf )
-            end if
-        end if
-
-        if ( present(reverse) ) then
-            if ( reverse ) then
-                call reverse_segment( array, index )
-            end if
-        end if
-
-    contains
-
-        pure function calc_min_run( n ) result(min_run)
-!! Returns the minimum length of a run from 32-63 so that N/MIN_RUN is
-!! less than or equal to a power of two. See
-!! https://svn.python.org/projects/python/trunk/Objects/listsort.txt
-            integer(int_size)             :: min_run
-            integer(int_size), intent(in) :: n
-
-            integer(int_size) :: num, r
-
-            num = n
-            r = 0_int_size
-
-            do while( num >= 64 )
-                r = ior( r, iand(num, 1_int_size) )
-                num = ishft(num, -1_int_size)
-            end do
-            min_run = num + r
-
-        end function calc_min_run
-
-
-        pure subroutine insertion_sort( array, index )
-! Sorts `ARRAY` using an insertion sort, while maintaining consistency in
-! location of the indices in `INDEX` to the elements of `ARRAY`.
-            type(string_type), intent(inout) :: array(0:)
-            integer(int_size), intent(inout) :: index(0:)
-
-            integer(int_size) :: i, j, key_index
-            type(string_type) :: key
-
-            do j=1, size(array, kind=int_size)-1
-                key = array(j)
-                key_index = index(j)
-                i = j - 1
-                do while( i >= 0 .and. array(i) > key )
-                    array(i+1) = array(i)
-                    index(i+1) = index(i)
-                    i = i - 1
-                end do
-                array(i+1) = key
-                index(i+1) = key_index
-            end do
-
-        end subroutine insertion_sort
-
-
-        pure function collapse( runs ) result ( r )
-! Examine the stack of runs waiting to be merged, identifying adjacent runs
-! to be merged until the stack invariants are restablished:
-!
-! 1. len(-3) > len(-2) + len(-1)
-! 2. len(-2) > len(-1)
-
-            integer(int_size) :: r
-            type(run_type), intent(in), target :: runs(0:)
-
-            integer(int_size) :: n
-            logical :: test
-
-            n = size(runs, kind=int_size)
-            test = .false.
-            if (n >= 2) then
-                if ( runs( n-1 ) % base == 0 .or. &
-                     runs( n-2 ) % len <= runs(n-1) % len ) then
-                    test = .true.
-                else if ( n >= 3 ) then ! X exists
-                    if ( runs(n-3) % len <= &
-                         runs(n-2) % len + runs(n-1) % len ) then
-                        test = .true.
-!               |X| <= |Y| + |Z| => will need to merge due to rho1 or rho2
-                    else if( n >= 4 ) then
-                        if ( runs(n-4) % len <= &
-                             runs(n-3) % len + runs(n-2) % len ) then
-                            test = .true.
-!               |W| <= |X| + |Y| => will need to merge due to rho1 or rho3
-                        end if
-                    end if
-                end if
-            end if
-            if ( test ) then
-! By default merge Y & Z, rho2 or rho3
-                if ( n >= 3 ) then
-                    if ( runs(n-3) % len < runs(n-1) % len ) then
-                        r = n - 3
-! |X| < |Z| => merge X & Y, rho1
-                        return
-                    end if
-                end if
-                r = n - 2
-! |Y| <= |Z| => merge Y & Z, rho4
-                return
-            else
-                r = -1
-            end if
-
-        end function collapse
-
-
-        pure subroutine insert_head( array, index )
-! Inserts `array(0)` into the pre-sorted sequence `array(1:)` so that the
-! whole `array(0:)` becomes sorted, copying the first element into
-! a temporary variable, iterating until the right place for it is found.
-! copying every traversed element into the slot preceding it, and finally,
-! copying data from the temporary variable into the resulting hole.
-! Consistency of the indices in `index` with the elements of `array`
-! are maintained.
-
-            type(string_type), intent(inout) :: array(0:)
-            integer(int_size), intent(inout) :: index(0:)
-
-            type(string_type) :: tmp
-            integer(int_size) :: i, tmp_index
-
-            tmp = array(0)
-            tmp_index = index(0)
-            find_hole: do i=1, size(array, kind=int_size)-1
-                if ( array(i) >= tmp ) exit find_hole
-                array(i-1) = array(i)
-                index(i-1) = index(i)
-            end do find_hole
-            array(i-1) = tmp
-            index(i-1) = tmp_index
-
-        end subroutine insert_head
-
-
-        subroutine merge_sort( array, index, buf, ibuf )
-! The Rust merge sort borrows some (but not all) of the ideas from TimSort,
-! which is described in detail at
-! (http://svn.python.org/projects/python/trunk/Objects/listsort.txt).
-!
-! The algorithm identifies strictly descending and non-descending
-! subsequences, which are called natural runs. Where these runs are less
-! than a minimum run size they are padded by adding additional samples
-! using an insertion sort. The merge process is driven by a stack of
-! pending unmerged runs. Each newly found run is pushed onto the stack,
-! and then pairs of adjacentd runs are merged until these two invariants
-! are satisfied:
-!
-! 1. for every `i` in `1..size(runs)-1`: `runs(i - 1)%len > runs(i)%len`
-! 2. for every `i` in `2..size(runs)-1`: `runs(i - 2)%len >
-!    runs(i - 1)%len + runs(i)%len`
-!
-! The invariants ensure that the total running time is `O(n log n)`
-! worst-case. Consistency of the indices in `index` with the elements of
-! `array` are maintained.
-
-            type(string_type), intent(inout) :: array(0:)
-            integer(int_size), intent(inout) :: index(0:)
-            type(string_type), intent(inout) :: buf(0:)
-            integer(int_size), intent(inout) :: ibuf(0:)
-
-            integer(int_size) :: array_size, finish, min_run, r, r_count, &
-                start
-            type(run_type)    :: runs(0:max_merge_stack-1), left, right
-
-            array_size = size(array, kind=int_size)
-
-! Very short runs are extended using insertion sort to span at least this
-! many elements. Slices of up to this length are sorted using insertion sort.
-            min_run = calc_min_run( array_size )
-
-            if ( array_size <= min_run ) then
-                if ( array_size >= 2 ) call insertion_sort( array, index )
-                return
-            end if
-
-
-! Following Rust sort, natural runs in `array` are identified by traversing
-! it backwards. By traversing it backward, merges more often go in the
-! opposite direction (forwards). According to developers of Rust sort,
-! merging forwards is slightly faster than merging backwards. Therefore
-! identifying runs by traversing backwards should improve performance.
-            r_count = 0
-            finish = array_size - 1
-            do while ( finish >= 0 )
-! Find the next natural run, and reverse it if it's strictly descending.
-                start = finish
-                if ( start > 0 ) then
-                    start = start - 1
-                    if ( array(start+1) < array(start) ) then
-                        Descending: do while ( start > 0 )
-                            if ( array(start) >= array(start-1) ) &
-                                exit Descending
-                            start = start - 1
-                        end do Descending
-                        call reverse_segment( array(start:finish), &
-                                            index(start:finish) )
-                    else
-                        Ascending: do while( start > 0 )
-                            if ( array(start) < array(start-1) ) exit Ascending
-                            start = start - 1
-                        end do Ascending
-                    end if
-                end if
-
-! If the run is too short insert some more elements using an insertion sort.
-                Insert: do while ( start > 0 )
-                    if ( finish - start >= min_run - 1 ) exit Insert
-                    start = start - 1
-                    call insert_head( array(start:finish), index(start:finish) )
-                end do Insert
-                if ( start == 0 .and. finish == array_size - 1 ) return
-
-                runs(r_count) = run_type( base = start, &
-                                          len = finish - start + 1 )
-                finish = start-1
-                r_count = r_count + 1
-
-! Determine whether pairs of adjacent runs need to be merged to satisfy
-! the invariants, and, if so, merge them.
-                Merge_loop: do
-                    r = collapse( runs(0:r_count - 1) )
-                    if ( r < 0 .or. r_count <= 1 ) exit Merge_loop
-                    left = runs( r + 1 )
-                    right = runs( r )
-                    call merge( array( left % base: &
-                                     right % base + right % len - 1 ), &
-                                left % len, buf, &
-                                index( left % base: &
-                                     right % base + right % len - 1 ), ibuf )
-
-                    runs(r) = run_type( base = left % base, &
-                                        len = left % len + right % len )
-                    if ( r == r_count - 3 ) runs(r+1) = runs(r+2)
-                    r_count = r_count - 1
-
-                end do Merge_loop
-            end do
-            if ( r_count /= 1 ) &
-                error stop "MERGE_SORT completed without RUN COUNT == 1."
-
-        end subroutine merge_sort
-
-
-        pure subroutine merge( array, mid, buf, index, ibuf )
-! Merges the two non-decreasing runs `ARRAY(0:MID-1)` and `ARRAY(MID:)`
-! using `BUF` as temporary storage, and stores the merged runs into
-! `ARRAY(0:)`. `MID` must be > 0, and < `SIZE(ARRAY)-1`. Buffer `BUF`
-! must be long enough to hold the shorter of the two runs.
-            type(string_type), intent(inout) :: array(0:)
-            integer(int_size), intent(in)    :: mid
-            type(string_type), intent(inout) :: buf(0:)
-            integer(int_size), intent(inout) :: index(0:)
-            integer(int_size), intent(inout) :: ibuf(0:)
-
-            integer(int_size) :: array_len, i, j, k
-
-            array_len = size(array, kind=int_size)
-
-! Merge first copies the shorter run into `buf`. Then, depending on which
-! run was shorter, it traces the copied run and the longer run forwards
-! (or backwards), comparing their next unprocessed elements and then
-! copying the lesser (or greater) one into `array`.
-
-            if ( mid <= array_len - mid ) then ! The left run is shorter.
-                buf(0:mid-1) = array(0:mid-1)
-                ibuf(0:mid-1) = index(0:mid-1)
-                i = 0
-                j = mid
-                merge_lower: do k = 0, array_len-1
-                    if ( buf(i) <= array(j) ) then
-                        array(k) = buf(i)
-                        index(k) = ibuf(i)
-                        i = i + 1
-                        if ( i >= mid ) exit merge_lower
-                    else
-                        array(k) = array(j)
-                        index(k) = index(j)
-                        j = j + 1
-                        if ( j >= array_len ) then
-                            array(k+1:) = buf(i:mid-1)
-                            index(k+1:) = ibuf(i:mid-1)
-                            exit merge_lower
-                        end if
-                    end if
-                end do merge_lower
-            else ! The right run is shorter
-                buf(0:array_len-mid-1) = array(mid:array_len-1)
-                ibuf(0:array_len-mid-1) = index(mid:array_len-1)
-                i = mid - 1
-                j = array_len - mid -1
-                merge_upper: do k = array_len-1, 0, -1
-                    if ( buf(j) >= array(i) ) then
-                        array(k) = buf(j)
-                        index(k) = ibuf(j)
-                        j = j - 1
-                        if ( j < 0 ) exit merge_upper
-                    else
-                        array(k) = array(i)
-                        index(k) = index(i)
-                        i = i - 1
-                        if ( i < 0 ) then
-                            array(0:k-1) = buf(0:j)
-                            index(0:k-1) = ibuf(0:j)
-                            exit merge_upper
-                        end if
-                    end if
-                end do merge_upper
-            end if
-        end subroutine merge
-
-
-        pure subroutine reverse_segment( array, index )
-! Reverse a segment of an array in place
-            type(string_type), intent(inout)  :: array(0:)
-            integer(int_size), intent(inout) :: index(0:)
-
-            integer(int_size) :: itemp, lo, hi
-            type(string_type) :: temp
-
-            lo = 0
-            hi = size( array, kind=int_size ) - 1
-            do while( lo < hi )
-                temp = array(lo)
-                array(lo) = array(hi)
-                array(hi) = temp
-                itemp = index(lo)
-                index(lo) = index(hi)
-                index(hi) = itemp
-                lo = lo + 1
-                hi = hi - 1
-            end do
-
-        end subroutine reverse_segment
-
-    end subroutine string_sort_index
-
 end submodule stdlib_sorting_sort_index

From f26a47bf31caa579ce8882707e1add5e5bb4c7b0 Mon Sep 17 00:00:00 2001
From: Jeremie Vandenplas <jeremie.vandenplas@gmail.com>
Date: Thu, 20 May 2021 21:45:24 +0200
Subject: [PATCH 27/45] sort: add fypp variable for code simplifications + fix
 typo

---
 src/stdlib_sorting.fypp      |  15 +---
 src/stdlib_sorting_sort.fypp | 169 +----------------------------------
 2 files changed, 4 insertions(+), 180 deletions(-)

diff --git a/src/stdlib_sorting.fypp b/src/stdlib_sorting.fypp
index d87a6fd59..a6c45674b 100644
--- a/src/stdlib_sorting.fypp
+++ b/src/stdlib_sorting.fypp
@@ -1,5 +1,5 @@
 #:include "common.fypp"
-#:set IR_KINDS_TYPES = INT_KINDS_TYPES + REAL_KINDS_TYPES + STRING_KINDS_TYPES
+#:set IRS_KINDS_TYPES = INT_KINDS_TYPES + REAL_KINDS_TYPES + STRING_KINDS_TYPES
 
 !! Licensing:
 !!
@@ -385,7 +385,7 @@ module stdlib_sorting
 !! The generic subroutine interface implementing the `SORT` algorithm, based
 !! on the `introsort` of David Musser.
 
-#:for k1, t1 in IR_KINDS_TYPES
+#:for k1, t1 in IRS_KINDS_TYPES
         pure module subroutine ${k1}$_sort( array )
 !! Version: experimental
 !!
@@ -410,17 +410,6 @@ module stdlib_sorting
             character(len=*), intent(inout) :: array(0:)
         end subroutine char_sort
 
-        pure module subroutine string_sort( array )
-!! Version: experimental
-!!
-!! `string_sort( array )` sorts the input `ARRAY` of type `STRING_TYPE`
-!! using a hybrid sort based on the `introsort` of David Musser.
-!! The algorithm is of order O(N Ln(N)) for all inputs.
-!! Because it relies on `quicksort`, the coefficient of the O(N Ln(N))
-!! behavior is small for random data compared to other sorting algorithms.
-            type(string_type), intent(inout) :: array(0:)
-        end subroutine string_sort
-
     end interface sort
 
     interface sort_index
diff --git a/src/stdlib_sorting_sort.fypp b/src/stdlib_sorting_sort.fypp
index 89200ddb1..8f8d1f90a 100644
--- a/src/stdlib_sorting_sort.fypp
+++ b/src/stdlib_sorting_sort.fypp
@@ -1,5 +1,5 @@
 #:include "common.fypp"
-#:set IR_KINDS_TYPES = INT_KINDS_TYPES + REAL_KINDS_TYPES
+#:set IRS_KINDS_TYPES = INT_KINDS_TYPES + REAL_KINDS_TYPES + STRING_KINDS_TYPES
 
 !! Licensing:
 !!
@@ -61,7 +61,7 @@ submodule(stdlib_sorting) stdlib_sorting_sort
 contains
 
 
-#:for k1, t1 in IR_KINDS_TYPES
+#:for k1, t1 in IRS_KINDS_TYPES
 
     pure module subroutine ${k1}$_sort( array )
 ! `${k1}$_sort( array )` sorts the input `ARRAY` of type `${t1}$`
@@ -398,169 +398,4 @@ contains
 
     end subroutine char_sort
 
-    pure module subroutine string_sort( array )
-! `string_sort( array )` sorts the input `ARRAY` of type `STRING_TyPE`
-! using a hybrid sort based on the `introsort` of David Musser. As with
-! `introsort`, `string_sort( array )` is an unstable hybrid comparison
-! algorithm using `quicksort` for the main body of the sort tree,
-! supplemented by `insertion sort` for the outer brances, but if
-! `quicksort` is converging too slowly the algorithm resorts
-! to `heapsort`. The algorithm is of order O(N Ln(N)) for all inputs.
-! Because it relies on `quicksort`, the coefficient of the O(N Ln(N))
-! behavior is typically small compared to other sorting algorithms.
-
-        type(string_type), intent(inout) :: array(0:)
-
-        integer(int32) :: depth_limit
-
-        depth_limit = 2 * int( floor( log( real( size( array, kind=int64 ),  &
-                                                 kind=dp) ) / log(2.0_dp) ), &
-                               kind=int32 )
-        call introsort(array, depth_limit)
-
-    contains
-
-        pure recursive subroutine introsort( array, depth_limit )
-! It devolves to `insertionsort` if the remaining number of elements
-! is fewer than or equal to `INSERT_SIZE`, `heapsort` if the completion
-! of the `quicksort` is too slow as estimated from `DEPTH_LIMIT`,
-! otherwise sorting is done by a `quicksort`.
-            type(string_type), intent(inout) :: array(0:)
-            integer(int32), intent(in)      :: depth_limit
-
-            integer(int_size), parameter :: insert_size = 16_int_size
-            integer(int_size)            :: index
-
-            if ( size(array, kind=int_size) <= insert_size ) then
-                ! May be best at the end of SORT processing the whole array
-                ! See Musser, D.R., “Introspective Sorting and Selection
-                ! Algorithms,” Software—Practice and Experience, Vol. 27(8),
-                ! 983–993 (August 1997).
-
-                call insertion_sort( array )
-            else if ( depth_limit == 0 ) then
-                call heap_sort( array )
-            else
-                call partition( array, index )
-                call introsort( array(0:index-1), depth_limit-1 )
-                call introsort( array(index+1:), depth_limit-1 )
-            end if
-
-        end subroutine introsort
-
-
-        pure subroutine partition( array, index )
-! quicksort partition using median of three.
-            type(string_type), intent(inout) :: array(0:)
-            integer(int_size), intent(out)  :: index
-
-            integer(int_size) :: i, j
-            type(string_type) :: u, v, w, x, y
-
-! Determine median of three and exchange it with the end.
-            u = array( 0 )
-            v = array( size(array, kind=int_size)/2-1 )
-            w = array( size(array, kind=int_size)-1 )
-            if ( (u > v) .neqv. (u > w) ) then
-                x = u
-                y = array(0)
-                array(0) = array( size( array, kind=int_size ) - 1 )
-                array( size( array, kind=int_size ) - 1 ) = y
-            else if ( (v < u) .neqv. (v < w) ) then
-                x = v
-                y = array(size( array, kind=int_size )/2-1)
-                array( size( array, kind=int_size )/2-1 ) = &
-                    array( size( array, kind=int_size )-1 )
-                array( size( array, kind=int_size )-1 ) = y
-            else
-                x = w
-            end if
-! Partition the array.
-            i = -1_int_size
-            do j = 0_int_size, size(array, kind=int_size)-2
-                if ( array(j) <= x ) then
-                    i = i + 1
-                    y = array(i)
-                    array(i) = array(j)
-                    array(j) = y
-                end if
-            end do
-            y = array(i+1)
-            array(i+1) = array(size(array, kind=int_size)-1)
-            array(size(array, kind=int_size)-1) = y
-            index = i + 1
-
-        end subroutine partition
-
-        pure subroutine insertion_sort( array )
-! Bog standard insertion sort.
-            type(string_type), intent(inout) :: array(0:)
-
-            integer(int_size) :: i, j
-            type(string_type) :: key
-
-            do j=1_int_size, size(array, kind=int_size)-1
-                key = array(j)
-                i = j - 1
-                do while( i >= 0 )
-                    if ( array(i) <= key ) exit
-                    array(i+1) = array(i)
-                    i = i - 1
-                end do
-                array(i+1) = key
-            end do
-
-        end subroutine insertion_sort
-
-        pure subroutine heap_sort( array )
-! A bog standard heap sort
-            type(string_type), intent(inout) :: array(0:)
-
-            integer(int_size) :: i, heap_size
-            type(string_type) :: y
-
-            heap_size = size( array, kind=int_size )
-! Build the max heap
-            do i = (heap_size-2)/2_int_size, 0_int_size, -1_int_size
-                call max_heapify( array, i, heap_size )
-            end do
-            do i = heap_size-1, 1_int_size, -1_int_size
-! Swap the first element with the current final element
-                y = array(0)
-                array(0) = array(i)
-                array(i) = y
-! Sift down using max_heapify
-                call max_heapify( array, 0_int_size, i )
-            end do
-
-        end subroutine heap_sort
-
-        pure recursive subroutine max_heapify( array, i, heap_size )
-! Transform the array into a max heap
-            type(string_type), intent(inout) :: array(0:)
-            integer(int_size), intent(in)    :: i, heap_size
-
-            integer(int_size) :: l, r, largest
-            type(string_type) :: y
-
-            largest = i
-            l = 2_int_size * i + 1_int_size
-            r = l + 1_int_size
-            if ( l < heap_size ) then
-                if ( array(l) > array(largest) ) largest = l
-            end if
-            if ( r < heap_size ) then
-                if ( array(r) > array(largest) ) largest = r
-            end if
-            if ( largest /= i ) then
-                y = array(i)
-                array(i) = array(largest)
-                array(largest) = y
-                call max_heapify( array, largest, heap_size )
-            end if
-
-        end subroutine max_heapify
-
-    end subroutine string_sort
-
 end submodule stdlib_sorting_sort

From 3f8bea7777da43b11ca6037b7739381820da0a1d Mon Sep 17 00:00:00 2001
From: Jeremie Vandenplas <jeremie.vandenplas@gmail.com>
Date: Thu, 20 May 2021 21:46:01 +0200
Subject: [PATCH 28/45] fix typo

---
 src/stdlib_sorting.fypp | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/stdlib_sorting.fypp b/src/stdlib_sorting.fypp
index a6c45674b..a95d0fb52 100644
--- a/src/stdlib_sorting.fypp
+++ b/src/stdlib_sorting.fypp
@@ -1,4 +1,5 @@
 #:include "common.fypp"
+#:set IR_KINDS_TYPES = INT_KINDS_TYPES + REAL_KINDS_TYPES
 #:set IRS_KINDS_TYPES = INT_KINDS_TYPES + REAL_KINDS_TYPES + STRING_KINDS_TYPES
 
 !! Licensing:

From 4b16aca4efea4d36a3a42372b648d01a7e4ead2f Mon Sep 17 00:00:00 2001
From: Jeremie Vandenplas <jeremie.vandenplas@gmail.com>
Date: Thu, 20 May 2021 21:53:38 +0200
Subject: [PATCH 29/45] ord_sort: add fypp variable to reduce code

---
 src/stdlib_sorting.fypp          |  18 +-
 src/stdlib_sorting_ord_sort.fypp | 341 +------------------------------
 2 files changed, 6 insertions(+), 353 deletions(-)

diff --git a/src/stdlib_sorting.fypp b/src/stdlib_sorting.fypp
index a95d0fb52..1663efb2c 100644
--- a/src/stdlib_sorting.fypp
+++ b/src/stdlib_sorting.fypp
@@ -1,5 +1,4 @@
 #:include "common.fypp"
-#:set IR_KINDS_TYPES = INT_KINDS_TYPES + REAL_KINDS_TYPES
 #:set IRS_KINDS_TYPES = INT_KINDS_TYPES + REAL_KINDS_TYPES + STRING_KINDS_TYPES
 
 !! Licensing:
@@ -328,11 +327,11 @@ module stdlib_sorting
 
     public char_sort_index
 
-    public string_ord_sort
+    public string_type_ord_sort
 
-    public string_sort
+    public string_type_sort
 
-    public string_sort_index
+    public string_type_sort_index
 
     interface ord_sort
 !! Version: experimental
@@ -348,7 +347,7 @@ module stdlib_sorting
 !! sorted data, having O(N) performance on uniformly non-increasing or
 !! non-decreasing data.
 
-#:for k1, t1 in IR_KINDS_TYPES
+#:for k1, t1 in IRS_KINDS_TYPES
         module subroutine ${k1}$_ord_sort( array, work )
 !! Version: experimental
 !!
@@ -369,15 +368,6 @@ module stdlib_sorting
             character(len=len(array)), intent(inout), optional :: work(0:)
         end subroutine char_ord_sort
 
-        module subroutine string_ord_sort( array, work )
-!! Version: experimental
-!!
-!! `char_ord_sort( array )` sorts the input `ARRAY` of type `CHARACTER(*)`
-!! using a hybrid sort based on the `'Rust" sort` algorithm found in `slice.rs`
-            type(string_type), intent(inout)           :: array(0:)
-            type(string_type), intent(inout), optional :: work(0:)
-        end subroutine string_ord_sort
-
     end interface ord_sort
 
     interface sort
diff --git a/src/stdlib_sorting_ord_sort.fypp b/src/stdlib_sorting_ord_sort.fypp
index 437a4fb7c..d737133bf 100644
--- a/src/stdlib_sorting_ord_sort.fypp
+++ b/src/stdlib_sorting_ord_sort.fypp
@@ -1,5 +1,5 @@
 #:include "common.fypp"
-#:set IR_KINDS_TYPES = INT_KINDS_TYPES + REAL_KINDS_TYPES
+#:set IRS_KINDS_TYPES = INT_KINDS_TYPES + REAL_KINDS_TYPES + STRING_KINDS_TYPES
 
 !! Licensing:
 !!
@@ -56,7 +56,7 @@ submodule(stdlib_sorting) stdlib_sorting_ord_sort
 
 contains
 
-#:for k1, t1 in IR_KINDS_TYPES
+#:for k1, t1 in IRS_KINDS_TYPES
 
     module subroutine ${k1}$_ord_sort( array, work )
 ! A translation to Fortran 2008, of the `"Rust" sort` algorithm found in
@@ -734,342 +734,5 @@ contains
 
     end subroutine char_ord_sort
 
-
-    module subroutine string_ord_sort( array, work )
-! A translation to Fortran 2008, of the `"Rust" sort` algorithm found in
-! `slice.rs`
-! https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs#L2159
-! The Rust version in turn is a simplification of the Timsort algorithm
-! described in
-! https://svn.python.org/projects/python/trunk/Objects/listsort.txt, as
-! it drops both the use of 'galloping' to identify bounds of regions to be
-! sorted and the estimation of the optimal `run size`. However it remains
-! a hybrid sorting algorithm combining an iterative Merge sort controlled
-! by a stack of `RUNS` identified by regions of uniformly decreasing or
-! non-decreasing sequences that may be expanded to a minimum run size and
-! initially processed by an insertion sort.
-!
-! Note the Fortran implementation simplifies the logic as it only has to
-! deal with Fortran arrays of intrinsic types and not the full generality
-! of Rust's arrays and lists for arbitrary types. It also adds the
-! estimation of the optimal `run size` as suggested in Tim Peters'
-! original `listsort.txt`, and an optional `work` array to be used as
-! scratch memory.
-        type(string_type), intent(inout)           :: array(0:)
-        type(string_type), intent(inout), optional :: work(0:)
-
-        type(string_type), allocatable :: buf(:)
-        integer(int_size)              :: array_size
-        integer                        :: stat
-
-        if ( present(work) ) then
-! Use the work array as scratch memory
-            call merge_sort( array, work )
-        else
-! Allocate a buffer to use as scratch memory.
-            array_size = size( array, kind=int_size )
-            allocate( buf(0:array_size/2-1), stat=stat )
-            if ( stat /= 0 ) error stop "${k1}$_ord_sort: Allocation of buffer failed."
-            call merge_sort( array, buf )
-        end if
-
-    contains
-
-        pure function calc_min_run( n ) result(min_run)
-!! Returns the minimum length of a run from 32-63 so that N/MIN_RUN is
-!! less than or equal to a power of two. See
-!! https://svn.python.org/projects/python/trunk/Objects/listsort.txt
-            integer(int_size)             :: min_run
-            integer(int_size), intent(in) :: n
-
-            integer(int_size) :: num, r
-
-            num = n
-            r = 0_int_size
-
-            do while( num >= 64 )
-                r = ior( r, iand(num, 1_int_size) )
-                num = ishft(num, -1_int_size)
-            end do
-            min_run = num + r
-
-        end function calc_min_run
-
-
-        pure subroutine insertion_sort( array )
-! Sorts `ARRAY` using an insertion sort.
-            type(string_type), intent(inout) :: array(0:)
-
-            integer(int_size) :: i, j
-            type(string_type) :: key
-
-            do j=1, size(array, kind=int_size)-1
-                key = array(j)
-                i = j - 1
-                do while( i >= 0 .and. array(i) > key )
-                    array(i+1) = array(i)
-                    i = i - 1
-                end do
-                array(i+1) = key
-            end do
-
-        end subroutine insertion_sort
-
-
-        pure function collapse( runs ) result ( r )
-! Examine the stack of runs waiting to be merged, identifying adjacent runs
-! to be merged until the stack invariants are restablished:
-!
-! 1. len(-3) > len(-2) + len(-1)
-! 2. len(-2) > len(-1)
-            integer(int_size) :: r
-            type(run_type), intent(in), target :: runs(0:)
-
-            integer(int_size) :: n
-            logical           :: test
-
-            n = size(runs, kind=int_size)
-            test = .false.
-            if (n >= 2) then
-                if ( runs( n-1 ) % base == 0 .or. &
-                     runs( n-2 ) % len <= runs(n-1) % len ) then
-                    test = .true.
-                else if ( n >= 3 ) then ! X exists
-                    if ( runs(n-3) % len <= &
-                         runs(n-2) % len + runs(n-1) % len ) then
-                        test = .true.
-!               |X| <= |Y| + |Z| => will need to merge due to rho1 or rho2
-                    else if( n >= 4 ) then
-                        if ( runs(n-4) % len <= &
-                             runs(n-3) % len + runs(n-2) % len ) then
-                            test = .true.
-!               |W| <= |X| + |Y| => will need to merge due to rho1 or rho3
-                        end if
-                    end if
-                end if
-            end if
-            if ( test ) then
-! By default merge Y & Z, rho2 or rho3
-                if ( n >= 3 ) then
-                    if ( runs(n-3) % len < runs(n-1) % len ) then
-                        r = n - 3
-! |X| < |Z| => merge X & Y, rho1
-                        return
-                    end if
-                end if
-                r = n - 2
-! |Y| <= |Z| => merge Y & Z, rho4
-                return
-            else
-                r = -1
-            end if
-
-        end function collapse
-
-
-        pure subroutine insert_head( array )
-! Inserts `array(0)` into the pre-sorted sequence `array(1:)` so that the
-! whole `array(0:)` becomes sorted, copying the first element into
-! a temporary variable, iterating until the right place for it is found.
-! copying every traversed element into the slot preceding it, and finally,
-! copying data from the temporary variable into the resulting hole.
-
-            type(string_type), intent(inout) :: array(0:)
-
-            type(string_type) :: tmp
-            integer(int_size) :: i
-
-            tmp = array(0)
-            find_hole: do i=1, size(array, kind=int_size)-1
-                if ( array(i) >= tmp ) exit find_hole
-                array(i-1) = array(i)
-            end do find_hole
-            array(i-1) = tmp
-
-        end subroutine insert_head
-
-
-        subroutine merge_sort( array, buf )
-! The Rust merge sort borrows some (but not all) of the ideas from TimSort,
-! which is described in detail at
-! (http://svn.python.org/projects/python/trunk/Objects/listsort.txt).
-!
-! The algorithm identifies strictly descending and non-descending
-! subsequences, which are called natural runs. Where these runs are less
-! than a minimum run size they are padded by adding additional samples
-! using an insertion sort. The merge process is driven by a stack of
-! pending unmerged runs. Each newly found run is pushed onto the stack,
-! and then pairs of adjacentd runs are merged until these two invariants
-! are satisfied:
-!
-! 1. for every `i` in `1..size(runs)-1`: `runs(i - 1)%len > runs(i)%len`
-! 2. for every `i` in `2..size(runs)-1`: `runs(i - 2)%len >
-!    runs(i - 1)%len + runs(i)%len`
-!
-! The invariants ensure that the total running time is `O(n log n)`
-! worst-case.
-
-            type(string_type), intent(inout) :: array(0:)
-            type(string_type), intent(inout) :: buf(0:)
-
-            integer(int_size) :: array_size, finish, min_run, r, r_count, &
-                start
-            type(run_type)    :: runs(0:max_merge_stack-1), left, right
-
-            array_size = size(array, kind=int_size)
-
-! Very short runs are extended using insertion sort to span at least
-! min_run elements. Slices of up to this length are sorted using insertion
-! sort.
-            min_run = calc_min_run( array_size )
-
-            if ( array_size <= min_run ) then
-                if ( array_size >= 2 ) call insertion_sort( array )
-                return
-            end if
-
-! Following Rust sort, natural runs in `array` are identified by traversing
-! it backwards. By traversing it backward, merges more often go in the
-! opposite direction (forwards). According to developers of Rust sort,
-! merging forwards is slightly faster than merging backwards. Therefore
-! identifying runs by traversing backwards should improve performance.
-            r_count = 0
-            finish = array_size - 1
-            do while ( finish >= 0 )
-! Find the next natural run, and reverse it if it's strictly descending.
-                start = finish
-                if ( start > 0 ) then
-                    start = start - 1
-                    if ( array(start+1) < array(start) ) then
-                        Descending: do while ( start > 0 )
-                            if ( array(start) >= array(start-1) ) &
-                                exit Descending
-                            start = start - 1
-                        end do Descending
-                        call reverse_segment( array(start:finish) )
-                    else
-                        Ascending: do while( start > 0 )
-                            if ( array(start) < array(start-1) ) exit Ascending
-                            start = start - 1
-                        end do Ascending
-                    end if
-                end if
-
-! If the run is too short insert some more elements using an insertion sort.
-                Insert: do while ( start > 0 )
-                    if ( finish - start >= min_run - 1 ) exit Insert
-                    start = start - 1
-                    call insert_head( array(start:finish) )
-                end do Insert
-                if ( start == 0 .and. finish == array_size - 1 ) return
-
-                runs(r_count) = run_type( base = start, &
-                                          len = finish - start + 1 )
-                finish = start-1
-                r_count = r_count + 1
-
-! Determine whether pairs of adjacent runs need to be merged to satisfy
-! the invariants, and, if so, merge them.
-                Merge_loop: do
-                    r = collapse( runs(0:r_count - 1) )
-                    if ( r < 0 .or. r_count <= 1 ) exit Merge_loop
-                    left = runs( r + 1 )
-                    right = runs( r )
-                    call merge( array( left % base: &
-                                       right % base + right % len - 1 ), &
-                                left % len, buf )
-
-                    runs(r) = run_type( base = left % base, &
-                                        len = left % len + right % len )
-                    if ( r == r_count - 3 ) runs(r+1) = runs(r+2)
-                    r_count = r_count - 1
-
-                end do Merge_loop
-            end do
-            if ( r_count /= 1 ) &
-                error stop "MERGE_SORT completed without RUN COUNT == 1."
-
-        end subroutine merge_sort
-
-
-        pure subroutine merge( array, mid, buf )
-! Merges the two non-decreasing runs `ARRAY(0:MID-1)` and `ARRAY(MID:)`
-! using `BUF` as temporary storage, and stores the merged runs into
-! `ARRAY(0:)`. `MID` must be > 0, and < `SIZE(ARRAY)-1`. Buffer `BUF`
-! must be long enough to hold the shorter of the two runs.
-            type(string_type), intent(inout) :: array(0:)
-            integer(int_size), intent(in)    :: mid
-            type(string_type), intent(inout) :: buf(0:)
-
-            integer(int_size) :: array_len, i, j, k
-
-            array_len = size(array, kind=int_size)
-
-! Merge first copies the shorter run into `buf`. Then, depending on which
-! run was shorter, it traces the copied run and the longer run forwards
-! (or backwards), comparing their next unprocessed elements and then
-! copying the lesser (or greater) one into `array`.
-
-            if ( mid <= array_len - mid ) then ! The left run is shorter.
-                buf(0:mid-1) = array(0:mid-1)
-                i = 0
-                j = mid
-                merge_lower: do k = 0, array_len-1
-                    if ( buf(i) <= array(j) ) then
-                        array(k) = buf(i)
-                        i = i + 1
-                        if ( i >= mid ) exit merge_lower
-                    else
-                        array(k) = array(j)
-                        j = j + 1
-                        if ( j >= array_len ) then
-                            array(k+1:) = buf(i:mid-1)
-                            exit merge_lower
-                        end if
-                    end if
-                end do merge_lower
-            else ! The right run is shorter ! check that it is stable
-                buf(0:array_len-mid-1) = array(mid:array_len-1)
-                i = mid - 1
-                j = array_len - mid -1
-                merge_upper: do k = array_len-1, 0, -1
-                    if ( buf(j) >= array(i) ) then
-                        array(k) = buf(j)
-                        j = j - 1
-                        if ( j < 0 ) exit merge_upper
-                    else
-                        array(k) = array(i)
-                        i = i - 1
-                        if ( i < 0 ) then
-                            array(0:k-1) = buf(0:j)
-                            exit merge_upper
-                        end if
-                    end if
-                end do merge_upper
-            end if
-        end subroutine merge
-
-
-        pure subroutine reverse_segment( array )
-! Reverse a segment of an array in place
-            type(string_type), intent(inout) :: array(0:)
-
-            integer(int_size) :: lo, hi
-            type(string_type) :: temp
-
-            lo = 0
-            hi = size( array, kind=int_size ) - 1
-            do while( lo < hi )
-                temp = array(lo)
-                array(lo) = array(hi)
-                array(hi) = temp
-                lo = lo + 1
-                hi = hi - 1
-            end do
-
-        end subroutine reverse_segment
-
-    end subroutine string_ord_sort
-
 end submodule stdlib_sorting_ord_sort
 

From 84f1294eb874c3031231e6b571d09b22befd2366 Mon Sep 17 00:00:00 2001
From: "William B. Clodius" <65470906+wclodius2@users.noreply.github.com>
Date: Sun, 23 May 2021 17:20:06 -0600
Subject: [PATCH 30/45] Update src/stdlib_sorting.fypp

Co-authored-by: Jeremie Vandenplas <jeremie.vandenplas@gmail.com>
---
 src/stdlib_sorting.fypp | 12 ------------
 1 file changed, 12 deletions(-)

diff --git a/src/stdlib_sorting.fypp b/src/stdlib_sorting.fypp
index 1663efb2c..e16bfffe6 100644
--- a/src/stdlib_sorting.fypp
+++ b/src/stdlib_sorting.fypp
@@ -321,18 +321,6 @@ module stdlib_sorting
 !!    end subroutine sort_a_data
 !!```
 
-    public char_ord_sort
-
-    public char_sort
-
-    public char_sort_index
-
-    public string_type_ord_sort
-
-    public string_type_sort
-
-    public string_type_sort_index
-
     interface ord_sort
 !! Version: experimental
 !!

From 8d91761cecec7b7e4217a5b3c36295f7270dd0a1 Mon Sep 17 00:00:00 2001
From: "William B. Clodius" <65470906+wclodius2@users.noreply.github.com>
Date: Sun, 23 May 2021 17:20:44 -0600
Subject: [PATCH 31/45] Update doc/specs/stdlib_sorting.md

Co-authored-by: Jeremie Vandenplas <jeremie.vandenplas@gmail.com>
---
 doc/specs/stdlib_sorting.md | 11 -----------
 1 file changed, 11 deletions(-)

diff --git a/doc/specs/stdlib_sorting.md b/doc/specs/stdlib_sorting.md
index 12ca35dc1..b2ad21ad3 100644
--- a/doc/specs/stdlib_sorting.md
+++ b/doc/specs/stdlib_sorting.md
@@ -321,16 +321,6 @@ element of `array` is a `NaN`.  Sorting of `CHARACTER(*)` and
 
 ##### Example
 
-```fortran
-    ...
-    ! Read random data from a file
-    call read_file( 'dummy_file', array )
-    ! Sort the random data
-    call sort( array )
-    ! Process the sorted data
-    call array_search( array, values )
-    ...
-```
 
 ```fortran
     program demo_sort
@@ -626,4 +616,3 @@ for applications running on Intel(R) 64, Version 2021.2.0 Build
 | String_type |   17576  | String Increase |  Sort_Index |   0.00543 |
 | String_type |   17576  |   String Random |  Sort_Index |   0.37748 |
 
-

From 6ccd3664466aaee6bde3551573f894fdca7e10ae Mon Sep 17 00:00:00 2001
From: "William B. Clodius" <65470906+wclodius2@users.noreply.github.com>
Date: Sun, 23 May 2021 17:37:04 -0600
Subject: [PATCH 32/45] Update src/stdlib_sorting_sort.fypp

Co-authored-by: Jeremie Vandenplas <jeremie.vandenplas@gmail.com>
---
 src/stdlib_sorting_sort.fypp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/stdlib_sorting_sort.fypp b/src/stdlib_sorting_sort.fypp
index 8f8d1f90a..244165c69 100644
--- a/src/stdlib_sorting_sort.fypp
+++ b/src/stdlib_sorting_sort.fypp
@@ -248,7 +248,7 @@ contains
 
         integer(int32) :: depth_limit
 
-        depth_limit = 2 * int( floor( log( real( size( array, kind=int64 ),  &
+        depth_limit = 2 * int( floor( log( real( size( array, kind=int_size ),  &
                                                  kind=dp) ) / log(2.0_dp) ), &
                                kind=int32 )
         call introsort(array, depth_limit)

From 8e59d2f30bb21e90a896d01a831c04291cbf0a38 Mon Sep 17 00:00:00 2001
From: William Clodius <w.clodius@icloud.com>
Date: Sun, 23 May 2021 19:31:23 -0600
Subject: [PATCH 33/45] Changed intent of arguments from inout to out

Walter Spector suggested that several arguments to ord_sort and sort_index
from intent(inout) to intent(in). Jeremie Vandenplas agreed and now so do I.
This involved changing the interfaces in src/stdlib_sorting.fypp, the code in
src/stdlib_sorting_ord_sort.fypp and src/stdlib_sorting_index_sort.fypp, and
and the documentation in doc/specs/stdlib_sorting.md.

[ticket: X]
---
 doc/specs/stdlib_sorting.md        | 34 +++++++--------
 src/stdlib_sorting.fypp            | 68 +++++++++++++++---------------
 src/stdlib_sorting_ord_sort.fypp   |  8 ++--
 src/stdlib_sorting_sort_index.fypp | 20 ++++-----
 4 files changed, 65 insertions(+), 65 deletions(-)

diff --git a/doc/specs/stdlib_sorting.md b/doc/specs/stdlib_sorting.md
index b2ad21ad3..520de6ba6 100644
--- a/doc/specs/stdlib_sorting.md
+++ b/doc/specs/stdlib_sorting.md
@@ -228,7 +228,7 @@ of non-decreasing value.
 
 `work` (optional): shall be a rank one array of the same type as
 array, and shall have at least `size(array)/2` elements. It is an
-`intent(inout)` argument. It is intended to be used as "scratch"
+`intent(out)` argument. It is intended to be used as "scratch"
 memory for internal record keeping. If associated with an array in
 static storage, its use can significantly reduce the stack memory
 requirements for the code. Its contents on return are undefined.
@@ -370,14 +370,14 @@ array in the desired direction.
 
 `work` (optional): shall be a rank one array of any of the same type as
 `array`, and shall have at least `size(array)/2` elements. It is an
-`intent(inout)` argument. It is intended to be used as "scratch"
+`intent(out)` argument. It is intended to be used as "scratch"
 memory for internal record keeping. If associated with an array in
 static storage, its use can significantly reduce the stack memory
 requirements for the code. Its contents on return are undefined.
 
 `iwork` (optional): shall be a rank one integer array of kind
 `int_size`, and shall have at least `size(array)/2` elements. It
-is an `intent(inout)` argument.  It is intended to be used as "scratch"
+is an `intent(out)` argument.  It is intended to be used as "scratch"
 memory for internal record keeping. If associated with an array in
 static storage, its use can significantly reduce the stack memory
 requirements for the code. Its contents on return are undefined.
@@ -411,11 +411,11 @@ Sorting a related rank one array:
 ```Fortran
     subroutine sort_related_data( a, b, work, index, iwork )
         ! Sort `a`, and  also  sort `b` to be reorderd the same way as `a`
-        integer, intent(inout)           :: a(:)
-        integer(int32), intent(inout)    :: b(:) ! The same size as a
-		integer(int32), intent(inout)    :: work(:)
-		integer(int_size), intent(inout) :: index(:)
-		integer(int_size), intent(inout) :: iwork(:)
+        integer, intent(inout)         :: a(:)
+        integer(int32), intent(inout)  :: b(:) ! The same size as a
+		integer(int32), intent(out)    :: work(:)
+		integer(int_size), intent(out) :: index(:)
+		integer(int_size), intent(out) :: iwork(:)
 		! Find the indices to sort a
         call sort_index(a, index(1:size(a)),&
             work(1:size(a)/2), iwork(1:size(a)/2))
@@ -429,12 +429,12 @@ Sorting a rank 2 array based on the data in a column
 ```Fortran
 	subroutine sort_related_data( array, column, work, index, iwork )
 	    ! Reorder rows of `array` such that `array(:, column)` is  sorted
-	    integer, intent(inout)           :: array(:,:)
-		integer(int32), intent(in)       :: column
-		integer(int32), intent(inout)    :: work(:)
-		integer(int_size), intent(inout) :: index(:)
-		integer(int_size), intent(inout) :: iwork(:)
-		integer, allocatable             :: dummy(:)
+	    integer, intent(inout)         :: array(:,:)
+		integer(int32), intent(in)     :: column
+		integer(int32), intent(out)    :: work(:)
+		integer(int_size), intent(out) :: index(:)
+		integer(int_size), intent(out) :: iwork(:)
+		integer, allocatable           :: dummy(:)
 		integer :: i
 		allocate(dummy(size(array, dim=1)))
 		! Extract a column of `array`
@@ -456,9 +456,9 @@ Sorting an array of a derived type based on the data in one component
 	    ! Sort `a_data` in terms or its component `a`
 	    type(a_type), intent(inout)      :: a_data(:)
 		integer(int32), intent(inout)    :: a(:)
-		integer(int32), intent(inout)    :: work(:)
-		integer(int_size), intent(inout) :: index(:)
-		integer(int_size), intent(inout) :: iwork(:)
+		integer(int32), intent(out)    :: work(:)
+		integer(int_size), intent(out) :: index(:)
+		integer(int_size), intent(out) :: iwork(:)
 		! Extract a component of `a_data`
 		a(1:size(a_data)) = a_data(:) % a
 		! Find the indices to sort the component
diff --git a/src/stdlib_sorting.fypp b/src/stdlib_sorting.fypp
index e16bfffe6..f1139e9b2 100644
--- a/src/stdlib_sorting.fypp
+++ b/src/stdlib_sorting.fypp
@@ -156,7 +156,7 @@ module stdlib_sorting
 !!
 !! * work (optional): shall be a rank 1 array of the same type as
 !!   `array`, and shall have at least `size(array)/2` elements. It is an
-!!   `intent(inout)` argument to be used as "scratch" memory
+!!   `intent(out)` argument to be used as "scratch" memory
 !!   for internal record keeping. If associated with an array in static
 !!   storage, its use can significantly reduce the stack memory requirements
 !!   for the code. Its value on return is undefined.
@@ -232,21 +232,21 @@ module stdlib_sorting
 !!   of the `array` and `index` results is undefined. Otherwise it is
 !!   defined to be as specified by reverse.
 !!
-!! * index: a rank 1 array of sorting indices. It is an `intent(inout)`
+!! * index: a rank 1 array of sorting indices. It is an `intent(out)`
 !!   argument of the type `integer(int_size)`. Its size shall be the
 !!   same as `array`. On return, if defined, its elements would
 !!   sort the input `array` in the direction specified by `reverse`.
 !!
 !! * work (optional): shall be a rank 1 array of the same type as
 !!   `array`, and shall have at least `size(array)/2` elements. It is an
-!!   `intent(inout)` argument to be used as "scratch" memory
+!!   `intent(out)` argument to be used as "scratch" memory
 !!   for internal record keeping. If associated with an array in static
 !!   storage, its use can significantly reduce the stack memory requirements
 !!   for the code. Its value on return is undefined.
 !!
 !! * iwork (optional): shall be a rank 1 integer array of kind `int_size`,
 !!   and shall have at least `size(array)/2` elements. It is an
-!!   `intent(inout)` argument to be used as "scratch" memory
+!!   `intent(out)` argument to be used as "scratch" memory
 !!   for internal record keeping. If associated with an array in static
 !!   storage, its use can significantly reduce the stack memory requirements
 !!   for the code. Its value on return is undefined.
@@ -264,11 +264,11 @@ module stdlib_sorting
 !!```Fortran
 !!    subroutine sort_related_data( a, b, work, index, iwork )
 !!        ! Sort `b` in terms or its related array `a`
-!!        integer, intent(inout)           :: a(:)
-!!        integer(int32), intent(inout)    :: b(:) ! The same size as a
-!!        integer(int32), intent(inout)    :: work(:)
-!!        integer(int_size), intent(inout) :: index(:)
-!!        integer(int_size), intent(inout) :: iwork(:)
+!!        integer, intent(inout)         :: a(:)
+!!        integer(int32), intent(inout)  :: b(:) ! The same size as a
+!!        integer(int32), intent(out)    :: work(:)
+!!        integer(int_size), intent(out) :: index(:)
+!!        integer(int_size), intent(out) :: iwork(:)
 !!    ! Find the indices to sort a
 !!        call sort_index(a, index(1:size(a)),&
 !!            work(1:size(a)/2), iwork(1:size(a)/2))
@@ -282,12 +282,12 @@ module stdlib_sorting
 !!```Fortran
 !!    subroutine sort_related_data( array, column, work, index, iwork )
 !!    ! Sort `a_data` in terms or its component `a`
-!!        integer, intent(inout)           :: a(:,:)
-!!        integer(int32), intent(in)       :: column
-!!        integer(int32), intent(inout)    :: work(:)
-!!        integer(int_size), intent(inout) :: index(:)
-!!        integer(int_size), intent(inout) :: iwork(:)
-!!        integer, allocatable             :: dummy(:)
+!!        integer, intent(inout)         :: a(:,:)
+!!        integer(int32), intent(in)     :: column
+!!        integer(int32), intent(out)    :: work(:)
+!!        integer(int_size), intent(out) :: index(:)
+!!        integer(int_size), intent(out) :: iwork(:)
+!!        integer, allocatable           :: dummy(:)
 !!        integer :: i
 !!        allocate(dummy(size(a, dim=1)))
 !!    ! Extract a component of `a_data`
@@ -306,11 +306,11 @@ module stdlib_sorting
 !!```fortran
 !!    subroutine sort_a_data( a_data, a, work, index, iwork )
 !!    ! Sort `a_data` in terms or its component `a`
-!!        type(a_type), intent(inout)      :: a_data(:)
-!!        integer(int32), intent(inout)    :: a(:)
-!!        integer(int32), intent(inout)    :: work(:)
-!!        integer(int_size), intent(inout) :: index(:)
-!!        integer(int_size), intent(inout) :: iwork(:)
+!!        type(a_type), intent(inout)    :: a_data(:)
+!!        integer(int32), intent(inout)  :: a(:)
+!!        integer(int32), intent(out)    :: work(:)
+!!        integer(int_size), intent(out) :: index(:)
+!!        integer(int_size), intent(out) :: iwork(:)
 !!    ! Extract a component of `a_data`
 !!        a(1:size(a_data)) = a_data(:) % a
 !!    ! Find the indices to sort the component
@@ -341,8 +341,8 @@ module stdlib_sorting
 !!
 !! `${k1}$_ord_sort( array )` sorts the input `ARRAY` of type `${t1}$`
 !! using a hybrid sort based on the `'Rust" sort` algorithm found in `slice.rs`
-            ${t1}$, intent(inout)           :: array(0:)
-            ${t1}$, intent(inout), optional :: work(0:)
+            ${t1}$, intent(inout)         :: array(0:)
+            ${t1}$, intent(out), optional :: work(0:)
         end subroutine ${k1}$_ord_sort
 
 #:endfor
@@ -352,8 +352,8 @@ module stdlib_sorting
 !!
 !! `char_ord_sort( array )` sorts the input `ARRAY` of type `CHARACTER(*)`
 !! using a hybrid sort based on the `'Rust" sort` algorithm found in `slice.rs`
-            character(len=*), intent(inout)                    :: array(0:)
-            character(len=len(array)), intent(inout), optional :: work(0:)
+            character(len=*), intent(inout)                  :: array(0:)
+            character(len=len(array)), intent(out), optional :: work(0:)
         end subroutine char_ord_sort
 
     end interface ord_sort
@@ -411,11 +411,11 @@ module stdlib_sorting
 !! using a hybrid sort based on the `'Rust" sort` algorithm found in `slice.rs`
 !! and returns the sorted `ARRAY` and an array `INDEX of indices in the
 !! order that would sort the input `ARRAY` in the desired direction.
-            ${t1}$, intent(inout)                      :: array(0:)
-            integer(int_size), intent(inout)           :: index(0:)
-            ${t1}$, intent(inout), optional            :: work(0:)
-            integer(int_size), intent(inout), optional :: iwork(0:)
-            logical, intent(in), optional              :: reverse
+            ${t1}$, intent(inout)                    :: array(0:)
+            integer(int_size), intent(out)           :: index(0:)
+            ${t1}$, intent(out), optional            :: work(0:)
+            integer(int_size), intent(out), optional :: iwork(0:)
+            logical, intent(in), optional            :: reverse
         end subroutine ${k1}$_sort_index
 
 #:endfor
@@ -428,11 +428,11 @@ module stdlib_sorting
 !! using a hybrid sort based on the `'Rust" sort` algorithm found in `slice.rs`
 !! and returns the sorted `ARRAY` and an array `INDEX of indices in the
 !! order that would sort the input `ARRAY` in the desired direction.
-            character(len=*), intent(inout)                    :: array(0:)
-            integer(int_size), intent(inout)                   :: index(0:)
-            character(len=len(array)), intent(inout), optional :: work(0:)
-            integer(int_size), intent(inout), optional         :: iwork(0:)
-            logical, intent(in), optional                      :: reverse
+            character(len=*), intent(inout)                  :: array(0:)
+            integer(int_size), intent(out)                   :: index(0:)
+            character(len=len(array)), intent(out), optional :: work(0:)
+            integer(int_size), intent(out), optional         :: iwork(0:)
+            logical, intent(in), optional                    :: reverse
         end subroutine char_sort_index
 
     end interface sort_index
diff --git a/src/stdlib_sorting_ord_sort.fypp b/src/stdlib_sorting_ord_sort.fypp
index d737133bf..065b2a3c8 100644
--- a/src/stdlib_sorting_ord_sort.fypp
+++ b/src/stdlib_sorting_ord_sort.fypp
@@ -78,8 +78,8 @@ contains
 ! estimation of the optimal `run size` as suggested in Tim Peters'
 ! original `listsort.txt`, and an optional `work` array to be used as
 ! scratch memory.
-        ${t1}$, intent(inout)           :: array(0:)
-        ${t1}$, intent(inout), optional :: work(0:)
+        ${t1}$, intent(inout)         :: array(0:)
+        ${t1}$, intent(out), optional :: work(0:)
 
         ${t1}$, allocatable :: buf(:)
         integer(int_size) :: array_size
@@ -417,8 +417,8 @@ contains
 ! estimation of the optimal `run size` as suggested in Tim Peters'
 ! original `listsort.txt`, and an optional `work` array to be used as
 ! scratch memory.
-        character(len=*), intent(inout)                    :: array(0:)
-        character(len=len(array)), intent(inout), optional :: work(0:)
+        character(len=*), intent(inout)                  :: array(0:)
+        character(len=len(array)), intent(out), optional :: work(0:)
 
         character(len=:), allocatable :: buf(:)
         integer(int_size) :: array_size
diff --git a/src/stdlib_sorting_sort_index.fypp b/src/stdlib_sorting_sort_index.fypp
index 5a11a5efe..e91935972 100644
--- a/src/stdlib_sorting_sort_index.fypp
+++ b/src/stdlib_sorting_sort_index.fypp
@@ -83,11 +83,11 @@ contains
 ! original listsort.txt, and the optional `work` and `iwork` arraya to be
 ! used as scratch memory.
 
-        ${t1}$, intent(inout)                      :: array(0:)
-        integer(int_size), intent(inout)           :: index(0:)
-        ${t1}$, intent(inout), optional            :: work(0:)
-        integer(int_size), intent(inout), optional :: iwork(0:)
-        logical, intent(in), optional              :: reverse
+        ${t1}$, intent(inout)                    :: array(0:)
+        integer(int_size), intent(out)           :: index(0:)
+        ${t1}$, intent(out), optional            :: work(0:)
+        integer(int_size), intent(out), optional :: iwork(0:)
+        logical, intent(in), optional            :: reverse
 
         integer(int_size) :: array_size, i, stat
         ${t1}$, allocatable :: buf(:)
@@ -490,11 +490,11 @@ contains
 ! original listsort.txt, and the optional `work` and `iwork` arraya to be
 ! used as scratch memory.
 
-        character(len=*), intent(inout)                      :: array(0:)
-        integer(int_size), intent(inout)                     :: index(0:)
-        character(len=len(array)), intent(inout), optional   :: work(0:)
-        integer(int_size), intent(inout), optional           :: iwork(0:)
-        logical, intent(in), optional                        :: reverse
+        character(len=*), intent(inout)                    :: array(0:)
+        integer(int_size), intent(out)                     :: index(0:)
+        character(len=len(array)), intent(out), optional   :: work(0:)
+        integer(int_size), intent(out), optional           :: iwork(0:)
+        logical, intent(in), optional                      :: reverse
 
         integer(int_size) :: array_size, i, stat
         character(len=:), allocatable :: buf(:)

From 6e77312cac687b47111d6dd5704a1aa5706e276c Mon Sep 17 00:00:00 2001
From: Jeremie Vandenplas <jeremie.vandenplas@gmail.com>
Date: Fri, 28 May 2021 20:02:09 +0200
Subject: [PATCH 34/45] add reverse to sort

---
 src/stdlib_sorting.fypp            |   8 +-
 src/stdlib_sorting_sort.fypp       |  74 ++++++++++---
 src/tests/sorting/test_sorting.f90 | 171 ++++++++++++++++++++++++-----
 3 files changed, 208 insertions(+), 45 deletions(-)

diff --git a/src/stdlib_sorting.fypp b/src/stdlib_sorting.fypp
index f1139e9b2..3e7533d3a 100644
--- a/src/stdlib_sorting.fypp
+++ b/src/stdlib_sorting.fypp
@@ -365,7 +365,7 @@ module stdlib_sorting
 !! on the `introsort` of David Musser.
 
 #:for k1, t1 in IRS_KINDS_TYPES
-        pure module subroutine ${k1}$_sort( array )
+        pure module subroutine ${k1}$_sort( array, reverse )
 !! Version: experimental
 !!
 !! `${k1}$_sort( array )` sorts the input `ARRAY` of type `${t1}$`
@@ -373,12 +373,13 @@ module stdlib_sorting
 !! The algorithm is of order O(N Ln(N)) for all inputs.
 !! Because it relies on `quicksort`, the coefficient of the O(N Ln(N))
 !! behavior is small for random data compared to other sorting algorithms.
-            ${t1}$, intent(inout) :: array(0:)
+            ${t1}$, intent(inout)         :: array(0:)
+            logical, intent(in), optional :: reverse
         end subroutine ${k1}$_sort
 
 #:endfor
 
-        pure module subroutine char_sort( array )
+        pure module subroutine char_sort( array, reverse )
 !! Version: experimental
 !!
 !! `char_sort( array )` sorts the input `ARRAY` of type `CHARACTER(*)`
@@ -387,6 +388,7 @@ module stdlib_sorting
 !! Because it relies on `quicksort`, the coefficient of the O(N Ln(N))
 !! behavior is small for random data compared to other sorting algorithms.
             character(len=*), intent(inout) :: array(0:)
+            logical, intent(in), optional   :: reverse
         end subroutine char_sort
 
     end interface sort
diff --git a/src/stdlib_sorting_sort.fypp b/src/stdlib_sorting_sort.fypp
index 244165c69..3502eb750 100644
--- a/src/stdlib_sorting_sort.fypp
+++ b/src/stdlib_sorting_sort.fypp
@@ -1,6 +1,11 @@
 #:include "common.fypp"
 #:set IRS_KINDS_TYPES = INT_KINDS_TYPES + REAL_KINDS_TYPES + STRING_KINDS_TYPES
 
+#:set SIGN_NAME = ["increase", "decrease"]
+#:set SIGN_TYPE = [">", "<"]
+#:set SIGN_OPP_TYPE = ["<", ">"]
+#:set SIGN_NAME_TYPE = list(zip(SIGN_NAME, SIGN_TYPE, SIGN_OPP_TYPE))
+
 !! Licensing:
 !!
 !! This file is subjec† both to the Fortran Standard Library license, and
@@ -60,10 +65,28 @@ submodule(stdlib_sorting) stdlib_sorting_sort
 
 contains
 
+#:for k1, t1 in IRS_KINDS_TYPES
+    pure module subroutine ${k1}$_sort( array, reverse )
+            ${t1}$, intent(inout) :: array(0:)
+            logical, intent(in), optional            :: reverse
+
+            logical :: reverse_
+
+            reverse_ = .false.
+            if(present(reverse)) reverse_ = reverse
 
+            if(reverse_)then
+             call ${k1}$_decrease_sort(array)
+            else
+             call ${k1}$_increase_sort(array)
+            endif
+    end subroutine ${k1}$_sort
+#:endfor
+
+#:for sname, signt, signoppt in SIGN_NAME_TYPE
 #:for k1, t1 in IRS_KINDS_TYPES
 
-    pure module subroutine ${k1}$_sort( array )
+    pure module subroutine ${k1}$_${sname}$_sort( array )
 ! `${k1}$_sort( array )` sorts the input `ARRAY` of type `${t1}$`
 ! using a hybrid sort based on the `introsort` of David Musser. As with
 ! `introsort`, `${k1}$_sort( array )` is an unstable hybrid comparison
@@ -126,12 +149,12 @@ contains
             u = array( 0 )
             v = array( size(array, kind=int_size)/2-1 )
             w = array( size(array, kind=int_size)-1 )
-            if ( (u > v) .neqv. (u > w) ) then
+            if ( (u ${signt}$ v) .neqv. (u ${signt}$ w) ) then
                 x = u
                 y = array(0)
                 array(0) = array( size( array, kind=int_size ) - 1 )
                 array( size( array, kind=int_size ) - 1 ) = y
-            else if ( (v < u) .neqv. (v < w) ) then
+            else if ( (v ${signoppt}$ u) .neqv. (v ${signoppt}$ w) ) then
                 x = v
                 y = array(size( array, kind=int_size )/2-1)
                 array( size( array, kind=int_size )/2-1 ) = &
@@ -143,7 +166,7 @@ contains
 ! Partition the array.
             i = -1_int_size
             do j = 0_int_size, size(array, kind=int_size)-2
-                if ( array(j) <= x ) then
+                if ( array(j) ${signoppt}$= x ) then
                     i = i + 1
                     y = array(i)
                     array(i) = array(j)
@@ -168,7 +191,7 @@ contains
                 key = array(j)
                 i = j - 1
                 do while( i >= 0 )
-                    if ( array(i) <= key ) exit
+                    if ( array(i) ${signoppt}$= key ) exit
                     array(i+1) = array(i)
                     i = i - 1
                 end do
@@ -212,10 +235,10 @@ contains
             l = 2_int_size * i + 1_int_size
             r = l + 1_int_size
             if ( l < heap_size ) then
-                if ( array(l) > array(largest) ) largest = l
+                if ( array(l) ${signt}$ array(largest) ) largest = l
             end if
             if ( r < heap_size ) then
-                if ( array(r) > array(largest) ) largest = r
+                if ( array(r) ${signt}$ array(largest) ) largest = r
             end if
             if ( largest /= i ) then
                 y = array(i)
@@ -226,14 +249,32 @@ contains
 
         end subroutine max_heapify
 
-    end subroutine ${k1}$_sort
+    end subroutine ${k1}$_${sname}$_sort
 
 #:endfor
+#:endfor
+
 
+    pure module subroutine char_sort( array, reverse )
+        character(len=*), intent(inout) :: array(0:)
+        logical, intent(in), optional   :: reverse
 
+        logical :: reverse_
 
+        reverse_ = .false.
+        if(present(reverse)) reverse_ = reverse
+
+        if(reverse_)then
+            call char_decrease_sort(array)
+        else
+            call char_increase_sort(array)
+        endif
+    end subroutine char_sort
 
-    pure module subroutine char_sort( array )
+
+
+#:for sname, signt, signoppt in SIGN_NAME_TYPE
+    pure module subroutine char_${sname}$_sort( array )
 ! `char_sort( array )` sorts the input `ARRAY` of type `CHARACTER(*)`
 ! using a hybrid sort based on the `introsort` of David Musser. As with
 ! `introsort`, `char_sort( array )` is an unstable hybrid comparison
@@ -296,12 +337,12 @@ contains
             u = array( 0 )
             v = array( size(array, kind=int_size)/2-1 )
             w = array( size(array, kind=int_size)-1 )
-            if ( (u > v) .neqv. (u > w) ) then
+            if ( (u ${signt}$ v) .neqv. (u ${signt}$ w) ) then
                 x = u
                 y = array(0)
                 array(0) = array( size( array, kind=int_size ) - 1 )
                 array( size( array, kind=int_size ) - 1 ) = y
-            else if ( (v < u) .neqv. (v < w) ) then
+            else if ( (v ${signoppt}$ u) .neqv. (v ${signoppt}$ w) ) then
                 x = v
                 y = array(size( array, kind=int_size )/2-1)
                 array( size( array, kind=int_size )/2-1 ) = &
@@ -313,7 +354,7 @@ contains
 ! Partition the array.
             i = -1_int_size
             do j = 0_int_size, size(array, kind=int_size)-2
-                if ( array(j) <= x ) then
+                if ( array(j) ${signoppt}$= x ) then
                     i = i + 1
                     y = array(i)
                     array(i) = array(j)
@@ -338,7 +379,7 @@ contains
                 key = array(j)
                 i = j - 1
                 do while( i >= 0 )
-                    if ( array(i) <= key ) exit
+                    if ( array(i) ${signoppt}$= key ) exit
                     array(i+1) = array(i)
                     i = i - 1
                 end do
@@ -382,10 +423,10 @@ contains
             l = 2_int_size * i + 1_int_size
             r = l + 1_int_size
             if ( l < heap_size ) then
-                if ( array(l) > array(largest) ) largest = l
+                if ( array(l) ${signt}$ array(largest) ) largest = l
             end if
             if ( r < heap_size ) then
-                if ( array(r) > array(largest) ) largest = r
+                if ( array(r) ${signt}$ array(largest) ) largest = r
             end if
             if ( largest /= i ) then
                 y = array(i)
@@ -396,6 +437,7 @@ contains
 
         end subroutine max_heapify
 
-    end subroutine char_sort
+    end subroutine char_${sname}$_sort
+#:endfor
 
 end submodule stdlib_sorting_sort
diff --git a/src/tests/sorting/test_sorting.f90 b/src/tests/sorting/test_sorting.f90
index 4025aef42..fd6c6aeda 100644
--- a/src/tests/sorting/test_sorting.f90
+++ b/src/tests/sorting/test_sorting.f90
@@ -4,8 +4,9 @@ program test_sorting
         compiler_version
     use stdlib_kinds, only: int32, int64, dp, sp
     use stdlib_sorting
-    use stdlib_string_type, only: string_type, assignment(=), operator(>), &
+    use stdlib_string_type, only: string_type, assignment(=), operator(>), operator(<), &
         write(formatted)
+    use stdlib_error, only: check
 
     implicit none
 
@@ -48,6 +49,7 @@ program test_sorting
     integer                 :: lun
     character(len=4)        :: char_temp
     type(string_type)       :: string_temp
+    logical                 :: ltest, ldummy
 
 ! Create the test arrays
     identical(:) = 10
@@ -149,17 +151,22 @@ program test_sorting
         '--|-----------|")' )
 
 ! test the sorting routines on the test arrays
+    ltest = .true.
+
     call test_int_ord_sorts( )
 
     call test_char_ord_sorts( )
 
     call test_string_ord_sorts( )
 
-    call test_int_sorts( )
+    call test_int_sorts( ldummy )
+    ltest = (ltest .and. ldummy)
 
-    call test_char_sorts( )
+    call test_char_sorts( ldummy )
+    ltest = (ltest .and. ldummy)
 
-    call test_string_sorts( )
+    call test_string_sorts( ldummy )
+    ltest = (ltest .and. ldummy)
 
     call test_int_sort_indexes( )
 
@@ -167,6 +174,9 @@ program test_sorting
 
     call test_string_sort_indexes( )
 
+
+    call check(ltest)
+
 contains
 
     subroutine test_int_ord_sorts( )
@@ -295,29 +305,46 @@ subroutine test_string_ord_sort( a, a_name )
     end subroutine test_string_ord_sort
 
 
-    subroutine test_int_sorts( )
-
-        call test_int_sort( blocks, "Blocks" )
-        call test_int_sort( decrease, "Decreasing" )
-        call test_int_sort( identical, "Identical" )
-        call test_int_sort( increase, "Increasing" )
-        call test_int_sort( rand1, "Random dense" )
-        call test_int_sort( rand2, "Random order" )
-        call test_int_sort( rand0, "Random sparse" )
-        call test_int_sort( rand3, "Random 3" )
-        call test_int_sort( rand10, "Random 10" )
+    subroutine test_int_sorts( ltest )
+        logical, intent(out) :: ltest
+
+        logical :: ldummy
+
+        ltest = .true.
+
+        call test_int_sort( blocks, "Blocks", ldummy )
+        ltest = (ltest .and. ldummy)
+        call test_int_sort( decrease, "Decreasing", ldummy )
+        ltest = (ltest .and. ldummy)
+        call test_int_sort( identical, "Identical", ldummy )
+        ltest = (ltest .and. ldummy)
+        call test_int_sort( increase, "Increasing", ldummy )
+        ltest = (ltest .and. ldummy)
+        call test_int_sort( rand1, "Random dense", ldummy )
+        ltest = (ltest .and. ldummy)
+        call test_int_sort( rand2, "Random order", ldummy )
+        ltest = (ltest .and. ldummy)
+        call test_int_sort( rand0, "Random sparse", ldummy )
+        ltest = (ltest .and. ldummy)
+        call test_int_sort( rand3, "Random 3", ldummy )
+        ltest = (ltest .and. ldummy)
+        call test_int_sort( rand10, "Random 10", ldummy )
+        ltest = (ltest .and. ldummy)
 
     end subroutine test_int_sorts
 
-    subroutine test_int_sort( a, a_name )
+    subroutine test_int_sort( a, a_name, ltest )
         integer(int32), intent(in) :: a(:)
         character(*), intent(in)   :: a_name
+        logical, intent(out) :: ltest
 
         integer(int64) :: t0, t1, tdiff
         real(dp)       :: rate
         integer(int64) :: i
         logical        :: valid
 
+        ltest = .true.
+
         tdiff = 0
         do i = 1, repeat
             dummy = a
@@ -328,7 +355,9 @@ subroutine test_int_sort( a, a_name )
         end do
         tdiff = tdiff/repeat
 
+
         call verify_sort( dummy, valid, i )
+        ltest = (ltest .and. valid)
         if ( .not. valid ) then
             write( *, * ) "SORT did not sort " // a_name // "."
             write(*,*) 'i = ', i
@@ -338,19 +367,40 @@ subroutine test_int_sort( a, a_name )
             'a12, " |",  F10.5, " |" )' ) &
             test_size, a_name, "Sort", tdiff/rate
 
+
+        ! reverse
+        dummy = a
+        call sort( dummy, .true.)
+        call verify_reverse_sort(dummy, valid, i)
+        ltest = (ltest .and. valid)
+        if ( .not. valid ) then
+            write( *, * ) "reverse SORT did not sort " // a_name // "."
+            write(*,*) 'i = ', i
+            write(*,'(a12, 2i7)') 'dummy(i-1:i) = ', dummy(i-1:i)
+        end if
+
     end subroutine test_int_sort
 
-    subroutine test_char_sorts( )
+    subroutine test_char_sorts( ltest )
+        logical, intent(out) :: ltest
+
+        logical :: ldummy
+
+        call test_char_sort( char_decrease, "Char. Decrease", ldummy )
+        ltest = (ltest .and. ldummy)
 
-        call test_char_sort( char_decrease, "Char. Decrease" )
-        call test_char_sort( char_increase, "Char. Increase" )
-        call test_char_sort( char_rand, "Char. Random" )
+        call test_char_sort( char_increase, "Char. Increase", ldummy )
+        ltest = (ltest .and. ldummy)
+
+        call test_char_sort( char_rand, "Char. Random", ldummy )
+        ltest = (ltest .and. ldummy)
 
     end subroutine test_char_sorts
 
-    subroutine test_char_sort( a, a_name )
+    subroutine test_char_sort( a, a_name, ltest )
         character(len=4), intent(in) :: a(0:)
         character(*), intent(in) :: a_name
+        logical, intent(out) :: ltest
 
         integer(int64) :: t0, t1, tdiff
         real(dp)       :: rate
@@ -368,6 +418,7 @@ subroutine test_char_sort( a, a_name )
         tdiff = tdiff/repeat
 
         call verify_char_sort( char_dummy, valid, i )
+        ltest = (ltest .and. valid)
         if ( .not. valid ) then
             write( *, * ) "SORT did not sort " // a_name // "."
             write(*,*) 'i = ', i
@@ -377,25 +428,47 @@ subroutine test_char_sort( a, a_name )
             'a12, " |",  F10.5, " |" )' ) &
             char_size, a_name, "Sort", tdiff/rate
 
+        !reverse
+        char_dummy = a
+        call sort( char_dummy, .true.)
+        call verify_char_reverse_sort( char_dummy, valid, i )
+        ltest = (ltest .and. valid)
+        if ( .not. valid ) then
+            write( *, * ) "reverse SORT did not sort " // a_name // "."
+            write(*,*) 'i = ', i
+            write(*,'(a17, 2(1x,a4))') 'char_dummy(i-1:i) = ', char_dummy(i-1:i)
+        end if
+   
     end subroutine test_char_sort
 
-    subroutine test_string_sorts( )
+    subroutine test_string_sorts( ltest )
+        logical, intent(out) :: ltest
+
+        logical :: ldummy
+
+        ltest = .true.
 
-        call test_string_sort( string_decrease, "String Decrease" )
-        call test_string_sort( string_increase, "String Increase" )
-        call test_string_sort( string_rand, "String Random" )
+        call test_string_sort( string_decrease, "String Decrease", ldummy )
+        ltest = (ltest .and. ldummy)
+        call test_string_sort( string_increase, "String Increase", ldummy )
+        ltest = (ltest .and. ldummy)
+        call test_string_sort( string_rand, "String Random", ldummy )
+        ltest = (ltest .and. ldummy)
 
     end subroutine test_string_sorts
 
-    subroutine test_string_sort( a, a_name )
+    subroutine test_string_sort( a, a_name, ltest )
         type(string_type), intent(in) :: a(0:)
         character(*), intent(in) :: a_name
+        logical, intent(out) :: ltest
 
         integer(int64) :: t0, t1, tdiff
         real(dp)       :: rate
         integer(int64) :: i
         logical        :: valid
 
+        ltest = .true.
+
         tdiff = 0
         do i = 1, repeat
             string_dummy = a
@@ -407,6 +480,7 @@ subroutine test_string_sort( a, a_name )
         tdiff = tdiff/repeat
 
         call verify_string_sort( string_dummy, valid, i )
+        ltest = (ltest .and. valid)
         if ( .not. valid ) then
             write( *, * ) "SORT did not sort " // a_name // "."
             write(*,*) 'i = ', i
@@ -417,6 +491,19 @@ subroutine test_string_sort( a, a_name )
             'a12, " |",  F10.5, " |" )' ) &
             string_size, a_name, "Sort", tdiff/rate
 
+        ! reverse
+        string_dummy = a
+        call sort( string_dummy, .true.)
+        call verify_string_reverse_sort(string_dummy, valid, i)
+        ltest = (ltest .and. valid)
+        if ( .not. valid ) then
+            write( *, * ) "reverse SORT did not sort " // a_name // "."
+            write(*,*) 'i = ', i
+            write(*,'(a17, 2(1x,a4))') 'string_dummy(i-1:i) = ', &
+                string_dummy(i-1:i)
+        end if
+
+
     end subroutine test_string_sort
 
     subroutine test_int_sort_indexes( )
@@ -573,6 +660,7 @@ subroutine verify_sort( a, valid, i )
     end subroutine verify_sort
 
 
+
     subroutine verify_string_sort( a, valid, i )
         type(string_type), intent(in) :: a(0:)
         logical, intent(out) :: valid
@@ -605,6 +693,21 @@ subroutine verify_char_sort( a, valid, i )
 
     end subroutine verify_char_sort
 
+    subroutine verify_char_reverse_sort( a, valid, i )
+        character(len=4), intent(in) :: a(0:)
+        logical, intent(out) :: valid
+        integer(int64), intent(out) :: i
+
+        integer(int64) :: n
+
+        n = size( a, kind=int64 )
+        valid = .false.
+        do i=1, n-1
+            if ( a(i-1) < a(i) ) return
+        end do
+        valid = .true.
+
+    end subroutine verify_char_reverse_sort
 
     subroutine verify_reverse_sort( a, valid, i )
         integer(int32), intent(in) :: a(0:)
@@ -622,4 +725,20 @@ subroutine verify_reverse_sort( a, valid, i )
 
     end subroutine verify_reverse_sort
 
+    subroutine verify_string_reverse_sort( a, valid, i )
+        type(string_type), intent(in) :: a(0:)
+        logical, intent(out) :: valid
+        integer(int64), intent(out) :: i
+
+        integer(int64) :: n
+
+        n = size( a, kind=int64 )
+        valid = .false.
+        do i=1, n-1
+            if ( a(i-1) < a(i) ) return
+        end do
+        valid = .true.
+
+    end subroutine verify_string_reverse_sort
+
 end program test_sorting

From 40e084ea5adc0c387fa9a5ef44e11d2b227f8c3e Mon Sep 17 00:00:00 2001
From: Jeremie Vandenplas <jeremie.vandenplas@gmail.com>
Date: Fri, 28 May 2021 21:21:09 +0200
Subject: [PATCH 35/45] sort_sign: add reverse in ord_sort

---
 src/stdlib_sorting.fypp            |   6 +-
 src/stdlib_sorting_ord_sort.fypp   |  87 +++++++++++----
 src/tests/sorting/test_sorting.f90 | 172 +++++++++++++++++++++++------
 3 files changed, 211 insertions(+), 54 deletions(-)

diff --git a/src/stdlib_sorting.fypp b/src/stdlib_sorting.fypp
index 3e7533d3a..9658021fe 100644
--- a/src/stdlib_sorting.fypp
+++ b/src/stdlib_sorting.fypp
@@ -336,24 +336,26 @@ module stdlib_sorting
 !! non-decreasing data.
 
 #:for k1, t1 in IRS_KINDS_TYPES
-        module subroutine ${k1}$_ord_sort( array, work )
+        module subroutine ${k1}$_ord_sort( array, work, reverse )
 !! Version: experimental
 !!
 !! `${k1}$_ord_sort( array )` sorts the input `ARRAY` of type `${t1}$`
 !! using a hybrid sort based on the `'Rust" sort` algorithm found in `slice.rs`
             ${t1}$, intent(inout)         :: array(0:)
             ${t1}$, intent(out), optional :: work(0:)
+            logical, intent(in), optional :: reverse
         end subroutine ${k1}$_ord_sort
 
 #:endfor
 
-        module subroutine char_ord_sort( array, work )
+        module subroutine char_ord_sort( array, work, reverse )
 !! Version: experimental
 !!
 !! `char_ord_sort( array )` sorts the input `ARRAY` of type `CHARACTER(*)`
 !! using a hybrid sort based on the `'Rust" sort` algorithm found in `slice.rs`
             character(len=*), intent(inout)                  :: array(0:)
             character(len=len(array)), intent(out), optional :: work(0:)
+            logical, intent(in), optional                    :: reverse
         end subroutine char_ord_sort
 
     end interface ord_sort
diff --git a/src/stdlib_sorting_ord_sort.fypp b/src/stdlib_sorting_ord_sort.fypp
index 065b2a3c8..b7fae6d23 100644
--- a/src/stdlib_sorting_ord_sort.fypp
+++ b/src/stdlib_sorting_ord_sort.fypp
@@ -1,6 +1,11 @@
 #:include "common.fypp"
 #:set IRS_KINDS_TYPES = INT_KINDS_TYPES + REAL_KINDS_TYPES + STRING_KINDS_TYPES
 
+#:set SIGN_NAME = ["increase", "decrease"]
+#:set SIGN_TYPE = [">", "<"]
+#:set SIGN_OPP_TYPE = ["<", ">"]
+#:set SIGN_NAME_TYPE = list(zip(SIGN_NAME, SIGN_TYPE, SIGN_OPP_TYPE))
+
 !! Licensing:
 !!
 !! This file is subjec† both to the Fortran Standard Library license, and
@@ -57,8 +62,29 @@ submodule(stdlib_sorting) stdlib_sorting_ord_sort
 contains
 
 #:for k1, t1 in IRS_KINDS_TYPES
+    module subroutine ${k1}$_ord_sort( array, work, reverse )
+        ${t1}$, intent(inout)         :: array(0:)
+        ${t1}$, intent(out), optional :: work(0:)
+        logical, intent(in), optional :: reverse
+
+        logical :: reverse_
+
+        reverse_ = .false.
+        if(present(reverse)) reverse_ = reverse
+
+        if (reverse_) then
+            call ${k1}$_decrease_ord_sort(array, work)
+        else
+            call ${k1}$_increase_ord_sort(array, work)
+        endif
 
-    module subroutine ${k1}$_ord_sort( array, work )
+    end subroutine ${k1}$_ord_sort
+#:endfor
+
+#:for sname, signt, signoppt in SIGN_NAME_TYPE
+#:for k1, t1 in IRS_KINDS_TYPES
+
+    module subroutine ${k1}$_${sname}$_ord_sort( array, work )
 ! A translation to Fortran 2008, of the `"Rust" sort` algorithm found in
 ! `slice.rs`
 ! https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs#L2159
@@ -92,7 +118,7 @@ contains
 ! Allocate a buffer to use as scratch memory.
             array_size = size( array, kind=int_size )
             allocate( buf(0:array_size/2-1), stat=stat )
-            if ( stat /= 0 ) error stop "${k1}$_ord_sort: Allocation of buffer failed."
+            if ( stat /= 0 ) error stop "${k1}$_${sname}$_ord_sort: Allocation of buffer failed."
             call merge_sort( array, buf )
         end if
 
@@ -129,7 +155,7 @@ contains
             do j=1, size(array, kind=int_size)-1
                 key = array(j)
                 i = j - 1
-                do while( i >= 0 .and. array(i) > key )
+                do while( i >= 0 .and. array(i) ${signt}$ key )
                     array(i+1) = array(i)
                     i = i - 1
                 end do
@@ -204,7 +230,7 @@ contains
 
             tmp = array(0)
             find_hole: do i=1, size(array, kind=int_size)-1
-                if ( array(i) >= tmp ) exit find_hole
+                if ( array(i) ${signt}$= tmp ) exit find_hole
                 array(i-1) = array(i)
             end do find_hole
             array(i-1) = tmp
@@ -263,16 +289,16 @@ contains
                 start = finish
                 if ( start > 0 ) then
                     start = start - 1
-                    if ( array(start+1) < array(start) ) then
+                    if ( array(start+1) ${signoppt}$ array(start) ) then
                         Descending: do while ( start > 0 )
-                            if ( array(start) >= array(start-1) ) &
+                            if ( array(start) ${signt}$= array(start-1) ) &
                                 exit Descending
                             start = start - 1
                         end do Descending
                         call reverse_segment( array(start:finish) )
                     else
                         Ascending: do while( start > 0 )
-                            if ( array(start) < array(start-1) ) exit Ascending
+                            if ( array(start) ${signoppt}$ array(start-1) ) exit Ascending
                             start = start - 1
                         end do Ascending
                     end if
@@ -338,7 +364,7 @@ contains
                 i = 0
                 j = mid
                 merge_lower: do k = 0, array_len-1
-                    if ( buf(i) <= array(j) ) then
+                    if ( buf(i) ${signoppt}$= array(j) ) then
                         array(k) = buf(i)
                         i = i + 1
                         if ( i >= mid ) exit merge_lower
@@ -356,7 +382,7 @@ contains
                 i = mid - 1
                 j = array_len - mid -1
                 merge_upper: do k = array_len-1, 0, -1
-                    if ( buf(j) >= array(i) ) then
+                    if ( buf(j) ${signt}$= array(i) ) then
                         array(k) = buf(j)
                         j = j - 1
                         if ( j < 0 ) exit merge_upper
@@ -392,12 +418,32 @@ contains
 
         end subroutine reverse_segment
 
-    end subroutine ${k1}$_ord_sort
+    end subroutine ${k1}$_${sname}$_ord_sort
 
 #:endfor
+#:endfor
+
+    module subroutine char_ord_sort( array, work, reverse )
+        character(len=*), intent(inout)                  :: array(0:)
+        character(len=len(array)), intent(out), optional :: work(0:)
+        logical, intent(in), optional                    :: reverse
+
+        logical :: reverse_
+
+        reverse_ = .false.
+        if(present(reverse)) reverse_ = reverse
 
+        if (reverse_) then
+            call char_decrease_ord_sort(array, work)
+        else
+            call char_increase_ord_sort(array, work)
+        endif
+
+    end subroutine char_ord_sort
 
-    module subroutine char_ord_sort( array, work )
+
+#:for sname, signt, signoppt in SIGN_NAME_TYPE
+    module subroutine char_${sname}$_ord_sort( array, work )
 ! A translation to Fortran 2008, of the `"Rust" sort` algorithm found in
 ! `slice.rs`
 ! https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs#L2159
@@ -432,7 +478,7 @@ contains
             array_size = size( array, kind=int_size )
             allocate( character(len=len(array)) :: buf(0:array_size/2-1), &
                       stat=stat )
-            if ( stat /= 0 ) error stop "${k1}$_ord_sort: Allocation of buffer failed."
+            if ( stat /= 0 ) error stop "${k1}$_${sname}$_ord_sort: Allocation of buffer failed."
             call merge_sort( array, buf )
         end if
 
@@ -469,7 +515,7 @@ contains
             do j=1, size(array, kind=int_size)-1
                 key = array(j)
                 i = j - 1
-                do while( i >= 0 .and. array(i) > key )
+                do while( i >= 0 .and. array(i) ${signt}$ key )
                     array(i+1) = array(i)
                     i = i - 1
                 end do
@@ -544,7 +590,7 @@ contains
 
             tmp = array(0)
             find_hole: do i=1, size(array, kind=int_size)-1
-                if ( array(i) >= tmp ) exit find_hole
+                if ( array(i) ${signt}$= tmp ) exit find_hole
                 array(i-1) = array(i)
             end do find_hole
             array(i-1) = tmp
@@ -603,16 +649,16 @@ contains
                 start = finish
                 if ( start > 0 ) then
                     start = start - 1
-                    if ( array(start+1) < array(start) ) then
+                    if ( array(start+1) ${signoppt}$ array(start) ) then
                         Descending: do while ( start > 0 )
-                            if ( array(start) >= array(start-1) ) &
+                            if ( array(start) ${signt}$= array(start-1) ) &
                                 exit Descending
                             start = start - 1
                         end do Descending
                         call reverse_segment( array(start:finish) )
                     else
                         Ascending: do while( start > 0 )
-                            if ( array(start) < array(start-1) ) exit Ascending
+                            if ( array(start) ${signoppt}$ array(start-1) ) exit Ascending
                             start = start - 1
                         end do Ascending
                     end if
@@ -678,7 +724,7 @@ contains
                 i = 0
                 j = mid
                 merge_lower: do k = 0, array_len-1
-                    if ( buf(i) <= array(j) ) then
+                    if ( buf(i) ${signoppt}$= array(j) ) then
                         array(k) = buf(i)
                         i = i + 1
                         if ( i >= mid ) exit merge_lower
@@ -696,7 +742,7 @@ contains
                 i = mid - 1
                 j = array_len - mid -1
                 merge_upper: do k = array_len-1, 0, -1
-                    if ( buf(j) >= array(i) ) then
+                    if ( buf(j) ${signt}$= array(i) ) then
                         array(k) = buf(j)
                         j = j - 1
                         if ( j < 0 ) exit merge_upper
@@ -732,7 +778,8 @@ contains
 
         end subroutine reverse_segment
 
-    end subroutine char_ord_sort
+    end subroutine char_${sname}$_ord_sort
+#:endfor
 
 end submodule stdlib_sorting_ord_sort
 
diff --git a/src/tests/sorting/test_sorting.f90 b/src/tests/sorting/test_sorting.f90
index fd6c6aeda..639abf1e5 100644
--- a/src/tests/sorting/test_sorting.f90
+++ b/src/tests/sorting/test_sorting.f90
@@ -153,20 +153,17 @@ program test_sorting
 ! test the sorting routines on the test arrays
     ltest = .true.
 
-    call test_int_ord_sorts( )
+    call test_int_ord_sorts( ldummy );    ltest = (ltest .and. ldummy)
 
-    call test_char_ord_sorts( )
+    call test_char_ord_sorts(ldummy );    ltest = (ltest .and. ldummy)
 
-    call test_string_ord_sorts( )
+    call test_string_ord_sorts( ldummy ); ltest = (ltest .and. ldummy)
 
-    call test_int_sorts( ldummy )
-    ltest = (ltest .and. ldummy)
+    call test_int_sorts( ldummy );        ltest = (ltest .and. ldummy)
 
-    call test_char_sorts( ldummy )
-    ltest = (ltest .and. ldummy)
+    call test_char_sorts( ldummy );       ltest = (ltest .and. ldummy)
 
-    call test_string_sorts( ldummy )
-    ltest = (ltest .and. ldummy)
+    call test_string_sorts( ldummy );     ltest = (ltest .and. ldummy)
 
     call test_int_sort_indexes( )
 
@@ -179,30 +176,47 @@ program test_sorting
 
 contains
 
-    subroutine test_int_ord_sorts( )
+    subroutine test_int_ord_sorts( ltest )
+        logical, intent(out) :: ltest
+
+        logical :: ldummy
 
-        call test_int_ord_sort( blocks, "Blocks" )
-        call test_int_ord_sort( decrease, "Decreasing" )
-        call test_int_ord_sort( identical, "Identical" )
-        call test_int_ord_sort( increase, "Increasing" )
-        call test_int_ord_sort( rand1, "Random dense" )
-        call test_int_ord_sort( rand2, "Random order" )
-        call test_int_ord_sort( rand0, "Random sparse" )
-        call test_int_ord_sort( rand3, "Random 3" )
-        call test_int_ord_sort( rand10, "Random 10" )
+        ltest = .true.
+
+        call test_int_ord_sort( blocks, "Blocks", ldummy )
+        ltest = (ltest .and. ldummy)
+        call test_int_ord_sort( decrease, "Decreasing", ldummy )
+        ltest = (ltest .and. ldummy)
+        call test_int_ord_sort( identical, "Identical", ldummy )
+        ltest = (ltest .and. ldummy)
+        call test_int_ord_sort( increase, "Increasing", ldummy )
+        ltest = (ltest .and. ldummy)
+        call test_int_ord_sort( rand1, "Random dense", ldummy )
+        ltest = (ltest .and. ldummy)
+        call test_int_ord_sort( rand2, "Random order", ldummy )
+        ltest = (ltest .and. ldummy)
+        call test_int_ord_sort( rand0, "Random sparse", ldummy )
+        ltest = (ltest .and. ldummy)
+        call test_int_ord_sort( rand3, "Random 3", ldummy )
+        ltest = (ltest .and. ldummy)
+        call test_int_ord_sort( rand10, "Random 10", ldummy )
+        ltest = (ltest .and. ldummy)
 
     end subroutine test_int_ord_sorts
 
 
-    subroutine test_int_ord_sort( a, a_name )
+    subroutine test_int_ord_sort( a, a_name, ltest )
         integer(int32), intent(in) :: a(:)
         character(*), intent(in)   :: a_name 
+        logical, intent(out)       :: ltest
 
         integer(int64) :: t0, t1, tdiff
         real(dp)       :: rate
         integer(int64) :: i
         logical :: valid
 
+        ltest = .true.
+
         tdiff = 0
         do i = 1, repeat
             dummy = a
@@ -214,6 +228,7 @@ subroutine test_int_ord_sort( a, a_name )
         tdiff = tdiff/repeat
 
         call verify_sort( dummy, valid, i )
+        ltest = (ltest .and. valid)
         if ( .not. valid ) then
             write( *, * ) "ORD_SORT did not sort " // a_name // "."
             write(*,*) 'i = ', i
@@ -223,19 +238,49 @@ subroutine test_int_ord_sort( a, a_name )
             'a12, " |",  F10.5, " |" )' ) &
             test_size, a_name, "Ord_Sort", tdiff/rate
 
+        !reverse
+        dummy = a
+        call ord_sort( dummy, work, reverse = .true.)
+        call verify_reverse_sort( dummy, valid, i )
+        ltest = (ltest .and. valid)
+        if ( .not. valid ) then
+            write( *, * ) "reverse + work ORD_SORT did not sort " // a_name // "."
+            write(*,*) 'i = ', i
+            write(*,'(a12, 2i7)') 'dummy(i-1:i) = ', dummy(i-1:i)
+        end if
+
+        dummy = a
+        call ord_sort( dummy, reverse = .true.)
+        call verify_reverse_sort( dummy, valid, i )
+        ltest = (ltest .and. valid)
+        if ( .not. valid ) then
+            write( *, * ) "reverse ORD_SORT did not sort " // a_name // "."
+            write(*,*) 'i = ', i
+            write(*,'(a12, 2i7)') 'dummy(i-1:i) = ', dummy(i-1:i)
+        end if
+
     end subroutine test_int_ord_sort
 
-    subroutine test_char_ord_sorts( )
+    subroutine test_char_ord_sorts( ltest )
+        logical, intent(out) :: ltest
 
-        call test_char_ord_sort( char_decrease, "Char. Decrease" )
-        call test_char_ord_sort( char_increase, "Char. Increase" )
-        call test_char_ord_sort( char_rand, "Char. Random" )
+        logical :: ldummy
+
+        ltest = .true.
+
+        call test_char_ord_sort( char_decrease, "Char. Decrease", ldummy )
+        ltest = (ltest .and. ldummy)
+        call test_char_ord_sort( char_increase, "Char. Increase", ldummy )
+        ltest = (ltest .and. ldummy)
+        call test_char_ord_sort( char_rand, "Char. Random", ldummy )
+        ltest = (ltest .and. ldummy)
 
     end subroutine test_char_ord_sorts
 
-    subroutine test_char_ord_sort( a, a_name )
+    subroutine test_char_ord_sort( a, a_name, ltest )
         character(len=4), intent(in) :: a(0:)
         character(*), intent(in) :: a_name
+        logical, intent(out) :: ltest
 
         integer(int64) :: t0, t1, tdiff
         real(dp)       :: rate
@@ -253,34 +298,71 @@ subroutine test_char_ord_sort( a, a_name )
         tdiff = tdiff/repeat
 
         call verify_char_sort( char_dummy, valid, i )
+        ltest = (ltest .and. valid)
         if ( .not. valid ) then
             write( *, * ) "ORD_SORT did not sort " // a_name // "."
             write(*,*) 'i = ', i
-            write(*,'(a17, 2(1x,a4))') 'char_dummy(i-1:i) = ', char_dummy(i-1:i)
+            write(*,'(a, 2(1x,a4))') 'char_dummy(i-1:i) = ', char_dummy(i-1:i)
         end if
         write( lun, '("|   Character |", 1x, i7, 2x, "|", 1x, a15, " |", ' // &
             'a12, " |",  F10.5, " |" )' ) &
             char_size, a_name, "Ord_Sort", tdiff/rate
 
+        !reverse
+        char_dummy = a
+        call ord_sort( char_dummy, char_work, reverse = .true. )
+
+        call verify_char_reverse_sort( char_dummy, valid, i )
+        ltest = (ltest .and. valid)
+        if ( .not. valid ) then
+            write( *, * ) "reverse + work ORD_SORT did not sort " // a_name // "."
+            write(*,*) 'i = ', i
+            write(*,'(a, 2(1x,a4))') 'char_dummy(i-1:i) = ', char_dummy(i-1:i)
+        end if
+
+        char_dummy = a
+        call ord_sort( char_dummy, reverse = .true. )
+
+        call verify_char_reverse_sort( char_dummy, valid, i )
+        ltest = (ltest .and. valid)
+        if ( .not. valid ) then
+            write( *, * ) "reverse + work ORD_SORT did not sort " // a_name // "."
+            write(*,*) 'i = ', i
+            write(*,'(a, 2(1x,a4))') 'char_dummy(i-1:i) = ', char_dummy(i-1:i)
+        end if
+
     end subroutine test_char_ord_sort
 
-    subroutine test_string_ord_sorts( )
+    subroutine test_string_ord_sorts( ltest )
+        logical, intent(out) :: ltest
+
+        logical:: ldummy
 
-        call test_string_ord_sort( string_decrease, "String Decrease" )
-        call test_string_ord_sort( string_increase, "String Increase" )
-        call test_string_ord_sort( string_rand, "String Random" )
+        ltest = .true.
+
+        call test_string_ord_sort( string_decrease, "String Decrease", ldummy )
+        ltest = (ltest .and. ldummy)
+
+        call test_string_ord_sort( string_increase, "String Increase", ldummy )
+        ltest = (ltest .and. ldummy)
+
+        call test_string_ord_sort( string_rand, "String Random" , ldummy)
+        ltest = (ltest .and. ldummy)
 
     end subroutine test_string_ord_sorts
 
-    subroutine test_string_ord_sort( a, a_name )
+    subroutine test_string_ord_sort( a, a_name, ltest )
         type(string_type), intent(in) :: a(0:)
         character(*), intent(in)      :: a_name
+        logical, intent(out)          :: ltest
 
         integer(int64) :: t0, t1, tdiff
         real(dp)       :: rate
         integer(int64) :: i
         logical        :: valid
 
+        ltest = .true.
+
         tdiff = 0
         do i = 1, repeat
             string_dummy = a
@@ -292,16 +374,42 @@ subroutine test_string_ord_sort( a, a_name )
         tdiff = tdiff/repeat
 
         call verify_string_sort( string_dummy, valid, i )
+        ltest = (ltest .and. valid)
         if ( .not. valid ) then
             write( *, * ) "ORD_SORT did not sort " // a_name // "."
             write(*,*) 'i = ', i
-            write(*,'(a17, 2(1x,a4))') 'string_dummy(i-1:i) = ', &
+            write(*,'(a, 2(1x,a))') 'string_dummy(i-1:i) = ', &
                 string_dummy(i-1:i)
         end if
         write( lun, '("| String_type |", 1x, i7, 2x, "|", 1x, a15, " |", ' // &
             'a12, " |",  F10.5, " |" )' ) &
             string_size, a_name, "Ord_Sort", tdiff/rate
 
+        !reverse
+        string_dummy = a
+        call ord_sort( string_dummy, string_work, reverse = .true. )
+
+        call verify_string_reverse_sort( string_dummy, valid, i )
+        ltest = (ltest .and. valid)
+        if ( .not. valid ) then
+            write( *, * ) "reverse + work ORD_SORT did not sort " // a_name // "."
+            write(*,*) 'i = ', i
+            write(*,'(a, 2(1x,a))') 'string_dummy(i-1:i) = ', &
+                string_dummy(i-1:i)
+        end if
+ 
+        string_dummy = a
+        call ord_sort( string_dummy, reverse = .true. )
+
+        call verify_string_reverse_sort( string_dummy, valid, i )
+        ltest = (ltest .and. valid)
+        if ( .not. valid ) then
+            write( *, * ) "reverse ORD_SORT did not sort " // a_name // "."
+            write(*,*) 'i = ', i
+            write(*,'(a, 2(1x,a))') 'string_dummy(i-1:i) = ', &
+                string_dummy(i-1:i)
+        end if
+
     end subroutine test_string_ord_sort
 
 

From 940b6fc6efb7e2363fbd682ce18db729502cf76e Mon Sep 17 00:00:00 2001
From: Jeremie Vandenplas <jeremie.vandenplas@gmail.com>
Date: Fri, 28 May 2021 21:39:54 +0200
Subject: [PATCH 36/45] sort_sign: add checks (error stop) to all tests

---
 src/tests/sorting/test_sorting.f90 | 109 ++++++++++++++++++++---------
 1 file changed, 76 insertions(+), 33 deletions(-)

diff --git a/src/tests/sorting/test_sorting.f90 b/src/tests/sorting/test_sorting.f90
index 639abf1e5..415f846f2 100644
--- a/src/tests/sorting/test_sorting.f90
+++ b/src/tests/sorting/test_sorting.f90
@@ -153,23 +153,23 @@ program test_sorting
 ! test the sorting routines on the test arrays
     ltest = .true.
 
-    call test_int_ord_sorts( ldummy );    ltest = (ltest .and. ldummy)
+    call test_int_ord_sorts( ldummy );       ltest = (ltest .and. ldummy)
 
-    call test_char_ord_sorts(ldummy );    ltest = (ltest .and. ldummy)
+    call test_char_ord_sorts(ldummy );       ltest = (ltest .and. ldummy)
 
-    call test_string_ord_sorts( ldummy ); ltest = (ltest .and. ldummy)
+    call test_string_ord_sorts( ldummy );    ltest = (ltest .and. ldummy)
 
-    call test_int_sorts( ldummy );        ltest = (ltest .and. ldummy)
+    call test_int_sorts( ldummy );           ltest = (ltest .and. ldummy)
 
-    call test_char_sorts( ldummy );       ltest = (ltest .and. ldummy)
+    call test_char_sorts( ldummy );          ltest = (ltest .and. ldummy)
 
-    call test_string_sorts( ldummy );     ltest = (ltest .and. ldummy)
+    call test_string_sorts( ldummy );        ltest = (ltest .and. ldummy)
 
-    call test_int_sort_indexes( )
+    call test_int_sort_indexes( ldummy );    ltest = (ltest .and. ldummy)
 
-    call test_char_sort_indexes( )
+    call test_char_sort_indexes( ldummy );   ltest = (ltest .and. ldummy)
 
-    call test_string_sort_indexes( )
+    call test_string_sort_indexes( ldummy ); ltest = (ltest .and. ldummy)
 
 
     call check(ltest)
@@ -207,7 +207,7 @@ end subroutine test_int_ord_sorts
 
     subroutine test_int_ord_sort( a, a_name, ltest )
         integer(int32), intent(in) :: a(:)
-        character(*), intent(in)   :: a_name 
+        character(*), intent(in)   :: a_name
         logical, intent(out)       :: ltest
 
         integer(int64) :: t0, t1, tdiff
@@ -397,7 +397,7 @@ subroutine test_string_ord_sort( a, a_name, ltest )
             write(*,'(a, 2(1x,a))') 'string_dummy(i-1:i) = ', &
                 string_dummy(i-1:i)
         end if
- 
+
         string_dummy = a
         call ord_sort( string_dummy, reverse = .true. )
 
@@ -546,7 +546,7 @@ subroutine test_char_sort( a, a_name, ltest )
             write(*,*) 'i = ', i
             write(*,'(a17, 2(1x,a4))') 'char_dummy(i-1:i) = ', char_dummy(i-1:i)
         end if
-   
+
     end subroutine test_char_sort
 
     subroutine test_string_sorts( ltest )
@@ -614,29 +614,46 @@ subroutine test_string_sort( a, a_name, ltest )
 
     end subroutine test_string_sort
 
-    subroutine test_int_sort_indexes( )
+    subroutine test_int_sort_indexes( ltest )
+        logical, intent(out) :: ltest
+
+        logical :: ldummy
 
-        call test_int_sort_index( blocks, "Blocks" )
-        call test_int_sort_index( decrease, "Decreasing" )
-        call test_int_sort_index( identical, "Identical" )
-        call test_int_sort_index( increase, "Increasing" )
-        call test_int_sort_index( rand1, "Random dense" )
-        call test_int_sort_index( rand2, "Random order" )
-        call test_int_sort_index( rand0, "Random sparse" )
-        call test_int_sort_index( rand3, "Random 3" )
-        call test_int_sort_index( rand10, "Random 10" )
+        ltest = .true.
+
+        call test_int_sort_index( blocks, "Blocks", ldummy )
+        ltest = (ltest .and. ldummy)
+        call test_int_sort_index( decrease, "Decreasing", ldummy )
+        ltest = (ltest .and. ldummy)
+        call test_int_sort_index( identical, "Identical", ldummy )
+        ltest = (ltest .and. ldummy)
+        call test_int_sort_index( increase, "Increasing", ldummy )
+        ltest = (ltest .and. ldummy)
+        call test_int_sort_index( rand1, "Random dense", ldummy )
+        ltest = (ltest .and. ldummy)
+        call test_int_sort_index( rand2, "Random order", ldummy )
+        ltest = (ltest .and. ldummy)
+        call test_int_sort_index( rand0, "Random sparse", ldummy )
+        ltest = (ltest .and. ldummy)
+        call test_int_sort_index( rand3, "Random 3", ldummy )
+        ltest = (ltest .and. ldummy)
+        call test_int_sort_index( rand10, "Random 10", ldummy )
+        ltest = (ltest .and. ldummy)
 
     end subroutine test_int_sort_indexes
 
-    subroutine test_int_sort_index( a, a_name )
+    subroutine test_int_sort_index( a, a_name, ltest )
         integer(int32), intent(inout) :: a(:)
         character(*), intent(in)      :: a_name
+        logical, intent(out)          :: ltest
 
         integer(int64)                 :: t0, t1, tdiff
         real(dp)                       :: rate
         integer(int64)                 :: i
         logical                        :: valid
 
+        ltest = .true.
+
         tdiff = 0
         do i = 1, repeat
             dummy = a
@@ -649,6 +666,7 @@ subroutine test_int_sort_index( a, a_name )
 
         dummy = a(index)
         call verify_sort( dummy, valid, i )
+        ltest = (ltest .and. valid)
         if ( .not. valid ) then
             write( *, * ) "SORT_INDEX did not sort " // a_name // "."
             write(*,*) 'i = ', i
@@ -662,6 +680,7 @@ subroutine test_int_sort_index( a, a_name )
         call sort_index( dummy, index, work, iwork, reverse=.true. )
         dummy = a(index)
         call verify_reverse_sort( dummy, valid, i )
+        ltest = (ltest .and. valid)
         if ( .not. valid ) then
             write( *, * ) "SORT_INDEX did not reverse sort " // &
                 a_name // "."
@@ -671,23 +690,34 @@ subroutine test_int_sort_index( a, a_name )
 
     end subroutine test_int_sort_index
 
-    subroutine test_char_sort_indexes( )
+    subroutine test_char_sort_indexes( ltest )
+        logical, intent(out) :: ltest
 
-        call test_char_sort_index( char_decrease, "Char. Decrease" )
-        call test_char_sort_index( char_increase, "Char. Increase" )
-        call test_char_sort_index( char_rand, "Char. Random" )
+        logical :: ldummy
+
+        ltest = .true.
+
+        call test_char_sort_index( char_decrease, "Char. Decrease", ldummy )
+        ltest = (ltest .and. ldummy)
+        call test_char_sort_index( char_increase, "Char. Increase", ldummy )
+        ltest = (ltest .and. ldummy)
+        call test_char_sort_index( char_rand, "Char. Random", ldummy )
+        ltest = (ltest .and. ldummy)
 
     end subroutine test_char_sort_indexes
 
-    subroutine test_char_sort_index( a, a_name )
+    subroutine test_char_sort_index( a, a_name, ltest )
         character(len=4), intent(in) :: a(0:)
         character(*), intent(in) :: a_name
+        logical, intent(out)     :: ltest
 
         integer(int64) :: t0, t1, tdiff
         real(dp)       :: rate
         integer(int64) :: i
         logical        :: valid
 
+        ltest = .true.
+
         tdiff = 0
         do i = 1, repeat
             char_dummy = a
@@ -699,6 +729,7 @@ subroutine test_char_sort_index( a, a_name )
         tdiff = tdiff/repeat
 
         call verify_char_sort( char_dummy, valid, i )
+        ltest = (ltest .and. valid)
         if ( .not. valid ) then
             write( *, * ) "SORT_INDEX did not sort " // a_name // "."
             write(*,*) 'i = ', i
@@ -710,23 +741,34 @@ subroutine test_char_sort_index( a, a_name )
 
     end subroutine test_char_sort_index
 
-    subroutine test_string_sort_indexes( )
+    subroutine test_string_sort_indexes( ltest )
+        logical, intent(out) :: ltest
+
+        logical :: ldummy
 
-        call test_string_sort_index( string_decrease, "String Decrease" )
-        call test_string_sort_index( string_increase, "String Increase" )
-        call test_string_sort_index( string_rand, "String Random" )
+        ltest = .true.
+
+        call test_string_sort_index( string_decrease, "String Decrease", ldummy )
+        ltest = (ltest .and. ldummy)
+        call test_string_sort_index( string_increase, "String Increase", ldummy )
+        ltest = (ltest .and. ldummy)
+        call test_string_sort_index( string_rand, "String Random", ldummy )
+        ltest = (ltest .and. ldummy)
 
     end subroutine test_string_sort_indexes
 
-    subroutine test_string_sort_index( a, a_name )
+    subroutine test_string_sort_index( a, a_name, ltest )
         type(string_type), intent(in) :: a(0:)
         character(*), intent(in) :: a_name
+        logical, intent(out) :: ltest
 
         integer(int64) :: t0, t1, tdiff
         real(dp)       :: rate
         integer(int64) :: i
         logical        :: valid
 
+        ltest = .true.
+
         tdiff = 0
         do i = 1, repeat
             string_dummy = a
@@ -738,6 +780,7 @@ subroutine test_string_sort_index( a, a_name )
         tdiff = tdiff/repeat
 
         call verify_string_sort( string_dummy, valid, i )
+        ltest = (ltest .and. valid)
         if ( .not. valid ) then
             write( *, * ) "SORT_INDEX did not sort " // a_name // "."
             write(*,*) 'i = ', i

From df150f2883f807b38bffd84fafa209f9fa22136d Mon Sep 17 00:00:00 2001
From: Jeremie Vandenplas <jeremie.vandenplas@gmail.com>
Date: Fri, 28 May 2021 22:01:08 +0200
Subject: [PATCH 37/45] sort_sign: remove module subroutines when inappropriate

---
 src/stdlib_sorting_ord_sort.fypp | 4 ++--
 src/stdlib_sorting_sort.fypp     | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/stdlib_sorting_ord_sort.fypp b/src/stdlib_sorting_ord_sort.fypp
index b7fae6d23..d43162320 100644
--- a/src/stdlib_sorting_ord_sort.fypp
+++ b/src/stdlib_sorting_ord_sort.fypp
@@ -84,7 +84,7 @@ contains
 #:for sname, signt, signoppt in SIGN_NAME_TYPE
 #:for k1, t1 in IRS_KINDS_TYPES
 
-    module subroutine ${k1}$_${sname}$_ord_sort( array, work )
+    subroutine ${k1}$_${sname}$_ord_sort( array, work )
 ! A translation to Fortran 2008, of the `"Rust" sort` algorithm found in
 ! `slice.rs`
 ! https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs#L2159
@@ -443,7 +443,7 @@ contains
 
 
 #:for sname, signt, signoppt in SIGN_NAME_TYPE
-    module subroutine char_${sname}$_ord_sort( array, work )
+    subroutine char_${sname}$_ord_sort( array, work )
 ! A translation to Fortran 2008, of the `"Rust" sort` algorithm found in
 ! `slice.rs`
 ! https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs#L2159
diff --git a/src/stdlib_sorting_sort.fypp b/src/stdlib_sorting_sort.fypp
index 3502eb750..8d042ad18 100644
--- a/src/stdlib_sorting_sort.fypp
+++ b/src/stdlib_sorting_sort.fypp
@@ -86,7 +86,7 @@ contains
 #:for sname, signt, signoppt in SIGN_NAME_TYPE
 #:for k1, t1 in IRS_KINDS_TYPES
 
-    pure module subroutine ${k1}$_${sname}$_sort( array )
+    pure subroutine ${k1}$_${sname}$_sort( array )
 ! `${k1}$_sort( array )` sorts the input `ARRAY` of type `${t1}$`
 ! using a hybrid sort based on the `introsort` of David Musser. As with
 ! `introsort`, `${k1}$_sort( array )` is an unstable hybrid comparison
@@ -274,7 +274,7 @@ contains
 
 
 #:for sname, signt, signoppt in SIGN_NAME_TYPE
-    pure module subroutine char_${sname}$_sort( array )
+    pure subroutine char_${sname}$_sort( array )
 ! `char_sort( array )` sorts the input `ARRAY` of type `CHARACTER(*)`
 ! using a hybrid sort based on the `introsort` of David Musser. As with
 ! `introsort`, `char_sort( array )` is an unstable hybrid comparison

From 41b8b8e21c0418d9cb88ab8bcadc2d9803e4caa4 Mon Sep 17 00:00:00 2001
From: Jeremie Vandenplas <jeremie.vandenplas@gmail.com>
Date: Fri, 28 May 2021 22:24:55 +0200
Subject: [PATCH 38/45] sort_sign: add init ltest

---
 src/tests/sorting/test_sorting.f90 | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/src/tests/sorting/test_sorting.f90 b/src/tests/sorting/test_sorting.f90
index 415f846f2..3ce0aae52 100644
--- a/src/tests/sorting/test_sorting.f90
+++ b/src/tests/sorting/test_sorting.f90
@@ -287,6 +287,8 @@ subroutine test_char_ord_sort( a, a_name, ltest )
         integer(int64) :: i
         logical        :: valid
 
+        ltest = .true.
+
         tdiff = 0
         do i = 1, repeat
             char_dummy = a
@@ -494,6 +496,8 @@ subroutine test_char_sorts( ltest )
 
         logical :: ldummy
 
+        ltest = .true.
+
         call test_char_sort( char_decrease, "Char. Decrease", ldummy )
         ltest = (ltest .and. ldummy)
 
@@ -515,6 +519,8 @@ subroutine test_char_sort( a, a_name, ltest )
         integer(int64) :: i
         logical        :: valid
 
+        ltest = .true.
+
         tdiff = 0
         do i = 1, repeat
             char_dummy = a

From 6c4d3e247074eeab9afad28ee5e9f46d36d2b905 Mon Sep 17 00:00:00 2001
From: Jeremie Vandenplas <jeremie.vandenplas@gmail.com>
Date: Sat, 29 May 2021 20:04:24 +0200
Subject: [PATCH 39/45] update specs

---
 doc/specs/stdlib_sorting.md | 72 +++++++++++++++++++------------------
 1 file changed, 38 insertions(+), 34 deletions(-)

diff --git a/doc/specs/stdlib_sorting.md b/doc/specs/stdlib_sorting.md
index 520de6ba6..265c57a95 100644
--- a/doc/specs/stdlib_sorting.md
+++ b/doc/specs/stdlib_sorting.md
@@ -55,12 +55,12 @@ data:
 
 The Fortran Standard Library is distributed under the MIT
 License. However components of the library may be based on code with
-additional licensing restriction. In particular `ORD_SORT`,
+additional licensing restrictions. In particular `ORD_SORT`,
 `SORT_INDEX`, and `SORT` are translations of codes with their
 own distribution restrictions.
 
 The `ORD_SORT` and `SORT_INDEX` subroutines are essentially
-translations to Fortran 2008 of the `"rust" sort` of the Rust Language
+translations to Fortran 2008 of the `"Rust" sort` of the Rust Language
 distributed as part of
 [`slice.rs`](https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs).
 The header of the `slice.rs` file has as its licensing requirements:
@@ -75,7 +75,7 @@ The header of the `slice.rs` file has as its licensing requirements:
     option. This file may not be copied, modified, or distributed
     except according to those terms.
 
-so the license for the `slice.rs` code is compatible with the use of
+So the license for the `slice.rs` code is compatible with the use of
 modified versions of the code in the Fortran Standard Library under
 the MIT license.
 
@@ -108,17 +108,16 @@ performance than `SORT` on partially sorted data, having O(N)
 performance on uniformly increasing or decreasing data.
 
 
-`ORD_SORT` begins by traversing the array starting in its tail
-attempting to identify `runs` in the array, where a run is either a
-uniformly decreasing sequence, `ARRAY(i-1) > ARRAY(i)`, or a
-non-decreasing, `ARRAY(i-1) <= ARRAY(i)`, sequence. First delimitated
-decreasing sequences are reversed in their order. Then, if the
-sequence has less than `MIN_RUN` elements, previous elements in the
-array are added to the run using `insertion sort` until the run
-contains `MIN_RUN` elements or the array is completely processed. As
-each run is identified the start and length of the run 
-are then pushed onto a stack and the stack is then processed using
-`merge` until it obeys the stack invariants:
+When sorting in an increasing order, `ORD_SORT` begins by traversing the array
+starting in its tail attempting to identify `runs` in the array, where a run is
+either a uniformly decreasing sequence, `ARRAY(i-1) > ARRAY(i)`, or a
+non-decreasing, `ARRAY(i-1) <= ARRAY(i)`, sequence. First delimited decreasing
+sequences are reversed in their order. Then, if the sequence has less than
+`MIN_RUN` elements, previous elements in the array are added to the run using
+`insertion sort` until the run contains `MIN_RUN` elements or the array is
+completely processed. As each run is identified the start and length of the run
+are then pushed onto a stack and the stack is then processed using `merge` until
+it obeys the stack invariants:
 
 1. len(i-2) > len(i-1) + len(i)
 2. len(i-1) > len(i)
@@ -134,6 +133,12 @@ structured data. As a modified `merge sort`, `ORD_SORT` requires the
 use of a "scratch" array, that may be provided as an optional `work`
 argument or allocated internally on the stack.
 
+Arrays can be also sorted in a decreasing order by providing the argument `reverse
+= .true.`.
+
+**QUESTION: is the `reverse` mode still a stable comparison algorithm?**
+
+
 #### The `SORT_INDEX` subroutine
 
 The `SORT` and `ORD_SORT` subroutines can sort rank 1 isolated
@@ -205,11 +210,11 @@ Experimental
 ##### Description
 
 Returns an input `array` with the elements sorted in order of
-increasing value.
+increasing, or decreasing, value.
 
 ##### Syntax
 
-`call [[stdlib_sorting(module):ord_sort(subroutine)]]ord_sort ( array[, work ] )`
+`call [[stdlib_sorting(module):ord_sort(subroutine)]]ord_sort ( array[, work, reverse ] )`
 
 ##### Class
 
@@ -233,6 +238,12 @@ memory for internal record keeping. If associated with an array in
 static storage, its use can significantly reduce the stack memory
 requirements for the code. Its contents on return are undefined.
 
+`reverse` (optional): shall be a scalar of type default logical. It
+is an `intent(in)` argument. If present with a value of `.true.` then
+`array` will be sorted in order of non-increasing values in stable
+order **(stable order: is it still true?)**. Otherwise index will sort `array` in order of non-decreasing
+values in stable order.
+
 ##### Notes
 
 `ORD_SORT` implements a hybrid sorting algorithm combining
@@ -246,26 +257,12 @@ non-decreasing arrays. The optional `work` array replaces "scratch"
 memory that would otherwise be allocated on the stack. If `array` is of
 any type `REAL` the order of its elements on return undefined if any
 element of `array` is a `NaN`.  Sorting of `CHARACTER(*)` and
-`STRING_TYPE` arrays are based on the operator `>`, and not on the
+`STRING_TYPE` arrays are based on the operators `>` and `<`, and not on the
 function `LGT`.
 
 
 ##### Example
 
-```fortran
-    ...
-    ! Read arrays from sorted files
-    call read_sorted_file( 'dummy_file1', array1 )
-    call read_sorted_file( 'dummy_file2', array2 )
-    ! Concatenate the arrays
-	array = [ array1, array2 ]
-    ! Sort the resulting array
-    call ord_sort( array, work )
-    ! Process the sorted array
-    call array_search( array, values )
-    ...
-```
-
 ```fortran
     program demo_ord_sort
       use stdlib_sorting, only: ord_sort
@@ -287,12 +284,12 @@ Experimental
 
 ##### Description
 
-Returns an input array with the elements sorted in order of increasing
-value.
+Returns an input array with the elements sorted in order of increasing, or
+decreasing, value.
 
 ##### Syntax
 
-`call [[stdlib_sorting(module):sort(subroutine)]]sort ( array )`
+`call [[stdlib_sorting(module):sort(subroutine)]]sort ( array, reverse )`
 
 ##### Class
 
@@ -306,6 +303,13 @@ Pure generic subroutine.
 `type(string_type)`. It is an `intent(inout)` argument. On return its
 input elements will be sorted in order of non-decreasing value.
 
+
+`reverse` (optional): shall be a scalar of type default logical. It
+is an `intent(in)` argument. If present with a value of `.true.` then
+`array` will be sorted in order of non-increasing values in unstable
+order. Otherwise index will sort `array` in order of non-decreasing
+values in unstable order.
+
 ##### Notes
 
 `SORT` implements a hybrid sorting algorithm combining

From 18ad97908e11e71ca32a2efb8e57028d05a587ec Mon Sep 17 00:00:00 2001
From: Jeremie Vandenplas <jeremie.vandenplas@gmail.com>
Date: Sun, 30 May 2021 20:28:23 +0200
Subject: [PATCH 40/45] sort_sign: removed bold sentences

---
 doc/specs/stdlib_sorting.md | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/doc/specs/stdlib_sorting.md b/doc/specs/stdlib_sorting.md
index 265c57a95..158e72486 100644
--- a/doc/specs/stdlib_sorting.md
+++ b/doc/specs/stdlib_sorting.md
@@ -136,9 +136,6 @@ argument or allocated internally on the stack.
 Arrays can be also sorted in a decreasing order by providing the argument `reverse
 = .true.`.
 
-**QUESTION: is the `reverse` mode still a stable comparison algorithm?**
-
-
 #### The `SORT_INDEX` subroutine
 
 The `SORT` and `ORD_SORT` subroutines can sort rank 1 isolated
@@ -241,7 +238,7 @@ requirements for the code. Its contents on return are undefined.
 `reverse` (optional): shall be a scalar of type default logical. It
 is an `intent(in)` argument. If present with a value of `.true.` then
 `array` will be sorted in order of non-increasing values in stable
-order **(stable order: is it still true?)**. Otherwise index will sort `array` in order of non-decreasing
+order. Otherwise index will sort `array` in order of non-decreasing
 values in stable order.
 
 ##### Notes

From 933837aca9cfaaf7121c02ef9c6e0fcc0f5c961a Mon Sep 17 00:00:00 2001
From: William Clodius <w.clodius@icloud.com>
Date: Sun, 30 May 2021 20:04:38 -0600
Subject: [PATCH 41/45] Corrected doc/specs/incex,md

Changed `[sorting](.stdlib_sorting.md)` to `[sorting](./stdlib_sorting.md)`

[ticket: X]
---
 doc/specs/index.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/specs/index.md b/doc/specs/index.md
index 0883ea07c..790450aa5 100644
--- a/doc/specs/index.md
+++ b/doc/specs/index.md
@@ -19,7 +19,7 @@ This is and index/directory of the specifications (specs) for each new module/fe
  - [logger](./stdlib_logger.html) - Runtime logging system
  - [optval](./stdlib_optval.html) - Fallback value for optional arguments
  - [quadrature](./stdlib_quadrature.html) - Numerical integration
- - [sorting](.stdlib_sorting.html) - Sorting of rank one arrays
+ - [sorting](./stdlib_sorting.html) - Sorting of rank one arrays
  - [stats](./stdlib_stats.html) - Descriptive Statistics
  - [stats_distribution_PRNG](./stdlib_stats_distribution_PRNG.html) - Probability Distributions random number generator
  - [string\_type](./stdlib_string_type.html) - Basic string support

From de03c1da57301ecc0f4347027db91fed1f8db40c Mon Sep 17 00:00:00 2001
From: Jeremie Vandenplas <jeremie.vandenplas@gmail.com>
Date: Mon, 31 May 2021 09:12:39 +0200
Subject: [PATCH 42/45] update of the comments in source code

---
 doc/specs/stdlib_sorting.md  |  2 +-
 src/stdlib_sorting.fypp      | 31 ++++++++++++++++++++++---------
 src/stdlib_sorting_sort.fypp | 12 ++++++------
 3 files changed, 29 insertions(+), 16 deletions(-)

diff --git a/doc/specs/stdlib_sorting.md b/doc/specs/stdlib_sorting.md
index 158e72486..912b359fd 100644
--- a/doc/specs/stdlib_sorting.md
+++ b/doc/specs/stdlib_sorting.md
@@ -286,7 +286,7 @@ decreasing, value.
 
 ##### Syntax
 
-`call [[stdlib_sorting(module):sort(subroutine)]]sort ( array, reverse )`
+`call [[stdlib_sorting(module):sort(subroutine)]]sort ( array[, reverse] )`
 
 ##### Class
 
diff --git a/src/stdlib_sorting.fypp b/src/stdlib_sorting.fypp
index 9658021fe..b53008f2d 100644
--- a/src/stdlib_sorting.fypp
+++ b/src/stdlib_sorting.fypp
@@ -139,10 +139,10 @@ module stdlib_sorting
 !! Version: experimental
 !!
 !! The generic subroutine implementing the `ORD_SORT` algorithm to return
-!! an input array with its elements sorted in order of non-decreasing
+!! an input array with its elements sorted in order of (non-)decreasing
 !! value. Its use has the syntax:
 !!
-!!     call ord_sort( array[, work] )
+!!     call ord_sort( array[, work, reverse] )
 !!
 !! with the arguments:
 !!
@@ -161,6 +161,12 @@ module stdlib_sorting
 !!   storage, its use can significantly reduce the stack memory requirements
 !!   for the code. Its value on return is undefined.
 !!
+!! * `reverse` (optional): shall be a scalar of type default logical. It
+!!   is an `intent(in)` argument. If present with a value of `.true.` then
+!!   `array` will be sorted in order of non-increasing values in stable
+!!   order. Otherwise index will sort `array` in order of non-decreasing
+!!   values in stable order.
+!!
 !!#### Example
 !!
 !!```fortran
@@ -183,10 +189,10 @@ module stdlib_sorting
 !! Version: experimental
 !!
 !! The generic subroutine implementing the `SORT` algorithm to return
-!! an input array with its elements sorted in order of non-decreasing
+!! an input array with its elements sorted in order of (non-)decreasing
 !! value. Its use has the syntax:
 !!
-!!     call sort( array )
+!!     call sort( array[, reverse] )
 !!
 !! with the arguments:
 !!
@@ -197,6 +203,11 @@ module stdlib_sorting
 !!   real and at least one of the elements is a `NaN`, then the ordering
 !!   of the result is undefined. Otherwise it is defined to be the
 !!   original elements in non-decreasing order.
+!! * `reverse` (optional): shall be a scalar of type default logical. It
+!!   is an `intent(in)` argument. If present with a value of `.true.` then
+!!   `array` will be sorted in order of non-increasing values in unstable
+!!   order. Otherwise index will sort `array` in order of non-decreasing
+!!   values in unstable order.
 !!
 !!#### Example
 !!
@@ -351,7 +362,7 @@ module stdlib_sorting
         module subroutine char_ord_sort( array, work, reverse )
 !! Version: experimental
 !!
-!! `char_ord_sort( array )` sorts the input `ARRAY` of type `CHARACTER(*)`
+!! `char_ord_sort( array[, work, reverse] )` sorts the input `ARRAY` of type `CHARACTER(*)`
 !! using a hybrid sort based on the `'Rust" sort` algorithm found in `slice.rs`
             character(len=*), intent(inout)                  :: array(0:)
             character(len=len(array)), intent(out), optional :: work(0:)
@@ -370,7 +381,7 @@ module stdlib_sorting
         pure module subroutine ${k1}$_sort( array, reverse )
 !! Version: experimental
 !!
-!! `${k1}$_sort( array )` sorts the input `ARRAY` of type `${t1}$`
+!! `${k1}$_sort( array[, reverse] )` sorts the input `ARRAY` of type `${t1}$`
 !! using a hybrid sort based on the `introsort` of David Musser.
 !! The algorithm is of order O(N Ln(N)) for all inputs.
 !! Because it relies on `quicksort`, the coefficient of the O(N Ln(N))
@@ -384,7 +395,7 @@ module stdlib_sorting
         pure module subroutine char_sort( array, reverse )
 !! Version: experimental
 !!
-!! `char_sort( array )` sorts the input `ARRAY` of type `CHARACTER(*)`
+!! `char_sort( array[, reverse] )` sorts the input `ARRAY` of type `CHARACTER(*)`
 !! using a hybrid sort based on the `introsort` of David Musser.
 !! The algorithm is of order O(N Ln(N)) for all inputs.
 !! Because it relies on `quicksort`, the coefficient of the O(N Ln(N))
@@ -411,7 +422,8 @@ module stdlib_sorting
             reverse )
 !! Version: experimental
 !!
-!! `${k1}$_sort_index( array )` sorts an input `ARRAY` of type `${t1}$`
+!! `${k1}$_sort_index( array, index[, work, iwork, reverse] )` sorts
+!! an input `ARRAY` of type `${t1}$`
 !! using a hybrid sort based on the `'Rust" sort` algorithm found in `slice.rs`
 !! and returns the sorted `ARRAY` and an array `INDEX of indices in the
 !! order that would sort the input `ARRAY` in the desired direction.
@@ -428,7 +440,8 @@ module stdlib_sorting
             reverse )
 !! Version: experimental
 !!
-!! `char_sort_index( array )` sorts an input `ARRAY` of type `CHARACTER(*)`
+!! `char_sort_index( array, index[, work, iwork, reverse] )` sorts
+!! an input `ARRAY` of type `CHARACTER(*)`
 !! using a hybrid sort based on the `'Rust" sort` algorithm found in `slice.rs`
 !! and returns the sorted `ARRAY` and an array `INDEX of indices in the
 !! order that would sort the input `ARRAY` in the desired direction.
diff --git a/src/stdlib_sorting_sort.fypp b/src/stdlib_sorting_sort.fypp
index 8d042ad18..69ca16bea 100644
--- a/src/stdlib_sorting_sort.fypp
+++ b/src/stdlib_sorting_sort.fypp
@@ -87,11 +87,11 @@ contains
 #:for k1, t1 in IRS_KINDS_TYPES
 
     pure subroutine ${k1}$_${sname}$_sort( array )
-! `${k1}$_sort( array )` sorts the input `ARRAY` of type `${t1}$`
+! `${k1}$_${name}$_sort( array )` sorts the input `ARRAY` of type `${t1}$`
 ! using a hybrid sort based on the `introsort` of David Musser. As with
-! `introsort`, `${k1}$_sort( array )` is an unstable hybrid comparison
+! `introsort`, `${k1}$_${name}$_sort( array )` is an unstable hybrid comparison
 ! algorithm using `quicksort` for the main body of the sort tree,
-! supplemented by `insertion sort` for the outer brances, but if
+! supplemented by `insertion sort` for the outer branches, but if
 ! `quicksort` is converging too slowly the algorithm resorts
 ! to `heapsort`. The algorithm is of order O(N Ln(N)) for all inputs.
 ! Because it relies on `quicksort`, the coefficient of the O(N Ln(N))
@@ -275,11 +275,11 @@ contains
 
 #:for sname, signt, signoppt in SIGN_NAME_TYPE
     pure subroutine char_${sname}$_sort( array )
-! `char_sort( array )` sorts the input `ARRAY` of type `CHARACTER(*)`
+! `char_${name}$_sort( array )` sorts the input `ARRAY` of type `CHARACTER(*)`
 ! using a hybrid sort based on the `introsort` of David Musser. As with
-! `introsort`, `char_sort( array )` is an unstable hybrid comparison
+! `introsort`, `char_${name}$_sort( array )` is an unstable hybrid comparison
 ! algorithm using `quicksort` for the main body of the sort tree,
-! supplemented by `insertion sort` for the outer brances, but if
+! supplemented by `insertion sort` for the outer branches, but if
 ! `quicksort` is converging too slowly the algorithm resorts
 ! to `heapsort`. The algorithm is of order O(N Ln(N)) for all inputs.
 ! Because it relies on `quicksort`, the coefficient of the O(N Ln(N))

From 720a0cdcad336aa36b4fdb33b0af2141d22cfaec Mon Sep 17 00:00:00 2001
From: Jeremie Vandenplas <jeremie.vandenplas@gmail.com>
Date: Mon, 31 May 2021 09:16:27 +0200
Subject: [PATCH 43/45] fixed typo

---
 src/stdlib_sorting_sort.fypp | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/stdlib_sorting_sort.fypp b/src/stdlib_sorting_sort.fypp
index 69ca16bea..7e5bd2ee4 100644
--- a/src/stdlib_sorting_sort.fypp
+++ b/src/stdlib_sorting_sort.fypp
@@ -87,9 +87,9 @@ contains
 #:for k1, t1 in IRS_KINDS_TYPES
 
     pure subroutine ${k1}$_${sname}$_sort( array )
-! `${k1}$_${name}$_sort( array )` sorts the input `ARRAY` of type `${t1}$`
+! `${k1}$_${sname}$_sort( array )` sorts the input `ARRAY` of type `${t1}$`
 ! using a hybrid sort based on the `introsort` of David Musser. As with
-! `introsort`, `${k1}$_${name}$_sort( array )` is an unstable hybrid comparison
+! `introsort`, `${k1}$_${sname}$_sort( array )` is an unstable hybrid comparison
 ! algorithm using `quicksort` for the main body of the sort tree,
 ! supplemented by `insertion sort` for the outer branches, but if
 ! `quicksort` is converging too slowly the algorithm resorts
@@ -275,9 +275,9 @@ contains
 
 #:for sname, signt, signoppt in SIGN_NAME_TYPE
     pure subroutine char_${sname}$_sort( array )
-! `char_${name}$_sort( array )` sorts the input `ARRAY` of type `CHARACTER(*)`
+! `char_${sname}$_sort( array )` sorts the input `ARRAY` of type `CHARACTER(*)`
 ! using a hybrid sort based on the `introsort` of David Musser. As with
-! `introsort`, `char_${name}$_sort( array )` is an unstable hybrid comparison
+! `introsort`, `char_${sname}$_sort( array )` is an unstable hybrid comparison
 ! algorithm using `quicksort` for the main body of the sort tree,
 ! supplemented by `insertion sort` for the outer branches, but if
 ! `quicksort` is converging too slowly the algorithm resorts

From 8fc4f67d98d180bb8395ec0555e37b461b8c6290 Mon Sep 17 00:00:00 2001
From: Jeremie Vandenplas <jeremie.vandenplas@gmail.com>
Date: Tue, 1 Jun 2021 07:49:54 +0200
Subject: [PATCH 44/45] add checks

---
 src/stdlib_sorting_ord_sort.fypp   | 5 ++++-
 src/stdlib_sorting_sort_index.fypp | 9 +++++++++
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/src/stdlib_sorting_ord_sort.fypp b/src/stdlib_sorting_ord_sort.fypp
index d43162320..c433a66ec 100644
--- a/src/stdlib_sorting_ord_sort.fypp
+++ b/src/stdlib_sorting_ord_sort.fypp
@@ -111,12 +111,15 @@ contains
         integer(int_size) :: array_size
         integer :: stat
 
+        array_size = size( array, kind=int_size )
         if ( present(work) ) then
+            if ( size( work, kind=int_size) < array_size/2 ) then
+                error stop "${k1}$_${sname}$_ord_sort: work array too small."
+            endif
 ! Use the work array as scratch memory
             call merge_sort( array, work )
         else
 ! Allocate a buffer to use as scratch memory.
-            array_size = size( array, kind=int_size )
             allocate( buf(0:array_size/2-1), stat=stat )
             if ( stat /= 0 ) error stop "${k1}$_${sname}$_ord_sort: Allocation of buffer failed."
             call merge_sort( array, buf )
diff --git a/src/stdlib_sorting_sort_index.fypp b/src/stdlib_sorting_sort_index.fypp
index e91935972..9988b9f3a 100644
--- a/src/stdlib_sorting_sort_index.fypp
+++ b/src/stdlib_sorting_sort_index.fypp
@@ -107,7 +107,13 @@ contains
 
 ! If necessary allocate buffers to serve as scratch memory.
         if ( present(work) ) then
+            if ( size(work, kind=int_size) < array_size/2 ) then
+                error stop "work array too small."
+            end if
             if ( present(iwork) ) then
+                if ( size(iwork, kind=int_size) < array_size/2 ) then
+                    error stop "work array too small."
+                endif
                 call merge_sort( array, index, work, iwork )
             else
                 allocate( ibuf(0:array_size/2-1), stat=stat )
@@ -118,6 +124,9 @@ contains
             allocate( buf(0:array_size/2-1), stat=stat )
             if ( stat /= 0 ) error stop "Allocation of array buffer failed."
             if ( present(iwork) ) then
+                if ( size(iwork, kind=int_size) < array_size/2 ) then
+                    error stop "work array too small."
+                endif
                 call merge_sort( array, index, buf, iwork )
             else
                 allocate( ibuf(0:array_size/2-1), stat=stat )

From 25876b75a2a989ac56ac51c0814a36f007062bbf Mon Sep 17 00:00:00 2001
From: Jeremie Vandenplas <jeremie.vandenplas@gmail.com>
Date: Tue, 1 Jun 2021 21:59:32 +0200
Subject: [PATCH 45/45] changes following wclodius2

---
 src/stdlib_sorting_ord_sort.fypp   | 2 +-
 src/stdlib_sorting_sort_index.fypp | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/stdlib_sorting_ord_sort.fypp b/src/stdlib_sorting_ord_sort.fypp
index c433a66ec..2de070298 100644
--- a/src/stdlib_sorting_ord_sort.fypp
+++ b/src/stdlib_sorting_ord_sort.fypp
@@ -114,7 +114,7 @@ contains
         array_size = size( array, kind=int_size )
         if ( present(work) ) then
             if ( size( work, kind=int_size) < array_size/2 ) then
-                error stop "${k1}$_${sname}$_ord_sort: work array too small."
+                error stop "${k1}$_${sname}$_ord_sort: work array is too small."
             endif
 ! Use the work array as scratch memory
             call merge_sort( array, work )
diff --git a/src/stdlib_sorting_sort_index.fypp b/src/stdlib_sorting_sort_index.fypp
index 9988b9f3a..96aff4cd2 100644
--- a/src/stdlib_sorting_sort_index.fypp
+++ b/src/stdlib_sorting_sort_index.fypp
@@ -108,11 +108,11 @@ contains
 ! If necessary allocate buffers to serve as scratch memory.
         if ( present(work) ) then
             if ( size(work, kind=int_size) < array_size/2 ) then
-                error stop "work array too small."
+                error stop "work array is too small."
             end if
             if ( present(iwork) ) then
                 if ( size(iwork, kind=int_size) < array_size/2 ) then
-                    error stop "work array too small."
+                    error stop "iwork array is too small."
                 endif
                 call merge_sort( array, index, work, iwork )
             else
@@ -125,7 +125,7 @@ contains
             if ( stat /= 0 ) error stop "Allocation of array buffer failed."
             if ( present(iwork) ) then
                 if ( size(iwork, kind=int_size) < array_size/2 ) then
-                    error stop "work array too small."
+                    error stop "iwork array is too small."
                 endif
                 call merge_sort( array, index, buf, iwork )
             else