Optimization: std.sort should only evaluate keyF once per array element #245
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR improves the performance of
std.sort
and related functions whenkeyF
is used.The existing implementation evaluates
keyF
multiple times per input element because:keyF
on all elements to check whether all keys are of the same type.keyF
on every pair of compared elements.In the best case (an already-sorted array), this performs ~3x more evaluations than needed because each element participates in up to two extra unnecessary comparisons. In the worst case, we have to do additional comparisons during sorting and the unnecessary work will be even higher.
The fix
The fix:
I also made a few other small improvements:
"Cannot sort with key values that are not all the same type"
error becauseVal.True
andVal.False
are different classes. The existing code which did class equality checks onVal.Bool
would never match becauseVal.Bool
is an abstract class.keyTypes
set: we can simply check that all other elements match the first type's element. This saves some garbage allocations in the common case..force
calls earlier so that we don't have to call them in the sort comparator.Benchmarking results
Consider the following toy benchmark case:
With the
RunProfiler
we can see an enormous difference in the number ofstd.toString
invocations via the key function: with a standard 5 benchmark runs, we expect to see only 50,000 hits but the old code ran it241,210
times!I also measured performance on one of our real-world jsonnet bundles, where this PR's optimization cut one expensive target's runtime by 25%.