-
Notifications
You must be signed in to change notification settings - Fork 168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Greatly improve performance of sorting dictionaries #5168
Conversation
#5780 might make this unnecessary as it significantly changes the performance characteristics of dictionaries. If this is still a meaningful gain then there's probably no downside; the increased scratch memory usage really isn't signficant. |
Still seems worth doing:
|
3f23468
to
df83018
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! Let's do it.
Dictionary lookup by index is slow, so read all of the keys/values from the dictionary into a vector and sort that instead. Using a dictionary iterator to read the values replaces O(N log N) ClusterTree lookups with O(log N), so this is asymptotically faster. The benchmark is sped up from 3.77 seconds to 21.62 milliseconds, but more reasonably sized Dictionaries will see proportionally smaller benefits. The downside of this is that the array of Mixed requires 24 bytes of scratch space per element in the dictionary. We already require 8 bytes per element to store the results, so this is just a constant factor increase rather than an aysmtotic change in memory use and it's probably not a problem.
df83018
to
364bc5d
Compare
I noticed in #5163 that dictionary sorting is rather slow and took a stab at improving it.
Dictionary lookup by index is slow, so read all of the keys/values from the dictionary into a vector and sort that instead. Using a dictionary iterator to read the values replaces O(N log N) ClusterTree lookups with O(log N), so this is asymptotically faster. The benchmark is sped up from 3.77 seconds to 21.62 milliseconds, but more reasonably sized Dictionaries will see proportionally smaller benefits.
The downside of this is that the array of Mixed requires 24 bytes of scratch space per element in the dictionary. We already require 8 bytes per element to store the results, so this is just a constant factor increase rather than an asymptotic change in memory use and it's probably not a problem.