Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cap small strs groupby optimization at 8 bytes #2886

Closed
stress-tess opened this issue Dec 14, 2023 · 0 comments · Fixed by #2887
Closed

Cap small strs groupby optimization at 8 bytes #2886

stress-tess opened this issue Dec 14, 2023 · 0 comments · Fixed by #2887
Assignees

Comments

@stress-tess
Copy link
Member

cap strs for the small str optimization at 8 bytes. There was a drop off in our str groupby benchmarks

This is because we currently have our small str optimization set to kick in at 16 bytes (which is how long the strings are in the multi-col str groupby) but this is the max number of bits we can have before we hit the totalDigits > 8 case, where we hash everything anyway. So these are all resulting in extra processing just to do what we were already doing

@stress-tess stress-tess self-assigned this Dec 14, 2023
@stress-tess stress-tess changed the title cap small_strs at 8 bytes Cap small strs groupby optimization at 8 bytes Dec 14, 2023
stress-tess pushed a commit to stress-tess/arkouda that referenced this issue Dec 14, 2023
This PR (closes Bears-R-Us#2886) caps strs for the small str optimization at 8 bytes since there was a drop off in our str groupby benchmarks

This is because we currently have our small str optimization set to kick in at 16 bytes (which is how long the strings are in the multi-col str groupby benchmark) but this is the max number of bits we can have before we hit the totalDigits > 8 case, where we hash everything anyway. So these are all resulting in extra processing just to do what we were already doing
github-merge-queue bot pushed a commit that referenced this issue Dec 18, 2023
* Closes #2886: Cap small strs groupby optimization

This PR (closes #2886) caps strs for the small str optimization at 8 bytes since there was a drop off in our str groupby benchmarks

This is because we currently have our small str optimization set to kick in at 16 bytes (which is how long the strings are in the multi-col str groupby benchmark) but this is the max number of bits we can have before we hit the totalDigits > 8 case, where we hash everything anyway. So these are all resulting in extra processing just to do what we were already doing

* putting back in 1 col shortcut

---------

Co-authored-by: Pierce Hayes <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant