-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Left join large memory usage regression #18106
Comments
Other things of note
|
OK. Can confirm that we see this performance regression with the latest dev version of polars.
|
@corwinjoy the plot for 0.19.19 shows |
@CHDev93 The same script was being run in both cases, but I agree that the time for The mprof command outputs (with sleep removed using 0.19.19):
|
Yes, we now go into the row-encodign for the group-by. The new algorithm is faster when data doesn't fit in your cache size anymore. You must increase the dataset (depending on the beefyness of your machine) to see the result. Latest Polars is 1.3-1.5x faster for me, but does indeed require more memory (1.5). The old code had to go though. The memory requirements will improve with the new-streaming engine and with fixed size row-encoding which we plan to add as well. #19929 will also improve this if we land it. In any case, it isn't a bug but the cost of our new algorithm. We have to be able to remove old code branches if they hurt us and sometimes this has a different memory footprint. |
Checks
Reproducible example
I installed memo
Log output
Issue description
I'm doing a join of two tables on a compound key of [int, int, int]. In newer versions of polars (inlcuding
polars==1.4.1
) it uses much more memory than I'd expect. I confirmed by rolling back topolars==0.19.19
and found it did use significantly less memory.Expected behavior
I'd expect a left join for this problem to use something like 2x the space of the left table as it did in
0.19.19
. I ran the script with python'smemory-profiler
package and used the commandsRunning:
mprof run --python -o polars_141_small.prof -M --include-children python polars_join_bug_mwe.py
Plotting:
mprof plot polars_141_small.prof --title polars_1_4_1 -o polars_141_small.png -w 0,12
Installed versions
The text was updated successfully, but these errors were encountered: