Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix memory growth bug in read_csv #24837

Merged

Conversation

gfyoung
Copy link
Member

@gfyoung gfyoung commented Jan 19, 2019

The edge case where we hit powers of 2 every time during allocation can be painful.

Closes #24805.

xref #23527.

@gfyoung gfyoung added Regression Functionality that used to work in a prior pandas version IO CSV read_csv, to_csv labels Jan 19, 2019
@gfyoung gfyoung added this to the 0.24.0 milestone Jan 19, 2019
@gfyoung
Copy link
Member Author

gfyoung commented Jan 19, 2019

I don't recall off the top of my head, but given that this regression was introduced during the RC period, do we still need to do a whatsnew ? The regression was technically not part of an official release.

@codecov
Copy link

codecov bot commented Jan 19, 2019

Codecov Report

Merging #24837 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #24837   +/-   ##
=======================================
  Coverage   92.38%   92.38%           
=======================================
  Files         166      166           
  Lines       52382    52382           
=======================================
  Hits        48394    48394           
  Misses       3988     3988
Flag Coverage Δ
#multiple 90.81% <ø> (ø) ⬆️
#single 42.91% <ø> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7e6ad86...12aaad0. Read the comment docs.

@codecov
Copy link

codecov bot commented Jan 19, 2019

Codecov Report

Merging #24837 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #24837   +/-   ##
=======================================
  Coverage   92.39%   92.39%           
=======================================
  Files         166      166           
  Lines       52378    52378           
=======================================
  Hits        48393    48393           
  Misses       3985     3985
Flag Coverage Δ
#multiple 90.81% <ø> (ø) ⬆️
#single 42.9% <ø> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f4458c1...0c366a8. Read the comment docs.

@jreback
Copy link
Contributor

jreback commented Jan 19, 2019

don't recall off the top of my head, but given that this regression was introduced during the RC period, do we still need to do a whatsnew ? The regression was technically not part of an official release.

no i think this is ok as is, but can you add a memory asv here.

The edge case where we hit powers of 2
every time during allocation can be painful.

Closes pandas-devgh-24805.

xref pandas-devgh-23527.
@gfyoung gfyoung force-pushed the read-csv-memory-growth-chunksize branch from 12aaad0 to 0c366a8 Compare January 20, 2019 00:05
@gfyoung
Copy link
Member Author

gfyoung commented Jan 20, 2019

@jreback : Added an ASV as requested, and all is still green. PTAL.

@jreback jreback merged commit 03001be into pandas-dev:master Jan 20, 2019
@jreback
Copy link
Contributor

jreback commented Jan 20, 2019

thanks @gfyoung

@gfyoung gfyoung deleted the read-csv-memory-growth-chunksize branch January 20, 2019 18:41
Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019
* Fix memory growth bug in read_csv

The edge case where we hit powers of 2
every time during allocation can be painful.

Closes pandas-devgh-24805.

xref pandas-devgh-23527.

* TST: Add ASV benchmark for issue
Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019
* Fix memory growth bug in read_csv

The edge case where we hit powers of 2
every time during allocation can be painful.

Closes pandas-devgh-24805.

xref pandas-devgh-23527.

* TST: Add ASV benchmark for issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO CSV read_csv, to_csv Regression Functionality that used to work in a prior pandas version
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants