-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoids memory issues with big files while extracting archives #2192
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2192 +/- ##
========================================
+ Coverage 73.0% 76.9% +3.8%
========================================
Files 464 249 -215
Lines 17866 10362 -7504
Branches 1759 1026 -733
========================================
- Hits 13045 7969 -5076
+ Misses 4365 2071 -2294
+ Partials 456 322 -134
Flags with carried forward coverage won't be shown. Click here to find out more.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was expecting tests to be in services/web/server/tests/unit/isolated/test_exporter_archiving.py. Can you please provide some? thx
services/web/server/src/simcore_service_webserver/exporter/archiving.py
Outdated
Show resolved
Hide resolved
services/web/server/src/simcore_service_webserver/exporter/archiving.py
Outdated
Show resolved
Hide resolved
services/web/server/src/simcore_service_webserver/exporter/archiving.py
Outdated
Show resolved
Hide resolved
…re-forked into fix-zip-memory-issues
- fixes an error with with high memory consumption
|
||
@pytest.mark.parametrize( | ||
"compress,store_relative_path", | ||
[[True, True], [True, False], [False, True], [False, False]], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI: check itertools.product to create cartesian products
|
||
async def unarchive_dir(archive_to_extract: Path, destination_folder: Path) -> None: | ||
with zipfile.ZipFile(archive_to_extract, mode="r") as zip_file_handler: | ||
with ProcessPoolExecutor() as pool: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aha, now i see where that it gets in a pool, OK
What do these changes do?
Unzipping was taking way too much RAM, could cause the webserver to crash while importing big projects.
nodeports memory consumption while unarchiving will be lowered with this fix. Nodeports compression has also been removed to enhance performance.
Related issue/s
also related to #2171 regarding the jupyter-labs OOM due to pour nodeports not chunking files while extracting them but doing it all in memory.
How to test
Checklist