Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoids memory issues with big files while extracting archives #2192

Merged
merged 19 commits into from
Mar 4, 2021

Conversation

GitHK
Copy link
Contributor

@GitHK GitHK commented Mar 4, 2021

What do these changes do?

Unzipping was taking way too much RAM, could cause the webserver to crash while importing big projects.

nodeports memory consumption while unarchiving will be lowered with this fix. Nodeports compression has also been removed to enhance performance.

Related issue/s

also related to #2171 regarding the jupyter-labs OOM due to pour nodeports not chunking files while extracting them but doing it all in memory.

How to test

Checklist

@GitHK GitHK added bug buggy, it does not work as expected a:webserver issue related to the webserver service labels Mar 4, 2021
@GitHK GitHK self-assigned this Mar 4, 2021
@GitHK GitHK requested review from sanderegg and pcrespov March 4, 2021 08:17
@codecov
Copy link

codecov bot commented Mar 4, 2021

Codecov Report

Merging #2192 (ff70f95) into master (dbcaf9f) will increase coverage by 3.8%.
The diff coverage is 35.2%.

Impacted file tree graph

@@           Coverage Diff            @@
##           master   #2192     +/-   ##
========================================
+ Coverage    73.0%   76.9%   +3.8%     
========================================
  Files         464     249    -215     
  Lines       17866   10362   -7504     
  Branches     1759    1026    -733     
========================================
- Hits        13045    7969   -5076     
+ Misses       4365    2071   -2294     
+ Partials      456     322    -134     
Flag Coverage Δ
integrationtests 65.3% <35.2%> (-0.2%) ⬇️
unittests 69.4% <36.3%> (+2.9%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...core-sdk/src/simcore_sdk/node_data/data_manager.py 0.0% <0.0%> (-95.3%) ⬇️
...rc/simcore_service_webserver/exporter/archiving.py 70.5% <54.5%> (-23.9%) ⬇️
.../simcore-sdk/src/simcore_sdk/node_data/__init__.py 0.0% <0.0%> (-100.0%) ⬇️
...erver/src/simcore_service_webserver/rest_models.py 0.0% <0.0%> (-91.2%) ⬇️
...src/simcore_service_sidecar/celery_configurator.py 0.0% <0.0%> (-91.2%) ⬇️
...src/simcore_service_webserver/activity/handlers.py 27.1% <0.0%> (-62.9%) ⬇️
...ver/src/simcore_service_webserver/rest_handlers.py 48.1% <0.0%> (-51.9%) ⬇️
...simcore_service_webserver/activity/module_setup.py 55.5% <0.0%> (-44.5%) ⬇️
.../simcore_service_webserver/security_permissions.py 0.0% <0.0%> (-43.8%) ⬇️
... and 265 more

@GitHK GitHK requested a review from odeimaiz March 4, 2021 08:20
Copy link
Member

@pcrespov pcrespov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was expecting tests to be in services/web/server/tests/unit/isolated/test_exporter_archiving.py. Can you please provide some? thx

@GitHK GitHK requested a review from pcrespov March 4, 2021 14:02

@pytest.mark.parametrize(
"compress,store_relative_path",
[[True, True], [True, False], [False, True], [False, False]],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: check itertools.product to create cartesian products


async def unarchive_dir(archive_to_extract: Path, destination_folder: Path) -> None:
with zipfile.ZipFile(archive_to_extract, mode="r") as zip_file_handler:
with ProcessPoolExecutor() as pool:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aha, now i see where that it gets in a pool, OK

@GitHK GitHK merged commit ba3684c into ITISFoundation:master Mar 4, 2021
@sanderegg sanderegg added this to the The Red Panda milestone Mar 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a:webserver issue related to the webserver service bug buggy, it does not work as expected
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants