Avoids memory issues with big files while extracting archives #2192

GitHK · 2021-03-04T08:17:38Z

What do these changes do?

Unzipping was taking way too much RAM, could cause the webserver to crash while importing big projects.

nodeports memory consumption while unarchiving will be lowered with this fix. Nodeports compression has also been removed to enhance performance.

Related issue/s

also related to #2171 regarding the jupyter-labs OOM due to pour nodeports not chunking files while extracting them but doing it all in memory.

How to test

Checklist

codecov · 2021-03-04T08:18:56Z

Codecov Report

Merging #2192 (ff70f95) into master (dbcaf9f) will increase coverage by 3.8%.
The diff coverage is 35.2%.

@@           Coverage Diff            @@
##           master   #2192     +/-   ##
========================================
+ Coverage    73.0%   76.9%   +3.8%     
========================================
  Files         464     249    -215     
  Lines       17866   10362   -7504     
  Branches     1759    1026    -733     
========================================
- Hits        13045    7969   -5076     
+ Misses       4365    2071   -2294     
+ Partials      456     322    -134

Flag	Coverage Δ
integrationtests	`65.3% <35.2%> (-0.2%)`	⬇️
unittests	`69.4% <36.3%> (+2.9%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
...core-sdk/src/simcore_sdk/node_data/data_manager.py	`0.0% <0.0%> (-95.3%)`	⬇️
...rc/simcore_service_webserver/exporter/archiving.py	`70.5% <54.5%> (-23.9%)`	⬇️
.../simcore-sdk/src/simcore_sdk/node_data/__init__.py	`0.0% <0.0%> (-100.0%)`	⬇️
...erver/src/simcore_service_webserver/rest_models.py	`0.0% <0.0%> (-91.2%)`	⬇️
...src/simcore_service_sidecar/celery_configurator.py	`0.0% <0.0%> (-91.2%)`	⬇️
...src/simcore_service_webserver/activity/handlers.py	`27.1% <0.0%> (-62.9%)`	⬇️
...ver/src/simcore_service_webserver/rest_handlers.py	`48.1% <0.0%> (-51.9%)`	⬇️
...simcore_service_webserver/activity/module_setup.py	`55.5% <0.0%> (-44.5%)`	⬇️
.../simcore_service_webserver/security_permissions.py	`0.0% <0.0%> (-43.8%)`	⬇️
... and 265 more

pcrespov

I was expecting tests to be in services/web/server/tests/unit/isolated/test_exporter_archiving.py. Can you please provide some? thx

services/web/server/src/simcore_service_webserver/exporter/archiving.py

…re-forked into fix-zip-memory-issues

- fixes an error with with high memory consumption

pcrespov · 2021-03-04T16:23:50Z

packages/service-library/tests/test_archiving_utils.py

+
+@pytest.mark.parametrize(
+    "compress,store_relative_path",
+    [[True, True], [True, False], [False, True], [False, False]],


FYI: check itertools.product to create cartesian products

pcrespov · 2021-03-04T16:25:27Z

packages/service-library/src/servicelib/archiving_utils.py

+
+async def unarchive_dir(archive_to_extract: Path, destination_folder: Path) -> None:
+    with zipfile.ZipFile(archive_to_extract, mode="r") as zip_file_handler:
+        with ProcessPoolExecutor() as pool:


aha, now i see where that it gets in a pool, OK

Andrei Neagu added 2 commits March 4, 2021 09:15

no longer requires to create folder before extracting

aef1c5e

adding chunked extractor to avoid huge mermory usage

2406d24

GitHK added bug buggy, it does not work as expected a:webserver issue related to the webserver service labels Mar 4, 2021

GitHK self-assigned this Mar 4, 2021

GitHK requested review from sanderegg and pcrespov March 4, 2021 08:17

GitHK requested a review from odeimaiz March 4, 2021 08:20

odeimaiz approved these changes Mar 4, 2021

View reviewed changes

Andrei Neagu and others added 3 commits March 4, 2021 11:33

this is till required

51e0b6b

no need to create destination

c7f63d8

Merge branch 'master' into fix-zip-memory-issues

528b158

pcrespov requested changes Mar 4, 2021

View reviewed changes

services/web/server/src/simcore_service_webserver/exporter/archiving.py Outdated Show resolved Hide resolved

services/web/server/src/simcore_service_webserver/exporter/archiving.py Outdated Show resolved Hide resolved

pcrespov reviewed Mar 4, 2021

View reviewed changes

services/web/server/src/simcore_service_webserver/exporter/archiving.py Outdated Show resolved Hide resolved

Andrei Neagu added 6 commits March 4, 2021 14:09

moved archiving to servicelib

f5039b6

Merge branch 'fix-zip-memory-issues' of github.com:GitHK/osparc-simco…

6df3346

…re-forked into fix-zip-memory-issues

fixing tests

fa9e611

pylint dose not undestand

cf65405

replaced archiving and unarchivng

7114420

- fixes an error with with high memory consumption

disabled compression

de1d077

GitHK requested a review from pcrespov March 4, 2021 14:02

Andrei Neagu added 7 commits March 4, 2021 15:46

adding extra test information to error

d5c2cd8

adding extra tests

481a0b9

remove extra line

f9e60cd

removing error from destination_folder

1c825f5

properly fixed error

2725c8d

only extract files

85af1b0

handle directories differently, just create them

e523262

pcrespov approved these changes Mar 4, 2021

View reviewed changes

fixing very badly written test

dd0a88f

GitHK merged commit ba3684c into ITISFoundation:master Mar 4, 2021

GitHK mentioned this pull request Mar 15, 2021

zipping/unzipping takes a looong time #2124

Closed

sanderegg added this to the The Red Panda milestone Mar 24, 2021

This was referenced Mar 24, 2021

platform stability #1426

Closed

maintenance/scaling of the platform ITISFoundation/osparc-issues#428

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoids memory issues with big files while extracting archives #2192

Avoids memory issues with big files while extracting archives #2192

GitHK commented Mar 4, 2021 •

edited

Loading

codecov bot commented Mar 4, 2021 •

edited

Loading

pcrespov left a comment

pcrespov Mar 4, 2021

pcrespov Mar 4, 2021

Avoids memory issues with big files while extracting archives #2192

Avoids memory issues with big files while extracting archives #2192

Conversation

GitHK commented Mar 4, 2021 • edited Loading

What do these changes do?

Related issue/s

How to test

Checklist

codecov bot commented Mar 4, 2021 • edited Loading

Codecov Report

pcrespov left a comment

Choose a reason for hiding this comment

pcrespov Mar 4, 2021

Choose a reason for hiding this comment

pcrespov Mar 4, 2021

Choose a reason for hiding this comment

GitHK commented Mar 4, 2021 •

edited

Loading

codecov bot commented Mar 4, 2021 •

edited

Loading