Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 re: Incident. Webserver: Load and error when exporting / duplicating study #2805

Closed
mrnicegyu11 opened this issue Feb 8, 2022 · 1 comment
Assignees
Labels
a:webserver issue related to the webserver service bug buggy, it does not work as expected

Comments

@mrnicegyu11
Copy link
Member

This is a write-up of the indicdent investigation. Feel free to modify and/or close this issue if it is a duplicate etc.

========================

On the night from 07Feb to 08Feb on aws-prod, a :duplicate request was received on that lead to an short outage of osparc. The request was redirected to the webserver, who proceeded to perform many IO disk operations and achieved 100% CPU load for some minutes:
image
(sorry for the mouse-pointer in the screenshot, the CPU spike is hidden behind it...)

Finally, the request was answered after 480seconds with a 500 by the webserver, as an Exception occured.

The current line of thinking is:

  • the websever is using a lot of CPU to zip stuff
  • A asyncio.exceptions.CancelledError occured in _run_http_download, so likely the download timed out.
/v0/projects/0f9b7548-3b1e-11ec-917d-02420a0b3377:duplicate"" done in 479.60 secs. Responding with status 500","[0;31mERROR�[0m: servicelib.aiohttp.monitoring:middleware_handler(216) - Unexpected server error ""<class 'aiohttp.web_exceptions.HTTPInternalServerError'>"" from access: 10.0.97.40 ""POST /v0/projects/0f9b7548-3b1e-11ec-917d-02420a0b3377:duplicate"" done in 479.60 secs. Responding with status 500
Traceback (most recent call last):
  File ""/home/scu/.venv/lib/python3.8/site-packages/parfive/downloader.py"", line 295, in run_download
    done.update(await self._run_http_download(main_pb, timeouts))
  File ""/home/scu/.venv/lib/python3.8/site-packages/parfive/downloader.py"", line 429, in _run_http_download
    done, _ = await asyncio.wait(futures)
  File ""/usr/local/lib/python3.8/asyncio/tasks.py"", line 426, in wait
    return await _wait(fs, timeout, return_when, loop)
  File ""/usr/local/lib/python3.8/asyncio/tasks.py"", line 534, in _wait
    await waiter
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File ""/home/scu/.venv/lib/python3.8/site-packages/servicelib/aiohttp/rest_middlewares.py"", line 75, in _middleware_handler
    response = await handler(request)
  File ""/home/scu/.venv/lib/python3.8/site-packages/servicelib/aiohttp/rest_middlewares.py"", line 203, in _middleware_handler
    resp: _ResponseOrBodyData = await handler(request)
  File ""/home/scu/.venv/lib/python3.8/site-packages/simcore_service_webserver/products.py"", line 97, in discover_product_middleware
    response = await handler(request)
  File ""/home/scu/.venv/lib/python3.8/site-packages/simcore_service_webserver/login/decorators.py"", line 20, in wrapped
    ret = await handler(*args, **kwargs)
  File ""/home/scu/.venv/lib/python3.8/site-packages/simcore_service_webserver/security_decorators.py"", line 23, in wrapped
    ret = await handler(request)
  File ""/home/scu/.venv/lib/python3.8/site-packages/simcore_service_webserver/exporter/request_handlers.py"", line 130, in duplicate_project
    exported_project_path = await study_export(
  File ""/home/scu/.venv/lib/python3.8/site-packages/simcore_service_webserver/exporter/export_import.py"", line 40, in study_export
    await formatter.format_export_directory(
  File ""/home/scu/.venv/lib/python3.8/site-packages/simcore_service_webserver/exporter/formatters/formatter_v1.py"", line 443, in format_export_directory
    await generate_directory_contents(
  File ""/home/scu/.venv/lib/python3.8/site-packages/simcore_service_webserver/exporter/formatters/formatter_v1.py"", line 149, in generate_directory_contents
    await download_all_files_from_storage(app=app, download_links=download_links)
  File ""/home/scu/.venv/lib/python3.8/site-packages/simcore_service_webserver/exporter/formatters/formatter_v1.py"", line 57, in download_all_files_from_storage
    await parallel_downloader.download_files(app)
  File ""/home/scu/.venv/lib/python3.8/site-packages/simcore_service_webserver/exporter/file_downloader.py"", line 31, in download_files
    results = await self.downloader.run_download(
  File ""/home/scu/.venv/lib/python3.8/site-packages/parfive/downloader.py"", line 297, in run_download
    done.update(await self._run_ftp_download(main_pb, timeouts))
  File ""/usr/local/lib/python3.8/contextlib.py"", line 131, in __exit__
    self.gen.throw(type, value, traceback)
AttributeError: 'list_iterator' object has no attribute 'throw'
@mrnicegyu11 mrnicegyu11 added bug buggy, it does not work as expected a:webserver issue related to the webserver service labels Feb 8, 2022
@mrnicegyu11 mrnicegyu11 changed the title 🐛 re: Indident. Webserver: Load and error when exporting / duplicating study 🐛 re: Incident. Webserver: Load and error when exporting / duplicating study Feb 8, 2022
@sanderegg
Copy link
Member

exporting is disabled.
copying was fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a:webserver issue related to the webserver service bug buggy, it does not work as expected
Projects
None yet
Development

No branches or pull requests

4 participants