Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🚀 Pre-release master -> staging_Mithril5 #4054

Closed
11 of 13 tasks
matusdrobuliak66 opened this issue Apr 3, 2023 · 5 comments
Closed
11 of 13 tasks

🚀 Pre-release master -> staging_Mithril5 #4054

matusdrobuliak66 opened this issue Apr 3, 2023 · 5 comments
Assignees
Labels
t:maintenance Some planned maintenance work
Milestone

Comments

@matusdrobuliak66
Copy link
Contributor

matusdrobuliak66 commented Apr 3, 2023

What kind of pre-release?

master branch

Sprint Name

Mithril

Pre-release version

5

Commit SHA

0b60aee

Did the commit CI suceeded?

  • The commit CI succeeded.

Motivation

  • Weekly release to stagging

What Changed

Devops check ⚠️ devops

e2e testing check 🧪

#4054 (comment)
No response

Summary 📝

  • make release-staging name=Mithril version=5 git_sha=0b60aee54ae17ef4b31156e833b44f1c7c47aee9
  • Draft pre-release
  • Announce
    {"start": "2023-04-04T12:30:00.000Z", "end": "2023-04-04T13:00:00.000Z", "reason": "Release Mithril5"}

Releasing

  • Release (release draft)
  • Check Release CI
  • Check deployed
    • aws deploy
    • dalco deploy
  • Delete announcement
  • Check e2e runs
  • Announce
https://github.com/ITISFoundation/osparc-simcore/releases/tag/staging_Mithril5
@matusdrobuliak66
Copy link
Contributor Author

matusdrobuliak66 commented Apr 3, 2023

  • unhealthy webserver
    await asyncio.wait_for(
simcore_service_webserver.rest_healthcheck.HealthCheckFailed: Last requests average latency is 27.843608220418293 secs and surpasses 10.0 secs
ERROR: [2023-04-03 15:12:35,021/MainProcess] [servicelib.aiohttp.monitoring:middleware_handler(220)]  -  Unexpected server error "<class 'aiohttp.web_exceptions.HTTPServiceUnavailable'>" from access: 127.0.0.1 "GET /v0/health" done in 0.00 secs. Responding with status 503
  • webserver api call timeout: GET clusters/0/details ( @sanderegg is there a fix already in master for this?)
ERROR: [2023-04-03 15:59:29,759/MainProcess] [servicelib.aiohttp.monitoring:middleware_handler(220)]  -  Unexpected server error "<class 'aiohttp.web_exceptions.HTTPServiceUnavailable'>" from access: 10.87.9.250 "GET /v0/clusters/0/details" done in 41.72 secs. Responding with status 503
The above exception was the direct cause of the following exception:
asyncio.exceptions.TimeoutError
  File "/home/scu/.venv/lib/python3.10/site-packages/tenacity/_asyncio.py", line 69, in __anext__
Traceback (most recent call last):
  File "/home/scu/.venv/lib/python3.10/site-packages/simcore_service_webserver/director_v2_core_base.py", line 76, in request_director_v2
    async for attempt in AsyncRetrying(**DEFAULT_RETRY_POLICY):
    do = self.iter(retry_state=self._retry_state)
  File "/home/scu/.venv/lib/python3.10/site-packages/tenacity/__init__.py", line 360, in iter
    raise retry_exc.reraise()
  File "/home/scu/.venv/lib/python3.10/site-packages/tenacity/__init__.py", line 193, in reraise
    raise self.last_attempt.result()
    raise asyncio.TimeoutError from None
  File "/home/scu/.venv/lib/python3.10/site-packages/aiohttp/helpers.py", line 720, in __exit__
  File "/home/scu/.venv/lib/python3.10/site-packages/aiohttp/client_reqrep.py", line 894, in start
    with self._timer:
    await resp.start(conn)
  File "/home/scu/.venv/lib/python3.10/site-packages/aiohttp/client.py", line 560, in _request
    self._resp = await self._coro
  File "/home/scu/.venv/lib/python3.10/site-packages/aiohttp/client.py", line 1141, in __aenter__
    raise self._exception
  File "/home/scu/.venv/lib/python3.10/site-packages/simcore_service_webserver/director_v2_core_base.py", line 79, in request_director_v2
    async with session.request(
  File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
    return self.__get_result()
  File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 451, in result
  • director v2
raise DirectorServiceError(
simcore_service_webserver.director_v2_exceptions.DirectorServiceError: Unexpected error: director-v2 returned 503, reason 'request to director-v2 timed-out: ' after calling URL('http://staging_director-v2:8000/v2/clusters/0/details?user_id=41176')
  • I have restarted director_v2
  • Sylvain did some additional cleaning of his tests in aws stagging (?)
  • Afterwards the errors were not showing anymore

@mrnicegyu11
Copy link
Member

Looks good, green from my side! thanks :)

@matusdrobuliak66
Copy link
Contributor Author

AWS stagging problem with pending services improved a lot after the release:
image

@matusdrobuliak66
Copy link
Contributor Author

matusdrobuliak66 commented Apr 3, 2023

E2E tests: The same problems as before release (these are all tests that went after the release):
image

@matusdrobuliak66
Copy link
Contributor Author

matusdrobuliak66 commented Apr 4, 2023

Seem e2e tests are back to normal:
image

@matusdrobuliak66 matusdrobuliak66 self-assigned this Apr 4, 2023
@matusdrobuliak66 matusdrobuliak66 added this to the Mithril milestone Apr 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
t:maintenance Some planned maintenance work
Projects
None yet
Development

No branches or pull requests

3 participants