Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[FIX] http: Unreachable server when db_maxconn reached during registr…
…y loading On `WebRequest` `__exit__`, when an exception occured, (in `self.registry.signal_changes` or `self.registry.reset_changes`) cursor were left unclosed as `self._cr.close` was not called in such cases. Having exceptions in the above mentioned method do not happen often, but when it does it left unclosed and unusable cursors in the connection pool, and in the extreme case explained below, it left the connection pool with only unclosed and unusable cursors. The entire server was then unusable as it no longer had working cursors. Case: - Start a multi-thread server with db_maxconn set to 5 - Ensure you do not send any request to the server, not even with a left open tab on `http://localhost:8069` in your browser - Send 6 parallel HTTP requests to `/web/login` thanks to an external thread python script (See below, at the end of this long commit message) According to your registry state (if you have a lot of modules installed or not), and the native Python Garbage Collecting state, you might end with - either warnings telling some unclosed cursor were garbage collected, and therefore closed (by a kind of luck thanks to the Python garbage collecting), - either, a server completely blocked not accepting any other request (you can try for instance `curl http://localhost:8069` and you end up with a `500 Internal Server Error` This observed issue looks to appear only in 11.0. Not 10.0 or 12.0. This is because only 11.0 clear the cache during registry loading: `https://github.com/odoo/odoo/blob/f1706c848d41c47646dabca771996e9b9f788241/odoo/modules/loading.py#L236` This cache clearing doesn't happen in 10.0 nor 12.0 (in 12.0, thanks to e181f59) When sending the 6 parallel requests, it uses instantly all the 5 available cursors of the connection pool to handle these requests, and when each request exits, in `__exit__`, it calls `self.registry.signal_changes()` which tries to open a new cursor because of - `self.cache_invalidated` which is True, for all the 6 requests, thanks to the call to `clear_caches` explained above during the registry loading and the fact all requests have been treated in parallel, - `with closing(self.cursor()) as cr:`, `self.cursor()` attempting to use a new cursor (the `closing(...)` does not have any incidence on this issue, despite it could look like guilty) The attempt to use a new cursor fails, as there is no more available (`db_maxconn` is reached), raising a `PoolError('The Connection Pool Is Full')` exception. In the request `__exit__` method, because of this exception raised when calling `signal_changes`, `self._cr.close` is never reached, and the parallel request therefore left only unclosed cursors in the connection pool, therefore leaving the server in a state where it only has unusable cursors and therefore can't do anything more. This might look like really bad luck to land in such a state, but we observed multiple actual case on Odoo.sh, the one referenced in this commit (opw-2008340) was because of an Outlook client which launched 18 parallel requests to fetch the email images, and the server wasn't spawned, therefore neither was the registry. The server registry was therefore just loaded when it received the 18 parallel requests, and it therefore triggered this extreme use case. The server was left unusable for several minutes, until a forced restart. For reference, here is the script that has been used to trigger the 6 parallel requests: ``` import requests import threading threads = [] for i in range(6): threads.append(threading.Thread(target=lambda: requests.get('http://localhost:8069/web/login'))) for thread in threads: thread.start() ``` opw-2008340 closes odoo/odoo#34071 Signed-off-by: Denis Ledoux <[email protected]>
- Loading branch information