[FIX] http: Unreachable server when db_maxconn reached during registr… · ForgeFlow/OpenUpgrade@242e485

Commit

[FIX] http: Unreachable server when db_maxconn reached during registr…

…y loading

On `WebRequest` `__exit__`, when an exception occured,
(in `self.registry.signal_changes` or `self.registry.reset_changes`)
cursor were left unclosed as `self._cr.close` was not called
in such cases.

Having exceptions in the above mentioned method do not happen
often, but when it does it left unclosed and unusable cursors
in the connection pool, and in the extreme case explained below,
it left the connection pool with only unclosed and unusable cursors.
The entire server was then unusable as it no longer had working cursors.

Case:
- Start a multi-thread server with db_maxconn set to 5
- Ensure you do not send any request to the server,
not even with a left open tab on `http://localhost:8069` in your browser
- Send 6 parallel HTTP requests to `/web/login`
thanks to an external thread python script
(See below, at the end of this long commit message)

According to your registry state (if you have a lot of modules installed or not),
and the native Python Garbage Collecting state,
you might end with
- either warnings telling some unclosed cursor were garbage collected,
and therefore closed (by a kind of luck thanks to the Python garbage collecting),
- either, a server completely blocked not accepting any other request
(you can try for instance `curl http://localhost:8069`
and you end up with a `500 Internal Server Error`

This observed issue looks to appear only in 11.0. Not 10.0 or 12.0.
This is because only 11.0 clear the cache during registry loading:
`https://github.com/odoo/odoo/blob/f1706c848d41c47646dabca771996e9b9f788241/odoo/modules/loading.py#L236`
This cache clearing doesn't happen in 10.0 nor 12.0
(in 12.0, thanks to e181f59)

When sending the 6 parallel requests,
it uses instantly all the 5 available cursors of the connection pool to handle these requests,
and when each request exits, in `__exit__`, it calls `self.registry.signal_changes()`
which tries to open a new cursor because of
- `self.cache_invalidated` which is True, for all the 6 requests, thanks to the call to `clear_caches`
explained above during the registry loading and the fact all requests have been treated in parallel,
- `with closing(self.cursor()) as cr:`, `self.cursor()` attempting to use a new cursor
(the `closing(...)` does not have any incidence on this issue, despite it could look like guilty)

The attempt to use a new cursor fails, as there is no more available (`db_maxconn` is reached),
raising a `PoolError('The Connection Pool Is Full')` exception.

In the request `__exit__` method, because of this exception raised when calling `signal_changes`,
`self._cr.close` is never reached, and the parallel request therefore left only unclosed
cursors in the connection pool,
therefore leaving the server in a state where it only has unusable cursors
and therefore can't do anything more.

This might look like really bad luck to land in such a state,
but we observed multiple actual case on Odoo.sh,
the one referenced in this commit (opw-2008340) was because of an Outlook client
which launched 18 parallel requests to fetch the email images,
and the server wasn't spawned, therefore neither was the registry.
The server registry was therefore just loaded when it received the 18 parallel requests,
and it therefore triggered this extreme use case.
The server was left unusable for several minutes, until a forced restart.

For reference, here is the script that has been used to trigger the 6 parallel requests:
```
import requests
import threading

threads = []
for i in range(6):
threads.append(threading.Thread(target=lambda: requests.get('http://localhost:8069/web/login')))
for thread in threads:
thread.start()

```

opw-2008340

closes odoo/odoo#34071

Signed-off-by: Denis Ledoux <[email protected]>

Loading branch information

beledouxdenis committed Jun 12, 2019

1 parent 92ef6da commit 242e485

odoo/http.py

-Original file line number
+Diff line change
@@ Expand Up / @@ -278,13 +278,15 @@ def __exit__(self, exc_type, exc_value, traceback): @@
             _request_stack.pop()
             if self._cr:
-                if exc_type is None and not self._failed:
-                    self._cr.commit()
-                    if self.registry:
-                        self.registry.signal_changes()
-                elif self.registry:
-                    self.registry.reset_changes()
-                self._cr.close()
+                try:
+                    if exc_type is None and not self._failed:
+                        self._cr.commit()
+                        if self.registry:
+                            self.registry.signal_changes()
+                    elif self.registry:
+                        self.registry.reset_changes()
+                finally:
+                    self._cr.close()
             # just to be sure no one tries to re-use the request
             self.disable_db = True
             self.uid = None
@@ Expand Down @@

0 comments on commit `242e485`

Please sign in to comment.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Commit

There are no files selected for viewing

0 comments on commit `242e485`

Commit

There are no files selected for viewing

0 comments on commit 242e485

0 comments on commit `242e485`