Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Linearize calls to _generate_user_id #3029

Merged
merged 4 commits into from
Mar 28, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 10 additions & 4 deletions synapse/handlers/register.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@
from synapse.http.client import CaptchaServerHttpClient
from synapse import types
from synapse.types import UserID
from synapse.util.async import run_on_reactor
from synapse.util.async import run_on_reactor, Linearizer
from synapse.util.threepids import check_3pid_allowed
from ._base import BaseHandler

Expand All @@ -46,6 +46,10 @@ def __init__(self, hs):

self.macaroon_gen = hs.get_macaroon_generator()

self._generate_user_id_linearizer = Linearizer(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the help of the twisted mailing list, I realised recently that our Linearizer is pretty inefficient, especially when you get lots of things queued up on it. In cases like this where you really just want to do a thing the first time we get to a bit of code, it's better to do it with a single Deferred that everything else hangs off:

if thing_result is None:
    if thing_deferred is None:
        thing_deferred = run_in_background(do_the_thing)
    yield make_deferred_yieldable(thing_deferred)

(which is probably more-or-less equivalent to what some of our cache wrappers do)

having said that, if you feel disinclined to rewrite all this right now, I won't insist. Just I think we should bear it in mind for the future.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, we could use an ObservableDeferred here. I think with the reseed logic it becomes a non-trivial rewrite, so I think I'll punt until we have a chance to look at how we should be doing this properly.

name="_generate_user_id_linearizer",
)

@defer.inlineCallbacks
def check_username(self, localpart, guest_access_token=None,
assigned_user_id=None):
Expand Down Expand Up @@ -345,9 +349,11 @@ def check_user_id_not_appservice_exclusive(self, user_id, allowed_appservice=Non
@defer.inlineCallbacks
def _generate_user_id(self, reseed=False):
if reseed or self._next_generated_user_id is None:
self._next_generated_user_id = (
yield self.store.find_next_generated_user_id_localpart()
)
with (yield self._generate_user_id_linearizer.queue(())):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

surely we should have a test for self._next_generated_user_id is None inside the linearizer?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, yes! I've added a guard inside too, but kept the outer one for the fast path

if reseed or self._next_generated_user_id is None:
self._next_generated_user_id = (
yield self.store.find_next_generated_user_id_localpart()
)

id = self._next_generated_user_id
self._next_generated_user_id += 1
Expand Down
4 changes: 1 addition & 3 deletions synapse/storage/registration.py
Original file line number Diff line number Diff line change
Expand Up @@ -460,14 +460,12 @@ def find_next_generated_user_id_localpart(self):
"""
def _find_next_generated_user_id(txn):
txn.execute("SELECT name FROM users")
rows = self.cursor_to_dict(txn)

regex = re.compile("^@(\d+):")

found = set()

for r in rows:
user_id = r["name"]
for user_id, in txn:
match = regex.search(user_id)
if match:
found.add(int(match.group(1)))
Expand Down