-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Add all local users to the user_directory and optionally search them #2723
Conversation
…lts to additional users Initial commit; this doesn't work yet - the LIKE filtering seems too aggressive. It also needs _do_initial_spam to be aware of prepopulating the whole user_directory_search table with all users... ...and it needs a handle_user_signup() or something to be added so that new signups get incrementally added to the table too. Committing it here as a WIP
@erikjohnston ptal. i'm particularly baffled on how the FTS match could ever have worked on sqlite - I see no evidence that |
@erikjohnston one design thought on this: i have a feeling that the |
Yeah, I think having a flag is going to be best here, I think using a LIKE expression may tie our hands if we need to change how the lookup works (if e.g. it turns out to be slow) |
synapse/handlers/profile.py
Outdated
profile = yield self.store.get_profileinfo(target_user.localpart) | ||
yield self.user_directory_handler.handle_local_profile_change( | ||
target_user.user_id, profile | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would move the if and fetching of profile into handle_local_profile_change
rather than duplicating it everywhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mm, point.
synapse/handlers/user_directory.py
Outdated
yield self._handle_initial_room(room_id) | ||
num_processed_rooms += 1 | ||
yield sleep(self.INITIAL_SLEEP_MS / 1000.) | ||
|
||
logger.info("Processed all rooms.") | ||
|
||
if self.include_pattern: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, I would be tempted to do this all the time and do the checks when we filter. Otherwise we need to keep track of people toggling the option on and off
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i deliberately wasn't doing this (and instead thought the user can blow away the user_dir caches if they change the option), as for bigger servers it's going to make the user_directory and user_directory_search tables enormous, and slow down the FTS for a feature which may not even be being used...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that we add users to the table when they are in a room with anyone else, wouldn't the vast majority of users already be in there?
synapse/handlers/user_directory.py
Outdated
yield self._handle_initial_room(room_id) | ||
num_processed_rooms += 1 | ||
yield sleep(self.INITIAL_SLEEP_MS / 1000.) | ||
|
||
logger.info("Processed all rooms.") | ||
|
||
if self.include_pattern: | ||
num_processed_users = 0 | ||
user_ids = yield self.store.get_all_local_users() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to include appservice users here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also worried that pulling out millions of users is going to be painful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And its going to take matrix.org more than a day on the default settings to get through all its users
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm, point. so your suggestion is to batch? and yes, we probably should be including AS users.
Also, you broke the unit tests and flake8 is sad |
@erikjohnston PTAL now.
|
synapse/storage/profile.py
Outdated
retcols=("displayname", "avatar_url"), | ||
desc="get_profileinfo", | ||
) | ||
except StoreError, e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should really be using StoreError as e
syntax
Just dirty, I'd say :/ |
might need slightly more detailed review than that... |
Quick and dirty fix to #2720.