-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Conversation
Codecov Report
@@ Coverage Diff @@
## develop #4537 +/- ##
===========================================
+ Coverage 75.13% 75.34% +0.21%
===========================================
Files 340 340
Lines 34984 34878 -106
Branches 5739 5705 -34
===========================================
- Hits 26285 26279 -6
+ Misses 7082 6986 -96
+ Partials 1617 1613 -4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I said to you earlier: there's a lot of stuff going on here; it's hard to get a handle on all of it at once, and breaking it up into smaller chunks would be very helpful, both to help my understanding of what's going on, and in giving confidence that the changes are sensible.
In terms of the actual schema change: it looks like we're simply splitting users_who_share_rooms
in two, based on the value of the share_private
flag. I'm certainly not saying that's a silly thing to do, but at face value it looks like we end up doing twice the read queries (since we have to check each table separately), and I'm not following why this gives a dramatic improvement in the write path. Most likely I've just missed the relevant bit of code among all the other changes: can you spell it out?
It looks like we're dropping the old table but I'm not immediately seeing any code to populate the new ones: am I missing it, is it yet to be written, has it been forgotten?
As a general note on this sort of PR (and following up from what I was saying earlier): it's really helpful to a potential reviewer if:
- you can give a high-level walkthrough of the changes - it's very hard to see where to start when github just presents it as a big diff sorted alphabetically.
- you can give some sort of overview of the state of the PR ("I'm aware that this is going to need more comments, tests, and better naming, but feedback on the schema changes would be useful") - otherwise I tend to assume it's a done deal.
Obviously no need for an essay on either front, but "Rewrite userdir to be faster" doesn't give me much to go on!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this certainly looks way saner. thank you.
A few thoughts and questions...
synapse/handlers/user_directory.py
Outdated
@@ -51,14 +51,14 @@ class UserDirectoryHandler(object): | |||
INITIAL_USER_SLEEP_MS = 10 | |||
|
|||
def __init__(self, hs): | |||
self.hs = hs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a rule, we try to avoid generic self.hs
references, in favour of pulling things out of hs in the constructor. It makes dependencies more explicit and gives a better chance of detecting broken references early.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It also makes it impossible to reload or edit config during tests :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, if you want to argue that the advantages of this style outweigh the disadvantages, it's a topic for a retro or something. For now please put it back as it was.
|
||
# Then, re-add them to the tables. | ||
for user_id, profile in iteritems(users_with_profile): | ||
yield self._handle_new_user(room_id, user_id, profile) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it feels like we're (still) repeating ourselves here.
Suppose there are 3 local users in the room: A, B, C.
First we call _handle_new_user with A. That will add entries: (A, A), (A, B), (A, C), (B, A), (C, A).
Then we call _handle_new_user with B. That will add entries: (B, B), (B, A), (B, C), (A, B), (C, B).
Finally we call _handle_new_user with C. That will add entries: (C, C), (C, A), (C, B), (A, C), (B, C).
So most of those entries are being added twice.
(happy if you want to say that's an existing problem, to be ignored for now, but since we're rewriting this it seems a reasonable time to consider it, or at least add a comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without the adding yourself, that drops one of those, but yes, it does repeat that, but that's an existing problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without the adding yourself, that drops one of those
Ironically the "adding yourself" was the only entry which wasn't being redone.
yes, it does repeat that, but that's an existing problem.
Fair enough. I'd still be happier if you could add a TODO or something so that I know I'm not going mad next time I look at this code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good now other than #4537 (comment) and #4537 (comment).
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
\o/
Fixes #4581