narrow/recipient: Always centrally filter and sort PM users, and verify that. #4335

gnprice · 2020-12-11T05:31:17Z

This is the larger refactoring branch foreshadowed by #4332, following up further on #4330. This takes care of most of #4035 which has been the main obstacle to #3133, and brings us much closer to being able to handle #4333.

The main thrust of this series is to further centralize responsibility for the fiddly details of how we describe PM conversations. In particular, at the end of this series:

All PM narrows we construct, outside of test code, use data that is verified -- with the help of the type-checker -- to have the right set of users, including or excluding self in a consistent way.

We've previously had a number of bugs where these didn't match. The last few of those were fixed in Fix some "is message in narrow" bugs, and test much more thoroughly #4294; this part doesn't change any behavior, but effectively consolidates those fixes.
All PM narrows we construct are similarly verified to have consistent sorting of the users.

Here we still had some mismatches even after Fix some "is message in narrow" bugs, and test much more thoroughly #4294, which represented at least latent bugs. We fix those here along the way.
Almost all PM narrows we construct are through constructors that receive user IDs as well as emails, setting us up to start recording user IDs in the Narrow objects in the future, i.e. Identify narrows with user IDs and stream IDs, not emails and names #4333.

The remaining exceptions are some 1:1 narrows constructed through pmNarrowFromEmail. These consist of some test data and one remaining non-test callsite.

For enlisting the type-checker to verify that we apply the right processing to the data, we use Flow's "opaque type" feature: the helper functions that do that work return types which code outside the recipient module cannot produce, and then the Narrow factories require those types.

Conveniently, we can still let other code consume the data: we export that the types are subtypes of certain array types (and secretly they really are just those array types), so app code can use the results as usual to get user IDs, etc., in addition to creating narrows.

chrisbobbe · 2020-12-11T17:05:59Z

data that is verified -- with the help of the type-checker -- to have the right set of users, including or excluding self in a consistent way.

All PM narrows we construct are similarly verified to have consistent sorting of the users.

Fascinating! I haven't started reading the branch yet, but I'm curious how we'd manage to get Flow to check these kinds of things.

...Ah, and later in your comment you mention "opaque types". Huh, interesting; I guess I'll find out what that code looks like soon. 😛

chrisbobbe

There's a lot in here, but I think it looks good! Just a few small comments below.

chrisbobbe · 2020-12-11T17:13:48Z

src/utils/narrow.js

@@ -71,11 +71,12 @@ const pmNarrowByString = (emails: string): Narrow => [
 //      the webapp, where the URL that appears in the location bar for a
 //      group PM conversation excludes self -- so it's unusable if you try
 //      to give someone else in it a link to a particular message, say.
-//  * OK, unsorted: getNarrowForReply


narrow [nfc]: Update comment classifying PM-constructor call stacks. In 034da03 the other day, we converted getNarrowForReply (then called getNarrowFromMessage) to use pmKeyRecipientsForMessage. Then in 436d971 we added getNarrowsForMessage, which relies on the same helper. Those are the only new call sites I see with `git log --stat -p -S`, searching for `groupNarrow` or `pmNarrowFromEmails`, since this comment was originally compiled in the series ending at ff872f6.

Nit in commit message: s/pmKeyRecipientsForMessage/pmKeyRecipientsFromMessage

chrisbobbe · 2020-12-11T17:52:58Z

src/utils/recipient.js

+ * outside this module can freely inspect and use the data from a value of
+ * this type, but only code within this module can create one.  See Flow docs:
+ *   https://flow.org/en/docs/types/opaque-types/
+ *


IIUC, this jsdoc could helpfully grow a mention that "freely inspect and use" means freely inspect and use as a subtype of $ReadOnlyArray<PmRecipientUser>; do I understand that correctly? 🙂

I think it's the : $ReadOnlyArray<PmRecipientUser> bit that accomplishes that, right? (If I remove it at this commit, the .map call in pmNarrowFromRecipients becomes suspect.)

This bit from your comment on the PR was helpful for my understanding :

Conveniently, we can still let other code consume the data: we export that the types are subtypes of certain array types (and secretly they really are just those array types), so app code can use the results as usual to get user IDs, etc., in addition to creating narrows.

I got a bit stuck when reading the linked Flow doc; in particular (IIUC), I think they might have done better, in the following, to say something like "outside the defining file, the opaque type is known to be a subtype of the specified supertype" (maybe with a link to how subtypes and supertypes work, if someone needs a refresher). Saying "we allow" may invite an already-confused reader to think Flow is making an exception to its normal way of checking things (exceptions already exist, like with any).

When you add a subtyping constraint to an opaque type alias, we allow the opaque type to be used as the super type when outside of the defining file.

I think it's the : $ReadOnlyArray<PmRecipientUser> bit that accomplishes that, right? (If I remove it at this commit, the .map call in pmNarrowFromRecipients becomes suspect.)

Yep! That's the "subtyping constraint", in that Flow doc's terminology.

"freely inspect and use" means freely inspect and use as a subtype of $ReadOnlyArray<PmRecipientUser>; do I understand that correctly?

That's right. But because that's also the underlying type (the type that PmKeyRecipients is secretly an alias for), that's the same thing as saying it can freely inspect and use all the data inside it.

Another thing we could have done -- which hmm, really that doc should have an example for -- is give a bound / a "subtyping constraint" which is not quite as specific as the underlying type. Like this:

export opaque type PmKeyRecipients: $ReadOnlyArray<{ +email: string, ... }> = $ReadOnlyArray<PmRecipientUser>;

I.e., we'd say "this is an array of some kind of object, where you can read .email and get a string; and there might be more but that's all you get to know about."

Then that .map call:

export const pmNarrowFromRecipients = (recipients: PmKeyRecipients): Narrow => pmNarrowFromEmails(recipients.map(r => r.email));

would still be OK, because it only needs the email property. But we couldn't e.g. look at the full_name property, even though it's there on PmKeyRecipients and therefore should definitely be there on the actual objects at runtime.

Expanded this jsdoc for my upcoming revision.

chrisbobbe · 2020-12-11T19:51:34Z

src/utils/recipient.js

 //
 // Specifically: all users, except the self-user, except if it's the
-// self-1:1 thread then include the self-user after all.
+// self-1:1 thread then include the self-user after all.  Then sort by ID.


6cab20a recipient: Almost always sort recipient lists, by user ID.

Hmm, I'm probably missing something, but I see several comment changes in this commit and wonder if they belong in the previous commit. filterRecipients was made to sort in the previous commit, and I don't yet understand why its implementation comment (and some implementation and interface comments for its callers) are updated (to reflect the sorting) here instead of there.

Yeah. The core reason is that I didn't want to make it part of the interfaces until I could say it was part of the interface for all the validating helpers -- so I wouldn't have to describe a situation where there's one set that validates both filtering and sorting, and another that validates only filtering. That'd be messier, and I basically just didn't want to write the text for it 🙂 .

This comment is formatted as non-jsdoc, but its content is really in the nature of an interface comment, so the same reasoning applies. I guess I originally formatted it as non-jsdoc to emphasize that it's internal to the module.

chrisbobbe · 2020-12-11T20:21:58Z

src/utils/recipient.js

@@ -64,6 +64,15 @@ const filterRecipients = (
    ? recipients
    : recipients.filter(r => r.id !== ownUserId).sort((a, b) => a.id - b.id);

+// Like filterRecipients, but on User objects.


// Like filterRecipients, but on User objects.

UserOrBot objects, I think.

Yeah. Consider this some foreshadowing, perhaps -- I hope to kill that type distinction soon in favor of just User everywhere:
master...gnprice:dev-crossrealmbot-eliminate
Until then, I think of UserOrBot as morally just a flavor of User.

For this comment perhaps I'll spell it "user objects" so there isn't that mismatch.

chrisbobbe · 2020-12-11T21:19:02Z

src/pm-conversations/pmConversationsSelectors.js

@@ -58,7 +53,7 @@ export const getRecentConversations: Selector<PmConversationData[]> = createSele
        /* $FlowFixMe: The keys of unreadPms are logically numbers, but because it's an object they
         end up converted to strings, so this access with string keys works.  We should probably use
         a Map for this and similar maps. */
-        unreadPms[recipient.ids] || unreadHuddles[recipient.ids],
+        unreadPms[recipient.unreadsKey] || unreadHuddles[recipient.unreadsKey],


Hmm, I'm seeing that recipient.unreadsKey ends up being string | any, not sure why. Then again, recipient.ids before this commit is also string | any. 🤷

Yeah, that seems to be some kind of bug in the types Flow shows for the IDE.

If you try using the value by e.g. taking some property that strings don't have, it'll show exactly the error you'd hope it would.

In 034da03 the other day, we converted getNarrowForReply (then called getNarrowFromMessage) to use pmKeyRecipientsFromMessage. Then in 436d971 we added getNarrowsForMessage, which relies on the same helper. Those are the only new call sites I see with `git log --stat -p -S`, searching for `groupNarrow` or `pmNarrowFromEmails`, since this comment was originally compiled in the series ending at ff872f6.

… data. The type on this function's argument, an "opaque type alias", guarantees that the data comes from one of our helper functions in `recipient.js`. See jsdoc within. This is one of a series of commits which will ultimately let us ensure that all our PM narrows are constructed with data that (a) observes the right convention for whether to include the self user, (b) is sorted consistently, and (c) contains user IDs as well as emails, so that we can subsequently switch to using user IDs in the Narrow values themselves. To enlist Flow to help us with (a) and (b), we'll use these opaque types to ensure that the data always passes through one of a few helper functions responsible for those conditions.

This is just like the old pmNarrowFromEmail, but takes a whole user object. Using this function guarantees that we *have* a user object on hand -- and therefore not just an email but a user ID. Once we're using this and other such constructors exclusively, that will enable us to start identifying PM narrows by user IDs, rather than by emails.

…mUser. Done like so: $ perl -i -0pe ' s/pmNarrowFromEmail$(.*?)\.email$/pm1to1NarrowFromUser($1)/g; s/\bpmNarrowFromEmail\b/pm1to1NarrowFromUser/g ' src/typing/__tests__/typingSelectors-test.js src/chat/__tests__/narrowsReducer-test.js $ tools/fmt The file limitation is mainly to keep it to files where the old can be eliminated entirely this way. A few other files call it with literal strings, so need a bit more work first.

Using this instead of pmNarrowFromEmails will do two things for us: * It helps us reduce to zero the usage of the email-based constructors, so that we can switch to using user IDs in our Narrow objects. * It sorts the users by ID automatically. As we move over the next few commits toward doing the same thing consistently in app code, we can have tests switch to this function to follow along, saving us from manually writing out the sort in a bunch of spots in our tests.

This also starts us sorting by ID in pmUiRecipientsFromMessage. But pmKeyRecipientsFromMessage is more interesting, because there it'll shortly make sense to make that part of its interface. In practice, for both functions it's already sorted because that's how a group PM's display_recipient comes from the server. The similar change in filterRecipientsAsUserIds is there for parallelism, but has no effect because its caller already sorts the same way anyway. On filterRecipientsByEmail we don't have the user IDs, so we can't do our standard sorting. This is OK because its only caller does its own sorting anyway.

This is "almost" because we don't do so in the deprecated `normalizeRecipientsSansMe`, which only takes emails as input. There's also still one codepath (through CreateGroupScreen) where we construct a PM narrow without going through the recipient helpers at all. In upcoming commits we'll make this rule universal for constructing PM narrows.

Better yet will be to make this use whole User objects, and go through a central, validating helper. We'll do that a bit later in the branch, but first this is a very easy change that gets this callsite out of the way of universally consistent sorting.

… true. This doesn't actually change what data we use in any of these codepaths -- just the names. Ultimately we probably want to be have just one map, namely "all users, by ID", and drop the "all" from its name because we'll stop having to think about the distinction. But until we're using "all users" maps everywhere, the names will help us track where we are vs. where we aren't yet. A few of these were recently changed to be "all users"; most of them have been for a long time.

This ensures that we can adequately test whether and how we're sorting the users in the output.

We're still getting joined-email strings as input here, though; we'll convert those to real user objects too in a subsequent commit.

In the previous commit, we stopped passing these joined-email strings down to GroupPmConversationItem. This commit continues that by eliminating them at the next step back, where we were generating them; and it lets us construct the chosen PM narrow using real users, with a type that guarantees they've been appropriately filtered and sorted by our recipient helpers. By relying on our centralized recipient-processing helpers, this also gets the users sorted by ID -- which eliminates the last remaining mismatch between how we sort users to describe a PM conversation, fixing a latent bug.

…site! We're now down to zero callsites (in non-test code) that call pmNarrowFromEmails directly! All callers that construct a PM narrow now use one of our new constructors that require types containing user IDs as well as emails, and constructible only by helpers that guarantee they properly filter and sort the list. This change is NFC because this callsite was already sorting the list, and the users in it are chosen from a UI which doesn't include the self user in the first place.

…strings. These are prompted by having calls to pmNarrowFromEmails, which we're about to systematically eliminate.

Now the only calls to this function are in the higher-level Narrow constructors, and in its own tests. Done automatically with these commands: $ perl -i -0pe ' s#pmNarrowFromEmails$\[(\S+\.email,?\s*)*\]$# $& =~ s/(\S+)\.email/$1/gr =~ s/^\w+$\[(.*)\]$/pmNarrowFromUsersUnsafe([$1])/r #eg && s/pmNarrowFromEmails(?=[, ])/pmNarrowFromUsersUnsafe/g ' src/**/__tests__/*.js $ tools/fmt followed by restoring the handful of references in its own tests.

The only remaining calls to this function are the ones within its own module, including the higher-level, validating constructors. As a result: * All PM narrows we construct are validated to have the right set of users, including or excluding self consistently (except some in test data that use pmNarrowFromUsersUnsafe for convenience.) * All PM narrows we construct are validated to have consistent sorting of the users. * Almost all PM narrows we construct are through constructors that receive user IDs as well as emails, setting us up to start recording user IDs in the Narrow objects in the future. The remaining exceptions are some 1:1 narrows constructed through pmNarrowFromEmail; these consist of some test data and one remaining non-test callsite.

gnprice · 2020-12-12T00:10:54Z

Pushed a new revision!

chrisbobbe · 2020-12-12T00:53:40Z

Great, merged! Thanks for tightening this up so much, and introducing opaque types to the code. 🙂

gnprice · 2020-12-12T01:15:20Z

Thanks for the review!

This completes a major objective of the long string of refactoring that appeared in the series of PRs zulip#4330, zulip#4332, zulip#4335, zulip#4339, zulip#4342, then zulip#4346, zulip#4356, zulip#4361, zulip#4364, and most recently zulip#4368. After this change, the portion of zulip#4333 that's about PMs, emails, and user IDs -- aka the portion of zulip#3764 that's about narrows -- is complete. Still open from zulip#4333 is to convert stream and topic narrows from stream names to stream IDs. I'm hopeful that that will be simpler: (a) unlike the situation for PMs, there's just one stream mentioned at a time, so there's no question of sorting, and (b) there isn't the "include self or not" complication that's bedeviled much of our code related to PMs. In other words, stream and topic narrows don't suffer from zulip#4035.

gnprice requested a review from chrisbobbe December 11, 2020 05:31

chrisbobbe reviewed Dec 11, 2020

View reviewed changes

gnprice added 17 commits December 11, 2020 16:08

narrow [nfc]: Add another checked-input constructor, pmNarrowFromUsers.

bed2f2a

pm-conversations tests: Drop reliance on specific user IDs.

4ed0c13

This ensures that we can adequately test whether and how we're sorting the users in the output.

PmConversationList [nfc]: Pass real Users, not joined-emails string.

077574e

We're still getting joined-email strings as input here, though; we'll convert those to real user objects too in a subsequent commit.

tests [nfc]: Well-type a few more tests, and avoid magic-coincidence …

5814cb5

…strings. These are prompted by having calls to pmNarrowFromEmails, which we're about to systematically eliminate.

gnprice force-pushed the pr-recipient branch from e5b7df2 to 998947d Compare December 12, 2020 00:10

chrisbobbe merged commit 998947d into zulip:master Dec 12, 2020

gnprice deleted the pr-recipient branch December 12, 2020 01:16

This was referenced Dec 15, 2020

narrow: Encode our own structured keys instead of JSON. #4339

Merged

narrow: Encapsulate the Narrow type within its module, checked with an opaque type #4342

Merged

Pin down the main forms of identifying a PM thread, and their conversion helpers #4035

Closed

gnprice mentioned this pull request Dec 23, 2020

Opening a notification for nonexistent user/stream causes crash #4309

Open

chrisbobbe mentioned this pull request Jan 5, 2021

narrow: Stop storing emails! #4382

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

narrow/recipient: Always centrally filter and sort PM users, and verify that. #4335

narrow/recipient: Always centrally filter and sort PM users, and verify that. #4335

gnprice commented Dec 11, 2020

chrisbobbe commented Dec 11, 2020

chrisbobbe left a comment

chrisbobbe Dec 11, 2020

chrisbobbe Dec 11, 2020

gnprice Dec 11, 2020

gnprice Dec 11, 2020

chrisbobbe Dec 11, 2020

gnprice Dec 11, 2020

chrisbobbe Dec 11, 2020

gnprice Dec 11, 2020

chrisbobbe Dec 11, 2020

gnprice Dec 11, 2020

gnprice commented Dec 12, 2020

chrisbobbe commented Dec 12, 2020

gnprice commented Dec 12, 2020

narrow/recipient: Always centrally filter and sort PM users, and verify that. #4335

narrow/recipient: Always centrally filter and sort PM users, and verify that. #4335

Conversation

gnprice commented Dec 11, 2020

chrisbobbe commented Dec 11, 2020

chrisbobbe left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gnprice commented Dec 12, 2020

chrisbobbe commented Dec 12, 2020

gnprice commented Dec 12, 2020