Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

narrow/recipient: Always centrally filter and sort PM users, and verify that. #4335

Merged
merged 17 commits into from
Dec 12, 2020

Conversation

gnprice
Copy link
Member

@gnprice gnprice commented Dec 11, 2020

This is the larger refactoring branch foreshadowed by #4332, following up further on #4330. This takes care of most of #4035 which has been the main obstacle to #3133, and brings us much closer to being able to handle #4333.

The main thrust of this series is to further centralize responsibility for the fiddly details of how we describe PM conversations. In particular, at the end of this series:

  • All PM narrows we construct, outside of test code, use data that is verified -- with the help of the type-checker -- to have the right set of users, including or excluding self in a consistent way.

    We've previously had a number of bugs where these didn't match. The last few of those were fixed in Fix some "is message in narrow" bugs, and test much more thoroughly #4294; this part doesn't change any behavior, but effectively consolidates those fixes.

  • All PM narrows we construct are similarly verified to have consistent sorting of the users.

    Here we still had some mismatches even after Fix some "is message in narrow" bugs, and test much more thoroughly #4294, which represented at least latent bugs. We fix those here along the way.

  • Almost all PM narrows we construct are through constructors that receive user IDs as well as emails, setting us up to start recording user IDs in the Narrow objects in the future, i.e. Identify narrows with user IDs and stream IDs, not emails and names #4333.

    The remaining exceptions are some 1:1 narrows constructed through pmNarrowFromEmail. These consist of some test data and one remaining non-test callsite.

For enlisting the type-checker to verify that we apply the right processing to the data, we use Flow's "opaque type" feature: the helper functions that do that work return types which code outside the recipient module cannot produce, and then the Narrow factories require those types.

Conveniently, we can still let other code consume the data: we export that the types are subtypes of certain array types (and secretly they really are just those array types), so app code can use the results as usual to get user IDs, etc., in addition to creating narrows.

@gnprice gnprice requested a review from chrisbobbe December 11, 2020 05:31
@chrisbobbe
Copy link
Contributor

data that is verified -- with the help of the type-checker -- to have the right set of users, including or excluding self in a consistent way.

All PM narrows we construct are similarly verified to have consistent sorting of the users.

Fascinating! I haven't started reading the branch yet, but I'm curious how we'd manage to get Flow to check these kinds of things.

...Ah, and later in your comment you mention "opaque types". Huh, interesting; I guess I'll find out what that code looks like soon. 😛

Copy link
Contributor

@chrisbobbe chrisbobbe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a lot in here, but I think it looks good! Just a few small comments below.

@@ -71,11 +71,12 @@ const pmNarrowByString = (emails: string): Narrow => [
// the webapp, where the URL that appears in the location bar for a
// group PM conversation excludes self -- so it's unusable if you try
// to give someone else in it a link to a particular message, say.
// * OK, unsorted: getNarrowForReply
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

narrow [nfc]: Update comment classifying PM-constructor call stacks.

In 034da03 the other day, we converted getNarrowForReply (then
called getNarrowFromMessage) to use pmKeyRecipientsForMessage.

Then in 436d971 we added getNarrowsForMessage, which relies on the
same helper.

Those are the only new call sites I see with `git log --stat -p -S`,
searching for `groupNarrow` or `pmNarrowFromEmails`, since this comment
was originally compiled in the series ending at ff872f6.

Nit in commit message: s/pmKeyRecipientsForMessage/pmKeyRecipientsFromMessage

* outside this module can freely inspect and use the data from a value of
* this type, but only code within this module can create one. See Flow docs:
* https://flow.org/en/docs/types/opaque-types/
*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, this jsdoc could helpfully grow a mention that "freely inspect and use" means freely inspect and use as a subtype of $ReadOnlyArray<PmRecipientUser>; do I understand that correctly? 🙂

I think it's the : $ReadOnlyArray<PmRecipientUser> bit that accomplishes that, right? (If I remove it at this commit, the .map call in pmNarrowFromRecipients becomes suspect.)

This bit from your comment on the PR was helpful for my understanding :

Conveniently, we can still let other code consume the data: we export that the types are subtypes of certain array types (and secretly they really are just those array types), so app code can use the results as usual to get user IDs, etc., in addition to creating narrows.

I got a bit stuck when reading the linked Flow doc; in particular (IIUC), I think they might have done better, in the following, to say something like "outside the defining file, the opaque type is known to be a subtype of the specified supertype" (maybe with a link to how subtypes and supertypes work, if someone needs a refresher). Saying "we allow" may invite an already-confused reader to think Flow is making an exception to its normal way of checking things (exceptions already exist, like with any).

When you add a subtyping constraint to an opaque type alias, we allow the opaque type to be used as the super type when outside of the defining file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's the : $ReadOnlyArray<PmRecipientUser> bit that accomplishes that, right? (If I remove it at this commit, the .map call in pmNarrowFromRecipients becomes suspect.)

Yep! That's the "subtyping constraint", in that Flow doc's terminology.

"freely inspect and use" means freely inspect and use as a subtype of $ReadOnlyArray<PmRecipientUser>; do I understand that correctly?

That's right. But because that's also the underlying type (the type that PmKeyRecipients is secretly an alias for), that's the same thing as saying it can freely inspect and use all the data inside it.

Another thing we could have done -- which hmm, really that doc should have an example for -- is give a bound / a "subtyping constraint" which is not quite as specific as the underlying type. Like this:

export opaque type PmKeyRecipients: $ReadOnlyArray<{ +email: string, ... }> =
  $ReadOnlyArray<PmRecipientUser>;

I.e., we'd say "this is an array of some kind of object, where you can read .email and get a string; and there might be more but that's all you get to know about."

Then that .map call:

export const pmNarrowFromRecipients = (recipients: PmKeyRecipients): Narrow =>
  pmNarrowFromEmails(recipients.map(r => r.email));

would still be OK, because it only needs the email property. But we couldn't e.g. look at the full_name property, even though it's there on PmKeyRecipients and therefore should definitely be there on the actual objects at runtime.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expanded this jsdoc for my upcoming revision.

//
// Specifically: all users, except the self-user, except if it's the
// self-1:1 thread then include the self-user after all.
// self-1:1 thread then include the self-user after all. Then sort by ID.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

6cab20a recipient: Almost always sort recipient lists, by user ID.

Hmm, I'm probably missing something, but I see several comment changes in this commit and wonder if they belong in the previous commit. filterRecipients was made to sort in the previous commit, and I don't yet understand why its implementation comment (and some implementation and interface comments for its callers) are updated (to reflect the sorting) here instead of there.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. The core reason is that I didn't want to make it part of the interfaces until I could say it was part of the interface for all the validating helpers -- so I wouldn't have to describe a situation where there's one set that validates both filtering and sorting, and another that validates only filtering. That'd be messier, and I basically just didn't want to write the text for it 🙂 .

This comment is formatted as non-jsdoc, but its content is really in the nature of an interface comment, so the same reasoning applies. I guess I originally formatted it as non-jsdoc to emphasize that it's internal to the module.

@@ -64,6 +64,15 @@ const filterRecipients = (
? recipients
: recipients.filter(r => r.id !== ownUserId).sort((a, b) => a.id - b.id);

// Like filterRecipients, but on User objects.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// Like filterRecipients, but on User objects.

UserOrBot objects, I think.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. Consider this some foreshadowing, perhaps -- I hope to kill that type distinction soon in favor of just User everywhere:
master...gnprice:dev-crossrealmbot-eliminate
Until then, I think of UserOrBot as morally just a flavor of User.

For this comment perhaps I'll spell it "user objects" so there isn't that mismatch.

@@ -58,7 +53,7 @@ export const getRecentConversations: Selector<PmConversationData[]> = createSele
/* $FlowFixMe: The keys of unreadPms are logically numbers, but because it's an object they
end up converted to strings, so this access with string keys works. We should probably use
a Map for this and similar maps. */
unreadPms[recipient.ids] || unreadHuddles[recipient.ids],
unreadPms[recipient.unreadsKey] || unreadHuddles[recipient.unreadsKey],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I'm seeing that recipient.unreadsKey ends up being string | any, not sure why. Then again, recipient.ids before this commit is also string | any. 🤷

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that seems to be some kind of bug in the types Flow shows for the IDE.

If you try using the value by e.g. taking some property that strings don't have, it'll show exactly the error you'd hope it would.

In 034da03 the other day, we converted getNarrowForReply (then
called getNarrowFromMessage) to use pmKeyRecipientsFromMessage.

Then in 436d971 we added getNarrowsForMessage, which relies on the
same helper.

Those are the only new call sites I see with `git log --stat -p -S`,
searching for `groupNarrow` or `pmNarrowFromEmails`, since this comment
was originally compiled in the series ending at ff872f6.
… data.

The type on this function's argument, an "opaque type alias",
guarantees that the data comes from one of our helper functions in
`recipient.js`.  See jsdoc within.

This is one of a series of commits which will ultimately let us ensure
that all our PM narrows are constructed with data that (a) observes
the right convention for whether to include the self user, (b) is
sorted consistently, and (c) contains user IDs as well as emails, so
that we can subsequently switch to using user IDs in the Narrow values
themselves.  To enlist Flow to help us with (a) and (b), we'll use
these opaque types to ensure that the data always passes through one
of a few helper functions responsible for those conditions.
This is just like the old pmNarrowFromEmail, but takes a whole user
object.  Using this function guarantees that we *have* a user object
on hand -- and therefore not just an email but a user ID.  Once we're
using this and other such constructors exclusively, that will enable
us to start identifying PM narrows by user IDs, rather than by emails.
…mUser.

Done like so:

  $ perl -i -0pe '
      s/pmNarrowFromEmail\((.*?)\.email\)/pm1to1NarrowFromUser($1)/g;
      s/\bpmNarrowFromEmail\b/pm1to1NarrowFromUser/g
    ' src/typing/__tests__/typingSelectors-test.js src/chat/__tests__/narrowsReducer-test.js
  $ tools/fmt

The file limitation is mainly to keep it to files where the old can be
eliminated entirely this way.  A few other files call it with literal
strings, so need a bit more work first.
Using this instead of pmNarrowFromEmails will do two things for us:

 * It helps us reduce to zero the usage of the email-based constructors,
   so that we can switch to using user IDs in our Narrow objects.

 * It sorts the users by ID automatically.  As we move over the next
   few commits toward doing the same thing consistently in app code,
   we can have tests switch to this function to follow along, saving us
   from manually writing out the sort in a bunch of spots in our tests.
This also starts us sorting by ID in pmUiRecipientsFromMessage.  But
pmKeyRecipientsFromMessage is more interesting, because there it'll
shortly make sense to make that part of its interface.  In practice,
for both functions it's already sorted because that's how a group PM's
display_recipient comes from the server.

The similar change in filterRecipientsAsUserIds is there for
parallelism, but has no effect because its caller already sorts the
same way anyway.

On filterRecipientsByEmail we don't have the user IDs, so we can't do
our standard sorting.  This is OK because its only caller does its own
sorting anyway.
This is "almost" because we don't do so in the deprecated
`normalizeRecipientsSansMe`, which only takes emails as input.
There's also still one codepath (through CreateGroupScreen) where
we construct a PM narrow without going through the recipient
helpers at all.

In upcoming commits we'll make this rule universal for constructing
PM narrows.
Better yet will be to make this use whole User objects, and go through
a central, validating helper.  We'll do that a bit later in the
branch, but first this is a very easy change that gets this callsite
out of the way of universally consistent sorting.
… true.

This doesn't actually change what data we use in any of these
codepaths -- just the names.

Ultimately we probably want to be have just one map, namely
"all users, by ID", and drop the "all" from its name because we'll
stop having to think about the distinction.  But until we're using
"all users" maps everywhere, the names will help us track where we
are vs. where we aren't yet.

A few of these were recently changed to be "all users"; most of
them have been for a long time.
This ensures that we can adequately test whether and how we're
sorting the users in the output.
We're still getting joined-email strings as input here, though;
we'll convert those to real user objects too in a subsequent commit.
In the previous commit, we stopped passing these joined-email strings
down to GroupPmConversationItem.  This commit continues that by
eliminating them at the next step back, where we were generating them;
and it lets us construct the chosen PM narrow using real users, with a
type that guarantees they've been appropriately filtered and sorted by
our recipient helpers.

By relying on our centralized recipient-processing helpers, this also
gets the users sorted by ID -- which eliminates the last remaining
mismatch between how we sort users to describe a PM conversation,
fixing a latent bug.
…site!

We're now down to zero callsites (in non-test code) that call
pmNarrowFromEmails directly!  All callers that construct a PM narrow
now use one of our new constructors that require types containing
user IDs as well as emails, and constructible only by helpers that
guarantee they properly filter and sort the list.

This change is NFC because this callsite was already sorting the list,
and the users in it are chosen from a UI which doesn't include the
self user in the first place.
…strings.

These are prompted by having calls to pmNarrowFromEmails, which we're
about to systematically eliminate.
Now the only calls to this function are in the higher-level Narrow
constructors, and in its own tests.

Done automatically with these commands:

  $ perl -i -0pe '
      s#pmNarrowFromEmails\(\[(\S+\.email,?\s*)*\]\)#
         $& =~ s/(\S+)\.email/$1/gr
            =~ s/^\w+\(\[(.*)\]\)/pmNarrowFromUsersUnsafe([$1])/r
       #eg
      && s/pmNarrowFromEmails(?=[, ])/pmNarrowFromUsersUnsafe/g
    ' src/**/__tests__/*.js
  $ tools/fmt

followed by restoring the handful of references in its own tests.
The only remaining calls to this function are the ones within its
own module, including the higher-level, validating constructors.

As a result:

 * All PM narrows we construct are validated to have the right set
   of users, including or excluding self consistently (except some
   in test data that use pmNarrowFromUsersUnsafe for convenience.)

 * All PM narrows we construct are validated to have consistent
   sorting of the users.

 * Almost all PM narrows we construct are through constructors that
   receive user IDs as well as emails, setting us up to start
   recording user IDs in the Narrow objects in the future.

   The remaining exceptions are some 1:1 narrows constructed through
   pmNarrowFromEmail; these consist of some test data and one
   remaining non-test callsite.
@gnprice
Copy link
Member Author

gnprice commented Dec 12, 2020

Pushed a new revision!

@chrisbobbe
Copy link
Contributor

Great, merged! Thanks for tightening this up so much, and introducing opaque types to the code. 🙂

@chrisbobbe chrisbobbe merged commit 998947d into zulip:master Dec 12, 2020
@gnprice
Copy link
Member Author

gnprice commented Dec 12, 2020

Thanks for the review!

gnprice added a commit to gnprice/zulip-mobile that referenced this pull request Jan 6, 2021
This completes a major objective of the long string of refactoring
that appeared in the series of PRs zulip#4330, zulip#4332, zulip#4335, zulip#4339, zulip#4342,
then zulip#4346, zulip#4356, zulip#4361, zulip#4364, and most recently zulip#4368.  After this
change, the portion of zulip#4333 that's about PMs, emails, and user IDs --
aka the portion of zulip#3764 that's about narrows -- is complete.

Still open from zulip#4333 is to convert stream and topic narrows from
stream names to stream IDs.  I'm hopeful that that will be simpler:
(a) unlike the situation for PMs, there's just one stream mentioned
at a time, so there's no question of sorting, and (b) there isn't
the "include self or not" complication that's bedeviled much of our
code related to PMs.  In other words, stream and topic narrows
don't suffer from zulip#4035.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants