Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

redux: Provide the global state to each sub-reducer #4437

Merged
merged 18 commits into from
Jan 27, 2021

Conversation

gnprice
Copy link
Member

@gnprice gnprice commented Jan 27, 2021

There are a lot of places in our various reducers where to best do their job, they really want some data from elsewhere in our app state, outside the bit that that particular reducer maintains.

For example, several different reducers want to know when handling an EVENT_NEW_MESSAGE whether the message was sent by the self user, because they need to treat it differently in that case. We currently accommodate this with a hack, where in eventToAction where we turn the server event into a Redux action, we stick in an extra property ownUserId that's selected straight off the current state.

We have similar hacks to provide state.caughtUp on the same event for the sake of some other reducers, and to provide ownUserId or state.messages on some other event actions. These hacks aren't limited to actions for server events, either: we have ownUserId on MESSAGE_FETCH_COMPLETE actions, too.

And then there's more places where we should be using information from elsewhere in the state but haven't been, and as a result doing far more computation than should be necessary to perform some tasks that ought to be very fast.

For example, when some messages get marked as read -- a very common operation, sitting in the critical path for updating the "N unreads" banner, which the user is probably noticing right at that moment -- the "unread" reducer has to sweep through its entire data structure searching for the given message IDs so it can remove them. If there are a lot of unreads, this can take hundreds of ms, which is painfully long. (This is #4438; it's an example of #3339 and related to #3949.) What we should be doing there is to consult state.messages, see exactly what conversation each message appears in, and go straight to that part of the unreads data structure and touch that and nothing else.

Why haven't we been better using the information available to us in places like that? One factor is that it's rather a pain to add one of these hacks. Another, probably, is that it's visibly a hack -- and perhaps even because we've felt that the Redux Way is to have each sub-reducer know nothing about the rest of the state and so there's something somehow wrong with doing so.

Well, that version of the Redux Way might be a good idea in situations where different sub-reducers are managing areas of data that are truly independent of each other, and benefit from being isolated. But as these examples show, many or most of our state subtrees are in reality closely interconnected. In particular they're full of IDs that refer to things in other subtrees -- messages, users, streams -- and they're naturally structured by facts about those things, facts which are maintained in those other subtrees. So let's put that idea to rest:

  • Our reducers often should use data from elsewhere in the state.

    If you think one should, it probably should.

And let's start providing a non-hacky way for them to do that. (Or at least a less-hacky way.)


At the end of this PR, each sub-reducer is now simply passed the global state as an argument. For the moment, most of them just ignore it.

Converting any given sub-reducer to actually use that argument will be a bit of work, just because its tests will all need to be updated to pass it.

To make an end-to-end demo, we go ahead and do this for the unread reducer, and use that to eliminate one use of the state-embedded-in-action hack.

Before making these changes, we need to take control of the overall structure of our main reducer, which is currently built using Redux's combineReducers. The initial part of the PR does that, inlining combineReducers and then simplifying the result, so that when we get to actually passing the global state it's a very straightforward change.

Copy link
Contributor

@chrisbobbe chrisbobbe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, thanks! Just some small comments below.

@@ -61,7 +60,78 @@ const reducers = {

export const ALL_KEYS: string[] = Object.keys(reducers);

const combinedReducer: CombinedReducer<GlobalState, Action> = combineReducers(
// Inlined just now from React upstream.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inlined just now from React upstream.

From Redux upstream, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah indeed, thanks 🙂

@@ -61,8 +61,7 @@ function applyReducer<Key: $Keys<GlobalState>, State>(
return nextState;
}

// Inlined just now from React upstream.
// We'll clean this up in the next few commits.
// Based on React upstream's combineReducers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on React upstream's combineReducers.

Redux upstream, I think?

globalState is not void.

This is OK because it's only ever void at the initialization action,
and no reducer should do anythng there other than return its initial
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

anythng

nit: "anything"

*
* See `reduxState` for a version starting from a minimal state.
*/
export const reduxStatePlus = (extra?: $Rest<GlobalState, {}>): GlobalState =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you say a bit about the choice of names for reduxStatePlus and plusReduxState? They seem pretty similar.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I don't love the names, and probably better names are possible. Happy to hear suggestions 🙂

I think having them similar to each other is pretty OK because they're basically the same underlying content with different forms of input. (In particular eg.plusReduxState is exactly equivalent to eg.reduxStatePlus(), with no argument.) So the only differences between them are the ones you see right in the source where they're being used; there's no hidden subtlety to have to keep straight, and if you just think of them as basically the same thing then that's pretty much ideal.

The "plus" is there because we'll still want to have eg.baseReduxState and the helper for adding stuff to it -- that one is canonical in a way that this one won't be. And because I want to communicate explicitly somehow that this is a state that has extra things added to it, things which we try to make more or less generic and reusable across our tests, but which are arbitrary and made up for the purpose of tests.

@@ -453,15 +453,75 @@ export const makeOutboxMessage = (data: $Shape<$Diff<Outbox, {| id: mixed |}>>):

const privateReduxStore = createStore(rootReducer);

/** The global Redux state, at its initial value. */
/**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

example data: Add a batteries-included version of the Redux state.

Cool, sounds good! Linking some prior brainstorming about this: #4299 (comment).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, thanks for locating that 🙂 Rereading, it looks like this covers all the thoughts there, either directly or as TODO comments; so that's reassuring.

We'll use this to give ourselves some more flexibility in the
structure of our overall reducer function.

The upstream combineReducers has a number of dev-only checks to make
sure that the reducers object has the right shape and the individual
sub-reducers have the right behavior.  These checks aren't helpful
for us (they're largely subsumed by type-checking, and they're mostly
helpful when first figuring out how to use Redux in any case), so
we'll delete them shortly.  This version just comments out the key
bits of them, to save us from having to copy all the detailed helper
functions for those checks only to delete them.
As mentioned in the previous commit, we don't really need these
checks, and so we disabled them when inlining this code in the
first place.
Now our normal auto-formatting applies here as usual.
This test never actually quite checked the thing it said it did:
if you read the code, it was saying the converse, that every key
that *was* listed in the store config was indeed the name of some
reducer.  Then also checking that the lengths matched... but
nothing ensured that the keys listed in the store config are
distinct, so if we were to accidentally have a duplicate then we
could also have one missing, and this test wouldn't notice.

Fix the test, and while we're at it also get the list of keys
straight from the behavior of the overall reducer -- from the state
it returns on initialization -- rather than from the internals of
its implementation.  This lets us stop exporting that bit of its
internals, too.
Now that we control the combined reducer's code, we can put this logic
there explicitly!

This will in turn help us unpack the combined-reducer logic further,
to help open up substantive changes we want to make.
This simplifies the loop body and sets us up to unroll it explicitly,
which in particular will make everything visible to Flow and let us
remove the fixmes.
This will just simplify a bit an upcoming refactoring.
The list of our reducers is already right here in the same source
file, so we don't really need all this metaprogramming on it.  By
unrolling the loop and unpacking what it does, we make everything
more explicit, and in particular make it explicit to the type-checker
so that it's able to check the types and we no longer need any fixme
comments to turn it off.
We've already done, for our overall reducer in src/boot/reducers.js,
the work of working through all the details of what combineReducers
does and unpacking it for an explicit set of sub-reducers.

So rather than repeat that, go in one step by basically copying the
structure of that resulting code.  Because we don't do the
maybeLogSlowReducer thing within this sub-reducer, we can simplify a
bit more by inlining
  applyReducer(key, subReducer, subState, action)
to just
  subReducer(subState, action)
.
There are a lot of places in our various reducers where to best do
their job, they really want some data from elsewhere in our app state,
outside the bit that that particular reducer maintains.

For example, several different reducers want to know when handling an
EVENT_NEW_MESSAGE whether the message was sent by the self user,
because they need to treat it differently in that case.  We currently
accommodate this with a hack, where in `eventToAction` where we turn
the server event into a Redux action, we stick in an extra property
`ownUserId` that's selected straight off the current state.

We have similar hacks to provide `state.caughtUp` on the same event
for the sake of some other reducers, and to provide `ownUserId` or
`state.messages` on some other event actions.  These hacks aren't
limited to actions for server events, either: we have `ownUserId` on
`MESSAGE_FETCH_COMPLETE` actions, too.

And then there's more places where we should be using information
from elsewhere in the state but haven't been, and as a result doing
far more computation than should be necessary to perform some tasks
that ought to be very fast.

For example, when some messages get marked as read -- a very common
operation, sitting in the critical path for updating the "N unreads"
banner, which the user is probably noticing right at that moment --
the "unread" reducer has to sweep through its entire data structure
searching for the given message IDs so it can remove them.  If there
are a lot of unreads, this can take hundreds of ms, which is
painfully long.  What we should be doing there is to consult
`state.messages`, see exactly what conversation each message appears
in, and go straight to that part of the unreads data structure and
touch that and nothing else.

Why haven't we been better using the information available to us in
places like that?  One factor is that it's rather a pain to add one
of these hacks.  Another, probably, is that it's visibly a hack --
and perhaps even because we've felt that the Redux Way is to have
each sub-reducer know nothing about the rest of the state and so
there's something somehow wrong with doing so.

Well, that version of the Redux Way might be a good idea in
situations where different sub-reducers are managing areas of data
that are truly independent of each other, and benefit from being
isolated.  But as these examples show, many or most of our state
subtrees are in reality closely interconnected.  In particular
they're full of IDs that refer to things in other subtrees --
messages, users, streams -- and they're naturally structured by
facts about those things, facts which are maintained in those other
subtrees.  So let's put that idea to rest:

  Our reducers often *should* use data from elsewhere in the state.

  If you think one should, it probably should.

And, with this commit, let's start providing a non-hacky way for
them to do that.  (Or at least a less-hacky way.)

Each sub-reducer is now simply passed the global state as an
argument.  For the moment, they all just ignore it.

Converting any given sub-reducer to actually use that argument will
be a bit of work, just because its tests will all need to be updated
to pass it.

But then once that's done for any given sub-reducer, it can go on to
inspect the state however it pleases, and use any part of the state
that it needs to.
As we start having reducers take the global state as an additional
argument, we're going to more frequently have a need for an example
global state that includes common data like the self user.  It's
already perhaps past time to have such a thing provided centrally in
exampleData.  Add one now.
This change consists almost entirely of just having the tests start
passing this argument, now that it's required, and using the new
batteries-included state `eg.plusReduxState` for it.
This is one of the places where we've already been using information
from elsewhere in the state, and have done so via a hack where that
information gets stuck into the action as an extra property.

Instead, use our new more-direct mechanism for getting at such data
in a reducer, where it's simply passed down as an argument.

This is a step toward cleaning up that `ownUserId` hack.  It also
serves as a demonstration of the new mechanism, by being the first
place we substantively use it.
@gnprice
Copy link
Member Author

gnprice commented Jan 27, 2021

Pushed a new revision, with those nits fixed.

@chrisbobbe chrisbobbe merged commit a2000b9 into zulip:master Jan 27, 2021
@chrisbobbe
Copy link
Contributor

Great, thanks! Merged.

@gnprice gnprice deleted the pr-reducer branch January 28, 2021 02:13
gnprice added a commit to gnprice/zulip-mobile that referenced this pull request Jan 30, 2021
Before this commit, when a message was marked as read we'd have to
search through this entire data structure to find where it was so we
could remove it.  In fact, even if the message was actually a PM we'd
end up searching through this whole data structure which is entirely
about stream messages, because we didn't use that information.

The same thing was true with the old data structure, before this
series.

Much better would be, when a message in a particular conversation gets
marked as read, to go straight to that particular conversation's part
of the data structure and update that without having to search through
anything else.  Do that.

Knowing what conversation the message is in requires looking that
information up in our data structures.  Happily we can do that now
(and without an intrusive hack like we've sometimes done in the past):
that was zulip#4437.

This reduces the time spent in this reducer to 7ms in the slowest
sample I've seen, or as little as <1ms (the threshold of measurement),
and the total time spent in dispatch to 110-120ms.  Those compare
with 30-50ms reducer / 150-200ms total before this commit, and with
70ms reducer / 300ms total before the whole series, using the old
data structure.  (Based on measuring the same way as described a few
commits ago.)  So that's an improvement of about 2.5x, or 180ms!

The 110-120ms we're still spending, almost all of it now outside the
reducer, still isn't *great*.  But it's enough better that I think
further optimization is no longer a top-priority thing for me to work
on; and because the remaining problem isn't inside the reducer where
I've been working and have built up the perf-logging tools to iterate
on, it's beyond the scope of where it's just easy to keep going.  So
with this I'm declaring victory on zulip#4438, the task of making this
"unread" model efficient at marking a message as read.

Fixes: zulip#4438
gnprice added a commit to gnprice/zulip-mobile that referenced this pull request Mar 18, 2021
Before this commit, when a message was marked as read we'd have to
search through this entire data structure to find where it was so we
could remove it.  In fact, even if the message was actually a PM we'd
end up searching through this whole data structure which is entirely
about stream messages, because we didn't use that information.

The same thing was true with the old data structure, before this
series.

Much better would be, when a message in a particular conversation gets
marked as read, to go straight to that particular conversation's part
of the data structure and update that without having to search through
anything else.  Do that.

Knowing what conversation the message is in requires looking that
information up in our data structures.  Happily we can do that now
(and without an intrusive hack like we've sometimes done in the past):
that was zulip#4437.

This reduces the time spent in this reducer to 7ms in the slowest
sample I've seen, or as little as <1ms (until recently the threshold
of measurement), and the total time spent in dispatch to 110-120ms.
Those compare with 30-50ms reducer / 150-200ms total before this
commit, and with 70ms reducer / 300ms total before the whole series,
using the old data structure.  (Based on measuring the same way as
described a few commits ago.)  So that's an improvement of about 2.5x,
or 180ms!

The 110-120ms we're still spending, almost all of it now outside the
reducer, still isn't *great*.  But it's enough better that I think
further optimization is no longer a top-priority thing for me to work
on; and because the remaining problem isn't inside the reducer where
I've been working and have built up the perf-logging tools to iterate
on, it's beyond the scope of where it's just easy to keep going.  So
with this I'm declaring victory on zulip#4438, the task of making this
"unread" model efficient at marking a message as read.

Fixes: zulip#4438
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants