-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Updates to the Room DAG concepts development document #12179
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
Updates to the Room DAG concepts development document. |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -30,37 +30,72 @@ rather than skipping any that arrived late; whereas if you're looking at a | |
historical section of timeline (i.e. `/messages`), you want to see the best | ||
representation of the state of the room as others were seeing it at the time. | ||
|
||
## Outliers | ||
|
||
## Forward extremity | ||
We mark an event as an `outlier` when we haven't figured out the state for the | ||
room at that point in the DAG yet. They are "floating" events that we haven't | ||
yet correlated to the DAG. | ||
|
||
Most-recent-in-time events in the DAG which are not referenced by any other events' `prev_events` yet. | ||
Outliers typically arise when we fetch the auth chain or state for a given | ||
event. When that happens, we just grab the events in the state/auth chain, | ||
without calculating the state at those events, or backfilling their | ||
`prev_events`. | ||
|
||
The forward extremities of a room are used as the `prev_events` when the next event is sent. | ||
So, typically, we won't have the `prev_events` of an `outlier` in the database, | ||
(though it's entirely possible that we *might* have them for some other | ||
reason). Other things that make outliers different from regular events: | ||
Comment on lines
+45
to
+46
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not saying we need to include them all here, but could you give a couple of examples for my own understanding? I imagine it is something like we have one, but we don't know the connection between the outlier and the previous events for some reason (maybe we previously left a room and were re-invited or something weird)? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. a couple of examples:
I can stick this in the body of the doc if it would be helpful. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks! I don't know if we really need the examples, as you mentioned in chat it might be taken as a list of the only ways it happens.
richvdh marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
* We don't have state for them, so there should be no entry in | ||
`event_to_state_groups` for an outlier. (In practice this isn't always | ||
the case, though I'm not sure why: see https://github.com/matrix-org/synapse/issues/12201). | ||
|
||
## Backward extremity | ||
* We don't record entries for them in the `event_edges`, | ||
`event_forward_extremeties` or `event_backward_extremities` tables. | ||
|
||
The current marker of where we have backfilled up to and will generally be the | ||
`prev_events` of the oldest-in-time events we have in the DAG. This gives a starting point when | ||
backfilling history. | ||
Since outliers are not tied into the DAG, they do not normally form part of the | ||
timeline sent down to clients via `/sync` or `/messages`; however there is an | ||
exception: | ||
|
||
When we persist a non-outlier event, we clear it as a backward extremity and set | ||
all of its `prev_events` as the new backward extremities if they aren't already | ||
persisted in the `events` table. | ||
### Out-of-band membership events | ||
|
||
A special case of outlier events are some membership events for federated rooms | ||
that we aren't full members of. For example: | ||
|
||
## Outliers | ||
* invites received over federation, before we join the room | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If we have local users Alice and Bob; remote user Charlie. Charlie and Bob share a room. Charlie invites Alice to join them. In this case, do we still end up with an outlier / out-of-band membership event? The homeserver will have the full auth chain, etc. from Bob so I don't think so? (I think the below cases are similar.) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good question! I think the invite still follows the same codepath, even though we share a room. So yeah, it probably still ends up stored as an out-of-band-membership event. I don't really know, without trying it out. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for the answer! I suspect it isn't important to figure out (it should do the right thing?), but figured I'd ask!
richvdh marked this conversation as resolved.
Show resolved
Hide resolved
|
||
* *rejections* for said invites | ||
* knock events for rooms that we would like to join but have not yet joined. | ||
|
||
We mark an event as an `outlier` when we haven't figured out the state for the | ||
room at that point in the DAG yet. | ||
In all the above cases, we don't have the state for the room, which is why they | ||
are treated as outliers. They are a bit special though, in that they are | ||
proactively sent to clients via `/sync`. | ||
|
||
We won't *necessarily* have the `prev_events` of an `outlier` in the database, | ||
but it's entirely possible that we *might*. | ||
## Forward extremity | ||
|
||
Most-recent-in-time events in the DAG which are not referenced by any other | ||
events' `prev_events` yet. (In this definition, outliers, rejected events, and | ||
soft-failed events don't count.) | ||
|
||
The forward extremities of a room (or at least, a subset of them, if there are | ||
more than ten) are used as the `prev_events` when the next event is sent. | ||
|
||
The "current state" of a room (ie: the state which would be used if we | ||
generated a new event) is, therefore, the resolution of the room states | ||
at each of the forward extremities. | ||
|
||
## Backward extremity | ||
|
||
The current marker of where we have backfilled up to and will generally be the | ||
`prev_events` of the oldest-in-time events we have in the DAG. This gives a starting point when | ||
backfilling history. | ||
|
||
For example, when we fetch the event auth chain or state for a given event, we | ||
mark all of those claimed auth events as outliers because we haven't done the | ||
state calculation ourself. | ||
Note that, unlike forward extremities, we typically don't have any backward | ||
extremity events themselves in the database - or, if we do, they will be "outliers" (see | ||
above). Either way, we don't expect to have the room state at a backward extremity. | ||
|
||
When we persist a non-outlier event, if it was previously a backward extremity, | ||
we clear it as a backward extremity and set all of its `prev_events` as the new | ||
backward extremities if they aren't already persisted as non-outliers. This | ||
therefore keeps the backward extremities up-to-date. | ||
|
||
## State groups | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sentence makes less practical sense to me than the previous one:
Perhaps:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really understand what you're trying to get at with "we haven't done the state calculation ourself". There are other times where we don't calculate the state at an event and yet those events aren't outliers (eg: the join event when joining a room over federation).
Also: the fact that we haven't backfilled
prev_events
doesn't, in itself, make it an outlier.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the big differentiator phrase to me is "mark all of those claimed auth events as outliers". Without something like that, it's not clear to me what we do after "we just grab the events in the state/auth chain".
The other phrasing is just notes on what Synapse is doing. By "ourself", I mean the local homeserver doing the work that we trust as a user on that server. Vs other servers who already calculated the
auth_events
on theoutlier
since it hasauth_events
and is available over federation but we don't know if that's absolutely correct.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, I don't disagree that the wording could be clarified, but I'm still struggling to make improvements without saying things that are actually wrong. For example,
isn't really true: we only do so for (claimed) auth events that we didn't already have.
How about just adding a sentence:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's better 👍
For some reason "claimed" clarifies the point that they can't be trusted for me which would be nice to include.
I feel like your suggestion also doesn't clarify this point but I feel like it doesn't matter whether this is clarified anyway. It's a separate fact that we don't re-outlier a persisted normal event. And the paragraph below slightly touches a bit on this point in a different way.
I'm not coming up without something better than the following but your suggestion is also a good improvement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated in #12345 with your suggestion 👍