This repository has been archived by the owner on Apr 26, 2024. It is now read-only.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fix missing sync events during historical batch imports #12319
Fix missing sync events during historical batch imports #12319
Changes from 8 commits
78e3df8
00f81f0
e974d6c
28880bb
446d647
e440db9
ef07fcd
0df9e93
bd87a3a
b2c6c20
9a78d14
8e319cc
77ac51c
361dd38
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/messages
could be updated to always return intopological_ordering
regardless of what type of pagination token is given. That's what it was doing before this PR anyway.I think we have two assumptions that we should probably enforce:
/messages
should always sort bytopological_ordering
regardless of pagination token given/sync
should always sort bystream_ordering
regardless of pagination token givenThis also aligns with what we have documented:
The current behavior of
/messages
and/sync
returnsevents
as expected but notstate
.There is a bug in how incremental
/sync
returnsstate
which this PR aims to fix. The best way to understand the problem is probably reading #12281 (comment) - but basically when calculating whatstate
to return,/sync
is ordering events bytopological_ordering
even though it should bestream_ordering
.I initially assumed we would want flexibility in how these endpoints sorted events according to the type of pagination token given but it's sounding like we actually want to enforce a given sort according to the endpoint. In which case, we can revert to @Fizzadar initial approach of plumbing a
order_by
argument which we can set.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Sorry for not getting to this yesterday, it sat at about the second thing on my todo list all day :/)
Thanks for the clarification. There's the added bonus that
/sync
will return events in pagination order for the first batch of events for the room, i.e. in an initial sync or when a user joins the room. (The rationale being that it's equivalent to getting no events then immediately doing a back pagination). This is relevant as we useget_recent_events_for_room
for that case as well.I wonder if we're just making this harder for ourselves by re-using the same query to do both
/messages
and when looking up the state for/sync
? I think the query we want to run is simply:So it might be simpler to just have a separate
get_last_event_in_room_before
function that we call when getting the state instead? Rather than extending the already slightly horrific_paginate_room_events_txn
function?Entirely separately, and I'm not suggesting we necessarily do this now, but it occurs to me that we have a
get_forward_extremities_for_room_at_stream_ordering
store function that maps fromroom_id
to the list of forward extremities in the room at the time. This can then be used to calculate the "current" state of the room at the time more accurately than the current method. The downside of this approach is thatget_forward_extremities_for_room_at_stream_ordering
will fail if the stream ordering is too old.Another approach that has also literally just now occurred to me is that we could use the
current_state_delta_stream
to work out the changes in state between two stream orderings in the room, as thestream_id
column correspond to the stream orderings of the changes (I think?).Anyway, just wanted to record my thoughts here before I forget them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Fizzadar sorry for going around the houses really slowly on this 😞
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries at all @erikjohnston - totally agree a separate function makes sense here; I've pushed that up in 9a78d14. I just wrapped an existing function
get_room_event_before_stream_ordering
(which doesn't actually return an event) and pulled out the event.Have undone the pagination ordering changes too!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! Will have a look after I've finished doing the v1.57.0rc1 release ❤️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add a test for #3305 so it doesn't regress.
I don't have clear reproduction steps for this one and maybe we can just consider the test we write for #12281 as a subset of this problem and only need one test 🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested and fails on
develop
but passes with this PR ✅ (as expected)