Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Add endpoints for backfilling history (MSC2716) #9247

Merged
merged 96 commits into from
Jun 22, 2021
Merged
Show file tree
Hide file tree
Changes from 63 commits
Commits
Show all changes
96 commits
Select commit Hold shift + click to select a range
22c038e
Add endpoints for backfilling history (MSC2716)
MadLittleMods Jan 28, 2021
4836954
Add querystring prev_event to pass into message send API
MadLittleMods Jan 29, 2021
c0b0936
Allow override origin_server_ts
MadLittleMods Jan 30, 2021
bf90053
Remove origin_server_ts in favor of spec'ed ts query param
MadLittleMods Feb 2, 2021
bcc6943
Use previous depth if overriding prev_events to insert into history
MadLittleMods Feb 2, 2021
7f4c3a6
Remove m.historical messages from /sync
MadLittleMods Feb 5, 2021
800f3a3
Remove debug logs
MadLittleMods Feb 5, 2021
9b5e057
Fix some lint
MadLittleMods Feb 5, 2021
447eaa8
Add changelog
MadLittleMods Feb 5, 2021
7ec22b5
Fix tox mypy check
MadLittleMods Feb 5, 2021
afa5e5d
Remove debugging fields for TARDIS visualization
MadLittleMods Feb 5, 2021
f1f3fb0
Pass inherit_depth all the way down the line which we define only wh…
MadLittleMods Feb 5, 2021
e7d7f92
Fix some lints
MadLittleMods Feb 5, 2021
b9024f7
Type hinting and docstrings
MadLittleMods Feb 9, 2021
d204880
Merge branch 'develop' into eric/msc2716-backfilling-history
MadLittleMods Feb 9, 2021
dbba361
Add experimental feature flag for MSC2716
MadLittleMods Feb 9, 2021
f1c31f1
Fix isort linting
MadLittleMods Feb 9, 2021
19aa93c
Fix experimental msc2716 feature flag
MadLittleMods Feb 9, 2021
c074584
Try fix type hints and tox errors
MadLittleMods Feb 9, 2021
c02079d
Fix extra new line lint
MadLittleMods Feb 9, 2021
412ffc3
Update changelog.d/9247.feature
MadLittleMods Feb 11, 2021
7160f3b
Update synapse/events/builder.py
MadLittleMods Feb 11, 2021
5ff398d
Update synapse/visibility.py
MadLittleMods Feb 11, 2021
864b98f
Clean up docstrings and address review
MadLittleMods Feb 11, 2021
6f174f1
Merge branch 'eric/msc2716-backfilling-history' of github.com:matrix-…
MadLittleMods Feb 11, 2021
ba1eb39
Simplify prev_event fetching off the event dictionary
MadLittleMods Feb 11, 2021
6d4fcb6
Filter recent events in both places that we fetch them
MadLittleMods Feb 11, 2021
5d5fb8b
Fix lint
MadLittleMods Feb 11, 2021
ed76d5f
Merge branch 'develop' into eric/msc2716-backfilling-history
MadLittleMods Feb 11, 2021
1aa3af9
Fix newline lint
MadLittleMods Feb 11, 2021
e63ef8e
Remove return strict type to avoid downstream lint problems
MadLittleMods Feb 11, 2021
c90af9e
Fix test failure with mocked build function
MadLittleMods Feb 11, 2021
270e5ee
Fix lint
MadLittleMods Feb 11, 2021
4359ab8
Fix lint again wanting to reverse what black wants...
MadLittleMods Feb 11, 2021
641e871
Merge branch 'develop' into eric/msc2716-backfilling-history
MadLittleMods Feb 17, 2021
6d8514f
Run lint.sh
MadLittleMods Feb 17, 2021
5dacc86
Remove default for inherit_depth
MadLittleMods Feb 17, 2021
61dc89f
WIP: Use depth from successor event
MadLittleMods Mar 11, 2021
e12a77d
Fix depth on historical forward extremeties
MadLittleMods Mar 11, 2021
8549219
403 when non appservice trying to use prev_event
MadLittleMods Mar 13, 2021
6c96622
Merge branch 'develop' into eric/msc2716-backfilling-history
MadLittleMods Mar 30, 2021
23d0379
Bulk send endpoint for backfilling history (MSC2716)
MadLittleMods Mar 30, 2021
ea37564
Add bulk send endpoint
MadLittleMods Mar 31, 2021
7008ee0
Skip everything and just persist the event
MadLittleMods Mar 31, 2021
0d4736f
Scratch commit trying to use existing functions
MadLittleMods Apr 6, 2021
49631e5
Scratch commit 2
MadLittleMods Apr 6, 2021
06dccec
Scratch commit 3
MadLittleMods Apr 7, 2021
8e0684d
Merge branch 'develop' into eric/msc2716-backfilling-history
MadLittleMods Apr 10, 2021
cbd16b8
Working bulksend endpoint for sending state and historical messages (…
MadLittleMods Apr 10, 2021
044a761
Fix up some CI lints
MadLittleMods Apr 12, 2021
86ec915
Fix some more CI lint and tests
MadLittleMods Apr 14, 2021
81b45b5
Copy over origin_server_ts when bulksend'ing
MadLittleMods Apr 14, 2021
02b7335
Wrap bulksend endpoint around experimental feature flag and only apps…
MadLittleMods Apr 14, 2021
82708cb
Add comment docs for new parameters
MadLittleMods Apr 14, 2021
f0fb732
Add historical messages to /backfill response
MadLittleMods May 1, 2021
c525c5d
Logging to better debug federated event not authing
MadLittleMods May 1, 2021
5deee7c
Add local copy of signedjson to debug sign/verify steps
MadLittleMods May 5, 2021
960ec21
More log debugging
MadLittleMods May 6, 2021
779ef25
Revert debugging commits
MadLittleMods May 6, 2021
7dcc0fa
Fix signature check failing for historical state events
MadLittleMods May 6, 2021
52b1e7b
Clean up remaining debug logs
MadLittleMods May 6, 2021
1d6cf78
Add insertion events to the end of chunks
MadLittleMods May 14, 2021
13b18a8
Add insertion initially, end of chunk, and add live markers
MadLittleMods May 14, 2021
3d513bf
Remove partial federation code in favor of future insertion/marker logic
MadLittleMods May 20, 2021
e92a9e9
Start of new approach for chronolgoical events in chunk
MadLittleMods May 21, 2021
e9ae5e1
Chronological events but persist in reverse-chronological
MadLittleMods May 21, 2021
a978f38
Skip push notification actions for historical messages
MadLittleMods May 26, 2021
2c22929
Use more explicit comparison
MadLittleMods May 26, 2021
0501a11
Fix lint
MadLittleMods May 27, 2021
e231a99
Merge branch 'develop' into eric/msc2716-backfilling-history
MadLittleMods May 27, 2021
176a854
Uncomment dropped fields to see if tests pass
MadLittleMods May 27, 2021
a6a05f8
Merge branch 'develop' into eric/msc2716-backfilling-history
MadLittleMods May 29, 2021
25aef56
Return early instead of big if nesting
MadLittleMods May 27, 2021
0580d09
Pass in prev_events as function parameter so clients can't set it in …
MadLittleMods Jun 2, 2021
ef68832
Switch to passing depth directly instead of inherit_depth
MadLittleMods Jun 5, 2021
d8316d6
Calculate and pass in depth directly
MadLittleMods Jun 6, 2021
513d7a2
Only get_max_depth_of where we use it
MadLittleMods Jun 7, 2021
f36bdde
Remove random logs used while developing
MadLittleMods Jun 7, 2021
c2e1924
Use nice negative list index to grab last item
MadLittleMods Jun 7, 2021
88327fb
Remove unneeded body from insertion events
MadLittleMods Jun 7, 2021
34f130c
Switch wording from bulk to batch
MadLittleMods Jun 7, 2021
2f35954
Always use auth_event_ids from the event itself
MadLittleMods Jun 7, 2021
6ce47d3
Protect from clients from using the historical logic (only /batchsend…
MadLittleMods Jun 7, 2021
b032370
Merge branch 'develop' into eric/msc2716-backfilling-history
MadLittleMods Jun 8, 2021
22881e2
Fix incorrect logic when refactoring to is_historical (failed tests a…
MadLittleMods Jun 8, 2021
ae85719
Fix tests using False as a depth value (should be None)
MadLittleMods Jun 9, 2021
d59d6be
Merge branch 'develop' into eric/msc2716-backfilling-history
MadLittleMods Jun 11, 2021
c236a3c
Fix greedy find/replace
MadLittleMods Jun 11, 2021
429e130
Add docstring to explain batchsend endpoint
MadLittleMods Jun 16, 2021
29c3708
Only use necessary auth_events
MadLittleMods Jun 16, 2021
89e08c5
Merge branch 'develop' into eric/msc2716-backfilling-history
MadLittleMods Jun 16, 2021
bfe458c
Use unstable endpoint for MSC2716 batch send
MadLittleMods Jun 17, 2021
2c1750f
Merge branch 'develop' into eric/msc2716-backfilling-history
MadLittleMods Jun 17, 2021
e851dac
Remove complement test jig changes
MadLittleMods Jun 17, 2021
7dcbba9
Merge branch 'develop' into eric/msc2716-backfilling-history
MadLittleMods Jun 21, 2021
f7158ff
Correct comment to make more sense to what the code was doing
MadLittleMods Jun 22, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions changelog.d/9247.feature
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Add experimental support for backfilling history into rooms ([MSC2716](https://github.com/matrix-org/matrix-doc/pull/2716)).
clokep marked this conversation as resolved.
Show resolved Hide resolved
MadLittleMods marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we not need to change the behaviour of getting backfill from the DB? Or are we punting that to a separate PR?

Copy link
Contributor Author

@MadLittleMods MadLittleMods Jun 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to work on the insertion/marker event logic in another PR after this merges so it doesn't grow more and I don't have to keep refactoring across PR's.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM!

2 changes: 1 addition & 1 deletion scripts-dev/complement.sh
Original file line number Diff line number Diff line change
Expand Up @@ -46,4 +46,4 @@ if [[ -n "$1" ]]; then
fi

# Run the tests!
COMPLEMENT_BASE_IMAGE=complement-synapse go test -v -tags synapse_blacklist,msc2946,msc3083 -count=1 $EXTRA_COMPLEMENT_ARGS ./tests
COMPLEMENT_BASE_IMAGE=complement-synapse go test -tags msc2716 -v -count=1 ./tests/main_test.go ./tests/msc2716_test.go
12 changes: 7 additions & 5 deletions synapse/api/auth.py
Original file line number Diff line number Diff line change
Expand Up @@ -87,11 +87,13 @@ def __init__(self, hs):
async def check_from_context(
self, room_version: str, event, context, do_sig_check=True
):
prev_state_ids = await context.get_prev_state_ids()
auth_events_ids = self.compute_auth_events(
event, prev_state_ids, for_verification=True
)
auth_events = await self.store.get_events(auth_events_ids)
auth_event_ids = event.auth_event_ids()
MadLittleMods marked this conversation as resolved.
Show resolved Hide resolved
if auth_event_ids is None:
prev_state_ids = await context.get_prev_state_ids()
auth_event_ids = self.compute_auth_events(
event, prev_state_ids, for_verification=True
)
auth_events = await self.store.get_events(auth_event_ids)
auth_events = {(e.type, e.state_key): e for e in auth_events.values()}

room_version_obj = KNOWN_ROOM_VERSIONS[room_version]
Expand Down
10 changes: 10 additions & 0 deletions synapse/api/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -106,6 +106,9 @@ class EventTypes:
MSC1772_SPACE_CHILD = "org.matrix.msc1772.space.child"
MSC1772_SPACE_PARENT = "org.matrix.msc1772.space.parent"

MSC2716_INSERTION = "org.matrix.msc2716.insertion"
MSC2716_MARKER = "org.matrix.msc2716.marker"
MadLittleMods marked this conversation as resolved.
Show resolved Hide resolved


class EduTypes:
Presence = "m.presence"
Expand Down Expand Up @@ -169,6 +172,13 @@ class EventContentFields:
# cf https://github.com/matrix-org/matrix-doc/pull/1772
MSC1772_ROOM_TYPE = "org.matrix.msc1772.type"

# For "insertion" events
MSC2716_NEXT_CHUNK_ID = "org.matrix.msc2716.next_chunk_id"
# Used on normal message events to indicate where the chunk connects to
MSC2716_CHUNK_ID = "org.matrix.msc2716.chunk_id"
# For "marker" events
MSC2716_PREV_INSERTION = "org.matrix.msc2716.prev_insertion"


class RoomEncryptionAlgorithms:
MEGOLM_V1_AES_SHA2 = "m.megolm.v1.aes-sha2"
Expand Down
2 changes: 2 additions & 0 deletions synapse/config/experimental.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,3 +36,5 @@ def read_config(self, config: JsonDict, **kwargs):

# MSC3026 (busy presence state)
self.msc3026_enabled = experimental.get("msc3026_enabled", False) # type: bool
# MSC2716 (backfill existing history)
self.msc2716_enabled = experimental.get("msc2716_enabled", False) # type: bool
48 changes: 46 additions & 2 deletions synapse/events/builder.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import logging
from typing import Any, Dict, List, Optional, Tuple, Union

import attr
Expand All @@ -34,6 +35,8 @@
from synapse.util import Clock
from synapse.util.stringutils import random_string

logger = logging.getLogger(__name__)


@attr.s(slots=True, cmp=False, frozen=True)
class EventBuilder:
Expand Down Expand Up @@ -101,6 +104,7 @@ async def build(
self,
prev_event_ids: List[str],
auth_event_ids: Optional[List[str]],
inherit_depth: bool = False,
MadLittleMods marked this conversation as resolved.
Show resolved Hide resolved
clokep marked this conversation as resolved.
Show resolved Hide resolved
) -> EventBase:
"""Transform into a fully signed and hashed event

Expand All @@ -109,6 +113,8 @@ async def build(
auth_event_ids: The event IDs to use as the auth events.
Should normally be set to None, which will cause them to be calculated
based on the room state at the prev_events.
inherit_depth: True to inherit the depth from the successor of the most
recent event from prev_event_ids. False to progress the depth as normal.

Returns:
The signed and hashed event.
Expand All @@ -132,8 +138,46 @@ async def build(
auth_events = auth_event_ids
prev_events = prev_event_ids

old_depth = await self._store.get_max_depth_of(prev_event_ids)
depth = old_depth + 1
(
most_recent_prev_event_id,
most_recent_prev_event_depth,
) = await self._store.get_max_depth_of(prev_event_ids)

# We want to insert the historical event after the `prev_event` but before the successor event
#
# We inherit depth from the successor event instead of the `prev_event`
# because events returned from `/messages` are first sorted by `topological_ordering`
# which is just the `depth` and then tie-break with `stream_ordering`.
#
# We mark these inserted historical events as "backfilled" which gives them a
# negative `stream_ordering`. If we use the same depth as the `prev_event`,
# then our historical event will tie-break and be sorted before the `prev_event`
# when it should come after.
#
# We want to use the successor event depth so they appear after `prev_event` because
# it has a larger `depth` but before the successor event because the `stream_ordering`
# is negative before the successor event.
if inherit_depth:
successor_event_ids = await self._store.get_successor_events(
[most_recent_prev_event_id]
)

# If we can't find any successor events, then it's a forward extremity of
# historical messages and we can just inherit from the previous historical
# event which we can already assume has the correct depth where we want
# to insert into.
if not successor_event_ids:
depth = most_recent_prev_event_depth
else:
(
_,
oldest_successor_depth,
) = await self._store.get_min_depth_of(successor_event_ids)

depth = oldest_successor_depth
# Otherwise, progress the depth as normal
else:
depth = most_recent_prev_event_depth + 1

# we cap depth of generated events, to ensure that they are not
# rejected by other servers (and so that they can be persisted in
Expand Down
14 changes: 7 additions & 7 deletions synapse/events/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -253,13 +253,13 @@ def format_event_for_client_v1(d):

def format_event_for_client_v2(d):
drop_keys = (
"auth_events",
"prev_events",
"hashes",
"signatures",
"depth",
"origin",
"prev_state",
# "auth_events",
# "prev_events",
# "hashes",
# "signatures",
# "depth",
# "origin",
# "prev_state",
MadLittleMods marked this conversation as resolved.
Show resolved Hide resolved
)
for key in drop_keys:
d.pop(key, None)
Expand Down
66 changes: 62 additions & 4 deletions synapse/handlers/message.py
Original file line number Diff line number Diff line change
Expand Up @@ -456,6 +456,8 @@ async def create_event(
prev_event_ids: Optional[List[str]] = None,
auth_event_ids: Optional[List[str]] = None,
require_consent: bool = True,
outlier: bool = False,
MadLittleMods marked this conversation as resolved.
Show resolved Hide resolved
inherit_depth: bool = False,
) -> Tuple[EventBase, EventContext]:
"""
Given a dict from a client, create a new event.
Expand All @@ -482,6 +484,13 @@ async def create_event(

require_consent: Whether to check if the requester has
consented to the privacy policy.

outlier: Indicates whether the event is an `outlier`, i.e. if
it's from an arbitrary point and floating in the DAG as
opposed to being inline with the current DAG.

inherit_depth: True to inherit the depth from the successor of the most
recent event from prev_event_ids. False to progress the depth as normal.
Raises:
ResourceLimitError if server is blocked to some resource being
exceeded
Expand Down Expand Up @@ -537,11 +546,14 @@ async def create_event(
if txn_id is not None:
builder.internal_metadata.txn_id = txn_id

builder.internal_metadata.outlier = outlier

event, context = await self.create_new_client_event(
builder=builder,
requester=requester,
prev_event_ids=prev_event_ids,
auth_event_ids=auth_event_ids,
inherit_depth=inherit_depth,
)

# In an ideal world we wouldn't need the second part of this condition. However,
Expand Down Expand Up @@ -698,9 +710,12 @@ async def create_and_send_nonmember_event(
self,
requester: Requester,
event_dict: dict,
auth_event_ids: Optional[List[str]] = None,
ratelimit: bool = True,
txn_id: Optional[str] = None,
ignore_shadow_ban: bool = False,
outlier: bool = False,
inherit_depth: bool = False,
) -> Tuple[EventBase, int]:
"""
Creates an event, then sends it.
Expand All @@ -710,10 +725,19 @@ async def create_and_send_nonmember_event(
Args:
requester: The requester sending the event.
event_dict: An entire event.
auth_event_ids:
The event ids to use as the auth_events for the new event.
Should normally be left as None, which will cause them to be calculated
based on the room state at the prev_events.
ratelimit: Whether to rate limit this send.
txn_id: The transaction ID.
ignore_shadow_ban: True if shadow-banned users should be allowed to
send this event.
outlier: Indicates whether the event is an `outlier`, i.e. if
it's from an arbitrary point and floating in the DAG as
opposed to being inline with the current DAG.
inherit_depth: True to inherit the depth from the successor of the most
recent event from prev_event_ids. False to progress the depth as normal.

Returns:
The event, and its stream ordering (if deduplication happened,
Expand Down Expand Up @@ -752,8 +776,16 @@ async def create_and_send_nonmember_event(
assert event.internal_metadata.stream_ordering
return event, event.internal_metadata.stream_ordering

prev_events = event_dict.get("prev_events")
MadLittleMods marked this conversation as resolved.
Show resolved Hide resolved

event, context = await self.create_event(
requester, event_dict, txn_id=txn_id
requester,
event_dict,
txn_id=txn_id,
prev_event_ids=prev_events,
auth_event_ids=auth_event_ids,
outlier=outlier,
inherit_depth=inherit_depth,
)

assert self.hs.is_mine_id(event.sender), "User must be our own: %s" % (
Expand Down Expand Up @@ -785,6 +817,7 @@ async def create_new_client_event(
requester: Optional[Requester] = None,
prev_event_ids: Optional[List[str]] = None,
auth_event_ids: Optional[List[str]] = None,
inherit_depth: bool = False,
) -> Tuple[EventBase, EventContext]:
"""Create a new event for a local client

Expand All @@ -802,6 +835,9 @@ async def create_new_client_event(
Should normally be left as None, which will cause them to be calculated
based on the room state at the prev_events.

inherit_depth: True to inherit the depth from the successor of the most
recent event from prev_event_ids. False to progress the depth as normal.

Returns:
Tuple of created event, context
"""
Expand All @@ -825,9 +861,23 @@ async def create_new_client_event(
), "Attempting to create an event with no prev_events"

event = await builder.build(
prev_event_ids=prev_event_ids, auth_event_ids=auth_event_ids
prev_event_ids=prev_event_ids,
auth_event_ids=auth_event_ids,
inherit_depth=inherit_depth,
)
context = await self.state.compute_event_context(event)

old_state = None

# Pass on the outlier property from the builder to the event
# after it is created
if builder.internal_metadata.outlier:
event.internal_metadata.outlier = builder.internal_metadata.outlier

# Calculate the state for outliers that pass in their own `auth_event_ids`
if auth_event_ids:
old_state = await self.store.get_events_as_list(auth_event_ids)

context = await self.state.compute_event_context(event, old_state=old_state)
if requester:
context.app_service = requester.app_service

Expand Down Expand Up @@ -1236,13 +1286,21 @@ async def persist_and_notify_client_event(
if prev_state_ids:
raise AuthError(403, "Changing the room create event is forbidden")

# Mark any `m.historical` messages as backfilled so they don't appear
# in `/sync` and have the proper decrementing `stream_ordering` as we import
backfilled = False
if event.content.get("m.historical", None):
backfilled = True

# Note that this returns the event that was persisted, which may not be
# the same as we passed in if it was deduplicated due transaction IDs.
(
event,
event_pos,
max_stream_token,
) = await self.storage.persistence.persist_event(event, context=context)
) = await self.storage.persistence.persist_event(
event, context=context, backfilled=backfilled
)

if self._ephemeral_events_enabled:
# If there's an expiry timestamp on the event, schedule its expiry.
Expand Down
Loading