-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Prevent historical state from being pushed to an application service via /transactions
(MSC2716)
#11265
Prevent historical state from being pushed to an application service via /transactions
(MSC2716)
#11265
Changes from all commits
95624a6
c30a35b
64df8cd
e580d47
15e3246
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
Prevent [MSC2716](https://github.com/matrix-org/matrix-doc/pull/2716) historical state events from being pushed to an application service via `/transactions`. | ||
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -231,13 +231,32 @@ async def push_bulk( | |
json_body=body, | ||
args={"access_token": service.hs_token}, | ||
) | ||
if logger.isEnabledFor(logging.DEBUG): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I came back to this and was confused why I did this since it's guarding debug against debug. AFAICT, this is to avoid potentially expensive string construction when we loop through all of the |
||
logger.debug( | ||
"push_bulk to %s succeeded! events=%s", | ||
uri, | ||
[event.get("event_id") for event in events], | ||
) | ||
sent_transactions_counter.labels(service.id).inc() | ||
sent_events_counter.labels(service.id).inc(len(events)) | ||
return True | ||
except CodeMessageException as e: | ||
logger.warning("push_bulk to %s received %s", uri, e.code) | ||
logger.warning( | ||
"push_bulk to %s received code=%s msg=%s", | ||
uri, | ||
e.code, | ||
e.msg, | ||
exc_info=logger.isEnabledFor(logging.DEBUG), | ||
) | ||
except Exception as ex: | ||
logger.warning("push_bulk to %s threw exception %s", uri, ex) | ||
logger.warning( | ||
"push_bulk to %s threw exception(%s) %s args=%s", | ||
uri, | ||
type(ex).__name__, | ||
ex, | ||
ex.args, | ||
exc_info=logger.isEnabledFor(logging.DEBUG), | ||
) | ||
MadLittleMods marked this conversation as resolved.
Show resolved
Hide resolved
|
||
failed_transactions_counter.labels(service.id).inc() | ||
return False | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -221,6 +221,7 @@ async def persist_state_events_at_start( | |
action=membership, | ||
content=event_dict["content"], | ||
outlier=True, | ||
historical=True, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. All of these changes are just piping
|
||
prev_event_ids=[prev_event_id_for_state_chain], | ||
# Make sure to use a copy of this list because we modify it | ||
# later in the loop here. Otherwise it will be the same | ||
|
@@ -240,6 +241,7 @@ async def persist_state_events_at_start( | |
), | ||
event_dict, | ||
outlier=True, | ||
historical=True, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. All of these changes are just piping
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. whatever else happens, please please please make sure that any parameters are added to the docstrings at each level, with the intended purpose clearly described. Relatedly: as a general rule, I would say it is preferable for method parameters to describe a change in behaviour (" [Currently we have a bit of a mess in some places ( There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Will do 🙇
Seems reasonable. I've created a separate issue to track this #11300 |
||
prev_event_ids=[prev_event_id_for_state_chain], | ||
# Make sure to use a copy of this list because we modify it | ||
# later in the loop here. Otherwise it will be the same | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit confused why these would come down
/transactions
and not down/sync
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was a little bit curious about this as well and saw that
/transactions
only cares aboutstream_ordering
so it made sense why it still came down and didn't look into it further.Diving into this more now, here are the details:
/sync
looks forstream_ordering
but excludes alloutliers
.synapse/synapse/storage/databases/main/stream.py
Lines 519 to 527 in b09d90c
Whereas
/transactions
only cares aboutstream_ordering
. Perhaps we should excludeoutliers
in/transactions
? I can tackle that in a separate PR if we decide.synapse/synapse/storage/databases/main/appservice.py
Lines 358 to 368 in b09d90c
/sync
stack traceSyncRestServlet.on_GET
wait_for_sync_for_user
_wait_for_sync_for_user
current_sync_for_user
generate_sync_result
_generate_sync_entry_for_rooms
_get_rooms_changed
get_room_events_stream_for_rooms
get_room_events_stream_for_room
/transactions
stack tracenotify_interested_services
_notify_interested_services
get_new_events_for_appservice
submit_event_for_as
submit_event_for_as
->enqueue_event
_send_request
appservice.scheduler.send
create_appservice_txn
AppServiceTransaction.send
push_bulk
/transactions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies for jumping in without context, but
the outlier flag seems like a poor way to decide whether we should push this data. (Yes, outliers shouldn't be sent over
/transactions
, but there are probably many other events which shouldn't be sent).FederationEventHandler._process_received_pdu
has abackfilled
parameter (seesynapse/synapse/handlers/federation_event.py
Line 945 in b09d90c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whatever we decide on to make
/sync
and/transactions
behave similarly, we can tackle in another PR. This PR is still good to ship on its own, i.e. It's a good idea to mark historical state ashistorical
For
/transactions
, we can't tell whether the event wasbackfilled
. The only indication is that thestream_ordering
would be negative which is what the/transactions
code already takes into account. And is why the fix in this PR to mark the historical state eventshistorical
->backfilled
works to not show up in/transactions
.I just suggested adding
outlier
to/transactions
query as well so it would match the query in/sync
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created #11394 to track this ⏩