-
Notifications
You must be signed in to change notification settings - Fork 220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix up logic for delaying sending read receipts over federation. #17933
Conversation
0bb3b58
to
801983c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems to make sense
# destination is registered for the flush | ||
if queues_pending_flush is not None: | ||
queues_pending_flush.add(queue) | ||
if self.clock.time_msec() - metadata.received_ts < 60_000: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know it's a silly case, but maybe
if self.clock.time_msec() - metadata.received_ts < 60_000: | |
if 0 <= self.clock.time_msec() - metadata.received_ts < 60_000: |
to guard against the case where someone's clock was wrong (in the future) for a while, or is currently in the past, from causing them to treat all events as recent?
Or perhaps it should have some tolerance for workers being out of sync, so perhaps -60_000 <=
???...
self.reactor.advance(10) | ||
mock_send_transaction.assert_called_once() | ||
json_cb = mock_send_transaction.call_args[0][1] | ||
transaction = mock_send_transaction.call_args_list[0][0][0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
transaction = mock_send_transaction.call_args_list[0][0][0] | |
transaction = mock_send_transaction.call_args_list[0].args[0] |
I think this is correct and clearer? (ditto for other assertions)
|
||
# expect a call to send_transaction for each host | ||
self.assertEqual(mock_send_transaction.call_count, 20) | ||
self._assert_edu_in_call(mock_send_transaction.call_args[0][1]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(also for these
self._assert_edu_in_call(mock_send_transaction.call_args[0][1]) | |
self._assert_edu_in_call(mock_send_transaction.call_args.args[1]) |
)
…ment-hq#17933) For context of why we delay read receipts, see matrix-org/synapse#4730. Element Web often sends read receipts in quick succession, if it reloads the timeline it'll send one for the last message in the old timeline and again for the last message in the new timeline. This caused remote users to see a read receipt for older messages come through quickly, but then the second read receipt taking a while to arrive for the most recent message. There are two things going on in this PR: 1. There was a mismatch between seconds and milliseconds, and so we ended up delaying for far longer than intended. 2. Changing the logic to reuse the `DestinationWakeupQueue` (used for presence) The changes in logic are: - Treat the first receipt and subsequent receipts in a room in the same way - Whitelist certain classes of receipts to never delay being sent, i.e. receipts in small rooms, receipts for events that were sent within the last 60s, and sending receipts to the event sender's server. - The maximum delay a receipt can have before being sent to a server is 30s, and we'll send out receipts to remotes at least at 50Hz (by default) The upshot is that this should make receipts feel more snappy over federation. This new logic should send roughly between 10%–20% of transactions immediately on matrix.org.
This release contains the security fixes from [v1.120.2](https://github.com/element-hq/synapse/releases/tag/v1.120.2). - Fix release process to not create duplicate releases. ([\#18025](element-hq/synapse#18025)) - Support for [MSC4190](matrix-org/matrix-spec-proposals#4190): device management for Application Services. ([\#17705](element-hq/synapse#17705)) - Update [MSC4186](matrix-org/matrix-spec-proposals#4186) Sliding Sync to include invite, ban, kick, targets when `$LAZY`-loading room members. ([\#17947](element-hq/synapse#17947)) - Use stable `M_USER_LOCKED` error code for locked accounts, as per [Matrix 1.12](https://spec.matrix.org/v1.12/client-server-api/#account-locking). ([\#17965](element-hq/synapse#17965)) - [MSC4076](matrix-org/matrix-spec-proposals#4076): Add `disable_badge_count` to pusher configuration. ([\#17975](element-hq/synapse#17975)) - Fix long-standing bug where read receipts could get overly delayed being sent over federation. ([\#17933](element-hq/synapse#17933)) - Add OIDC example configuration for Forgejo (fork of Gitea). ([\#17872](element-hq/synapse#17872)) - Link to element-docker-demo from contrib/docker*. ([\#17953](element-hq/synapse#17953)) - [MSC4108](matrix-org/matrix-spec-proposals#4108): Add a `Content-Type` header on the `PUT` response to work around a faulty behavior in some caching reverse proxies. ([\#17253](element-hq/synapse#17253)) - Fix incorrect comment in new schema delta. ([\#17936](element-hq/synapse#17936)) - Raise setuptools_rust version cap to 1.10.2. ([\#17944](element-hq/synapse#17944)) - Enable encrypted appservice related experimental features in the complement docker image. ([\#17945](element-hq/synapse#17945)) - Return whether the user is suspended when querying the user account in the Admin API. ([\#17952](element-hq/synapse#17952)) - Fix new scheduled tasks jumping the queue. ([\#17962](element-hq/synapse#17962)) - Bump pyo3 and dependencies to v0.23.2. ([\#17966](element-hq/synapse#17966)) - Update setuptools-rust and fix building abi3 wheels in latest version. ([\#17969](element-hq/synapse#17969)) - Consolidate SSO redirects through `/_matrix/client/v3/login/sso/redirect(/{idpId})`. ([\#17972](element-hq/synapse#17972)) - Fix Docker and Complement config to be able to use `public_baseurl`. ([\#17986](element-hq/synapse#17986)) - Fix building wheels for MacOS which was temporarily disabled in Synapse 1.120.2. ([\#17993](element-hq/synapse#17993)) - Fix release process to not create duplicate releases. ([\#17970](element-hq/synapse#17970), [\#17995](element-hq/synapse#17995)) * Bump bytes from 1.8.0 to 1.9.0. ([\#17982](element-hq/synapse#17982)) * Bump pysaml2 from 7.3.1 to 7.5.0. ([\#17978](element-hq/synapse#17978)) * Bump serde_json from 1.0.132 to 1.0.133. ([\#17939](element-hq/synapse#17939)) * Bump tomli from 2.0.2 to 2.1.0. ([\#17959](element-hq/synapse#17959)) * Bump tomli from 2.1.0 to 2.2.1. ([\#17979](element-hq/synapse#17979)) * Bump tornado from 6.4.1 to 6.4.2. ([\#17955](element-hq/synapse#17955))
For context of why we delay read receipts, see matrix-org/synapse#4730.
Element Web often sends read receipts in quick succession, if it reloads the timeline it'll send one for the last message in the old timeline and again for the last message in the new timeline. This caused remote users to see a read receipt for older messages come through quickly, but then the second read receipt taking a while to arrive for the most recent message.
There are two things going on in this PR:
DestinationWakeupQueue
(used for presence)The changes in logic are:
The upshot is that this should make receipts feel more snappy over federation.
This new logic should send roughly between 10%–20% of transactions immediately on matrix.org.
SQL query and output
Looking at the last 100,000 local receipts, we look at the number of servers we need to send those receipts to and bucket them into different classes. First, if they are in a small room (i.e. less than 5 domains), and then by age of event.