Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Use a database table to hold the users that should have full presence sent to them, instead of something in-memory #9823

Merged
merged 30 commits into from
May 18, 2021

Conversation

anoadragon453
Copy link
Member

@anoadragon453 anoadragon453 commented Apr 15, 2021

The ModuleApi.send_local_online_presence_to method allows Synapse modules to state that a local or remote user should receive all online presence that they are considered "interested in" on their next incremental sync. This would be the same presence information they would receive then if they were doing an initial sync.

The method would simply add user IDs to the in-memory set ModuleApi._send_full_presence_to_local_users. The SyncHandler would then pop them from there when that specific user did a sync. However, this information would only be local to the worker that the module was running on.

A quirk of ModuleApi.send_local_online_presence_to is that it should only be run on workers that have the capability to send federation, such that we can send presence updates to remote users. However, it also needs to be called on sync-capable workers so that they can get access to the in-memory data structure of which users to send presence to on their next sync.

This was a real footgun and something we don't really want to put on the shoulders of sysadmins when they're designing their worker topology. Thus, instead of using the in-memory set, we store this information in a database table instead, which acts as one central list that's shared amongst all users. We also tackle the problem of the list having users added to it, but potentially never removing those users if they don't perform a sync, by expiring entries in the table. This has changed, but there will only ever be at most X rows in the table, where X is the total count of local users.

Reviewable commit-by-commit. Blocked on #9819, as it relies on some changes made in ModuleApi.send_local_online_presence_to. Specifically the bit where it allows non-federation-capable workers to call the function. This has now merged.

@anoadragon453 anoadragon453 force-pushed the anoa/module_api_full_presence_fix branch 3 times, most recently from 471203f to fc341f8 Compare April 16, 2021 17:13
@anoadragon453 anoadragon453 requested a review from a team May 4, 2021 12:04
@erikjohnston erikjohnston self-assigned this May 4, 2021
Copy link
Member

@erikjohnston erikjohnston left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this up. Can we have some documentation around how to write presence module for workers? I'm trying to figure out of how this all works in practice.

synapse/storage/databases/main/presence.py Outdated Show resolved Hide resolved
synapse/handlers/presence.py Outdated Show resolved Hide resolved
@erikjohnston erikjohnston removed their assignment May 4, 2021
@anoadragon453
Copy link
Member Author

@erikjohnston Great suggestions! We now store a stream ID in the table instead, so that receiving full presence from multiple devices works. A couple outstanding questions I had:

  • I'm not sure whether I need an index for (user_id, presence_stream_id) on users_to_send_full_presence_to, as there's already an index on user_id via PRIMARY KEY. I tried EXPLAIN ANALYZEing on a test table, but the primary and foreign keys in there makes inserting test data a bit complicated.
  • Do I need to do anything special to have the new database methods run properly on workers? I'm currently configuring the test_send_local_online_presence_to and test_send_local_online_presence_to_federation unit tests to run on workers.

@erikjohnston
Copy link
Member

@erikjohnston Great suggestions! We now store a stream ID in the table instead, so that receiving full presence from multiple devices works. A couple outstanding questions I had:

  • I'm not sure whether I need an index for (user_id, presence_stream_id) on users_to_send_full_presence_to, as there's already an index on user_id via PRIMARY KEY. I tried EXPLAIN ANALYZEing on a test table, but the primary and foreign keys in there makes inserting test data a bit complicated.

No you don't need an index as there is only one row per user_id, i.e. looking up all rows that match the user_id in the index will return a single row and so there is no need to split the index further. (If it returned many many rows than adding the index (user_id, presence_stream_id) would allow you to split the returned rows by stream ID).

  • Do I need to do anything special to have the new database methods run properly on workers? I'm currently configuring the test_send_local_online_presence_to and test_send_local_online_presence_to_federation unit tests to run on workers.

The DB function needs to be in a class that gets added to the generic worker store in synapse.app.generic_worker

@anoadragon453 anoadragon453 added the X-Release-Blocker Must be resolved before making a release label May 11, 2021
(This constant is used in the next commit.)

So there's not much here right now, and perhaps this could fit better in
another file. Though I didn't see anything quite appropriate.
We now return full presence in a /sync response based on the provided sync token,
rather than if the user has received the response already. This is necessary in
circumstances where users have more than one device - which they often do!

This comment also modifies the spot in PresenceHandler where we check if
the user should receive all presence. We only need to make the check if
from_key is not None. If from_key is None, the user will be receiving
all presence states anyways.
If force_notify is True, we will create a new stream ID for this
presence update, even if it matches the user's previous presence update
state. Creating a new stream ID will notify clients, hence
'force_notify'.

We're resending the current state of the user who's getting full presence
in order to force the presence stream ID to advance 1 - as we compare
stream IDs when checking if a device (who's presencing a presence stream
ID as part of their sync token) has already received full online
presence as part of a previous sync.

If we didn't do so, it's possible for a syncing device to receive full
online presence multiple times if the presence stream doesn't advance
otherwise.
It turned out we needed to advance the reactor in order to successfully
write the body of the replication request from the worker to the main
process.

We do this so that we can test calling send_local_online_presence_to on
a worker, while a client is syncing via the main process.
@anoadragon453 anoadragon453 force-pushed the anoa/module_api_full_presence_fix branch from 46eccf1 to 0b15230 Compare May 12, 2021 14:41
Copy link
Member Author

@anoadragon453 anoadragon453 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, this PR has been updated to fix the worker-based tests. They should pass now.

No you don't need an index as there is only one row per user_id, i.e. looking up all rows that match the user_id in the index will return a single row and so there is no need to split the index further. (If it returned many many rows than adding the index (user_id, presence_stream_id) would allow you to split the returned rows by stream ID).

Thanks for the answer, and for explaining how multi-column indexes work in an intuitive way!

The DB function needs to be in a class that gets added to the generic worker store in synapse.app.generic_worker

PresenceStore.add_users_to_send_full_presence_to looks to work on both workers and the main process. Rejoice!

synapse/module_api/__init__.py Outdated Show resolved Hide resolved
@anoadragon453 anoadragon453 force-pushed the anoa/module_api_full_presence_fix branch from f48fb39 to 6b54010 Compare May 12, 2021 14:58
@babolivier babolivier removed the X-Release-Blocker Must be resolved before making a release label May 12, 2021
synapse/module_api/__init__.py Outdated Show resolved Hide resolved
synapse/handlers/presence.py Outdated Show resolved Hide resolved
@anoadragon453
Copy link
Member Author

I'd like to also update the worker docs to mention that we can write to the presence stream now, since we link to them here. But that can be done in a separate PR.

synapse/module_api/__init__.py Outdated Show resolved Hide resolved
It's good code hygiene to not name functions the exact same in different classes - especially
for those people who navigate the code via global search :)
@anoadragon453 anoadragon453 merged commit 4d6e5a5 into develop May 18, 2021
@anoadragon453 anoadragon453 deleted the anoa/module_api_full_presence_fix branch May 18, 2021 13:13
netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this pull request Jun 5, 2021
Synapse 1.35.1 (2021-06-03)
===========================

Bugfixes
--------

- Fix a bug introduced in v1.35.0 where invite-only rooms would be shown to all users in a space, regardless of if the user had access to it. ([\#10109](matrix-org/synapse#10109))


Synapse 1.35.0 (2021-06-01)
===========================

Note that [the tag](https://github.com/matrix-org/synapse/releases/tag/v1.35.0rc3) and [docker images](https://hub.docker.com/layers/matrixdotorg/synapse/v1.35.0rc3/images/sha256-34ccc87bd99a17e2cbc0902e678b5937d16bdc1991ead097eee6096481ecf2c4?context=explore) for `v1.35.0rc3` were incorrectly built. If you are experiencing issues with either, it is recommended to upgrade to the equivalent tag or docker image for the `v1.35.0` release.

Deprecations and Removals
-------------------------

- The core Synapse development team plan to drop support for the [unstable API of MSC2858](https://github.com/matrix-org/matrix-doc/blob/master/proposals/2858-Multiple-SSO-Identity-Providers.md#unstable-prefix), including the undocumented `experimental.msc2858_enabled` config option, in August 2021. Client authors should ensure that their clients are updated to use the stable API (which has been supported since Synapse 1.30) well before that time, to give their users time to upgrade. ([\#10101](matrix-org/synapse#10101))

Bugfixes
--------

- Fixed a bug causing replication requests to fail when receiving a lot of events via federation. Introduced in v1.33.0. ([\#10082](matrix-org/synapse#10082))
- Fix HTTP response size limit to allow joining very large rooms over federation. Introduced in v1.33.0. ([\#10093](matrix-org/synapse#10093))


Internal Changes
----------------

- Log method and path when dropping request due to size limit. ([\#10091](matrix-org/synapse#10091))


Synapse 1.35.0rc2 (2021-05-27)
==============================

Bugfixes
--------

- Fix a bug introduced in v1.35.0rc1 when calling the spaces summary API via a GET request. ([\#10079](matrix-org/synapse#10079))


Synapse 1.35.0rc1 (2021-05-25)
==============================

Features
--------

- Add experimental support to allow a user who could join a restricted room to view it in the spaces summary. ([\#9922](matrix-org/synapse#9922), [\#10007](matrix-org/synapse#10007), [\#10038](matrix-org/synapse#10038))
- Reduce memory usage when joining very large rooms over federation. ([\#9958](matrix-org/synapse#9958))
- Add a configuration option which allows enabling opentracing by user id. ([\#9978](matrix-org/synapse#9978))
- Enable experimental support for [MSC2946](matrix-org/matrix-spec-proposals#2946) (spaces summary API) and [MSC3083](matrix-org/matrix-spec-proposals#3083) (restricted join rules) by default. ([\#10011](matrix-org/synapse#10011))


Bugfixes
--------

- Fix a bug introduced in v1.26.0 which meant that `synapse_port_db` would not correctly initialise some postgres sequences, requiring manual updates afterwards. ([\#9991](matrix-org/synapse#9991))
- Fix `synctl`'s `--no-daemonize` parameter to work correctly with worker processes. ([\#9995](matrix-org/synapse#9995))
- Fix a validation bug introduced in v1.34.0 in the ordering of spaces in the space summary API. ([\#10002](matrix-org/synapse#10002))
- Fixed deletion of new presence stream states from database. ([\#10014](matrix-org/synapse#10014), [\#10033](matrix-org/synapse#10033))
- Fixed a bug with very high resolution image uploads throwing internal server errors. ([\#10029](matrix-org/synapse#10029))


Updates to the Docker image
---------------------------

- Fix bug introduced in Synapse 1.33.0 which caused a `Permission denied: '/homeserver.log'` error when starting Synapse with the generated log configuration. Contributed by Sergio Miguéns Iglesias. ([\#10045](matrix-org/synapse#10045))


Improved Documentation
----------------------

- Add hardened systemd files as proposed in [#9760](matrix-org/synapse#9760) and added them to `contrib/`. Change the docs to reflect the presence of these files. ([\#9803](matrix-org/synapse#9803))
- Clarify documentation around SSO mapping providers generating unique IDs and localparts. ([\#9980](matrix-org/synapse#9980))
- Updates to the PostgreSQL documentation (`postgres.md`). ([\#9988](matrix-org/synapse#9988), [\#9989](matrix-org/synapse#9989))
- Fix broken link in user directory documentation. Contributed by @junquera. ([\#10016](matrix-org/synapse#10016))
- Add missing room state entry to the table of contents of room admin API. ([\#10043](matrix-org/synapse#10043))


Deprecations and Removals
-------------------------

- Removed support for the deprecated `tls_fingerprints` configuration setting. Contributed by Jerin J Titus. ([\#9280](matrix-org/synapse#9280))


Internal Changes
----------------

- Allow sending full presence to users via workers other than the one that called `ModuleApi.send_local_online_presence_to`. ([\#9823](matrix-org/synapse#9823))
- Update comments in the space summary handler. ([\#9974](matrix-org/synapse#9974))
- Minor enhancements to the `@cachedList` descriptor. ([\#9975](matrix-org/synapse#9975))
- Split multipart email sending into a dedicated handler. ([\#9977](matrix-org/synapse#9977))
- Run `black` on files in the `scripts` directory. ([\#9981](matrix-org/synapse#9981))
- Add missing type hints to `synapse.util` module. ([\#9982](matrix-org/synapse#9982))
- Simplify a few helper functions. ([\#9984](matrix-org/synapse#9984), [\#9985](matrix-org/synapse#9985), [\#9986](matrix-org/synapse#9986))
- Remove unnecessary property from SQLBaseStore. ([\#9987](matrix-org/synapse#9987))
- Remove `keylen` param on `LruCache`. ([\#9993](matrix-org/synapse#9993))
- Update the Grafana dashboard in `contrib/`. ([\#10001](matrix-org/synapse#10001))
- Add a batching queue implementation. ([\#10017](matrix-org/synapse#10017))
- Reduce memory usage when verifying signatures on large numbers of events at once. ([\#10018](matrix-org/synapse#10018))
- Properly invalidate caches for destination retry timings every (instead of expiring entries every 5 minutes). ([\#10036](matrix-org/synapse#10036))
- Fix running complement tests with Synapse workers. ([\#10039](matrix-org/synapse#10039))
- Fix typo in `get_state_ids_for_event` docstring where the return type was incorrect. ([\#10050](matrix-org/synapse#10050))
babolivier added a commit to matrix-org/synapse-dinsic that referenced this pull request Sep 1, 2021
Synapse 1.35.0 (2021-06-01)
===========================

Note that [the tag](https://github.com/matrix-org/synapse/releases/tag/v1.35.0rc3) and [docker images](https://hub.docker.com/layers/matrixdotorg/synapse/v1.35.0rc3/images/sha256-34ccc87bd99a17e2cbc0902e678b5937d16bdc1991ead097eee6096481ecf2c4?context=explore) for `v1.35.0rc3` were incorrectly built. If you are experiencing issues with either, it is recommended to upgrade to the equivalent tag or docker image for the `v1.35.0` release.

Deprecations and Removals
-------------------------

- The core Synapse development team plan to drop support for the [unstable API of MSC2858](https://github.com/matrix-org/matrix-doc/blob/master/proposals/2858-Multiple-SSO-Identity-Providers.md#unstable-prefix), including the undocumented `experimental.msc2858_enabled` config option, in August 2021. Client authors should ensure that their clients are updated to use the stable API (which has been supported since Synapse 1.30) well before that time, to give their users time to upgrade. ([\#10101](matrix-org/synapse#10101))

Bugfixes
--------

- Fixed a bug causing replication requests to fail when receiving a lot of events via federation. Introduced in v1.33.0. ([\#10082](matrix-org/synapse#10082))
- Fix HTTP response size limit to allow joining very large rooms over federation. Introduced in v1.33.0. ([\#10093](matrix-org/synapse#10093))

Internal Changes
----------------

- Log method and path when dropping request due to size limit. ([\#10091](matrix-org/synapse#10091))

Synapse 1.35.0rc2 (2021-05-27)
==============================

Bugfixes
--------

- Fix a bug introduced in v1.35.0rc1 when calling the spaces summary API via a GET request. ([\#10079](matrix-org/synapse#10079))

Synapse 1.35.0rc1 (2021-05-25)
==============================

Features
--------

- Add experimental support to allow a user who could join a restricted room to view it in the spaces summary. ([\#9922](matrix-org/synapse#9922), [\#10007](matrix-org/synapse#10007), [\#10038](matrix-org/synapse#10038))
- Reduce memory usage when joining very large rooms over federation. ([\#9958](matrix-org/synapse#9958))
- Add a configuration option which allows enabling opentracing by user id. ([\#9978](matrix-org/synapse#9978))
- Enable experimental support for [MSC2946](matrix-org/matrix-spec-proposals#2946) (spaces summary API) and [MSC3083](matrix-org/matrix-spec-proposals#3083) (restricted join rules) by default. ([\#10011](matrix-org/synapse#10011))

Bugfixes
--------

- Fix a bug introduced in v1.26.0 which meant that `synapse_port_db` would not correctly initialise some postgres sequences, requiring manual updates afterwards. ([\#9991](matrix-org/synapse#9991))
- Fix `synctl`'s `--no-daemonize` parameter to work correctly with worker processes. ([\#9995](matrix-org/synapse#9995))
- Fix a validation bug introduced in v1.34.0 in the ordering of spaces in the space summary API. ([\#10002](matrix-org/synapse#10002))
- Fixed deletion of new presence stream states from database. ([\#10014](matrix-org/synapse#10014), [\#10033](matrix-org/synapse#10033))
- Fixed a bug with very high resolution image uploads throwing internal server errors. ([\#10029](matrix-org/synapse#10029))

Updates to the Docker image
---------------------------

- Fix bug introduced in Synapse 1.33.0 which caused a `Permission denied: '/homeserver.log'` error when starting Synapse with the generated log configuration. Contributed by Sergio Miguéns Iglesias. ([\#10045](matrix-org/synapse#10045))

Improved Documentation
----------------------

- Add hardened systemd files as proposed in [#9760](matrix-org/synapse#9760) and added them to `contrib/`. Change the docs to reflect the presence of these files. ([\#9803](matrix-org/synapse#9803))
- Clarify documentation around SSO mapping providers generating unique IDs and localparts. ([\#9980](matrix-org/synapse#9980))
- Updates to the PostgreSQL documentation (`postgres.md`). ([\#9988](matrix-org/synapse#9988), [\#9989](matrix-org/synapse#9989))
- Fix broken link in user directory documentation. Contributed by @junquera. ([\#10016](matrix-org/synapse#10016))
- Add missing room state entry to the table of contents of room admin API. ([\#10043](matrix-org/synapse#10043))

Deprecations and Removals
-------------------------

- Removed support for the deprecated `tls_fingerprints` configuration setting. Contributed by Jerin J Titus. ([\#9280](matrix-org/synapse#9280))

Internal Changes
----------------

- Allow sending full presence to users via workers other than the one that called `ModuleApi.send_local_online_presence_to`. ([\#9823](matrix-org/synapse#9823))
- Update comments in the space summary handler. ([\#9974](matrix-org/synapse#9974))
- Minor enhancements to the `@cachedList` descriptor. ([\#9975](matrix-org/synapse#9975))
- Split multipart email sending into a dedicated handler. ([\#9977](matrix-org/synapse#9977))
- Run `black` on files in the `scripts` directory. ([\#9981](matrix-org/synapse#9981))
- Add missing type hints to `synapse.util` module. ([\#9982](matrix-org/synapse#9982))
- Simplify a few helper functions. ([\#9984](matrix-org/synapse#9984), [\#9985](matrix-org/synapse#9985), [\#9986](matrix-org/synapse#9986))
- Remove unnecessary property from SQLBaseStore. ([\#9987](matrix-org/synapse#9987))
- Remove `keylen` param on `LruCache`. ([\#9993](matrix-org/synapse#9993))
- Update the Grafana dashboard in `contrib/`. ([\#10001](matrix-org/synapse#10001))
- Add a batching queue implementation. ([\#10017](matrix-org/synapse#10017))
- Reduce memory usage when verifying signatures on large numbers of events at once. ([\#10018](matrix-org/synapse#10018))
- Properly invalidate caches for destination retry timings every (instead of expiring entries every 5 minutes). ([\#10036](matrix-org/synapse#10036))
- Fix running complement tests with Synapse workers. ([\#10039](matrix-org/synapse#10039))
- Fix typo in `get_state_ids_for_event` docstring where the return type was incorrect. ([\#10050](matrix-org/synapse#10050))
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants