Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Speed up @cachedList #13591

Merged
merged 10 commits into from
Aug 23, 2022
Merged

Speed up @cachedList #13591

merged 10 commits into from
Aug 23, 2022

Conversation

erikjohnston
Copy link
Member

@erikjohnston erikjohnston commented Aug 23, 2022

This speeds things up by ~2x.

The vast majority of the time is now spent in LruCache moving things around the linked lists.

We do this via two things:

  1. Don't create a deferred per-key during bulk set operations in DeferredCache. Instead, only create them if a subsequent caller asks for the key.
  2. Add a bulk lookup API to DeferredCache rather than use a loop.

Reviewable commit-by-commit.

@erikjohnston erikjohnston force-pushed the erikj/cached_list_speed branch from 90c76f0 to a5c3fec Compare August 23, 2022 07:43
This is mostly around making `CacheEntry` a generic class and moving
some of the callbacks out into methods so that we can reuse them.
This allows us to avoid create a deferred per-key up front when doing
bulk fetch operations from the DB.
@erikjohnston erikjohnston force-pushed the erikj/cached_list_speed branch from a5c3fec to ee80912 Compare August 23, 2022 07:45
@erikjohnston erikjohnston force-pushed the erikj/cached_list_speed branch from ee80912 to ab035d2 Compare August 23, 2022 08:04
@erikjohnston erikjohnston marked this pull request as ready for review August 23, 2022 08:37
@erikjohnston erikjohnston requested a review from a team as a code owner August 23, 2022 08:37
@reivilibre reivilibre requested review from reivilibre and removed request for a team August 23, 2022 10:51
Copy link
Contributor

@reivilibre reivilibre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall, mostly just naming concerns to make it easier to follow for us noobs

synapse/util/caches/deferred_cache.py Outdated Show resolved Hide resolved
synapse/util/caches/deferred_cache.py Outdated Show resolved Hide resolved
synapse/util/caches/deferred_cache.py Outdated Show resolved Hide resolved
synapse/util/caches/deferred_cache.py Outdated Show resolved Hide resolved
@erikjohnston erikjohnston enabled auto-merge (squash) August 23, 2022 13:38
@erikjohnston erikjohnston merged commit f7ddfe1 into develop Aug 23, 2022
@erikjohnston erikjohnston deleted the erikj/cached_list_speed branch August 23, 2022 14:53
@MadLittleMods MadLittleMods added A-Performance Performance, both client-facing and admin-facing T-Task Refactoring, removal, replacement, enabling or disabling functionality, other engineering tasks. labels Sep 6, 2022
@MadLittleMods MadLittleMods mentioned this pull request Sep 6, 2022
6 tasks
@@ -0,0 +1 @@
Improve performance of `@cachedList`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great progress @erikjohnston 🎉

Here are the before/after timings for this PR using the have_seen_events benchmark from #13561

# events Before After
50k .
Benchmark time (1 cold cache ): 3.7170820236206055
Benchmark time (2, warm cache): 0.2985079288482666
Benchmark time (3, warm cache): 0.28847789764404297
Benchmark time (4, odds ): 0.1537461280822754
Benchmark time (5, odds ): 0.14780497550964355
Benchmark time (6, evens ): 0.1475691795349121
Benchmark time (7, evens ): 0.14868617057800293
.
Benchmark time (1 cold cache ): 1.062384843826294
Benchmark time (2, warm cache): 0.18310809135437012
Benchmark time (3, warm cache): 0.17781400680541992
Benchmark time (4, odds ): 0.09939789772033691
Benchmark time (5, odds ): 0.09195113182067871
Benchmark time (6, evens ): 0.08882665634155273
Benchmark time (7, evens ): 0.08661413192749023
100k .
Benchmark time (1 cold cache ): 8.10055136680603
Benchmark time (2, warm cache): 0.6121761798858643
Benchmark time (3, warm cache): 0.6093218326568604
Benchmark time (4, odds ): 0.29950785636901855
Benchmark time (5, odds ): 0.3049640655517578
Benchmark time (6, evens ): 0.3025388717651367
Benchmark time (7, evens ): 0.29833483695983887
.
Benchmark time (1 cold cache ): 2.264657735824585
Benchmark time (2, warm cache): 0.3539161682128906
Benchmark time (3, warm cache): 0.3556997776031494
Benchmark time (4, odds ): 0.17580008506774902
Benchmark time (5, odds ): 0.17888712882995605
Benchmark time (6, evens ): 0.1756150722503662
Benchmark time (7, evens ): 0.17418384552001953
200k .
Benchmark time (1 cold cache ): 19.106724977493286
Benchmark time (2, warm cache): 22.98161005973816
Benchmark time (3, warm cache): 23.126408100128174
Benchmark time (4, odds ): 11.401129007339478
Benchmark time (5, odds ): 0.6159579753875732
Benchmark time (6, evens ): 12.087002992630005
Benchmark time (7, evens ): 0.6241748332977295
.
Benchmark time (1 cold cache ): 4.827088117599487
Benchmark time (2, warm cache): 4.958348035812378
Benchmark time (3, warm cache): 4.923892974853516
Benchmark time (4, odds ): 2.5139119625091553
Benchmark time (5, odds ): 0.3530576229095459
Benchmark time (6, evens ): 2.465486764907837
Benchmark time (7, evens ): 0.3511021137237549

Fizzadar added a commit to beeper/synapse-legacy-fork that referenced this pull request Sep 15, 2022
Synapse 1.67.0 (2022-09-13)
===========================

This release removes using the deprecated direct TCP replication configuration
for workers. Server admins should use Redis instead. See the [upgrade
notes](https://matrix-org.github.io/synapse/v1.67/upgrade.html#upgrading-to-v1670).

The minimum version of `poetry` supported for managing source checkouts is now
1.2.0.

**Notice:** from the next major release (1.68.0) installing Synapse from a source
checkout will require a recent Rust compiler. Those using packages or
`pip install matrix-synapse` will not be affected. See the [upgrade
notes](https://matrix-org.github.io/synapse/v1.67/upgrade.html#upgrading-to-v1670).

**Notice:** from the next major release (1.68.0), running Synapse with a SQLite
database will require SQLite version 3.27.0 or higher. (The [current minimum
 version is SQLite 3.22.0](https://github.com/matrix-org/synapse/blob/release-v1.67/synapse/storage/engines/sqlite.py#L69-L78).)
See [matrix-org#12983](matrix-org#12983) and the [upgrade notes](https://matrix-org.github.io/synapse/v1.67/upgrade.html#upgrading-to-v1670) for more details.

No significant changes since 1.67.0rc1.

Synapse 1.67.0rc1 (2022-09-06)
==============================

Features
--------

- Support setting the registration shared secret in a file, via a new `registration_shared_secret_path` configuration option. ([\matrix-org#13614](matrix-org#13614))
- Change the default startup behaviour so that any missing "additional" configuration files (signing key, etc) are generated automatically. ([\matrix-org#13615](matrix-org#13615))
- Improve performance of sending messages in rooms with thousands of local users. ([\matrix-org#13634](matrix-org#13634))

Bugfixes
--------

- Fix a bug introduced in Synapse 1.13 where the [List Rooms admin API](https://matrix-org.github.io/synapse/develop/admin_api/rooms.html#list-room-api) would return integers instead of booleans for the `federatable` and `public` fields when using a Sqlite database. ([\matrix-org#13509](matrix-org#13509))
- Fix bug that user cannot `/forget` rooms after the last member has left the room. ([\matrix-org#13546](matrix-org#13546))
- Faster Room Joins: fix `/make_knock` blocking indefinitely when the room in question is a partial-stated room. ([\matrix-org#13583](matrix-org#13583))
- Fix loading the current stream position behind the actual position. ([\matrix-org#13585](matrix-org#13585))
- Fix a longstanding bug in `register_new_matrix_user` which meant it was always necessary to explicitly give a server URL. ([\matrix-org#13616](matrix-org#13616))
- Fix the running of [MSC1763](matrix-org/matrix-spec-proposals#1763) retention purge_jobs in deployments with background jobs running on a worker by forcing them back onto the main worker. Contributed by Brad @ Beeper. ([\matrix-org#13632](matrix-org#13632))
- Fix a long-standing bug that downloaded media for URL previews was not deleted while database background updates were running. ([\matrix-org#13657](matrix-org#13657))
- Fix [MSC3030](matrix-org/matrix-spec-proposals#3030) `/timestamp_to_event` endpoint to return the correct next event when the events have the same timestamp. ([\matrix-org#13658](matrix-org#13658))
- Fix bug where we wedge media plugins if clients disconnect early. Introduced in v1.22.0. ([\matrix-org#13660](matrix-org#13660))
- Fix a long-standing bug which meant that keys for unwhitelisted servers were not returned by `/_matrix/key/v2/query`. ([\matrix-org#13683](matrix-org#13683))
- Fix a bug introduced in Synapse v1.20.0 that would cause the unstable unread counts from [MSC2654](matrix-org/matrix-spec-proposals#2654) to be calculated even if the feature is disabled. ([\matrix-org#13694](matrix-org#13694))

Updates to the Docker image
---------------------------

- Update docker image to use a stable version of poetry. ([\matrix-org#13688](matrix-org#13688))

Improved Documentation
----------------------

- Improve the description of the ["chain cover index"](https://matrix-org.github.io/synapse/latest/auth_chain_difference_algorithm.html) used internally by Synapse. ([\matrix-org#13602](matrix-org#13602))
- Document how ["monthly active users"](https://matrix-org.github.io/synapse/latest/usage/administration/monthly_active_users.html) is calculated and used. ([\matrix-org#13617](matrix-org#13617))
- Improve documentation around user registration. ([\matrix-org#13640](matrix-org#13640))
- Remove documentation of legacy `frontend_proxy` worker app. ([\matrix-org#13645](matrix-org#13645))
- Clarify documentation that HTTP replication traffic can be protected with a shared secret. ([\matrix-org#13656](matrix-org#13656))
- Remove unintentional colons from [config manual](https://matrix-org.github.io/synapse/latest/usage/configuration/config_documentation.html) headers. ([\matrix-org#13665](matrix-org#13665))
- Update docs to make enabling metrics more clear. ([\matrix-org#13678](matrix-org#13678))
- Clarify `(room_id, event_id)` global uniqueness and how we should scope our database schemas. ([\matrix-org#13701](matrix-org#13701))

Deprecations and Removals
-------------------------

- Drop support for calling `/_matrix/client/v3/rooms/{roomId}/invite` without an `id_access_token`, which was not permitted by the spec. Contributed by @Vetchu. ([\matrix-org#13241](matrix-org#13241))
- Remove redundant `_get_joined_users_from_context` cache. Contributed by Nick @ Beeper (@Fizzadar). ([\matrix-org#13569](matrix-org#13569))
- Remove the ability to use direct TCP replication with workers. Direct TCP replication was deprecated in Synapse v1.18.0. Workers now require using Redis. ([\matrix-org#13647](matrix-org#13647))
- Remove support for unstable [private read receipts](matrix-org/matrix-spec-proposals#2285). ([\matrix-org#13653](matrix-org#13653), [\matrix-org#13692](matrix-org#13692))

Internal Changes
----------------

- Extend the release script to wait for GitHub Actions to finish and to be usable as a guide for the whole process. ([\matrix-org#13483](matrix-org#13483))
- Add experimental configuration option to allow disabling legacy Prometheus metric names. ([\matrix-org#13540](matrix-org#13540))
- Cache user IDs instead of profiles to reduce cache memory usage. Contributed by Nick @ Beeper (@Fizzadar). ([\matrix-org#13573](matrix-org#13573), [\matrix-org#13600](matrix-org#13600))
- Optimize how Synapse calculates domains to fetch from during backfill. ([\matrix-org#13575](matrix-org#13575))
- Comment about a better future where we can get the state diff between two events. ([\matrix-org#13586](matrix-org#13586))
- Instrument `_check_sigs_and_hash_and_fetch` to trace time spent in child concurrent calls for understandable traces in Jaeger. ([\matrix-org#13588](matrix-org#13588))
- Improve performance of `@cachedList`. ([\matrix-org#13591](matrix-org#13591))
- Minor speed up of fetching large numbers of push rules. ([\matrix-org#13592](matrix-org#13592))
- Optimise push action fetching queries. Contributed by Nick @ Beeper (@Fizzadar). ([\matrix-org#13597](matrix-org#13597))
- Rename `event_map` to `unpersisted_events` when computing the auth differences. ([\matrix-org#13603](matrix-org#13603))
- Refactor `get_users_in_room(room_id)` mis-use with dedicated `get_current_hosts_in_room(room_id)` function. ([\matrix-org#13605](matrix-org#13605))
- Use dedicated `get_local_users_in_room(room_id)` function to find local users when calculating `join_authorised_via_users_server` of a `/make_join` request. ([\matrix-org#13606](matrix-org#13606))
- Refactor `get_users_in_room(room_id)` mis-use to lookup single local user with dedicated `check_local_user_in_room(...)` function. ([\matrix-org#13608](matrix-org#13608))
- Drop unused column `application_services_state.last_txn`. ([\matrix-org#13627](matrix-org#13627))
- Improve readability of Complement CI logs by printing failure results last. ([\matrix-org#13639](matrix-org#13639))
- Generalise the `@cancellable` annotation so it can be used on functions other than just servlet methods. ([\matrix-org#13662](matrix-org#13662))
- Introduce a `CommonUsageMetrics` class to share some usage metrics between the Prometheus exporter and the phone home stats. ([\matrix-org#13671](matrix-org#13671))
- Add some logging to help track down matrix-org#13444. ([\matrix-org#13679](matrix-org#13679))
- Update poetry lock file for v1.2.0. ([\matrix-org#13689](matrix-org#13689))
- Add cache to `is_partial_state_room`. ([\matrix-org#13693](matrix-org#13693))
- Update the Grafana dashboard that is included with Synapse in the `contrib` directory. ([\matrix-org#13697](matrix-org#13697))
- Only run trial CI on all python versions on non-PRs. ([\matrix-org#13698](matrix-org#13698))
- Fix typechecking with latest types-jsonschema. ([\matrix-org#13712](matrix-org#13712))
- Reduce number of CI checks we run for PRs. ([\matrix-org#13713](matrix-org#13713))

# -----BEGIN PGP SIGNATURE-----
#
# iQFEBAABCgAuFiEEBTGR3/RnAzBGUif3pULk7RsPrAkFAmMgR2QQHGVyaWtAbWF0
# cml4Lm9yZwAKCRClQuTtGw+sCfG7B/94PwW1ChsaI8hkz/3e+93PEl/mNJ6YFaEB
# 5pP4Dh/0dipP/iKbpgNuj5xz/JFnIi8D49A8sKNnku3jk0/8AZHgqDiBgOkrN76z
# Y3awo5Q9ag4xww/105V3bhdnX1NrX8Avf6F2jchDv6/9q8wQHGBPg6DMgfZ/m/BL
# SB4dypbbNpgLykuwtWxx6YMUYH+trsXJOn/MoAqld3QcZsqkDR25wXCt9+Dr+6AT
# dPd/czi8kV8ruU59tf2K5HB7XKzBW9S3Qb3dJJmGOTTJ7ccUkN/XuTwqnII950Mo
# bSlMXjY2hqk8rKUNhGZpi9bqUkwNhMgOkZl9A0Y1XtsXx6yjy0T/
# =zSGi
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue Sep 13 10:03:32 2022 BST
# gpg:                using RSA key 053191DFF4670330465227F7A542E4ED1B0FAC09
# gpg:                issuer "[email protected]"
# gpg: Can't check signature: No public key

# Conflicts:
#	synapse/config/experimental.py
#	synapse/push/bulk_push_rule_evaluator.py
#	synapse/storage/databases/main/event_push_actions.py
#	synapse/util/caches/descriptors.py
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Performance Performance, both client-facing and admin-facing T-Task Refactoring, removal, replacement, enabling or disabling functionality, other engineering tasks.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants