Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Limit how often GC happens by time. #9902

Merged
merged 6 commits into from
May 5, 2021
Merged

Conversation

erikjohnston
Copy link
Member

Synapse can be quite memory intensive, and unless care is taken to tune
the GC thresholds it can end up thrashing, causing noticable performance
problems for large servers. We fix this by limiting how often we GC a
given generation, regardless of current counts/thresholds.

This does not help with the reverse problem where the thresholds are set
too high, but that should only happen in situations where they've been
manually configured.

Adds a gc_min_seconds_between config option to override the defaults.

Fixes #9890.

Synapse can be quite memory intensive, and unless care is taken to tune
the GC thresholds it can end up thrashing, causing noticable performance
problems for large servers. We fix this by limiting how often we GC a
given generation, regardless of current counts/thresholds.

This does not help with the reverse problem where the thresholds are set
too high, but that should only happen in situations where they've been
manually configured.

Adds a `gc_min_seconds_between` config option to override the defaults.

Fixes #9890.
@erikjohnston erikjohnston requested a review from a team April 28, 2021 14:27
# We check if we need to do one based on a straightforward
# comparison between the threshold and count. We also do an extra
# check to make sure that we don't a GC too often.
if threshold[i] < counts[i] and MIN_TIME_BETWEEN_GCS[i] < end - _last_gc[i]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can become out-of-bounds if someone specifies [1, 10] or similar accidentally.

Suggested change
if threshold[i] < counts[i] and MIN_TIME_BETWEEN_GCS[i] < end - _last_gc[i]:
if threshold[i] < counts[i] and len(MIN_TIME_BETWEEN_GCS) => i and MIN_TIME_BETWEEN_GCS[i] < end - _last_gc[i]:

I'm also wary of mutating a "constant" like this, because it can give mixed signals in the code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, mutating is a bit icky, but I can't think of a better way of injecting the values at this point.

This can become out-of-bounds if someone specifies [1, 10] or similar accidentally.

We check for the right format in the config, so I think its fine that this explodes if a developer messes up and sets it incorrectly.

Copy link
Member

@richvdh richvdh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, I have Opinions about configuration settings.

synapse/metrics/__init__.py Outdated Show resolved Hide resolved
synapse/config/server.py Outdated Show resolved Hide resolved
synapse/config/server.py Outdated Show resolved Hide resolved
synapse/config/server.py Outdated Show resolved Hide resolved
synapse/config/server.py Outdated Show resolved Hide resolved
synapse/metrics/__init__.py Outdated Show resolved Hide resolved
@erikjohnston erikjohnston requested a review from richvdh May 4, 2021 12:01
docs/sample_config.yaml Outdated Show resolved Hide resolved
synapse/config/server.py Outdated Show resolved Hide resolved
synapse/config/server.py Show resolved Hide resolved
@erikjohnston erikjohnston requested a review from richvdh May 4, 2021 13:51
Copy link
Member

@richvdh richvdh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know why we're working in seconds here where we use ms literally everwhere else, but lgtm otherwise.

@erikjohnston erikjohnston merged commit 1fb9a2d into develop May 5, 2021
@erikjohnston erikjohnston deleted the erikj/limit_how_often_gc branch May 5, 2021 15:53
@erikjohnston
Copy link
Member Author

I don't know why we're working in seconds here where we use ms literally everwhere else, but lgtm otherwise.

Where we using it everything is done in seconds (as we don't have access to the standard clocks), so using milliseconds would involve a lot of repeated conversion

aaronraimist added a commit to aaronraimist/synapse that referenced this pull request May 19, 2021
Synapse 1.34.0 (2021-05-17)
===========================

This release deprecates the `room_invite_state_types` configuration setting. See the [upgrade notes](https://github.com/matrix-org/synapse/blob/release-v1.34.0/UPGRADE.rst#upgrading-to-v1340) for instructions on updating your configuration file to use the new `room_prejoin_state` setting.

This release also deprecates the `POST /_synapse/admin/v1/rooms/<room_id>/delete` admin API route. Server administrators are encouraged to update their scripts to use the new `DELETE /_synapse/admin/v1/rooms/<room_id>` route instead.

No significant changes since v1.34.0rc1.

Synapse 1.34.0rc1 (2021-05-12)
==============================

Features
--------

- Add experimental option to track memory usage of the caches. ([\matrix-org#9881](matrix-org#9881))
- Add support for `DELETE /_synapse/admin/v1/rooms/<room_id>`. ([\matrix-org#9889](matrix-org#9889))
- Add limits to how often Synapse will GC, ensuring that large servers do not end up GC thrashing if `gc_thresholds` has not been correctly set. ([\matrix-org#9902](matrix-org#9902))
- Improve performance of sending events for worker-based deployments using Redis. ([\matrix-org#9905](matrix-org#9905), [\matrix-org#9950](matrix-org#9950), [\matrix-org#9951](matrix-org#9951))
- Improve performance after joining a large room when presence is enabled. ([\matrix-org#9910](matrix-org#9910), [\matrix-org#9916](matrix-org#9916))
- Support stable identifiers for [MSC1772](matrix-org/matrix-spec-proposals#1772) Spaces. `m.space.child` events will now be taken into account when populating the experimental spaces summary response. Please see [the upgrade notes](https://github.com/matrix-org/synapse/blob/release-v1.34.0/UPGRADE.rst#upgrading-to-v1340) if you have customised `room_invite_state_types` in your configuration. ([\matrix-org#9915](matrix-org#9915), [\matrix-org#9966](matrix-org#9966))
- Improve performance of backfilling in large rooms. ([\matrix-org#9935](matrix-org#9935))
- Add a config option to allow you to prevent device display names from being shared over federation. Contributed by @aaronraimist. ([\matrix-org#9945](matrix-org#9945))
- Update support for [MSC2946](matrix-org/matrix-spec-proposals#2946): Spaces Summary. ([\matrix-org#9947](matrix-org#9947), [\matrix-org#9954](matrix-org#9954))

Bugfixes
--------

- Fix a bug introduced in v1.32.0 where the associated connection was improperly logged for SQL logging statements. ([\matrix-org#9895](matrix-org#9895))
- Correct the type hint for the `user_may_create_room_alias` method of spam checkers. It is provided a `RoomAlias`, not a `str`. ([\matrix-org#9896](matrix-org#9896))
- Fix bug where user directory could get out of sync if room visibility and membership changed in quick succession. ([\matrix-org#9910](matrix-org#9910))
- Include the `origin_server_ts` property in the experimental [MSC2946](matrix-org/matrix-spec-proposals#2946) support to allow clients to properly sort rooms. ([\matrix-org#9928](matrix-org#9928))
- Fix bugs introduced in v1.23.0 which made the PostgreSQL port script fail when run with a newly-created SQLite database. ([\matrix-org#9930](matrix-org#9930))
- Fix a bug introduced in Synapse 1.29.0 which caused `m.room_key_request` to-device messages sent from one user to another to be dropped. ([\matrix-org#9961](matrix-org#9961), [\matrix-org#9965](matrix-org#9965))
- Fix a bug introduced in v1.27.0 preventing users and appservices exempt from ratelimiting from creating rooms with many invitees. ([\matrix-org#9968](matrix-org#9968))

Updates to the Docker image
---------------------------

- Add `startup_delay` to docker healthcheck to reduce waiting time for coming online and update the documentation with extra options. Contributed by @maquis196. ([\matrix-org#9913](matrix-org#9913))

Improved Documentation
----------------------

- Add `port` argument to the Postgres database sample config section. ([\matrix-org#9911](matrix-org#9911))

Deprecations and Removals
-------------------------

- Mark as deprecated `POST /_synapse/admin/v1/rooms/<room_id>/delete`. ([\matrix-org#9889](matrix-org#9889))

Internal Changes
----------------

- Reduce the length of Synapse's access tokens. ([\matrix-org#5588](matrix-org#5588))
- Export jemalloc stats to Prometheus if it is being used. ([\matrix-org#9882](matrix-org#9882))
- Add type hints to presence handler. ([\matrix-org#9885](matrix-org#9885))
- Reduce memory usage of the LRU caches. ([\matrix-org#9886](matrix-org#9886))
- Add type hints to the `synapse.handlers` module. ([\matrix-org#9896](matrix-org#9896))
- Time response time for external cache requests. ([\matrix-org#9904](matrix-org#9904))
- Minor fixes to the `make_full_schema.sh` script. ([\matrix-org#9931](matrix-org#9931))
- Move database schema files into a common directory. ([\matrix-org#9932](matrix-org#9932))
- Add debug logging for lost/delayed to-device messages. ([\matrix-org#9959](matrix-org#9959))
vy-let added a commit to vy-let/synapse that referenced this pull request May 31, 2021
Synapse 1.34.0 (2021-05-17)
===========================

This release deprecates the `room_invite_state_types` configuration setting. See the [upgrade notes](https://github.com/matrix-org/synapse/blob/release-v1.34.0/UPGRADE.rst#upgrading-to-v1340) for instructions on updating your configuration file to use the new `room_prejoin_state` setting.

This release also deprecates the `POST /_synapse/admin/v1/rooms/<room_id>/delete` admin API route. Server administrators are encouraged to update their scripts to use the new `DELETE /_synapse/admin/v1/rooms/<room_id>` route instead.

No significant changes since v1.34.0rc1.

Synapse 1.34.0rc1 (2021-05-12)
==============================

Features
--------

- Add experimental option to track memory usage of the caches. ([\matrix-org#9881](matrix-org#9881))
- Add support for `DELETE /_synapse/admin/v1/rooms/<room_id>`. ([\matrix-org#9889](matrix-org#9889))
- Add limits to how often Synapse will GC, ensuring that large servers do not end up GC thrashing if `gc_thresholds` has not been correctly set. ([\matrix-org#9902](matrix-org#9902))
- Improve performance of sending events for worker-based deployments using Redis. ([\matrix-org#9905](matrix-org#9905), [\matrix-org#9950](matrix-org#9950), [\matrix-org#9951](matrix-org#9951))
- Improve performance after joining a large room when presence is enabled. ([\matrix-org#9910](matrix-org#9910), [\matrix-org#9916](matrix-org#9916))
- Support stable identifiers for [MSC1772](matrix-org/matrix-spec-proposals#1772) Spaces. `m.space.child` events will now be taken into account when populating the experimental spaces summary response. Please see [the upgrade notes](https://github.com/matrix-org/synapse/blob/release-v1.34.0/UPGRADE.rst#upgrading-to-v1340) if you have customised `room_invite_state_types` in your configuration. ([\matrix-org#9915](matrix-org#9915), [\matrix-org#9966](matrix-org#9966))
- Improve performance of backfilling in large rooms. ([\matrix-org#9935](matrix-org#9935))
- Add a config option to allow you to prevent device display names from being shared over federation. Contributed by @aaronraimist. ([\matrix-org#9945](matrix-org#9945))
- Update support for [MSC2946](matrix-org/matrix-spec-proposals#2946): Spaces Summary. ([\matrix-org#9947](matrix-org#9947), [\matrix-org#9954](matrix-org#9954))

Bugfixes
--------

- Fix a bug introduced in v1.32.0 where the associated connection was improperly logged for SQL logging statements. ([\matrix-org#9895](matrix-org#9895))
- Correct the type hint for the `user_may_create_room_alias` method of spam checkers. It is provided a `RoomAlias`, not a `str`. ([\matrix-org#9896](matrix-org#9896))
- Fix bug where user directory could get out of sync if room visibility and membership changed in quick succession. ([\matrix-org#9910](matrix-org#9910))
- Include the `origin_server_ts` property in the experimental [MSC2946](matrix-org/matrix-spec-proposals#2946) support to allow clients to properly sort rooms. ([\matrix-org#9928](matrix-org#9928))
- Fix bugs introduced in v1.23.0 which made the PostgreSQL port script fail when run with a newly-created SQLite database. ([\matrix-org#9930](matrix-org#9930))
- Fix a bug introduced in Synapse 1.29.0 which caused `m.room_key_request` to-device messages sent from one user to another to be dropped. ([\matrix-org#9961](matrix-org#9961), [\matrix-org#9965](matrix-org#9965))
- Fix a bug introduced in v1.27.0 preventing users and appservices exempt from ratelimiting from creating rooms with many invitees. ([\matrix-org#9968](matrix-org#9968))

Updates to the Docker image
---------------------------

- Add `startup_delay` to docker healthcheck to reduce waiting time for coming online and update the documentation with extra options. Contributed by @maquis196. ([\matrix-org#9913](matrix-org#9913))

Improved Documentation
----------------------

- Add `port` argument to the Postgres database sample config section. ([\matrix-org#9911](matrix-org#9911))

Deprecations and Removals
-------------------------

- Mark as deprecated `POST /_synapse/admin/v1/rooms/<room_id>/delete`. ([\matrix-org#9889](matrix-org#9889))

Internal Changes
----------------

- Reduce the length of Synapse's access tokens. ([\matrix-org#5588](matrix-org#5588))
- Export jemalloc stats to Prometheus if it is being used. ([\matrix-org#9882](matrix-org#9882))
- Add type hints to presence handler. ([\matrix-org#9885](matrix-org#9885))
- Reduce memory usage of the LRU caches. ([\matrix-org#9886](matrix-org#9886))
- Add type hints to the `synapse.handlers` module. ([\matrix-org#9896](matrix-org#9896))
- Time response time for external cache requests. ([\matrix-org#9904](matrix-org#9904))
- Minor fixes to the `make_full_schema.sh` script. ([\matrix-org#9931](matrix-org#9931))
- Move database schema files into a common directory. ([\matrix-org#9932](matrix-org#9932))
- Add debug logging for lost/delayed to-device messages. ([\matrix-org#9959](matrix-org#9959))
babolivier added a commit to matrix-org/synapse-dinsic that referenced this pull request Sep 1, 2021
Synapse 1.34.0 (2021-05-17)
===========================

This release deprecates the `room_invite_state_types` configuration setting. See the [upgrade notes](https://github.com/matrix-org/synapse/blob/release-v1.34.0/UPGRADE.rst#upgrading-to-v1340) for instructions on updating your configuration file to use the new `room_prejoin_state` setting.

This release also deprecates the `POST /_synapse/admin/v1/rooms/<room_id>/delete` admin API route. Server administrators are encouraged to update their scripts to use the new `DELETE /_synapse/admin/v1/rooms/<room_id>` route instead.

No significant changes since v1.34.0rc1.

Synapse 1.34.0rc1 (2021-05-12)
==============================

Features
--------

- Add experimental option to track memory usage of the caches. ([\#9881](matrix-org/synapse#9881))
- Add support for `DELETE /_synapse/admin/v1/rooms/<room_id>`. ([\#9889](matrix-org/synapse#9889))
- Add limits to how often Synapse will GC, ensuring that large servers do not end up GC thrashing if `gc_thresholds` has not been correctly set. ([\#9902](matrix-org/synapse#9902))
- Improve performance of sending events for worker-based deployments using Redis. ([\#9905](matrix-org/synapse#9905), [\#9950](matrix-org/synapse#9950), [\#9951](matrix-org/synapse#9951))
- Improve performance after joining a large room when presence is enabled. ([\#9910](matrix-org/synapse#9910), [\#9916](matrix-org/synapse#9916))
- Support stable identifiers for [MSC1772](matrix-org/matrix-spec-proposals#1772) Spaces. `m.space.child` events will now be taken into account when populating the experimental spaces summary response. Please see [the upgrade notes](https://github.com/matrix-org/synapse/blob/release-v1.34.0/UPGRADE.rst#upgrading-to-v1340) if you have customised `room_invite_state_types` in your configuration. ([\#9915](matrix-org/synapse#9915), [\#9966](matrix-org/synapse#9966))
- Improve performance of backfilling in large rooms. ([\#9935](matrix-org/synapse#9935))
- Add a config option to allow you to prevent device display names from being shared over federation. Contributed by @aaronraimist. ([\#9945](matrix-org/synapse#9945))
- Update support for [MSC2946](matrix-org/matrix-spec-proposals#2946): Spaces Summary. ([\#9947](matrix-org/synapse#9947), [\#9954](matrix-org/synapse#9954))

Bugfixes
--------

- Fix a bug introduced in v1.32.0 where the associated connection was improperly logged for SQL logging statements. ([\#9895](matrix-org/synapse#9895))
- Correct the type hint for the `user_may_create_room_alias` method of spam checkers. It is provided a `RoomAlias`, not a `str`. ([\#9896](matrix-org/synapse#9896))
- Fix bug where user directory could get out of sync if room visibility and membership changed in quick succession. ([\#9910](matrix-org/synapse#9910))
- Include the `origin_server_ts` property in the experimental [MSC2946](matrix-org/matrix-spec-proposals#2946) support to allow clients to properly sort rooms. ([\#9928](matrix-org/synapse#9928))
- Fix bugs introduced in v1.23.0 which made the PostgreSQL port script fail when run with a newly-created SQLite database. ([\#9930](matrix-org/synapse#9930))
- Fix a bug introduced in Synapse 1.29.0 which caused `m.room_key_request` to-device messages sent from one user to another to be dropped. ([\#9961](matrix-org/synapse#9961), [\#9965](matrix-org/synapse#9965))
- Fix a bug introduced in v1.27.0 preventing users and appservices exempt from ratelimiting from creating rooms with many invitees. ([\#9968](matrix-org/synapse#9968))

Updates to the Docker image
---------------------------

- Add `startup_delay` to docker healthcheck to reduce waiting time for coming online and update the documentation with extra options. Contributed by @maquis196. ([\#9913](matrix-org/synapse#9913))

Improved Documentation
----------------------

- Add `port` argument to the Postgres database sample config section. ([\#9911](matrix-org/synapse#9911))

Deprecations and Removals
-------------------------

- Mark as deprecated `POST /_synapse/admin/v1/rooms/<room_id>/delete`. ([\#9889](matrix-org/synapse#9889))

Internal Changes
----------------

- Reduce the length of Synapse's access tokens. ([\#5588](matrix-org/synapse#5588))
- Export jemalloc stats to Prometheus if it is being used. ([\#9882](matrix-org/synapse#9882))
- Add type hints to presence handler. ([\#9885](matrix-org/synapse#9885))
- Reduce memory usage of the LRU caches. ([\#9886](matrix-org/synapse#9886))
- Add type hints to the `synapse.handlers` module. ([\#9896](matrix-org/synapse#9896))
- Time response time for external cache requests. ([\#9904](matrix-org/synapse#9904))
- Minor fixes to the `make_full_schema.sh` script. ([\#9931](matrix-org/synapse#9931))
- Move database schema files into a common directory. ([\#9932](matrix-org/synapse#9932))
- Add debug logging for lost/delayed to-device messages. ([\#9959](matrix-org/synapse#9959))
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Limit how often we do a GC.
3 participants