-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Summary of performance impact of running on resource constrained devices such as SBCs #8428
Comments
Note the |
Once this is complete it may be nice to place a formatted into the repo's wiki. |
I've now added the section about presence, since that's the high-profile one as far as I'm concerned. @ptman thanks, I'm not familiar with that one, could you expand on its impact for me, or give an example room that would be blocked? I've joined several of the common so-called "large" rooms with 5000+ users and a long history without issue (after the initial join and disabling presence). @anoadragon453 whatever you think is best. Could you also tag this with the "performance" label so it shows up in the issue searches? |
@emorrp1 it's not about user count, more about participating homeserver count. Check out
|
So if I understand you correctly, you're saying that limit_remote_rooms is a mitigation for #7671? Do you have a suggested complexity value, since 1.0 seems small? Here's the top 5 complex rooms I've joined, the 3rd is one I participate in regularly without issue, though I've just sent you a message in Matrix HQ and can see that has topped out my cpu, but not for long.
|
It was created by New Vector to limit problems on their smallest hosted instances: #5783 |
Note that this only disables the UI and idle tracking that element-web does. It mostly only makes sense to add to that config if the homeserver also has presence disabled. |
Hi all, I think I've finished all the issues I've seen, please tell me if you experience any others or if you have any wording improvements. |
It's been a couple of weeks without comment, I have converted the issue to a wiki page as suggested by @anoadragon453. Please make improvements there. https://github.com/matrix-org/synapse/wiki/Running-synapse-on-Single-board-computers |
The latest url is: https://matrix-org.github.io/synapse/latest/other/running_synapse_on_single_board_computers.html |
Relevant bit of source code: synapse/synapse/config/server.py Lines 231 to 232 in 119edf5
|
Description
I've been running my homeserver on a cubietruck at home now for some time and am often replying to statements like "you need loads of ram to join large rooms" with "it works fine for me". I thought it might be useful to curate a summary of the issues you're likely to run into to help as a scaling-down guide, maybe highlight these for development work or end up as documentation.
Performance Issues
Presence
This is the main reason people have a poor matrix experience on resource constrained homeservers. Element web will frequently be saying the server is offline while the python process will be pegged at 100% cpu. This feature is used to tell when other users are active (have a client app in the foreground) and therefore more likely to respond, but requires a lot of network activity to maintain even when nobody is talking in a room.
While synapse does have some performance issues with presence #3971, the fundamental problem is that this is an easy feature to implement for a centralised service at nearly no overhead, but federation makes it combinatorial #8055. There is also a client-side config option which disables the UI and idle tracking enable_presence_by_hs_url to blacklist the largest instances but I didn't notice much difference, so I recommend disabling the feature entirely at the server level as well.
Joining
Joining a "large", federated room will initially fail with the below message in Element web, but waiting a while (10-60mins) and trying again will succeed without any issue. What counts as "large" is not message history, user count, connections to homeservers or even a simple count of the state events, it is instead how long the state resolution algorithm takes. However, each of those numbers are reasonable proxies, so we can use them as estimates since user count is one of the few things you see before joining.
This is #1211 and will also hopefully be mitigated by peeking matrix-org/matrix-spec-proposals#2753 so at least you don't need to wait for a join to complete before finding out if it's the kind of room you want. Note that you should first disable presence, otherwise it'll just make the situation worse #3120. There is a lot of database interaction too, so make sure you've migrated your data from the default sqlite to postgresql. Personally, I recommend patience - once the initial join is complete there's rarely any issues with actually interacting with the room, but if you like you can just block "large" rooms entirely.
Sessions
Anything that requires modifying the device list #7721 will take a while to propagate, again taking the client "Offline" until it's complete. This includes signing in and out, editing the public name and verifying e2ee. The main mitigation I recommend is to keep long-running sessions open e.g. by using Firefox SSB "Use this site in App mode" or Chromium PWA "Install Element".
Recommended configuration
Put the below in a new file at /etc/matrix-synapse/conf.d/sbc.yaml to override the defaults in homeserver.yaml.
Currently the complexity is measured by current_state_events / 500. You can find join times and your most complex rooms like this:
Version information
The text was updated successfully, but these errors were encountered: