-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
high CPU load by "matrix- _process_incoming_pdus_in_room_inner" #11818
Comments
What EMS was seeing sounds a little different since I'm assuming @zxyz has restarted the service sometime between this starting? It might be useful to grep the logs for |
Thanks @clokep for the hint. Synapse and also the VM got restarted/upgraded several times. I let the debug log run over night. There are a 300 entries with `state-res´, here is an excerpt:
"!OGEhHVWSdvArJzumhm:matrix.org" is Matrix HQ if I'm not mistaken. The other rooms in case of "3 biggest rooms" had much lower values for DB and CPU , not posting them here because I don't know if they're private rooms and I don't want to reveal anything private :). We were already considering setting "complexity: 200.0" on the server to basically just kick out "Matrix HQ" because of to many resource usage (state...). So maybe I'll try this now. Or are there other options? But with your
There are a few rooms with c of 3, 4 and one with 5. But there is one room with a c of 109 (!) (federated room at privacytools.io). My prometheus data goes back 90 days and this c 109 room was there all the time. This room also has higher state-res than most other in the debug log:
But should/can I do anything about this room? Thanks a lot, much appreciated! PS: synapse was upgraded to 1.51.0 in between, no change. |
Note that I don't believe setting that now will have your users leave Matrix HQ. You would want to delete the room afterward: https://matrix-org.github.io/synapse/develop/admin_api/rooms.html#delete-room-api (please read through that first before just running the commands!)
See the forward extremities admin APIs: https://matrix-org.github.io/synapse/develop/admin_api/rooms.html#forward-extremities-admin-api, you might want to delete the forward extremities (this pretty much makes an event to simplify the overall DAG, which should help with state res). It mentions reading through #1760 though, we should probably move the important bits of that into our documentation. 😢 |
thanks a lot @clokep, deleting forward extremities did the trick!
I set Any ideas why this still worked? Then I deleted the room with:
Which failed after ~ 15 minutes with Re-runing the command again results in But the data is still there:
yields:
So now I'd have to delete it by hand in the database (hints on how to best do this?)?
I deleted the "forward extremities" in the database with the postgres command from [1] and the CPU load normalized Tbh I still don't understand what "forward extremities" is and wasn't able to enlighten myself with a web research. I'd be really thankful if someone could enlighten me here! :) Having this documented would be really nice! 🐝 I'm closing the ticket. Thanks so much for your help @clokep! 💜 [1] #1760 (comment) |
The setting only applies to new rooms, this allows admins to join rooms that are more complex and still have them work.
It probably would have been better to use the
Sounds like something in the purging broke, unfortunately. I don't have any ideas of a good way to poke at that. Most of the tables have a
I'm glad it worked for you! Our docs have a bit of info on what a forward extremity is
You're welcome! |
Description
Since ~ November the 24th last year I'm observing strange metrics in synapse.
This is the day I upgraded synapse from 1.45.1 to 1.46 (that was the only change that day).
CPU load went higher together with some other metrics (I'm referring to the synapse grafana dashboard, see below for screenshots):
What I tried:
Version information
matrix-docker-ansible-deploy
playbookgrafs
Please let me know if you'd like to have some other screenshots or information.
Thanks a lot!
The text was updated successfully, but these errors were encountered: