Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Homeserver not federating anymore after hour of downtime #3083

Closed
MartijnBraam opened this issue Apr 10, 2018 · 7 comments
Closed

Homeserver not federating anymore after hour of downtime #3083

MartijnBraam opened this issue Apr 10, 2018 · 7 comments

Comments

@MartijnBraam
Copy link

MartijnBraam commented Apr 10, 2018

Description

Synapse is not federating anymore after a reboot. I had about one hour of downtime on my synapse server and after two day's I'm still not getting any messages in the federated rooms I've joined.

The log also doesn't seem useful since it doesn't contain anything else than INFO and DEBUG message's. I usually have a lot of ERROR and WARNING messages since twisted seems to be a dumpster fire if IPv6 is enabled.

I still see the typing notifications in the rooms but most new messages don't show up, I have one room in which I get a few messages a day but things I send never appear in the same room on matrix.org.

Steps to reproduce

  • Stop synapse for an hour

This is the only thing in my logs getting repeated:

2018-04-10 13:49:24,946 - synapse.handlers.presence - 328 - INFO - - Handling presence timeouts
2018-04-10 13:49:24,946 - synapse.handlers.typing - 78 - INFO - - Checking for typing timeouts
2018-04-10 13:49:25,027 - synapse.storage.TIME - 213 - INFO - - Total database time: 0.003% {_update_client_ips_batch(2): 0.003%, update_cached_last_access_time(0): 0.000%, get_user_directory_stream_pos(0): 0.000%} {}
2018-04-10 13:49:29,927 - synapse.storage.txn - 229 - DEBUG - None- [TXN START] {_update_client_ips_batch-2f52}
2018-04-10 13:49:29,928 - synapse.storage.txn - 286 - DEBUG - None- [TXN END] {_update_client_ips_batch-2f52} 0.169189
2018-04-10 13:49:29,946 - synapse.handlers.presence - 328 - INFO - - Handling presence timeouts
2018-04-10 13:49:29,946 - synapse.handlers.typing - 78 - INFO - - Checking for typing timeouts
2018-04-10 13:49:34,927 - synapse.storage.txn - 229 - DEBUG - None- [TXN START] {_update_client_ips_batch-2f53}
2018-04-10 13:49:34,928 - synapse.storage.txn - 286 - DEBUG - None- [TXN END] {_update_client_ips_batch-2f53} 0.199951
2018-04-10 13:49:34,946 - synapse.handlers.presence - 328 - INFO - - Handling presence timeouts
2018-04-10 13:49:34,946 - synapse.handlers.presence - 243 - INFO - - Performing _persist_unpersisted_changes. Persisting 0 unpersisted changes
2018-04-10 13:49:34,946 - synapse.handlers.presence - 255 - INFO - - Finished _persist_unpersisted_changes
2018-04-10 13:49:34,946 - synapse.handlers.typing - 78 - INFO - - Checking for typing timeouts
2018-04-10 13:49:34,953 - synapse.storage.txn - 229 - DEBUG - None- [TXN START] {update_cached_last_access_time-2f54}
2018-04-10 13:49:34,954 - synapse.storage.txn - 286 - DEBUG - None- [TXN END] {update_cached_last_access_time-2f54} 0.170898
2018-04-10 13:49:35,027 - synapse.storage.TIME - 213 - INFO - - Total database time: 0.005% {_update_client_ips_batch(2): 0.004%, update_cached_last_access_time(1): 0.002%, get_user_directory_stream_pos(0): 0.000%} {}
2018-04-10 13:49:35,027 - synapse.util.caches.expiringcache - 139 - DEBUG - - [get_pdu_cache] _prune_cache before: 0, after len: 0
2018-04-10 13:49:39,927 - synapse.storage.txn - 229 - DEBUG - None- [TXN START] {_update_client_ips_batch-2f55}
2018-04-10 13:49:39,928 - synapse.storage.txn - 286 - DEBUG - None- [TXN END] {_update_client_ips_batch-2f55} 0.187012
2018-04-10 13:49:39,946 - synapse.handlers.presence - 328 - INFO - - Handling presence timeouts
2018-04-10 13:49:39,946 - synapse.handlers.typing - 78 - INFO - - Checking for typing timeouts
2018-04-10 13:49:44,927 - synapse.storage.txn - 229 - DEBUG - None- [TXN START] {_update_client_ips_batch-2f56}
2018-04-10 13:49:44,928 - synapse.storage.txn - 286 - DEBUG - None- [TXN END] {_update_client_ips_batch-2f56} 0.184082
2018-04-10 13:49:44,946 - synapse.handlers.presence - 328 - INFO - - Handling presence timeouts
2018-04-10 13:49:44,946 - synapse.handlers.typing - 78 - INFO - - Checking for typing timeouts
2018-04-10 13:49:45,027 - synapse.storage.TIME - 213 - INFO - - Total database time: 0.004% {_update_client_ips_batch(2): 0.004%, update_cached_last_access_time(0): 0.000%, get_user_directory_stream_pos(0): 0.000%} {}
2018-04-10 13:49:49,927 - synapse.storage.txn - 229 - DEBUG - None- [TXN START] {_update_client_ips_batch-2f57}
2018-04-10 13:49:49,928 - synapse.storage.txn - 286 - DEBUG - None- [TXN END] {_update_client_ips_batch-2f57} 0.173096
2018-04-10 13:49:49,946 - synapse.handlers.presence - 328 - INFO - - Handling presence timeouts
2018-04-10 13:49:49,946 - synapse.handlers.typing - 78 - INFO - - Checking for typing timeouts
2018-04-10 13:49:54,927 - synapse.storage.txn - 229 - DEBUG - None- [TXN START] {_update_client_ips_batch-2f58}
2018-04-10 13:49:54,928 - synapse.storage.txn - 286 - DEBUG - None- [TXN END] {_update_client_ips_batch-2f58} 0.201904
2018-04-10 13:49:54,946 - synapse.handlers.presence - 328 - INFO - - Handling presence timeouts
2018-04-10 13:49:54,946 - synapse.handlers.typing - 78 - INFO - - Checking for typing timeouts
2018-04-10 13:49:55,027 - synapse.storage.TIME - 213 - INFO - - Total database time: 0.004% {_update_client_ips_batch(2): 0.004%, update_cached_last_access_time(0): 0.000%, get_user_directory_stream_pos(0): 0.000%} {}
2018-04-10 13:49:59,927 - synapse.storage.txn - 229 - DEBUG - None- [TXN START] {_update_client_ips_batch-2f59}
2018-04-10 13:49:59,928 - synapse.storage.txn - 286 - DEBUG - None- [TXN END] {_update_client_ips_batch-2f59} 0.171143
2018-04-10 13:49:59,946 - synapse.handlers.presence - 328 - INFO - - Handling presence timeouts
2018-04-10 13:49:59,946 - synapse.handlers.typing - 78 - INFO - - Checking for typing timeouts

Version information

  • Homeserver: brixit.nl
  • Version: 0.27.2-1
  • Install method: debian package
  • Platform: Running in a debian 9 LXC container on proxmox
@ara4n
Copy link
Member

ara4n commented Apr 10, 2018

I'm surprised there's not something in there at ERROR level. As a hunch, can you check to see if this is fixed by #3082?

@MartijnBraam
Copy link
Author

That patch doesn't apply cleanly on the current version in the debian repository

@MartijnBraam
Copy link
Author

I've just upgraded to 0.27.2 and the problem persists. Here is the full homeserver log after a restart:

homeserver.log

@fuzzy76
Copy link

fuzzy76 commented Apr 18, 2018

Had something very similar, the federation tester described at https://github.com/matrix-org/synapse/blob/master/README.rst#id35 helped me identify my dns misconfiguration.

@MartijnBraam
Copy link
Author

Nothing has changed in my dns setup, the checker also is happy about my config: https://matrix.org/federationtester/api/report?server_name=brixit.nl

@Half-Shot
Copy link
Collaborator

I still see the typing notifications in the rooms but most new messages don't show up

Most, or all?

I'm wondering if your server is being federated to but rejecting the messages (my hunch is that EDUs are not really validated that much while PDUs might be being kicked out)

@MartijnBraam
Copy link
Author

It turns out that this isn't a synapse issue at all. I had 2 VM's running with two versions of synapse, one of which shouldn't be running and isn't portforwarded, that was the one I was debugging....

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants