Skip to content
This repository has been archived by the owner on Nov 25, 2024. It is now read-only.

Dendrite getting off line on and off for the last 2 hours for no apparent reason. #2015

Closed
r3k2 opened this issue Sep 27, 2021 · 8 comments
Closed

Comments

@r3k2
Copy link

r3k2 commented Sep 27, 2021

I am getting a bunch of errors on my logs for the last 2 hours, all of the sudden for no mayor reason and none of us that I have spoken to by other channels can connect, goes on and off, but daemon is up, and nginx is up, ip has not changed the federation test passes with all green... I do notice maybe something non-related but suspicious with one particular domain having 95% of the errors t2bot.io

or room!JMNJGukdnCivwastWM:tomesh.net" func="processTransaction\n\t" file=" [send.go:280]" error="context canceled" req.id=jFghmtyI1jWC req.method=PUT req.path=/_matrix/federation/v1/send/1629439834243
Sep 27 21:12:03 matrix dendrite-monolith-server[389]: time="2021-09-27T21:12:03.863307403Z" level=error msg="failed to query device keys for some users" func="processServer\n\t" file=" [device_list_update.go:401]" context=missing failed=1 server_name=t2bot.io total=1 wait=2s
Sep 27 21:12:06 matrix dendrite-monolith-server[389]: time="2021-09-27T21:12:06.871382078Z" level=error msg="failed to query device keys for some users" func="processServer\n\t" file=" [device_list_update.go:401]" context=missing failed=1 server_name=t2bot.io total=1 wait=2s
Sep 27 21:12:09 matrix dendrite-monolith-server[389]: time="2021-09-27T21:12:09.874638711Z" level=error msg="failed to query device keys for some users" func="processServer\n\t" file=" [device_list_update.go:401]" context=missing failed=1 server_name=t2bot.io total=1 wait=2s
Sep 27 21:12:12 matrix dendrite-monolith-server[389]: time="2021-09-27T21:12:12.884090429Z" level=error msg="failed to query device keys for some users" func="processServer\n\t" file=" [device_list_update.go:401]" context=missing failed=1 server_name=t2bot.io total=1 wait=2s
Sep 27 21:12:15 matrix dendrite-monolith-server[389]: time="2021-09-27T21:12:15.884973636Z" level=error msg="failed to query device keys for some users" func="processServer\n\t" file=" [device_list_update.go:401]" context=missing failed=1 server_name=t2bot.io total=1 wait=2s
Sep 27 21:12:15 matrix dendrite-monolith-server[389]: time="2021-09-27T21:12:15.922186909Z" level=warning msg="Transaction: Failed to query room version for room!joxsyRkUcrElcVOMHt:matrix.org" func="processTransaction\n\t" file=" [send.go:280]" error="context canceled" req.id=3CnMdoIb39lv req.method=PUT req.path=/_matrix/federation/v1/send/1632757076333
Sep 27 21:12:15 matrix dendrite-monolith-server[389]: time="2021-09-27T21:12:15.922512481Z" level=warning msg="Transaction: Failed to query room version for room!joxsyRkUcrElcVOMHt:matrix.org" func="processTransaction\n\t" file=" [send.go:280]" error="context canceled" req.id=3CnMdoIb39lv req.method=PUT req.path=/_matrix/federation/v1/send/1632757076333
Sep 27 21:12:18 matrix dendrite-monolith-server[389]: time="2021-09-27T21:12:18.884131920Z" level=error msg="failed to query device keys for some users" func="processServer\n\t" file=" [device_list_update.go:401]" context=missing failed=1 server_name=t2bot.io total=1 wait=2s
Sep 27 21:12:21 matrix dendrite-monolith-server[389]: time="2021-09-27T21:12:21.895686808Z" level=error msg="failed to query device keys for some users" func="processServer\n\t" file=" [device_list_update.go:401]" context=missing failed=1 server_name=t2bot.io total=1 wait=2s
Sep 27 21:12:24 matrix dendrite-monolith-server[389]: time="2021-09-27T21:12:24.898649940Z" level=error msg="failed to query device keys for some users" func="processServer\n\t" file=" [device_list_update.go:401]" context=missing failed=1 server_name=t2bot.io total=1 wait=2s
Sep 27 21:12:27 matrix dendrite-monolith-server[389]: time="2021-09-27T21:12:27.896609323Z" level=error msg="failed to query device keys for some users" func="processServer\n\t" file=" [device_list_update.go:401]" context=missing failed=1 server_name=t2bot.io total=1 wait=2s
Sep 27 21:12:30 matrix dendrite-monolith-server[389]: time="2021-09-27T21:12:30.898653007Z" level=error msg="failed to query device keys for some users" func="processServer\n\t" file=" [device_list_update.go:401]" context=missing failed=1 server_name=t2bot.io total=1 wait=2s
Sep 27 21:12:33 matrix dendrite-monolith-server[389]: time="2021-09-27T21:12:33.900644934Z" level=error msg="failed to query device keys for some users" func="processServer\n\t" file=" [device_list_update.go:401]" context=missing failed=1 server_name=t2bot.io total=1 wait=2s
Sep 27 21:12:36 matrix dendrite-monolith-server[389]: time="2021-09-27T21:12:36.742228063Z" level=warning msg="Transaction: Failed to query room version for room!icQFxNJphEJgpvupiL:matrix.org" func="processTransaction\n\t" file=" [send.go:280]" error="context canceled" req.id=ShcJtDrSDTaY req.method=PUT req.path=/_matrix/federation/v1/send/1628416781095
Sep 27 21:12:36 matrix dendrite-monolith-server[389]: time="2021-09-27T21:12:36.742547320Z" level=warning msg="Transaction: Failed to query room version for room!icQFxNJphEJgpvupiL:matrix.org" func="processTransaction\n\t" file=" [send.go:280]" error="context canceled" req.id=ShcJtDrSDTaY req.method=PUT req.path=/_matrix/federation/v1/send/1628416781095
Sep 27 21:12:36 matrix dendrite-monolith-server[389]: time="2021-09-27T21:12:36.902803268Z" level=error msg="failed to query device keys for some users" func="processServer\n\t" file=" [device_list_update.go:401]" context=missing failed=1 server_name=t2bot.io total=1 wait=2s
Sep 27 21:12:37 matrix dendrite-monolith-server[389]: time="2021-09-27T21:12:37.231419802Z" level=warning msg="Transaction: Failed to query room version for room!UOiWshZxoMFennjjxm:matrix.org" func="processTransaction\n\t" file=" [send.go:280]" error="context canceled" req.id=xeM0jL5Kam89 req.method=PUT req.path=/_matrix/federation/v1/send/1630467718321
Sep 27 21:12:38 matrix dendrite-monolith-server[389]: time="2021-09-27T21:12:38.189033943Z" level=error msg="QueryPublishedRooms failed" func="publicRooms\n\t" file=" [publicrooms.go:63]" error="context canceled" req.id=9U7HigOflRPG req.method=GET req.path=/_matrix/federation/v1/publicRooms
Sep 27 21:12:38 matrix dendrite-monolith-server[389]: time="2021-09-27T21:12:38.869774081Z" level=warning msg="Failed to process incoming federation event, skipping" func="func1\n\t" file=" [send.go:391]" error="t.rsAPI.QueryServerJoinedToRoom: r.DB.GetLocalServerInRoom: context deadline exceeded" event_id="$asX1Ynabs29N0K3lst98HEr-K4UW6ffvO5o659LhUVM" rejected=false req.id=1wYTMy3elFGh req.method=PUT req.path=/_matrix/federation/v1/send/1631953960450
Sep 27 21:12:39 matrix dendrite-monolith-server[389]: time="2021-09-27T21:12:39.903363338Z" level=error msg="failed to query device keys for some users" func="processServer\n\t" file=" [device_list_update.go:401]" context=missing failed=1 server_name=t2bot.io total=1 wait=2s
Sep 27 21:12:42 matrix dendrite-monolith-server[389]: time="2021-09-27T21:12:42.904567902Z" level=error msg="failed to query device keys for some users" func="processServer\n\t" file=" [device_list_update.go:401]" context=missing failed=1 server_name=t2bot.io total=1 wait=2s
Sep 27 21:12:45 matrix dendrite-monolith-server[389]: time="2021-09-27T21:12:45.902829360Z" level=error msg="failed to query device keys for some users" func="processServer\n\t" file=" [device_list_update.go:401]" context=missing failed=1 server_name=t2bot.io total=1 wait=2s

Any help or confirmation of an issue will be helpful

sorry I usually post this on dendrite channel but.. I can't connect. O_o

and more:

[root@matrix nginx]# journalctl -f -u dendrite  | grep error
Sep 27 21:16:28 matrix dendrite-monolith-server[389]: time="2021-09-27T21:16:28.097125337Z" level=error msg="failed to query device keys for some users" func="processServer\n\t" file=" [device_list_update.go:401]" context=missing failed=1 server_name=t2bot.io total=1 wait=2s
Sep 27 21:16:31 matrix dendrite-monolith-server[389]: time="2021-09-27T21:16:31.106987157Z" level=error msg="failed to query device keys for some users" func="processServer\n\t" file=" [device_list_update.go:401]" context=missing failed=1 server_name=t2bot.io total=1 wait=2s
Sep 27 21:16:34 matrix dendrite-monolith-server[389]: time="2021-09-27T21:16:34.115231278Z" level=error msg="failed to query device keys for some users" func="processServer\n\t" file=" [device_list_update.go:401]" context=missing failed=1 server_name=t2bot.io total=1 wait=2s
Sep 27 21:16:34 matrix dendrite-monolith-server[389]: time="2021-09-27T21:16:34.597664004Z" level=warning msg="Transaction: Failed to query room version for room!jxlRxnrZCsjpjDubDX:matrix.org" func="processTransaction\n\t" file=" [send.go:280]" error="context canceled" req.id=j7JtwRLO63Kt req.method=PUT req.path=/_matrix/federation/v1/send/1632243634247
Sep 27 21:16:34 matrix dendrite-monolith-server[389]: time="2021-09-27T21:16:34.597987816Z" level=warning msg="Transaction: Failed to query room version for room!jxlRxnrZCsjpjDubDX:matrix.org" func="processTransaction\n\t" file=" [send.go:280]" error="context canceled" req.id=j7JtwRLO63Kt req.method=PUT req.path=/_matrix/federation/v1/send/1632243634247
Sep 27 21:16:37 matrix dendrite-monolith-server[389]: time="2021-09-27T21:16:37.115248783Z" level=error msg="failed to query device keys for some users" func="processServer\n\t" file=" [device_list_update.go:401]" context=missing failed=1 server_name=t2bot.io total=1 wait=2s
Sep 27 21:16:40 matrix dendrite-monolith-server[389]: time="2021-09-27T21:16:40.122940318Z" level=error msg="failed to query device keys for some users" func="processServer\n\t" file=" [device_list_update.go:401]" context=missing failed=1 server_name=t2bot.io total=1 wait=2s
Sep 27 21:16:43 matrix dendrite-monolith-server[389]: time="2021-09-27T21:16:43.120908562Z" level=error msg="failed to query device keys for some users" func="processServer\n\t" file=" [device_list_update.go:401]" context=missing failed=1 server_name=t2bot.io total=1 wait=2s
Sep 27 21:16:44 matrix dendrite-monolith-server[389]: time="2021-09-27T21:16:44.773123181Z" level=warning msg="Transaction: Failed to query room version for room!WlugbpCHnWaiQIhGPl:luki.org" func="processTransaction\n\t" file=" [send.go:280]" error="context canceled" req.id=X3IW2Op9xBhf req.method=PUT req.path=/_matrix/federation/v1/send/1632621035552
Sep 27 21:16:46 matrix dendrite-monolith-server[389]: time="2021-09-27T21:16:46.130374142Z" level=error msg="failed to query device keys for some users" func="processServer\n\t" file=" [device_list_update.go:401]" context=missing failed=1 server_name=t2bot.io total=1 wait=2s
Sep 27 21:16:49 matrix dendrite-monolith-server[389]: time="2021-09-27T21:16:49.128578274Z" level=error msg="failed to query device keys for some users" func="processServer\n\t" file=" [device_list_update.go:401]" context=missing failed=1 server_name=t2bot.io total=1 wait=2s

a user just sent me this from one of his matrix.org accounts when trying to search for rooms on our server, not sure if this is always like this or is related:

request failed: CORS request rejected: https://matrix-client.matrix.org/_matrix/client/r0/publicRooms?server=hispagatos.org
@r3k2
Copy link
Author

r3k2 commented Sep 27, 2021

i dont know but t2bot.io keeps doing this super fast now.... this is not normal at all.

@r3k2
Copy link
Author

r3k2 commented Sep 27, 2021

I have block that domain with iptables the INPUT and OUTPUT now I can log in but can't type anything on any channel. and I now get this all the time:

ot.io/_matrix/federation/v1/user/devices/@voyager:t2bot.io"
time="2021-09-27T21:45:23.614398793Z" level=warning msg="Error sending request to https://synapse.02.fsn1.ht.t2host.io:443/_matrix/federation/v1/user/devices/@voyager:t2bot.io: context deadline exceeded" func="github.com/matrix-org/gomatrixserverlib.(*federationTripper).RoundTrip" file="github.com/matrix-org/[email protected]/client.go:229" context=missing out.req.ID=2zzlpc09CBBc out.req.method=GET out.req.uri="matrix://t2bot.io/_matrix/federation/v1/user/devices/@voyager:t2bot.io"
time="2021-09-27T21:45:23.614540885Z" level=warning msg="Outgoing request failed" func="github.com/matrix-org/gomatrixserverlib.(*Client).DoHTTPRequest" file="github.com/matrix-org/[email protected]/client.go:509" context=missing error="Get \"matrix://t2bot.io/_matrix/federation/v1/user/devices/@voyager:t2bot.io\": context deadline exceeded" out.req.ID=2zzlpc09CBBc out.req.method=GET out.req.uri="matrix://t2bot.io/_matrix/federation/v1/user/devices/@voyager:t2bot.io"
time="2021-09-27T21:45:23.614719402Z" level=error msg="failed to query device keys for some users" func="github.com/matrix-org/dendrite/keyserver/internal.(*DeviceListUpdater).processServer" file="github.com/matrix-org/dendrite/keyserver/internal/device_list_update.go:401" context=missing failed=1 server_name=t2bot.io total=1 wait=15.999996834s

Any idea?

@r3k2
Copy link
Author

r3k2 commented Sep 27, 2021

ok, seems to be working now, not sure if is because I did the change I am about to explain or because I rebooted or a mix?
so I notice last week or a bit more there was a config change. #1988
so at the time I was like. ok I do not need this because I am already doing this on the website with the main domain https://hispagatos.org as I pasted above... but I am like.. well is broken so may as well try.
so I added the config changes with the same info I always had before
well_known_server_name: "matrix.hispagatos.org:443"
I rebooted and now is working.....
why?

@r3k2
Copy link
Author

r3k2 commented Oct 1, 2021

This is happening again, seems this is going on since the upgrade to version 9 of the rooms it was working perfectly for a long time.

m/matrix-org/dendrite/federationapi/routing.(*txnReq).processTransaction" file="github.com/matrix-org/dendrite/federationapi/routing/send.go:280" error="context canceled" req.id=Yn9KjBjyMbBx req.method=PUT req.path=/_matrix/federation/v1/send/1630884595329
time="2021-10-01T16:52:33.775314290Z" level=warning msg="Failed to process incoming federation event, skipping" func="github.com/matrix-org/dendrite/federationapi/routing.(*inputWorker).run.func1" file="github.com/matrix-org/dendrite/federationapi/routing/send.go:391" error="t.rsAPI.QueryServerJoinedToRoom: r.DB.GetLocalServerInRoom: context deadline exceeded" event_id="$11LqTnuQEP51pGaL92MRPDxHOHe_y1FZc7ZKz_bxGlI" rejected=false req.id=JnJB251vHQ1B req.method=PUT req.path=/_matrix/federation/v1/send/1632681449119
time="2021-10-01T16:53:05.664052018Z" level=warning msg="Transaction: Failed to query room version for room!jxlRxnrZCsjpjDubDX:matrix.org" func="github.com/matrix-org/dendrite/federationapi/routing.(*txnReq).processTransaction" file="github.com/matrix-org/dendrite/federationapi/routing/send.go:280" error="context canceled" req.id=yT7I2K75z4wk req.method=PUT req.path=/_matrix/federation/v1/send/1632520068900
time="2021-10-01T16:53:05.664446085Z" level=warning msg="Transaction: Failed to query room version for room!SEgsRQLScqPxYtucHl:archlinux.org" func="github.com/matrix-org/dendrite/federationapi/routing.(*txnReq).processTransaction" file="github.com/matrix-org/dendrite/federationapi/routing/send.go:280" error="context canceled" req.id=2xkzCk7XPQw5 req.method=PUT req.path=/_matrix/federation/v1/send/1625242611063
time="2021-10-01T16:53:05.666450566Z" level=error msg="eventutil.BuildEvent failed" func=github.com/matrix-org/dendrite/clientapi/routing.generateSendEvent file="github.com/matrix-org/dendrite/clientapi/routing/sendevent.go:214" error="context canceled" req.id=2WXikKeYDxTU req.method=PUT req.path="/_matrix/client/r0/rooms/!0bHdZkTyCe4eNfIH:hispagatos.org/send/m.room.encrypted/m1633106987629.3" user_id="@krispis:hispagatos.org"

@r3k2
Copy link
Author

r3k2 commented Oct 2, 2021

This happened again this morning for 4 hours... then automatically restored itself...

@kegsay
Copy link
Member

kegsay commented Oct 5, 2021

The problem here is context deadline exceeded which is caused when we give up trying to process something because it took too long. There are many different contexts used throughout Dendrite so this becomes tricky to unpick which context has expired and why.

I need to know your server configuration informaiton, please include it in your bug reports.

@r3k2
Copy link
Author

r3k2 commented Oct 5, 2021

dendrite.yaml.txt
@kegsay ok here it is, I removed passwords etc.

@neilalexander
Copy link
Contributor

Please reopen if there is still an issue in 0.8.1.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants