Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

A user joined a v3 room and a lot of users were left. #4980

Closed
Half-Shot opened this issue Apr 1, 2019 · 5 comments
Closed

A user joined a v3 room and a lot of users were left. #4980

Half-Shot opened this issue Apr 1, 2019 · 5 comments
Labels
z-bug (Deprecated Label)

Comments

@Half-Shot
Copy link
Collaborator

Half-Shot commented Apr 1, 2019

This is an ongoing bug report because there are a few moving parts and I am not sure of all the details yet, but will try to explain as best as I can.

#watercooler:half-shot.uk (!gdRMqOrTFdOCYHNwOo:half-shot.uk) is a v3 room that has been upgraded several times before.

  • @Magnap:magnap.dk 's homeserver is offline
  • #watercooler:half-shot.uk exists and people are joined.
  • magnap.dk goes back online after a year or so's hiatus and tries to backfill from all the places at once. It struggles and 5XXs most requests to it as a result.
  • @Magnap:magnap.dk tries to join #watercooler:half-shot.uk with event $gzD61zfxtq5zoyrFLiY2ILt0SzUEwGJu90c229pKHww and a lot of servers reject the event. My server (half-shot.uk) did not.
  • @Magnap:magnap.dk successfully joins the room from my and his pov. However, some users report that they couldn't see his member event (due to rejections) in the timeline and his messages were appearing without any displayname/avatar.
  • Because the servers rejected the event, they failed to look the event up for other operations on the server and presumably got state reset out of the room (Speculation)
  • Users from these servers started reporting that #watercooler:half-shot.uk showed up in their room list, however they were not seeing any new traffic. This might be a combination of sync caching, but the users were not in the room.
  • Rejoining the room seemed to fix it for these users, and they report that the missing messages appeared in their clients.
@Magnap
Copy link

Magnap commented Apr 1, 2019

Some addtional detail from my POV: for many HSs I can't see their events (presumably because I'm not being sent them because they've rejected my join; however this does not square with the fact that this happens with messages from @cadair:cadair.com as well, even though he has deleted my join event from his rejections table) until someone from the set of HSs that do recognize my join sends an event that (transitively, I believe, but I can't say with 100% certainty that I'm not missing any messages) links back to it. I also can't search in #watercooler:half-shot.uk (with the current room ID !gdRMqOrTFdOCYHNwOo:half-shot.uk) getting the following error message in Riot Web: "Unknown room !ruaviCwHdJSWfKcBam:half-shot.uk" (which is the old room ID). I'll be happy to upload my logs if requested and I can run queries on my (PostgreSQL) DB if information to help diagnose this issued can be gleaned that way

@Magnap
Copy link

Magnap commented Apr 1, 2019

I should note that the join event mentioned ($gzD61zfxtq5zoyrFLiY2ILt0SzUEwGJu90c229pKHww) was actually a second join event sent as I, according to my client (Riot Web), was not present in the room. It's probably worth looking into whether that was just one of the effects of this (single) bug or whether it was due to some bug, and my response to it (re-joining) then triggered this state-messy behavior.

@richvdh
Copy link
Member

richvdh commented Aug 15, 2019

We seem to be failing to find time to investigate this :/

@richvdh
Copy link
Member

richvdh commented Aug 15, 2019

This certainly doesn't sound like a conventional "state reset" (which affects all servers equally); rather it's a different problem which we have no reason to believe would be affected by the change in state resolution algorithm.

However, some users report that they couldn't see his member event (due to rejections) in the timeline and his messages were appearing without any displayname/avatar.

Do we know why the member event was rejected? Any logs from any affected servers?

Users from these servers started reporting that #watercooler:half-shot.uk showed up in their room list, however they were not seeing any new traffic. This might be a combination of sync caching, but the users were not in the room.

"Not in the room" according to what metric?

@richvdh
Copy link
Member

richvdh commented Dec 18, 2019

this just doesn't feel like a thing that we're ever going to investigate.

@richvdh richvdh closed this as completed Dec 18, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
z-bug (Deprecated Label)
Projects
None yet
Development

No branches or pull requests

4 participants