Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

state resolution failing with KeyError: '$S38_aRG4Qz8eIdrTyzJuQdH3Ieus1Y1dhczoFKnOB7M' #6605

Closed
anoadragon453 opened this issue Dec 30, 2019 · 8 comments

Comments

@anoadragon453
Copy link
Member

Description

My client is getting continuous 500's from my synchrotrons due to a KeyError from _get_event, and thus I'm unable to sync. Restarting does not help:

2019-12-30 17:33:07,540 - synapse.http.server - 109 - ERROR - GET-18029- Failed handle request via 'SyncRestServlet': <SynapseRequest at 0x7f272183c5f8 method='GET' uri='/_matrix/client/r0/sync?filter=7&timeout=...
Traceback (most recent call last):
  File "/home/synapse/env/lib/python3.6/site-packages/twisted/internet/defer.py", line 1418, in _inlineCallbacks
    result = g.send(result)
StopIteration: {}

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/synapse/env/lib/python3.6/site-packages/twisted/internet/defer.py", line 1418, in _inlineCallbacks
    result = g.send(result)
StopIteration: {}

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/synapse/env/lib/python3.6/site-packages/twisted/internet/defer.py", line 1418, in _inlineCallbacks
    result = g.send(result)
StopIteration: {}

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/synapse/env/lib/python3.6/site-packages/twisted/internet/defer.py", line 1418, in _inlineCallbacks
    result = g.send(result)
StopIteration: []

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/synapse/env/lib/python3.6/site-packages/twisted/internet/defer.py", line 1418, in _inlineCallbacks
    result = g.send(result)
StopIteration: {}

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/synapse/env/lib/python3.6/site-packages/synapse/http/server.py", line 77, in wrapped_request_handler
    await h(self, request)
  File "/home/synapse/env/lib/python3.6/site-packages/synapse/http/server.py", line 326, in _async_render
    callback_return = await callback_return
  File "/home/synapse/env/lib/python3.6/site-packages/synapse/rest/client/v2_alpha/sync.py", line 178, in on_GET
    full_state=full_state,
  File "/home/synapse/env/lib/python3.6/site-packages/synapse/handlers/sync.py", line 263, in wait_for_sync_for_user
    full_state,
  File "/home/synapse/env/lib/python3.6/site-packages/twisted/internet/defer.py", line 1416, in _inlineCallbacks
    result = result.throwExceptionIntoGenerator(g)
  File "/home/synapse/env/lib/python3.6/site-packages/twisted/python/failure.py", line 512, in throwExceptionIntoGenerator
    return g.throw(self.type, self.value, self.tb)
  File "/home/synapse/env/lib/python3.6/site-packages/synapse/handlers/sync.py", line 285, in _wait_for_sync_for_user
    sync_config, since_token, full_state=full_state
  File "/home/synapse/env/lib/python3.6/site-packages/synapse/handlers/sync.py", line 956, in generate_sync_result
    sync_result_builder, account_data_by_room
  File "/home/synapse/env/lib/python3.6/site-packages/synapse/handlers/sync.py", line 1367, in _generate_sync_entry_for_rooms
    await concurrently_execute(handle_room_entries, room_entries, 10)
  File "/home/synapse/env/lib/python3.6/site-packages/twisted/internet/defer.py", line 1416, in _inlineCallbacks
    result = result.throwExceptionIntoGenerator(g)
  File "/home/synapse/env/lib/python3.6/site-packages/twisted/python/failure.py", line 512, in throwExceptionIntoGenerator
    return g.throw(self.type, self.value, self.tb)
  File "/home/synapse/env/lib/python3.6/site-packages/synapse/util/async_helpers.py", line 155, in _concurrently_execute_inner
    await maybe_awaitable(func(next(it)))
  File "/home/synapse/env/lib/python3.6/site-packages/synapse/handlers/sync.py", line 1762, in _generate_room_entry
    newly_joined_room=newly_joined,
  File "/home/synapse/env/lib/python3.6/site-packages/synapse/handlers/sync.py", line 413, in _load_filtered_recents
    current_state_ids = await self.state.get_current_state_ids(room_id)
  File "/home/synapse/env/lib/python3.6/site-packages/twisted/internet/defer.py", line 1416, in _inlineCallbacks
    result = result.throwExceptionIntoGenerator(g)
  File "/home/synapse/env/lib/python3.6/site-packages/twisted/python/failure.py", line 512, in throwExceptionIntoGenerator
    return g.throw(self.type, self.value, self.tb)
  File "/home/synapse/env/lib/python3.6/site-packages/synapse/state/__init__.py", line 171, in get_current_state_ids
    ret = yield self.resolve_state_groups_for_events(room_id, latest_event_ids)
  File "/home/synapse/env/lib/python3.6/site-packages/twisted/internet/defer.py", line 1416, in _inlineCallbacks
    result = result.throwExceptionIntoGenerator(g)
  File "/home/synapse/env/lib/python3.6/site-packages/twisted/python/failure.py", line 512, in throwExceptionIntoGenerator
    return g.throw(self.type, self.value, self.tb)
  File "/home/synapse/env/lib/python3.6/site-packages/synapse/util/metrics.py", line 82, in measured_func
    r = yield func(self, *args, **kwargs)
  File "/home/synapse/env/lib/python3.6/site-packages/twisted/internet/defer.py", line 1416, in _inlineCallbacks
    result = result.throwExceptionIntoGenerator(g)
  File "/home/synapse/env/lib/python3.6/site-packages/twisted/python/failure.py", line 512, in throwExceptionIntoGenerator
    return g.throw(self.type, self.value, self.tb)
  File "/home/synapse/env/lib/python3.6/site-packages/synapse/state/__init__.py", line 402, in resolve_state_groups_for_events
    state_res_store=StateResolutionStore(self.store),
  File "/home/synapse/env/lib/python3.6/site-packages/twisted/internet/defer.py", line 1416, in _inlineCallbacks
    result = result.throwExceptionIntoGenerator(g)
  File "/home/synapse/env/lib/python3.6/site-packages/twisted/python/failure.py", line 512, in throwExceptionIntoGenerator
    return g.throw(self.type, self.value, self.tb)
  File "/home/synapse/env/lib/python3.6/site-packages/synapse/state/__init__.py", line 525, in resolve_state_groups
    state_res_store=state_res_store,
  File "/home/synapse/env/lib/python3.6/site-packages/twisted/internet/defer.py", line 1416, in _inlineCallbacks
    result = result.throwExceptionIntoGenerator(g)
  File "/home/synapse/env/lib/python3.6/site-packages/twisted/python/failure.py", line 512, in throwExceptionIntoGenerator
    return g.throw(self.type, self.value, self.tb)
  File "/home/synapse/env/lib/python3.6/site-packages/synapse/state/v2.py", line 116, in resolve_events_with_store
    room_id, power_events, event_map, state_res_store, full_conflicted_set
  File "/home/synapse/env/lib/python3.6/site-packages/twisted/internet/defer.py", line 1416, in _inlineCallbacks
    result = result.throwExceptionIntoGenerator(g)
  File "/home/synapse/env/lib/python3.6/site-packages/twisted/python/failure.py", line 512, in throwExceptionIntoGenerator
    return g.throw(self.type, self.value, self.tb)
  File "/home/synapse/env/lib/python3.6/site-packages/synapse/state/v2.py", line 364, in _reverse_topological_power_sort
    room_id, event_id, event_map, state_res_store
  File "/home/synapse/env/lib/python3.6/site-packages/twisted/internet/defer.py", line 1416, in _inlineCallbacks
    result = result.throwExceptionIntoGenerator(g)
  File "/home/synapse/env/lib/python3.6/site-packages/twisted/python/failure.py", line 512, in throwExceptionIntoGenerator
    return g.throw(self.type, self.value, self.tb)
  File "/home/synapse/env/lib/python3.6/site-packages/synapse/state/v2.py", line 186, in _get_power_level_for_sender
    aev = yield _get_event(room_id, aid, event_map, state_res_store)
  File "/home/synapse/env/lib/python3.6/site-packages/twisted/internet/defer.py", line 1418, in _inlineCallbacks
    result = g.send(result)
  File "/home/synapse/env/lib/python3.6/site-packages/synapse/state/v2.py", line 535, in _get_event
    event = event_map[event_id]
KeyError: '$S38_aRG4Qz8eIdrTyzJuQdH3Ieus1Y1dhczoFKnOB7M'
2019-12-30 17:33:07,544 - synapse.access.http.9183 - 302 - INFO - GET-18029- 127.0.0.1 - 9183 - {@andrewm:amorgan.xyz} Processed request: 16.942sec/0.001sec (6.428sec, 0.142sec) (9.334sec/43.300sec/259) 55B 500

The event ID is $S38_aRG4Qz8eIdrTyzJuQdH3Ieus1Y1dhczoFKnOB7M, in room !BAXLHOFjvDKUeLafmO:matrix.org (IRC Matrix bridges). I don't see any event with that ID in my events table, nor can I see it in the content of any event in my db.

Version information

Synapse v1.7.2

@turt2live
Copy link
Member

Likewise for t2bot.io

@heftig
Copy link

heftig commented Dec 30, 2019

Riot can no longer sync with my server; the event ID is the same. I'm not sure the room belongs to it, but it does appear in the logs preceding each error:

synapse.state: [GET-1676] Resolving state for !BAXLHOFjvDKUeLafmO:matrix.org with 2 groups
synapse.state: [GET-1676] Resolving conflicted state for '!BAXLHOFjvDKUeLafmO:matrix.org'

@turt2live
Copy link
Member

@heftig does your server use workers at all or is it one giant synapse?

@heftig
Copy link

heftig commented Dec 30, 2019

No workers here.

@richvdh
Copy link
Member

richvdh commented Dec 30, 2019

for those encountering this issue: b3d344c is a patch to work around it; I'm continuing to investigate what has happened.

@richvdh
Copy link
Member

richvdh commented Dec 30, 2019

conclusions so far:

$vxKZZEW3pJmuKI4V320ZzWOr0d2O9MqJ+/WMPsUIEqM was an event sent (to the IRC bridges room !BAXLHOFjvDKUeLafmO:matrix.org) back in August; it (along with a few other events around that time) referenced a non-existent event in its auth_events. It therefore should have been rejected (as indeed it seems to have been on many servers); however, matrix.org appears to have incorrectly accepted it. The event in question was a join event.

Today, the IRC bridge kicked all users deemed to have been idle for 30 days or more; this included the user who joined in $vxK.... The kick event ($iFQmiui69oTYNtGPMQilJEfe5U6HYWQooMFicYRhXBI) referenced the broken event in its auth events.

That means that $vxK... is now getting involved in the state-resolution algorithm, which apparently explodes when it fails to find the auth event referenced by $vxK.

@richvdh
Copy link
Member

richvdh commented Dec 30, 2019

another odd thing is that $iFQmiui69oTYNtGPMQilJEfe5U6HYWQooMFicYRhXBI comes up as a forward extremity, even though its successor ($8l7/U97QjDXLjc+Ym9qzyBaO0tFF9KeYUPgH5x/vdTA) has been correctly received and persisted.

This appears to have happened because $8I7... was initially persisted as an outlier: see #6607.

@richvdh richvdh changed the title Synchrotron continuously 500's, getting stuck on serving a missing event down sync state resolution failing with KeyError: '$S38_aRG4Qz8eIdrTyzJuQdH3Ieus1Y1dhczoFKnOB7M' Dec 30, 2019
@richvdh
Copy link
Member

richvdh commented Dec 31, 2019

the fact that some servers accepted $vxKZZEW3pJmuKI4V320ZzWOr0d2O9MqJ+/WMPsUIEqM whereas others rejected it appears to be because different code paths use different values of auth_events when performing event auth: some use a locally-calculated set of events, whereas others use the auth events provided by the event itself.

The spec says:

The receiving server must ensure that the event:
...

  • Passes authorization rules based on the event's auth events, otherwise it is rejected.
  • Passes authorization rules based on the state at the event, otherwise it is rejected.

I'm far from convinced that both of these are happening on all code paths.

babolivier pushed a commit that referenced this issue Sep 1, 2021
* commit '92eac974b':
  Hacks to work around #6605 (#6608)
  sample log config
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants