-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Add a new third party callback check_event_allowed_v2
that is compatible with new batch persisting mechanisms.
#15131
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I owe you a proper review but I think these questions are probably blockers for doing a fair review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems sound.
I bring a few comments (mostly just pedantry) and ideas, but nothing wrong other than I don't see anywhere we let modules register these (...which could be me being bad at reading!).
I'd be tempted to support persisting multiple-events because I can see it being useful (e.g. in the frozen room use case: we might post an informational message about the room being frozen alongside freezing it. In other cases, we might want to update multiple state keys in one go) and it'd seem a shame to have to think about a v3 instead of introducing a list here and now (it doesn't seem too hard to do?).
@@ -234,15 +241,16 @@ async def check_event_allowed( | |||
self, | |||
event: EventBase, | |||
context: UnpersistedEventContextBase, | |||
) -> Tuple[bool, Optional[dict]]: | |||
) -> Tuple[bool, Optional[dict], Optional[dict]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bikeshed risk: should we allow the creation of multiple events so we don't wind up having to introduce a v3 for that later? :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't tell if I am being a curmudgeon here but my instinct would be to say no, as honestly in a perfect world one would not be able to inject events here at all - but since it's already been done this is a way to gracefully work around it. I feel like adding multiple events creates needless complexity, but again, I might just be being a curmudgeon.
else: | ||
for v2_callback in self._check_event_allowed_v2_callbacks: | ||
try: | ||
res, replacement_data, new_event = await delay_cancellation( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of introducing a v2, I wonder if we could just widen the return type to accept either doubles or triples, with doubles being interpreted the same as before and triples being interpreted the same as what you're doing for v2.
This could be done with a somewhat nasty Union
for the return type.
I'm not sure how that sounds to you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think since we don't want to mix v1 and v2 callbacks (ie we want to eventually deprecate the v1 callbacks) it makes sense to not do this. I also have a distaste for Union
return types, it always seems unwieldy to deal with them if you can avoid it.
@@ -32,6 +32,10 @@ | |||
CHECK_EVENT_ALLOWED_CALLBACK = Callable[ | |||
[EventBase, StateMap[EventBase]], Awaitable[Tuple[bool, Optional[dict]]] | |||
] | |||
CHECK_EVENT_ALLOWED_V2_CALLBACK = Callable[ | |||
[EventBase, StateMap[EventBase]], | |||
Awaitable[Tuple[bool, Optional[dict], Optional[dict]]], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is merely a suggestion, but I'd be tempted to replace this tuple with an attrs
or dataclass
class. That buys you:
- named fields (helps disambiguate the two
Optional[dict]
s; also means we can document the individual pieces more easily IMO) - an easier time for extending the return type in the future (we can add new fields with default values)
At this stage it's not really critical, but if this tuple grew any larger I'd start to think it was getting a bit unwieldy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm on the fence about this, here's my thinking: I agree that a class might be cleaner (although only slightly, my hope is that the tuple does not grow any larger!) but I wonder about the utility for module-writers. It seems easier to say "just give us a dict" rather than describing a class and expecting them to match their data to the class, but I may be wrong here. I'm open to being convinced otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
under this hypothetical situation, we would provide the dataclass and they would just instantiate it, e.g. they'd return something like
return EventAllowedResult(
allowed=True,
replace={"x": "y"},
)
return res, None, None | ||
elif isinstance(replacement_data, dict): | ||
return True, replacement_data, None | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious as to why we don't support a mix of v1 and v2 callbacks. Pedantically this could make it hard to upgrade as you'd need to upgrade all relevant modules at once, rather than doing it bit by bit. Realistically, this may not be an issue as I don't know if anyone runs with more than one such module anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main reason is that we'd like to phase out the v1 callbacks - they are not compatible with the batching mechanism, which is why the v2 callback which is compatible with the batching mechanism was proposed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought it was just the sending of additional events that was not compatible when done in v1 callbacks and with batching?
I think it would be OK to support v1+v2 as a transitional period with the caveat that v1 can't send additional events.
I think this would be good practice but I don't know if it's truly warranted here given the relatively low use of modules, so I'm tempted to actually just accept use of either v1 or v2 if that prevents having to think harder about this.
@@ -155,6 +159,9 @@ def __init__(self, hs: "HomeServer"): | |||
self._storage_controllers = hs.get_storage_controllers() | |||
|
|||
self._check_event_allowed_callbacks: List[CHECK_EVENT_ALLOWED_CALLBACK] = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(just as a note if it interests you, particularly relevant when combined with 'why not both v1 and v2?'):
In the past, for callback upgrades like these, we've sometimes made an adapter for the old callbacks and registered the adapter as a v2 callback (and then removed the list of v1 callbacks, since they're all wrapped).
This has sometimes proved easier than trying to call each callback in the way it was originally defined.
To illustrate (but I'd put proper variable names and type annotations on, just a tad fiddly to do from within GitHub without having it all fresh in mind):
def check_event_allowed_v1_v2_adapter(v1: CHECK_EVENT_ALLOWED_CALLBACK) -> CHECK_EVENT_ALLOWED_V2_CALLBACK:
async def adapter(x, y):
a, b = await v1(x, y)
return a, b, None
return adapter
# FIXME: Being able to throw SynapseErrors is relied upon by | ||
# some modules. PR #10386 accidentally broke this ability. | ||
# That said, we aren't keen on exposing this implementation detail | ||
# to modules and we should one day have a proper way to do what | ||
# is wanted. | ||
# This module callback needs a rework so that hacks such as | ||
# this one are not necessary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bikeshed risk: is this a good time to address this? If it looks like more than a tiny amount of work I'm happy to leave it, but if the only reason we haven't done this was 'it's not pressing enough to introduce a breaking change' then now might be a good opportunity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fascinatingly you were the one to add this todo: https://github.com/matrix-org/synapse/pull/11042/files
Do you remember what the discussion was at time/why it wasn't fixed then? I don't see an issue for the regression so I am a little unclear on what the original problem was.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the vague problem is that some modules would reject events by return False
, others would raise SynapseError(403, "This event is not allowed because blahblahblah")
— in effect, they could give it a custom error code and error message.
We don't like the latter — it's not explicitly supported but makes use of the fact that SynapseError
s bubble up automatically and get turned into Matrix/API-level errors (in the servlet? I think).
If we care about this mechanism then we should spec support for it without relying on exceptions, it feels like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm well it seems to me given the context that fixing this might be outside the purview of this particular PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well OK, it's just that this needs a new version of the callback so it would have been possibly better to do the changes together for the same version
with the event being checked, if this is the case the module writer must provide a dict that | ||
will form the basis of the event that is to be added to the room and it must be returned by `check_event_allowed_v2`. | ||
This dict will then be turned into an event at the appropriate time and it will be persisted after the event | ||
that triggered it, and if the event that triggered it is in a batch of events for persisting, it will be added to the | ||
end of that batch. | ||
end of that batch. Note that the event MAY NOT be a membership event. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
end of that batch. Note that the event MAY NOT be a membership event. | |
end of that batch. Note that the event MUST NOT be a membership event. |
if we're going for RFC conventions (the capitals make me feel like it), MUST NOT is the term to use
( | ||
event, | ||
unpersisted_context, | ||
_, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so (if I can read correctly) we don't let the callbacks on third-party events themselves generate more events? This could be worth putting in a comment and in the docs for the callback?
@H-Shay Is this still useful? Should we revive it? |
It's not totally useless as it unlocks a further (minimal-moderate) increase in the speed of creating rooms - but I was going to close it as I don't think the benefits are worth the time to rehab it. |
This is a suggestion for how to rework the third party rules callback
check_event_allowed
so that it is compatible with the newly added batch persisting mechanism, added in #13487 and #13800. We currently only batch up events during room creation, but there is a desire to integrate this functionality elsewhere as it has the potential to improve performance (by reducing round trips to the DB).However,
check_event_allowed
as it is currently implemented allows for a module writer to use this callback to send an event into a room, triggered by the event being checked bycheck_event_allowed
. This causes a problem when the event triggering the secondary event is part of a batch, as it forks the DAG.To alleviate this, I propose adding a second version of
check_event_allowed
,check_event_allowed_v2
which explicitly allows for a module writer to optionally provide an event dict that will form the basis of the new additional event they would like to send into the room. The event dict is then either turned into an event immediately after the original event, if the original event is not part of a batch, or if the original event is part of a batch the event dict is turned into an event after each of the batched events have already been created and added to the end of the batch for persisting.In this scheme only one of
check_event_allowed
orcheck_event_allowed_v2
could be active at a time, and if both are providedcheck_event_allowed
will be executed rather thancheck_event_allowed_v2
. The idea would be to eventually deprecatecheck_event_allowed
in favor ofcheck_event_allowed_v2
.Best reviewable commit-by-commit.