-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Add transactional API to history purge #2962
Conversation
da42238
to
643b9eb
Compare
retest this please |
remaining test failure is because the test was broken (now fixed) and PR builder doesn't rerun the commit tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Broadly LGTM, I've commented on a lot of nits, most of which are just opinions so shrug. Could do with a couple of docstrings and s/error/failed/
in the docs though.
synapse/handlers/message.py
Outdated
self._purges_in_progress_by_room.add(room_id) | ||
try: | ||
with (yield self.pagination_lock.write(room_id)): | ||
yield self.store.purge_history( | ||
room_id, topological_ordering, delete_local_events, | ||
) | ||
logger.info("[purge] complete") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should log the purge_id too for all these log lines. Maybe set the request_id for the context or something?>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah could do. Or we could make sure we log the purge id against the existing request_id so that you could tie the log lines together through that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, happy either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, changed to log the purge_id
synapse/handlers/message.py
Outdated
logger.info("[purge] complete") | ||
self._purges_by_id[purge_id].status = "complete" | ||
except Exception: | ||
logger.error("[purge] failed: %s", Failure().getTraceback().rstrip()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we not logging as an exception?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do you mean, and why would we do so?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As in, why aren't we using logger.exception(...)
rather than manually building a stack trace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah right. Because Failure().getTraceback()
will give us the proper stacktrace, whereas sys.exc_info
(and hence logger.exception
) get confused by the deferreds and only go one frame down.
synapse/handlers/message.py
Outdated
finally: | ||
self._purges_in_progress_by_room.discard(room_id) | ||
|
||
# remove the purge from the list an hour after it completes | ||
def clear_purge(): | ||
del self._purges_by_id[purge_id] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For paranoia's sake I tend to prefer d.pop(key, None)
, as that won't throw if the key doesn't exist, but shrug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd argue that knowing the key has gone missing would be a good thing...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fair
synapse/handlers/message.py
Outdated
self.status = "active" | ||
|
||
def asdict(self): | ||
return self.__dict__ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not entirely convinced this is better than just using a regular dict. Having a dedicated PurgeStatus
class is useful for documenting and enforcing the return values, but I don't think that this particular construct really does either of those.
I'd prefer either a subclass of namedtuple (it does helpfully provide an _asdict already), or have asdict
be explicit about the properties it returns, i.e. {"status": self.status}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, a short docstring would be good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried a namedtuple to start with, but I wanted the individual fields to be mutable so that didn't work.
So yes, I can make it a more explicit dict.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried a namedtuple to start with, but I wanted the individual fields to be mutable so that didn't work.
Ah, of course.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have made asdict
more explicit, and turned status
into a more enummy thing while I'm at it
synapse/handlers/message.py
Outdated
def purge_history(self, room_id, topological_ordering, | ||
delete_local_events=False): | ||
def start_purge_history(self, room_id, topological_ordering, | ||
delete_local_events=False): | ||
if room_id in self._purges_in_progress_by_room: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Docstring?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
synapse/handlers/message.py
Outdated
# remove the purge from the list an hour after it completes | ||
def clear_purge(): | ||
del self._purges_by_id[purge_id] | ||
reactor.callLater(3600, clear_purge) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given this can take hours, I'd probably have it much higher, like 24hours or something
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the logic went that if you were polling, you'd get what you needed within an hour either way. Can change it to 24h if you like though...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was just thinking in terms of people manually curling. Given how small PurgeStatus
is I think we may as well keep them for longer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
||
purge_status = self.handlers.message_handler.get_purge_status(purge_id) | ||
if purge_status is None: | ||
raise NotFoundError("purge id '%s' not found" % purge_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it'd be better to merge completed status with an unknown token status, so that even if we remove the entry we get the same result?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wat
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently a 404 means that either the token was invalid or the purge has completed but we've forgotten the ID.... Oh, I guess that's not true if we restart half way through. Nevermind.
docs/admin_api/purge_history_api.rst
Outdated
"status": "active" | ||
} | ||
|
||
The status will be one of ``active``, ``complete``, or ``error``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We return failed
not error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed, thanks
synapse/handlers/message.py
Outdated
|
||
# we log the purge_id here so that it can be tied back to the | ||
# request id in the log lines. | ||
logger.info("[purge] starting purge_id %s" % purge_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be logger.info("...", purge_id)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than using %
inside a logger.info
, LGTM
Queuing up purges doesn't sound like a good thing.
Make the purge request return quickly, and allow scripts to poll for updates.
0695257
to
e48c7aa
Compare
Make the purge request return quickly, and allow scripts to poll for updates.
(includes #2961)