Add transactional API to history purge #2962

richvdh · 2018-03-08T11:52:51Z

Make the purge request return quickly, and allow scripts to poll for updates.

(includes #2961)

richvdh · 2018-03-08T12:23:05Z

retest this please

richvdh · 2018-03-08T12:34:28Z

remaining test failure is because the test was broken (now fixed) and PR builder doesn't rerun the commit tests

erikjohnston

Broadly LGTM, I've commented on a lot of nits, most of which are just opinions so shrug. Could do with a couple of docstrings and s/error/failed/ in the docs though.

erikjohnston · 2018-03-12T10:01:32Z

synapse/handlers/message.py

        self._purges_in_progress_by_room.add(room_id)
        try:
            with (yield self.pagination_lock.write(room_id)):
                yield self.store.purge_history(
                    room_id, topological_ordering, delete_local_events,
                )
+            logger.info("[purge] complete")


I wonder if we should log the purge_id too for all these log lines. Maybe set the request_id for the context or something?>

Yeah could do. Or we could make sure we log the purge id against the existing request_id so that you could tie the log lines together through that?

Yup, happy either way.

ok, changed to log the purge_id

erikjohnston · 2018-03-12T10:01:53Z

synapse/handlers/message.py

+            logger.info("[purge] complete")
+            self._purges_by_id[purge_id].status = "complete"
+        except Exception:
+            logger.error("[purge] failed: %s", Failure().getTraceback().rstrip())


Why are we not logging as an exception?

what do you mean, and why would we do so?

As in, why aren't we using logger.exception(...) rather than manually building a stack trace?

ah right. Because Failure().getTraceback() will give us the proper stacktrace, whereas sys.exc_info (and hence logger.exception) get confused by the deferreds and only go one frame down.

erikjohnston · 2018-03-12T10:04:05Z

synapse/handlers/message.py

        finally:
            self._purges_in_progress_by_room.discard(room_id)

+            # remove the purge from the list an hour after it completes
+            def clear_purge():
+                del self._purges_by_id[purge_id]


For paranoia's sake I tend to prefer d.pop(key, None), as that won't throw if the key doesn't exist, but shrug.

I'd argue that knowing the key has gone missing would be a good thing...

erikjohnston · 2018-03-12T10:14:30Z

synapse/handlers/message.py

+        self.status = "active"
+
+    def asdict(self):
+        return self.__dict__


I'm not entirely convinced this is better than just using a regular dict. Having a dedicated PurgeStatus class is useful for documenting and enforcing the return values, but I don't think that this particular construct really does either of those.

I'd prefer either a subclass of namedtuple (it does helpfully provide an _asdict already), or have asdict be explicit about the properties it returns, i.e. {"status": self.status}

Also, a short docstring would be good.

I tried a namedtuple to start with, but I wanted the individual fields to be mutable so that didn't work.

So yes, I can make it a more explicit dict.

I tried a namedtuple to start with, but I wanted the individual fields to be mutable so that didn't work.

Ah, of course.

have made asdict more explicit, and turned status into a more enummy thing while I'm at it

erikjohnston · 2018-03-12T10:15:24Z

synapse/handlers/message.py

-    def purge_history(self, room_id, topological_ordering,
-                      delete_local_events=False):
+    def start_purge_history(self, room_id, topological_ordering,
+                            delete_local_events=False):
        if room_id in self._purges_in_progress_by_room:


erikjohnston · 2018-03-12T10:16:16Z

synapse/handlers/message.py

+            # remove the purge from the list an hour after it completes
+            def clear_purge():
+                del self._purges_by_id[purge_id]
+            reactor.callLater(3600, clear_purge)


Given this can take hours, I'd probably have it much higher, like 24hours or something

the logic went that if you were polling, you'd get what you needed within an hour either way. Can change it to 24h if you like though...

I was just thinking in terms of people manually curling. Given how small PurgeStatus is I think we may as well keep them for longer.

erikjohnston · 2018-03-12T10:18:13Z

synapse/rest/client/v1/admin.py

+
+        purge_status = self.handlers.message_handler.get_purge_status(purge_id)
+        if purge_status is None:
+            raise NotFoundError("purge id '%s' not found" % purge_id)


I wonder if it'd be better to merge completed status with an unknown token status, so that even if we remove the entry we get the same result?

Currently a 404 means that either the token was invalid or the purge has completed but we've forgotten the ID.... Oh, I guess that's not true if we restart half way through. Nevermind.

erikjohnston · 2018-03-12T10:18:50Z

docs/admin_api/purge_history_api.rst

+        "status": "active"
+    }
+
+The status will be one of ``active``, ``complete``, or ``error``.


We return failed not error

fixed, thanks

erikjohnston · 2018-03-12T14:41:00Z

synapse/handlers/message.py

+
+        # we log the purge_id here so that it can be tied back to the
+        # request id in the log lines.
+        logger.info("[purge] starting purge_id %s" % purge_id)


Should be logger.info("...", purge_id)

erikjohnston

Other than using % inside a logger.info, LGTM

Queuing up purges doesn't sound like a good thing.

Make the purge request return quickly, and allow scripts to poll for updates.

richvdh mentioned this pull request Mar 8, 2018

Tests for history purge transactional API matrix-org/sytest#434

Merged

richvdh force-pushed the rav/purge_history_txns branch from da42238 to 643b9eb Compare March 8, 2018 12:06

richvdh assigned erikjohnston Mar 12, 2018

erikjohnston suggested changes Mar 12, 2018

View reviewed changes

erikjohnston reviewed Mar 12, 2018

View reviewed changes

erikjohnston approved these changes Mar 12, 2018

View reviewed changes

richvdh added 2 commits March 12, 2018 16:22

Return an error when doing two purges on a room

1708412

Queuing up purges doesn't sound like a good thing.

Add transactional API to history purge

e48c7aa

Make the purge request return quickly, and allow scripts to poll for updates.

richvdh force-pushed the rav/purge_history_txns branch from 0695257 to e48c7aa Compare March 12, 2018 16:23

richvdh merged commit d65ceb4 into develop Mar 12, 2018

richvdh deleted the rav/purge_history_txns branch March 12, 2018 16:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add transactional API to history purge #2962

Add transactional API to history purge #2962

richvdh commented Mar 8, 2018

richvdh commented Mar 8, 2018

richvdh commented Mar 8, 2018

erikjohnston left a comment

erikjohnston Mar 12, 2018

richvdh Mar 12, 2018

erikjohnston Mar 12, 2018

richvdh Mar 12, 2018

erikjohnston Mar 12, 2018

richvdh Mar 12, 2018

erikjohnston Mar 12, 2018

richvdh Mar 12, 2018

erikjohnston Mar 12, 2018

richvdh Mar 12, 2018

erikjohnston Mar 12, 2018

erikjohnston Mar 12, 2018

erikjohnston Mar 12, 2018

richvdh Mar 12, 2018

erikjohnston Mar 12, 2018

richvdh Mar 12, 2018

erikjohnston Mar 12, 2018

richvdh Mar 12, 2018

erikjohnston Mar 12, 2018

richvdh Mar 12, 2018

erikjohnston Mar 12, 2018

richvdh Mar 12, 2018

erikjohnston Mar 12, 2018

richvdh Mar 12, 2018

erikjohnston Mar 12, 2018

erikjohnston Mar 12, 2018

richvdh Mar 12, 2018

erikjohnston Mar 12, 2018

erikjohnston left a comment

Add transactional API to history purge #2962

Add transactional API to history purge #2962

Conversation

richvdh commented Mar 8, 2018

richvdh commented Mar 8, 2018

richvdh commented Mar 8, 2018

erikjohnston left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

erikjohnston left a comment

Choose a reason for hiding this comment