-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Make repl send_event idempotent and retry on timeouts #2920
Conversation
If we treated timeouts as failures on the worker we would attempt to clean up e.g. push actions while the master might still process the event.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
neat!
think it needs a couple of log lines for the uncommon cases, but lgtm otherwise
|
||
def on_PUT(self, request, event_id): | ||
result = self.response_cache.get(event_id) | ||
if not result: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest adding a line of logging for the case where there is an entry in the cache already.
except CodeMessageException as e: | ||
if e.code != 504: | ||
raise | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggest logging something here.
if not result: | ||
result = self.response_cache.set( | ||
event_id, | ||
preserve_fn(self._handle_request)(request) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you preserve_fn this once rather than on every request?
break | ||
except CodeMessageException as e: | ||
if e.code != 504: | ||
logger.warn("send_event request timed out") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is in the wrong place
def on_PUT(self, request, event_id): | ||
result = self.response_cache.get(event_id) | ||
if not result: | ||
logger.warn("Returning cached response") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so is this
self._handle_request(request) | ||
) | ||
else: | ||
logger.warn("Returning cached response") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't mind if you logged the state of result.called
(ie whether we're returning a completed response or are still waiting for the request to complete), but this is fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
If we treated timeouts as failures on the worker we would attempt to
clean up e.g. push actions while the master might still process the
event.