Store Promise<Response> instead of Response for HTTP API transactions #1624

kegsay · 2016-11-11T15:21:05Z

This fixes a race whereby:

User hits an endpoint.
No cached transaction so executes main code.
User hits same endpoint.
No cache transaction so executes main code.
Main code finishes executing and caches response and returns.
Main code finishes executing and caches response and returns.

This race is common in the wild when Synapse is struggling under load.
This commit fixes the race by:

User hits an endpoint.
Caches the promise to execute the main code and executes main code.
User hits same endpoint.
Yields on the same promise as the first request.
Main code finishes executing and returns, unblocking both requests.

Now with bonus sytests!

This fixes a race whereby: - User hits an endpoint. - No cached transaction so executes main code. - User hits same endpoint. - No cache transaction so executes main code. - Main code finishes executing and caches response and returns. - Main code finishes executing and caches response and returns. This race is common in the wild when Synapse is struggling under load. This commit fixes the race by: - User hits an endpoint. - Caches the promise to execute the main code and executes main code. - User hits same endpoint. - Yields on the same promise as the first request. - Main code finishes executing and returns, unblocking both requests.

erikjohnston · 2016-11-11T15:29:53Z

I wondering if a nicer API would be something like:

self.transactions.fetch_or_execute(
    self.handler.do_foo, txn_id,
    arg1, arg2, arg3=arg3
)

where fetch_or_execute would use the first arg as a txn_id, and if that wasn't in the cache call the given function with the arguments. This has the advantage you don't explicitly need to remember to both check and store.

erikjohnston · 2016-11-11T15:30:48Z

Also, would be totally awesome if HttpTransactionCache had some python tests. Maybe also move it to synapse.utils?

kegsay · 2016-11-11T16:01:59Z

fetch_or_execute doesn't appear to pass the request object which is used to select the key in the cache. Was this intentional?

kegsay · 2016-11-11T16:04:08Z

Also, I don't mind moving it to util, but that feels like a downgrade in specificity, since only the REST Servlets make use of this class, and the aforementioned reliance on a request object.

kegsay · 2016-11-11T16:10:37Z

I'm guessing you're proposing I make it more generic (so txn_id is just the key in the cache)? Do we plan on using the generic form elsewhere?

erikjohnston · 2016-11-11T16:16:33Z

fetch_or_execute doesn't appear to pass the request object which is used to select the key in the cache. Was this intentional?

Nah, I just made it up.

Also, I don't mind moving it to util, but that feels like a downgrade in specificity, since only the REST Servlets make use of this class, and the aforementioned reliance on a request object.

At the very least it should be moved up, but generally I quite helpers like this to live a bit separately, rather than being dumped alongside the rest servlets themselves

I'm guessing you're proposing I make it more generic (so txn_id is just the key in the cache)? Do we plan on using the generic form elsewhere?

Well, it is currently implemented in a generic fashion. I'm happy for the arg name to be txn_id or key or whatever

kegsay · 2016-11-11T16:19:45Z

At the very least it should be moved up, but generally I quite helpers like this to live a bit separately, rather than being dumped alongside the rest servlets themselves

synapse/rest/client/transactions.py perhaps?

Well, it is currently implemented in a generic fashion. I'm happy for the arg name to be txn_id or key or whatever

Do you want the implementation to be generic or are you happy with it in its current form (accepting request objects which are then processed for their access_token for use as a key)?

erikjohnston · 2016-11-11T16:29:06Z

synapse/rest/client/transactions.py perhaps?

That's fine i suppose

Do you want the implementation to be generic or are you happy with it in its current form (accepting request objects which are then processed for their access_token for use as a key)?

Oh, I misread. Yeah, ok, I guess the generation of the key is non-trivial. Though I'd still be tempted to move the _get_key onto the v1 rest base class, as that would make the transaction class a nice self-contained and easily testable class, rather than having it know about request objects.

kegsay · 2016-11-11T16:32:20Z

SGTM

kegsay · 2016-11-11T17:51:59Z

Hmmm. The old implementation was using transaction IDs as a way to prune the cache, but it means that you couldn't have multiple in-flight requests at the same time and get idempotency, which feels bad. I've removed that code in my fix, but now the cache will grow unbounded.

How do you propose I clear the cache? Periodic interval? 10 minutes? The generic form now just takes a key, so I can't be more intelligent like base it off the given user (access_token, which is now concatenated in the key).

erikjohnston · 2016-11-12T11:30:07Z

synapse/rest/client/transactions.py

+            of (response_code, response_dict).
+        """
+        try:
+            return self.transactions[txn_key]


I think you need a .observe() on the end

erikjohnston · 2016-11-12T11:30:22Z

synapse/rest/client/transactions.py

+        deferred = fn(*args, **kwargs)
+        observable = ObservableDeferred(deferred)
+        self.transactions[txn_key] = observable
+        return observable


Ditto a .observe() here too

erikjohnston · 2016-11-12T11:30:53Z

synapse/rest/client/v1/room.py

+        observable = self.txns.fetch_or_execute_request(
+            request, self.on_POST, request
+        )
+        res = yield observable.observe()


Ah, I'd move this .observe() up into the actual cache to make things neater:

def on_PUT(self, request, txn_id): return self.txns.fetch_or_execute_request( request, self.on_POST, request )

erikjohnston · 2016-11-12T11:33:46Z

How do you propose I clear the cache? Periodic interval? 10 minutes? The generic form now just takes a key, so I can't be more intelligent like base it off the given user (access_token, which is now concatenated in the key).

For now, I'd probably expire after 30mins (10 is probably a bit on the low side). Ideally I'd guess we'd probably batch persist these txn_ids to the db so they survive restarts, and then purge that table after a few hours/days.

erikjohnston · 2016-11-12T11:34:29Z

(Also, a python test case for the HttpTransactionCache class would be awesome)

kegsay · 2016-11-14T11:23:05Z

(Also, a python test case for the HttpTransactionCache class would be awesome)

Done.

For cleaning entries, I'm just periodically checking every 30 minutes, and timestamping when functions were invoked (which means the actual time in the cache is between 30~60 minutes). This feels simpler and less wasteful compared to registering timeouts for each entry in the cache, which has comparatively more function call overhead.

kegsay · 2016-11-14T11:25:36Z

@erikjohnston PTAL

Also, are the Dendron tests just flakey or should I be worried? Looking at the previous builds on http://matrix.org/jenkins/job/SynapseSytestDendronCommit/ makes me think flakey, but I don't know.

erikjohnston · 2016-11-14T12:36:52Z

Also, are the Dendron tests just flakey or should I be worried? Looking at the previous builds on http://matrix.org/jenkins/job/SynapseSytestDendronCommit/ makes me think flakey, but I don't know.

Yes :(

erikjohnston · 2016-11-14T12:41:03Z

LGTM

NOTE: According to <https://matrix.org/docs/spec/client_server/r0.3.0.html#id183>, the transaction ID should be scoped to the access token, so we should preserve it with the token. However, if the client crashes and fails to save the TID, and then reuses it in the future...what happens? The server seems to accept messages with already-used TIDs. Maybe it has some kind of heuristic... I found these: <matrix-org/synapse#1481> and <matrix-org/synapse#1624>.

kegsay added 6 commits November 10, 2016 14:49

Flake8

8a8ad46

Use observable deferreds because they are sane

c7daf31

Use ObservableDeferreds instead of Deferreds as they behave as intended

42c43cf

Flake8 and fix whoopsie

a88bc67

More flake8

f6c4880

kegsay added the z-bug (Deprecated Label) label Nov 11, 2016

kegsay assigned erikjohnston Nov 11, 2016

Review comments

8ecaff5

erikjohnston reviewed Nov 12, 2016

View reviewed changes

kegsay added 2 commits November 14, 2016 09:52

Move .observe() up to the cache to make things neater

af4a1ba

Clean transactions based on time. Add HttpTransactionCache tests.

3991b4c

kegsay merged commit 9355a5c into develop Nov 14, 2016

freelock mentioned this pull request Nov 14, 2016

room.currentState does not always get updated with new state matrix-org/matrix-js-sdk#275

Closed

richvdh deleted the kegan/idempotent-requests branch December 1, 2016 14:09

kegsay mentioned this pull request Feb 13, 2017

CS API txn ids are dropped on multiple requests (SYN-603) #1481

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Store Promise<Response> instead of Response for HTTP API transactions #1624

Store Promise<Response> instead of Response for HTTP API transactions #1624

kegsay commented Nov 11, 2016

erikjohnston commented Nov 11, 2016

erikjohnston commented Nov 11, 2016

kegsay commented Nov 11, 2016

kegsay commented Nov 11, 2016 •

edited

Loading

kegsay commented Nov 11, 2016

erikjohnston commented Nov 11, 2016

kegsay commented Nov 11, 2016

erikjohnston commented Nov 11, 2016

kegsay commented Nov 11, 2016

kegsay commented Nov 11, 2016

erikjohnston Nov 12, 2016

kegsay Nov 14, 2016

erikjohnston Nov 12, 2016

kegsay Nov 14, 2016

erikjohnston Nov 12, 2016

kegsay Nov 14, 2016

erikjohnston commented Nov 12, 2016

erikjohnston commented Nov 12, 2016

kegsay commented Nov 14, 2016

kegsay commented Nov 14, 2016 •

edited

Loading

erikjohnston commented Nov 14, 2016

erikjohnston commented Nov 14, 2016

Store Promise<Response> instead of Response for HTTP API transactions #1624

Store Promise<Response> instead of Response for HTTP API transactions #1624

Conversation

kegsay commented Nov 11, 2016

erikjohnston commented Nov 11, 2016

erikjohnston commented Nov 11, 2016

kegsay commented Nov 11, 2016

kegsay commented Nov 11, 2016 • edited Loading

kegsay commented Nov 11, 2016

erikjohnston commented Nov 11, 2016

kegsay commented Nov 11, 2016

erikjohnston commented Nov 11, 2016

kegsay commented Nov 11, 2016

kegsay commented Nov 11, 2016

erikjohnston Nov 12, 2016

Choose a reason for hiding this comment

kegsay Nov 14, 2016

Choose a reason for hiding this comment

erikjohnston Nov 12, 2016

Choose a reason for hiding this comment

kegsay Nov 14, 2016

Choose a reason for hiding this comment

erikjohnston Nov 12, 2016

Choose a reason for hiding this comment

kegsay Nov 14, 2016

Choose a reason for hiding this comment

erikjohnston commented Nov 12, 2016

erikjohnston commented Nov 12, 2016

kegsay commented Nov 14, 2016

kegsay commented Nov 14, 2016 • edited Loading

erikjohnston commented Nov 14, 2016

erikjohnston commented Nov 14, 2016

kegsay commented Nov 11, 2016 •

edited

Loading

kegsay commented Nov 14, 2016 •

edited

Loading