Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: sync request cache ignores most parameters #336

Closed
olivia-fl opened this issue May 4, 2024 · 0 comments · Fixed by #380
Closed

Bug: sync request cache ignores most parameters #336

olivia-fl opened this issue May 4, 2024 · 0 comments · Fixed by #380
Labels
bug Something isn't working core-matrix

Comments

@olivia-fl
Copy link
Contributor

The sync endpoint caches the response for multiple requests from the same device, and only allows one request to run concurrently for a given device. This cache is keyed off the user id, device id, and the since parameter of the request body.

The request body has several other parameters that influence the response and database state-changes. full_state affects the response, set_presence affects the database state-change, and filter affects both. Because the cache does not take these parameters into account, requests that should have different responses are treated interchangeably, and requests that should have side effects are skipped.

I haven't tracked down the user-visible effect of this set of bugs yet. While working on the filter implementation, it's been very obvious because I will change the filter parameters and get the same response until I restart the server. I will probably end up fixing this, at least for the filter parameter, as part of the filtering implementation.

@girlbossceo girlbossceo added bug Something isn't working core-matrix labels May 4, 2024
olivia-fl added a commit to olivia-fl/conduwuit that referenced this issue May 17, 2024
This cache can serve invalid responses, and has an extremely low hit
rate.

It serves invalid responses because because it's only keyed off
the `since` parameter, but many of the other request parameters also
affect the response or it's side effects. This will become worse once we
implement filtering, because there will be a wider space of parameters
with different responses. This problem is fixable, but not worth it
because of the low hit rate.

The low hit rate is because normal clients will always issue the next
sync request with `since` set to the `prev_batch` value of the previous
response. The only time we expect to see multiple requests with the same
`since` is when the response is empty, but we don't cache empty
responses.

This was confirmed experimentally by logging cache hits and misses over
15 minutes with a wide variety of clients. This test was run on
matrix.computer.surgery, which has only a few active users, but a
large volume of sync traffic from many rooms. Over the test period, we
had 3 hits and 5309 misses. All hits occurred in the first minute, so I
suspect that they had something to do with client recovery from an
offline state. The clients that were connected during the test are:

 - element web
 - schildichat web
 - iamb
 - gomuks
 - nheko
 - fractal
 - fluffychat web
 - fluffychat android
 - cinny web
 - element android
 - element X android

Fixes: girlbossceo#336
olivia-fl added a commit to olivia-fl/conduwuit that referenced this issue May 17, 2024
This cache can serve invalid responses, and has an extremely low hit
rate.

It serves invalid responses because because it's only keyed off
the `since` parameter, but many of the other request parameters also
affect the response or it's side effects. This will become worse once we
implement filtering, because there will be a wider space of parameters
with different responses. This problem is fixable, but not worth it
because of the low hit rate.

The low hit rate is because normal clients will always issue the next
sync request with `since` set to the `prev_batch` value of the previous
response. The only time we expect to see multiple requests with the same
`since` is when the response is empty, but we don't cache empty
responses.

This was confirmed experimentally by logging cache hits and misses over
15 minutes with a wide variety of clients. This test was run on
matrix.computer.surgery, which has only a few active users, but a
large volume of sync traffic from many rooms. Over the test period, we
had 3 hits and 5309 misses. All hits occurred in the first minute, so I
suspect that they had something to do with client recovery from an
offline state. The clients that were connected during the test are:

 - element web
 - schildichat web
 - iamb
 - gomuks
 - nheko
 - fractal
 - fluffychat web
 - fluffychat android
 - cinny web
 - element android
 - element X android

Fixes: girlbossceo#336
olivia-fl added a commit to olivia-fl/conduwuit that referenced this issue May 17, 2024
This cache can serve invalid responses, and has an extremely low hit
rate.

It serves invalid responses because because it's only keyed off
the `since` parameter, but many of the other request parameters also
affect the response or it's side effects. This will become worse once we
implement filtering, because there will be a wider space of parameters
with different responses. This problem is fixable, but not worth it
because of the low hit rate.

The low hit rate is because normal clients will always issue the next
sync request with `since` set to the `prev_batch` value of the previous
response. The only time we expect to see multiple requests with the same
`since` is when the response is empty, but we don't cache empty
responses.

This was confirmed experimentally by logging cache hits and misses over
15 minutes with a wide variety of clients. This test was run on
matrix.computer.surgery, which has only a few active users, but a
large volume of sync traffic from many rooms. Over the test period, we
had 3 hits and 5309 misses. All hits occurred in the first minute, so I
suspect that they had something to do with client recovery from an
offline state. The clients that were connected during the test are:

 - element web
 - schildichat web
 - iamb
 - gomuks
 - nheko
 - fractal
 - fluffychat web
 - fluffychat android
 - cinny web
 - element android
 - element X android

Fixes: girlbossceo#336
girlbossceo pushed a commit that referenced this issue May 17, 2024
This cache can serve invalid responses, and has an extremely low hit
rate.

It serves invalid responses because because it's only keyed off
the `since` parameter, but many of the other request parameters also
affect the response or it's side effects. This will become worse once we
implement filtering, because there will be a wider space of parameters
with different responses. This problem is fixable, but not worth it
because of the low hit rate.

The low hit rate is because normal clients will always issue the next
sync request with `since` set to the `prev_batch` value of the previous
response. The only time we expect to see multiple requests with the same
`since` is when the response is empty, but we don't cache empty
responses.

This was confirmed experimentally by logging cache hits and misses over
15 minutes with a wide variety of clients. This test was run on
matrix.computer.surgery, which has only a few active users, but a
large volume of sync traffic from many rooms. Over the test period, we
had 3 hits and 5309 misses. All hits occurred in the first minute, so I
suspect that they had something to do with client recovery from an
offline state. The clients that were connected during the test are:

 - element web
 - schildichat web
 - iamb
 - gomuks
 - nheko
 - fractal
 - fluffychat web
 - fluffychat android
 - cinny web
 - element android
 - element X android

Fixes: #336
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working core-matrix
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants