Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

session_test: Increase tracing info timeout #966

Merged
merged 3 commits into from
Mar 19, 2024

Conversation

Lorak-mmk
Copy link
Collaborator

@Lorak-mmk Lorak-mmk commented Mar 19, 2024

Default values for fetching tracing info are quite strict in the driver: it will make 5 attempts to fetch the data, waiting 3ms between attempts.

For Scylla this is enough, but for Cassandra it is not. For this reason our tests sometimes fail.

This was partially (for one test) fixed previously: 61ad878
This PR fixes other test case that might be affected.
Previous fix used 50 attempts with 200ms wait, I changed it to 200 attempts with 50ms wait in order to not slow down Scylla tests unnecessarily.

Increased default retry attempts in Session from 5 to 10 to make tracing fetching more reliable for Scylla users without increasing latency.
I didn't touch default sleep between attempts - for Scylla new values should be ok, and Cassandra users can change them in config. I added a warning in docs so that they are aware of the problem.

Fixes: #957
Fixes: #709

Pre-review checklist

  • I have split my patch into logically separate commits.
  • All commit messages clearly explain what they change and why.
  • I added relevant tests for new features and bug fixes.
  • All commits compile, pass static checks and pass test.
  • PR description sums up the changes and reasons why they should be introduced.
  • I have provided docstrings for the public items that I want to introduce.
  • I have adjusted the documentation in ./docs/source/.
  • I added appropriate Fixes: annotations to PR description.

@Lorak-mmk
Copy link
Collaborator Author

While at it, we should decide what to do about #709
I'd be in favor of closing it, because this default seems good for Scylla (I don't think tracing tests ever failed for Scylla in CI) so I don't want to increase latency for Scylla users because of Cassandra.

Copy link

github-actions bot commented Mar 19, 2024

cargo semver-checks found no API-breaking changes in this PR! 🎉🥳
Checked commit: c70faec

@Lorak-mmk Lorak-mmk force-pushed the fix-tracing-defaults branch 2 times, most recently from aaeef26 to 963daa7 Compare March 19, 2024 13:11
@Lorak-mmk
Copy link
Collaborator Author

While at it, we should decide what to do about #709 I'd be in favor of closing it, because this default seems good for Scylla (I don't think tracing tests ever failed for Scylla in CI) so I don't want to increase latency for Scylla users because of Cassandra.

Added comments about Cassandra in session builder, and increased retries amount from 5 to 10, I think we can close #709 with this.

@wprzytula
Copy link
Collaborator

Added comments about Cassandra in session builder, and increased retries amount from 5 to 10, I think we can close #709 with this.

I agree, let's close #709.

Default values for fetching tracing info are quite strict in the driver:
it will make 5 attempts to fetch the data, waiting 3ms between attempts.

For Scylla this is enough, but for Cassandra it is not. For this reason
our tests sometimes fail.

This was partially (for one test) fixed previously:
scylladb@61ad878

This commit fixes rest of the tests in session_test that could be affected.
It also changes timeout parameters. Previous fixed used 50 attempts with
200ms wait, I changed it to 200 attempts with 50ms wait in order to not
slow down Scylla tests unnecessarily.
Cassandra sometimes takes a lot of time to update tracing tables.
Default values in session builder are suitable for Scylla, so Cassandra
users may encounter unexpected timeouts. This commit adds a note
suggesting them to increase the values.
Default values are a bit optimistic: fetching will be attempted 5 times,
with 3ms sleeps between attempts. I am reluctant to increase sleep time,
because it will increase latency in optimistic case.

This commit increases default attempts number from 5 to 10, so that
fetching tracing is more reliable, bot not overly lengthy.
@Lorak-mmk Lorak-mmk force-pushed the fix-tracing-defaults branch from d0e8ca2 to c70faec Compare March 19, 2024 17:16
@Lorak-mmk Lorak-mmk requested a review from wprzytula March 19, 2024 17:16
@wprzytula wprzytula merged commit 9634a0f into scylladb:main Mar 19, 2024
10 of 11 checks passed
@Lorak-mmk Lorak-mmk deleted the fix-tracing-defaults branch March 20, 2024 11:38
@Lorak-mmk Lorak-mmk mentioned this pull request May 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

test_tracing is probably flaky with Cassandra Default retry delay for getting tracing info may be too low
2 participants