Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slim down schema agreement API #779

Merged
merged 4 commits into from
Aug 22, 2023

Conversation

wprzytula
Copy link
Collaborator

@wprzytula wprzytula commented Jul 31, 2023

Schema agreement API is another place in the driver's code that can be polished towards API stability.

What's done

  1. fetch_schema_version() is removed.
    It can be a nasty source of race bugs when schema version is fetched without awaiting for schema agreement. It is especially risky when one awaits schema agreement and subsequently fetches schema version, because the schema version can change in the meantime and schema may no longer be in agreement. To cope with the problem, await_schema_agreement() now returns the agreed schema version, and fetch_schema_version() is removed alltogether.

  2. await_schema_agreement() gets implicit timeout.
    Similarly to

    • what Java driver 3 does in regard to awaiting schema agreement's API,
    • what was recently done in Rust driver in regard to tracing info's API,

    the API of awaiting schema agreement is altered so that:

    • await_timed_schema_agreement(), which takes explicit timeout, is
      removed,
    • await_schema_agreement() is now bound with a timeout that is set
      globally per-Session in SessionConfig.

    The motivation is that it could lead to application's deadlock when await_schema_agreement() was called with no timeout given and the cluster stoped synchronising (so schema agreement would be never reached), It is rarely desirable to block the calling application for arbitrarily long time, especially that some extreme situations are possible, such as a network partition between nodes.

  3. SessionBuilders API is accomodated to the changes. Docs are updated as well.

Bonus

  • missing docstrings were added for AddressTranslator.

Refs: #660

Pre-review checklist

  • I have split my patch into logically separate commits.
  • All commit messages clearly explain what they change and why.
  • [ ] I added relevant tests for new features and bug fixes.
  • All commits compile, pass static checks and pass test.
  • PR description sums up the changes and reasons why they should be introduced.
  • I have provided docstrings for the public items that I want to introduce.
  • I have adjusted the documentation in ./docs/source/.
  • [ ] I added appropriate Fixes: annotations to PR description.

&self,
timeout_duration: Duration,
) -> Result<bool, QueryError> {
timeout(timeout_duration, self.await_schema_agreement())
) -> Result<Option<Uuid>, QueryError> {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
) -> Result<Option<Uuid>, QueryError> {
) -> Result<Uuid, QueryError> {

maybe the timeout should be represented as a timeout QueryError?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea!

Copy link
Collaborator Author

@wprzytula wprzytula Jul 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only problem I see here is that we lose a way to conveniently distinguish cases of timeout (not yet in agreement) from cases of server-side or connection-related errors.

In version with Option<Uuid>:

if session
        .await_schema_agreement(Duration::from_secs(5))
        .await?
        .is_some()
    {
        println!("Schema is in agreement in time");
    } else {
        println!("Schema is NOT in agreement in time");
    }

whereas in the version you proposed:

match session
    .await_schema_agreement(Duration::from_secs(5))
    .await {
    Ok(_schema_version) => println!("Schema is in agreement in time"),
    Err(QueryError::RequestTimeout(_)) => println!("Schema is NOT in agreement in time"),
    err => return err,
};

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've implemented it anyway.

@wprzytula wprzytula force-pushed the slim-schema-agreement-api branch 2 times, most recently from 8dca004 to 4c6f5d5 Compare August 3, 2023 08:29
@wprzytula wprzytula marked this pull request as ready for review August 3, 2023 08:42
@wprzytula
Copy link
Collaborator Author

I've documented automatic schema agreement awaiting.

@wprzytula wprzytula force-pushed the slim-schema-agreement-api branch 2 times, most recently from 1422985 to 542b706 Compare August 18, 2023 14:24
scylla/src/transport/session.rs Outdated Show resolved Hide resolved
@piodul
Copy link
Collaborator

piodul commented Aug 22, 2023

Requires a rebase and dropping one of the commits, but otherwise LGTM.

It was a function that is exclusively used by `fetch_schema_version()`,
and is moreover misnamed. It is inlined into `fetch_schema_version()`.
Similarly to
- what Java driver 3 does in regard to awaiting schema
agreement's API,
- what was recently done in Rust driver in regard to tracing info's API,

the API of awaiting schema agreement is altered so that:
- `await_timed_schema_agreement()`, which takes explicit timeout, is
  removed,
- `await_schema_agreement()` is now bound with a timeout that is set
  globally per-`Session` in `SessionConfig`.

The motivation is that it could lead to application's deadlock when
`await_schema_agreement()` was called with no timeout given and the
cluster stoped synchronising (so schema agreement would be never
reached), It is rarely desirable to block the calling application for
arbitrarily long time, especially that some extreme situations are
possible, such as a network partition between nodes.

`SessionBuilder`s API is accomodated to the changes. Docs are updated as
well.
It can be a nasty source of race bugs when schema version is fetched
without awaiting for schema agreement. It is especially risky when one
awaits schema agreement and subsequently fetches schema version, because
the schema version can change in the meantime and schema may no longer
be in agreement.
To cope with the problem, `await_schema_agreement()` now returns the
agreed schema version, and `fetch_schema_version()` is removed
alltogether.
The mechanism of automatically awaiting schema agreement after a
schema-altering request is executed was missing mention in docs.
@wprzytula
Copy link
Collaborator Author

Rebased and got rid of irrelevant commit.

@piodul
Copy link
Collaborator

piodul commented Aug 22, 2023

Thanks. I'll merge after the CI checks finish.

@piodul piodul merged commit 14dc668 into scylladb:main Aug 22, 2023
9 checks passed
@wprzytula wprzytula deleted the slim-schema-agreement-api branch August 22, 2023 10:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants