Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: SET CLUSTER SETTING doesn't reliably wait for propagation when run on a tenant #87201

Closed
HonoreDB opened this issue Aug 31, 2022 · 6 comments
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. docs-known-limitation T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)

Comments

@HonoreDB
Copy link
Contributor

HonoreDB commented Aug 31, 2022

This test fails or flakes on large enough n: you can't quite rely on logic that does things like

for mode in modes:
  run sql to set cluster settings for mode
  run more sql 

Because even if you're the only database client, it's possible that the cluster settings won't be what you think they are after setting them.

This has come up a few times in tests since they often test behavior against multiple cluster settings in the same session. In theory though it could also happen in real life: there are some cluster settings that are the only way to control behavior, so if you want to set them "per statement", you need to change them repeatedly.

This is probably related to the implementation of waitForSettingUpdate and/or the rangefeed consumer that propagates settings, but it's not obvious to me where the race condition is. I think ideally waitForSettingUpdate would wait until it sees a setting being propagated that's tagged as being from its own unique statement id, or we should document that just as you can't set cluster settings in a transaction or multi-statement block, you can't rely on sessions being ordered with respect to them.

Jira issue: CRDB-19212

@HonoreDB HonoreDB added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-sql-queries SQL Queries Team labels Aug 31, 2022
@michae2
Copy link
Collaborator

michae2 commented Sep 6, 2022

Notes from triage meeting:

We don't currently guarantee that SET CLUSTER SETTING is synchronous, do we? Just session settings? @yuzefovich suggests maybe we should add a variant of SET CLUSTER SETTING that blocks until all nodes have the new value.

or we should document that just as you can't set cluster settings in a transaction or multi-statement block, you can't rely on sessions being ordered with respect to them.

Yeah, maybe simply documenting this is the way to go.

@ajwerner do you have any thoughts?

@blathers-crl blathers-crl bot added the T-sql-schema-deprecated Use T-sql-foundations instead label Sep 6, 2022
@ajwerner
Copy link
Contributor

ajwerner commented Sep 6, 2022

  1. This issue seems to talk about tenants, but I don't think it's particularly tenant specific. I don't see how tenants come into play.

  2. I don't think waiting for all nodes to have the new value solves the underlying problem. Indeed, the experiment Aaron ran is on a single node. The problem is if you toggle the setting back and forth, the currently polling logic isn't smart enough to know whether it was due to some previous change. If we wanted to make the waiting robust, we'd need some notion of time and causality and wait for that causality token to propagate. The truth is that that's not crazy hard to do -- there is a uniform notion of time here. We'd just need to plumb the timestamp corresponding to the value into the settings container and then wait for it to take on the correct value at a timestamp greater than or equal to the timestamp at which the setting was written.

@michae2
Copy link
Collaborator

michae2 commented Sep 6, 2022

@HonoreDB regarding the linked test, do we need this blocking behavior from SET CLUSTER SETTING security.ocsp.mode or is this only needed for testing?

@ajwerner
Copy link
Contributor

ajwerner commented Sep 6, 2022

I'd say this is something of an edge case we can document away, and is primarily painful only for testing.

@michae2
Copy link
Collaborator

michae2 commented Sep 6, 2022

I guess we do document this already:

Changing a cluster setting is not instantaneous, as the change must be propagated to other nodes in the cluster.

Closing.

@michae2 michae2 closed this as completed Sep 6, 2022
ajwerner added a commit to ajwerner/cockroach that referenced this issue Sep 8, 2022
A rangefeed is allowed to send previously seen values. When it did, it would
result in the observed value of a setting regressing. There's no need for this:
we can track some timestamps and ensure we do not regress.

Fixes cockroachdb#87502

Relates to cockroachdb#87201

Release note (bug fix): In rare cases, the value of a cluster setting could
regress soon after it was set. This no longer happens for a given gateway node.
@ajwerner
Copy link
Contributor

ajwerner commented Sep 8, 2022

I did something about this here: #87564.

ajwerner added a commit to ajwerner/cockroach that referenced this issue Sep 8, 2022
A rangefeed is allowed to send previously seen values. When it did, it would
result in the observed value of a setting regressing. There's no need for this:
we can track some timestamps and ensure we do not regress.

Fixes cockroachdb#87502

Relates to cockroachdb#87201

Release note (bug fix): In rare cases, the value of a cluster setting could
regress soon after it was set. This no longer happens for a given gateway node.
ajwerner added a commit to ajwerner/cockroach that referenced this issue Sep 9, 2022
A rangefeed is allowed to send previously seen values. When it did, it would
result in the observed value of a setting regressing. There's no need for this:
we can track some timestamps and ensure we do not regress.

Fixes cockroachdb#87502

Relates to cockroachdb#87201

Release note (bug fix): In rare cases, the value of a cluster setting could
regress soon after it was set. This no longer happens for a given gateway node.
craig bot pushed a commit that referenced this issue Sep 9, 2022
87564: server/settingswatcher: track timestamps so values do not regress r=ajwerner a=ajwerner

A rangefeed is allowed to send previously seen values. When it did, it would result in the observed value of a setting regressing. There's no need for this: we can track some timestamps and ensure we do not regress.

Fixes #87502

Relates to #87201

Release note (bug fix): In rare cases, the value of a cluster setting could regress soon after it was set. This no longer happens for a given gateway node.

Co-authored-by: Andrew Werner <[email protected]>
blathers-crl bot pushed a commit that referenced this issue Sep 9, 2022
A rangefeed is allowed to send previously seen values. When it did, it would
result in the observed value of a setting regressing. There's no need for this:
we can track some timestamps and ensure we do not regress.

Fixes #87502

Relates to #87201

Release note (bug fix): In rare cases, the value of a cluster setting could
regress soon after it was set. This no longer happens for a given gateway node.
@exalate-issue-sync exalate-issue-sync bot removed the T-sql-queries SQL Queries Team label Sep 30, 2022
ajwerner added a commit to ajwerner/cockroach that referenced this issue Nov 9, 2022
This knob was being used by default to subvert the settings infrastructure in
tenants on the local node. This lead to hazardous interactions with the
settingswatcher behavior. That library tries quite hard to synchronous updates
to settings and ensure that they do not regress. By setting the setting above
that layer, we could very much see them regress.

As far as I can tell, this code came about before tenants could actually manage
settings for themselves. In practice, this code would run prior to the
transaction writing the setting running, which generally meant that so long
as you didn't flip settings back and forth, things would work out.
Nevertheless, it was tech debt and is now removed.

Fixes cockroachdb#87017
Informs cockroachdb#87201

Release note: None
craig bot pushed a commit that referenced this issue Nov 9, 2022
91565: server,settings: remove vestigial tenant-only testing knob r=ajwerner a=ajwerner

This knob was being used by default to subvert the settings infrastructure in tenants on the local node. This lead to hazardous interactions with the settingswatcher behavior. That library tries quite hard to synchronous updates to settings and ensure that they do not regress. By setting the setting above that layer, we could very much see them regress.

As far as I can tell, this code came about before tenants could actually manage settings for themselves. In practice, this code would run prior to the transaction writing the setting running, which generally meant that so long as you didn't flip settings back and forth, things would work out. Nevertheless, it was tech debt and is now removed.

Fixes #87017
Informs #87201

Release note: None

Co-authored-by: Andrew Werner <[email protected]>
renatolabs added a commit to renatolabs/cockroach that referenced this issue Nov 18, 2022
Previously, the `tpcc/mixed-headroom` roachtests would reset the
`preserve_downgrade_option` setting and then wait for the upgrade to
finish by running a `SET CLUSTER SETTING version = '...'`
statement. However, that is not reliable as it's possible for that
statement to return an error if the resetting of the
`preserve_downgrade_option` has not been propagated yet (see cockroachdb#87201).

To avoid this type of flake (which has been observed in manual runs),
we use a retry loop waiting for the cluster version to converge, as is
done by the majority of upgrade-related roachtests.

Epic: None.
Release note: None
craig bot pushed a commit that referenced this issue Nov 18, 2022
92140: roachtest: geo distributed roachtests which specify a number of nodes… r=smg260 a=smg260

… strictly less

than the default number of gcloud zones (currently 9) result in a gcloud syntax error for omission of instance name. This happens because we loop through `len(zones)` instead of `len(nodes)`

Resolves: #92150
Release note: none
Epic: none

92147: testcluster: don't swallow node start error r=andreimatei a=andreimatei

The buggy code path had the structure:
```
func bug() {
  var err error
  if err := foo(); err != nil {
    if cond {
      goexit
    }
  }
  return err
}
```
This inadvertently swallws foo()'s error when `!cond`, because foo's error has a smaller scope than the error that ends up being returned.

Release note: None
Epic: None

92153: roachtest: wait for upgrade to complete using retry loop in tpcc r=srosenberg a=renatolabs

Previously, the `tpcc/mixed-headroom` roachtests would reset the `preserve_downgrade_option` setting and then wait for the upgrade to finish by running a `SET CLUSTER SETTING version = '...'` statement. However, that is not reliable as it's possible for that statement to return an error if the resetting of the `preserve_downgrade_option` has not been propagated yet (see #87201).

To avoid this type of flake (which has been observed in manual runs), we use a retry loop waiting for the cluster version to converge, as is done by the majority of upgrade-related roachtests.

Epic: None.
Release note: None

92166: cmd: increase logictestccl stress timeout to 2h r=rytaft a=rytaft

The default timeout of 1h is not enough. Increase it to 2h to match the regular logictests.

Fixes #92108

Release note: None

Co-authored-by: Miral Gadani <[email protected]>
Co-authored-by: Andrei Matei <[email protected]>
Co-authored-by: Renato Costa <[email protected]>
Co-authored-by: Rebecca Taft <[email protected]>
@healthy-pod healthy-pod added T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) and removed T-sql-schema-deprecated Use T-sql-foundations instead labels May 17, 2023
@mgartner mgartner moved this to Done in SQL Queries Jul 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. docs-known-limitation T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)
Projects
Archived in project
Development

No branches or pull requests

4 participants