sql: SET CLUSTER SETTING doesn't reliably wait for propagation when run on a tenant #87201

HonoreDB · 2022-08-31T17:57:37Z

This test fails or flakes on large enough n: you can't quite rely on logic that does things like

for mode in modes:
  run sql to set cluster settings for mode
  run more sql

Because even if you're the only database client, it's possible that the cluster settings won't be what you think they are after setting them.

This has come up a few times in tests since they often test behavior against multiple cluster settings in the same session. In theory though it could also happen in real life: there are some cluster settings that are the only way to control behavior, so if you want to set them "per statement", you need to change them repeatedly.

This is probably related to the implementation of waitForSettingUpdate and/or the rangefeed consumer that propagates settings, but it's not obvious to me where the race condition is. I think ideally waitForSettingUpdate would wait until it sees a setting being propagated that's tagged as being from its own unique statement id, or we should document that just as you can't set cluster settings in a transaction or multi-statement block, you can't rely on sessions being ordered with respect to them.

Jira issue: CRDB-19212

The text was updated successfully, but these errors were encountered:

michae2 · 2022-09-06T18:37:48Z

Notes from triage meeting:

We don't currently guarantee that SET CLUSTER SETTING is synchronous, do we? Just session settings? @yuzefovich suggests maybe we should add a variant of SET CLUSTER SETTING that blocks until all nodes have the new value.

or we should document that just as you can't set cluster settings in a transaction or multi-statement block, you can't rely on sessions being ordered with respect to them.

Yeah, maybe simply documenting this is the way to go.

@ajwerner do you have any thoughts?

ajwerner · 2022-09-06T18:43:22Z

This issue seems to talk about tenants, but I don't think it's particularly tenant specific. I don't see how tenants come into play.
I don't think waiting for all nodes to have the new value solves the underlying problem. Indeed, the experiment Aaron ran is on a single node. The problem is if you toggle the setting back and forth, the currently polling logic isn't smart enough to know whether it was due to some previous change. If we wanted to make the waiting robust, we'd need some notion of time and causality and wait for that causality token to propagate. The truth is that that's not crazy hard to do -- there is a uniform notion of time here. We'd just need to plumb the timestamp corresponding to the value into the settings container and then wait for it to take on the correct value at a timestamp greater than or equal to the timestamp at which the setting was written.

michae2 · 2022-09-06T19:44:26Z

@HonoreDB regarding the linked test, do we need this blocking behavior from SET CLUSTER SETTING security.ocsp.mode or is this only needed for testing?

ajwerner · 2022-09-06T19:47:59Z

I'd say this is something of an edge case we can document away, and is primarily painful only for testing.

michae2 · 2022-09-06T21:20:40Z

I guess we do document this already:

Changing a cluster setting is not instantaneous, as the change must be propagated to other nodes in the cluster.

Closing.

A rangefeed is allowed to send previously seen values. When it did, it would result in the observed value of a setting regressing. There's no need for this: we can track some timestamps and ensure we do not regress. Fixes cockroachdb#87502 Relates to cockroachdb#87201 Release note (bug fix): In rare cases, the value of a cluster setting could regress soon after it was set. This no longer happens for a given gateway node.

ajwerner · 2022-09-08T04:09:47Z

I did something about this here: #87564.

A rangefeed is allowed to send previously seen values. When it did, it would result in the observed value of a setting regressing. There's no need for this: we can track some timestamps and ensure we do not regress. Fixes cockroachdb#87502 Relates to cockroachdb#87201 Release note (bug fix): In rare cases, the value of a cluster setting could regress soon after it was set. This no longer happens for a given gateway node.

87564: server/settingswatcher: track timestamps so values do not regress r=ajwerner a=ajwerner A rangefeed is allowed to send previously seen values. When it did, it would result in the observed value of a setting regressing. There's no need for this: we can track some timestamps and ensure we do not regress. Fixes #87502 Relates to #87201 Release note (bug fix): In rare cases, the value of a cluster setting could regress soon after it was set. This no longer happens for a given gateway node. Co-authored-by: Andrew Werner <[email protected]>

A rangefeed is allowed to send previously seen values. When it did, it would result in the observed value of a setting regressing. There's no need for this: we can track some timestamps and ensure we do not regress. Fixes #87502 Relates to #87201 Release note (bug fix): In rare cases, the value of a cluster setting could regress soon after it was set. This no longer happens for a given gateway node.

This knob was being used by default to subvert the settings infrastructure in tenants on the local node. This lead to hazardous interactions with the settingswatcher behavior. That library tries quite hard to synchronous updates to settings and ensure that they do not regress. By setting the setting above that layer, we could very much see them regress. As far as I can tell, this code came about before tenants could actually manage settings for themselves. In practice, this code would run prior to the transaction writing the setting running, which generally meant that so long as you didn't flip settings back and forth, things would work out. Nevertheless, it was tech debt and is now removed. Fixes cockroachdb#87017 Informs cockroachdb#87201 Release note: None

91565: server,settings: remove vestigial tenant-only testing knob r=ajwerner a=ajwerner This knob was being used by default to subvert the settings infrastructure in tenants on the local node. This lead to hazardous interactions with the settingswatcher behavior. That library tries quite hard to synchronous updates to settings and ensure that they do not regress. By setting the setting above that layer, we could very much see them regress. As far as I can tell, this code came about before tenants could actually manage settings for themselves. In practice, this code would run prior to the transaction writing the setting running, which generally meant that so long as you didn't flip settings back and forth, things would work out. Nevertheless, it was tech debt and is now removed. Fixes #87017 Informs #87201 Release note: None Co-authored-by: Andrew Werner <[email protected]>

Previously, the `tpcc/mixed-headroom` roachtests would reset the `preserve_downgrade_option` setting and then wait for the upgrade to finish by running a `SET CLUSTER SETTING version = '...'` statement. However, that is not reliable as it's possible for that statement to return an error if the resetting of the `preserve_downgrade_option` has not been propagated yet (see cockroachdb#87201). To avoid this type of flake (which has been observed in manual runs), we use a retry loop waiting for the cluster version to converge, as is done by the majority of upgrade-related roachtests. Epic: None. Release note: None

92140: roachtest: geo distributed roachtests which specify a number of nodes… r=smg260 a=smg260 … strictly less than the default number of gcloud zones (currently 9) result in a gcloud syntax error for omission of instance name. This happens because we loop through `len(zones)` instead of `len(nodes)` Resolves: #92150 Release note: none Epic: none 92147: testcluster: don't swallow node start error r=andreimatei a=andreimatei The buggy code path had the structure: ``` func bug() { var err error if err := foo(); err != nil { if cond { goexit } } return err } ``` This inadvertently swallws foo()'s error when `!cond`, because foo's error has a smaller scope than the error that ends up being returned. Release note: None Epic: None 92153: roachtest: wait for upgrade to complete using retry loop in tpcc r=srosenberg a=renatolabs Previously, the `tpcc/mixed-headroom` roachtests would reset the `preserve_downgrade_option` setting and then wait for the upgrade to finish by running a `SET CLUSTER SETTING version = '...'` statement. However, that is not reliable as it's possible for that statement to return an error if the resetting of the `preserve_downgrade_option` has not been propagated yet (see #87201). To avoid this type of flake (which has been observed in manual runs), we use a retry loop waiting for the cluster version to converge, as is done by the majority of upgrade-related roachtests. Epic: None. Release note: None 92166: cmd: increase logictestccl stress timeout to 2h r=rytaft a=rytaft The default timeout of 1h is not enough. Increase it to 2h to match the regular logictests. Fixes #92108 Release note: None Co-authored-by: Miral Gadani <[email protected]> Co-authored-by: Andrei Matei <[email protected]> Co-authored-by: Renato Costa <[email protected]> Co-authored-by: Rebecca Taft <[email protected]>

HonoreDB added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-sql-queries SQL Queries Team labels Aug 31, 2022

HonoreDB mentioned this issue Aug 31, 2022

ccl/changefeedccl: TestChangefeedErrors failed #87017

Closed

blathers-crl bot added the T-sql-schema-deprecated Use T-sql-foundations instead label Sep 6, 2022

michae2 added the docs-known-limitation label Sep 6, 2022

michae2 closed this as completed Sep 6, 2022

ajwerner mentioned this issue Sep 8, 2022

server/settingswatcher: track timestamps so values do not regress #87564

Merged

blathers-crl bot mentioned this issue Sep 9, 2022

release-22.2: server/settingswatcher: track timestamps so values do not regress #87732

Merged

yuzefovich mentioned this issue Sep 20, 2022

ccl: TestTenantLogic_create_index timed out #88202

Closed

exalate-issue-sync bot removed the T-sql-queries SQL Queries Team label Sep 30, 2022

ajwerner mentioned this issue Nov 9, 2022

server,settings: remove vestigial tenant-only testing knob #91565

Merged

renatolabs mentioned this issue Nov 18, 2022

roachtest: wait for upgrade to complete using retry loop in tpcc #92153

Merged

cucaroach mentioned this issue Jan 20, 2023

sql: fix cluster setting propagation flake take 2 #95583

Merged

healthy-pod added T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) and removed T-sql-schema-deprecated Use T-sql-foundations instead labels May 17, 2023

mgartner added this to SQL Queries Jul 24, 2023

mgartner moved this to Done in SQL Queries Jul 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sql: SET CLUSTER SETTING doesn't reliably wait for propagation when run on a tenant #87201

sql: SET CLUSTER SETTING doesn't reliably wait for propagation when run on a tenant #87201

HonoreDB commented Aug 31, 2022 •

edited by cockroach-jira-scripts

Loading

michae2 commented Sep 6, 2022 •

edited

Loading

ajwerner commented Sep 6, 2022

michae2 commented Sep 6, 2022

ajwerner commented Sep 6, 2022

michae2 commented Sep 6, 2022

ajwerner commented Sep 8, 2022

sql: SET CLUSTER SETTING doesn't reliably wait for propagation when run on a tenant #87201

sql: SET CLUSTER SETTING doesn't reliably wait for propagation when run on a tenant #87201

Comments

HonoreDB commented Aug 31, 2022 • edited by cockroach-jira-scripts Loading

michae2 commented Sep 6, 2022 • edited Loading

ajwerner commented Sep 6, 2022

michae2 commented Sep 6, 2022

ajwerner commented Sep 6, 2022

michae2 commented Sep 6, 2022

ajwerner commented Sep 8, 2022

HonoreDB commented Aug 31, 2022 •

edited by cockroach-jira-scripts

Loading

michae2 commented Sep 6, 2022 •

edited

Loading