server: meta-issue to organize part of the work in CRDB-26691 #98431

knz · 2023-03-11T10:17:59Z

This issue covers 4 related threads of work in epic CRDB-26691:

improve the handling of cluster setting overrides
- in particular we want to fix server: avoid starting tenant servers until all overrides have been received #96512
making tenants aware of their capabilities, so they can advise SQL users for UX improvements
- will be needed to make ALTER CONFIGURE ZONE capability-aware
making tenants aware of their service mode, and make their servers self-abort when the service mode is cancelled.
- issues server: blocked mixed-style deployments [CRDB-14537 followup] #93145 and sql: connecting to an inactive tenant should return an error #83650
- original draft PR in server: honor and validate the service mode for SQL pods #96144 but we want a better approach
inject SQL into tenants for config purposes and perhaps other use cases
- this is a followup to the fix for issue server: separate config defaults for serverless vs dedicated/SH secondary tenants #94856

Architectural overview / tech strategy

We will choose a common, shared architecture for all 4 threads of work, which will reduce the amount of work overall by sharing a big chunk of technology between them:

on each KV node, tenant-specific configuration will be loaded up from system tables into a single in-memory data structure
- each tenant ID mapped to its own config
- this will integrate data from "settingswatcher" and "capwatcher" and will be extended with service mode and injected SQL
- whenever we do not have a rangefeed approach yet, we will prototype the solution using SQL polling with the expectation of rangefeeds to be added later (as an optimization)
we will use a single streaming RPC to propagate changes to the in-memory data structure to the tenant connector on each tenant. We will extend TenantSettings and its result payload for this.
we will modify the connector startup logic to wait on the initial config before reporting that startup has completed
(this will add a tiny extra startup latency to SQL pods, but we have sign off from the Serverless team on that -- it will not incur cross-region latency since the retrieved data is already in RAM in each KV node)

Work threads

The following list is the logical description of work, but the work items are actually shared across each group of tasks (see summary of work at the bottom).

A. Group capabilities and setting overrides into a single data structure (this is an infra dependency for the work below)

gather rangefeeds from system.tenant_settings and system.tenants to the same place
link up the notification channel (already defined for setting overrides) to updates in system.tenants

B. Propagate capabilities to tenants

Define RPC response payload
Use notif channel (dependency on A.2) in streaming RPC server handler
Receive data in tenant connector, cache it there
Define API for use by SQL
Objective Notify team of availability of capabilities tenant-side and educate on API use

C. Propagate Service mode to tenants

Decode service mode from system.tenants rangefeed conditionally (the column was added recently)
Load up the value in the in-memory data structure (dependency on A.1)
Define RPC response payload
Use notif channel (dependency on A.2) in streaming RPC server handler
Receive data in tenant connector, cache it there
Define API for use by SQL server initialization
Objective rebase server: honor and validate the service mode for SQL pods #96144 on top of this thread and get it merged

D. Propagate Injected SQL

Goal 1 Define basic data model for injected SQL using Go structs, merge initial "config profiles" PR for use by rest of team (this will de-risk server: separate config defaults for serverless vs dedicated/SH secondary tenants #94856)
Add injected SQL fields to in-memory data structure (not populated yet)
Define RPC response payload
Use notif channel (dependency on A.2) in streaming RPC server handler
Receive data in tenant connector, cache it there
Define system tables and hook up to data structure via polling loop, leave TODOs and file issue to add rangefeed later
Objective rebase cli,server: initial cluster configuration via job injection #98380 on top of this thread and get it merged

E. "Exec response" API - eventually needed for multiregion serverless and other purposes

Add response fields to in-memory data structure
Define new unary RPC for use by tenants
Filter injected SQL upon load based on already-complete payloads
Define a new system table to contain responses
Write-through from the response cache to the system table
Also poll from system table to refresh cache (leave TODO and file issue to add rangefeed later)
Objective follow-up on cli,server: initial cluster configuration via job injection #98380 to add a response callback upon job completion

Actual work (execution)

~~X1. Extract subset of #94856 that produces initial config for system tenant (D.1)~~
X2. Define common data structure and notification mechanism (A.1, A.2, C.2, D.2, E.1)
~~X3. Extend TenantSettings response payload to support more payload types (B.1, C.3, D.3)~~
~~X4. Extend TenantSettings RPC handler to propagate more data upon change notifications (B.2, C.4, D.4)~~
~~X5. Extend tenant connector to receive more data from TenantSettings (B.3, C.5, D.5)~~
~~X6. Advertise availability of caps tenant-side to team (B.5)~~
X7. Define new system tables and poll from them to refresh in-memory cache (D.6, E.4, E.5, E.6)
~~X8. Rebase #98380 to define injected SQL, close #94856 and related RFC (D.8)~~
X9. Rebase #96144, close #93145 and #83650. -- in progress
X10. Implement ExecResponse API, write-through to system table and filter injected SQL (E.2, E.3)
X11. Followup to #98380 to ping back storage cluster via ExecResponse API on job completion (E.7)

Jira issue: CRDB-25263

Epic CRDB-26691

The text was updated successfully, but these errors were encountered:

98440: upgrades,jobs: stub the auto config runner r=stevendanna,ajwerner a=knz Epic: CRDB-23559 Informs #98431. This change is part of a larger project to perform auto-configuration in newly created clusters. One of the components will be the "auto config runner", a job which receives configuration to apply from the environment, creates other jobs to apply this configuration, and stores the execution results. This commit stubs the job, which means allocating a cluster version for it and instantiating it via a cluster migration. The job is otherwise idle at this point. Release note: None Co-authored-by: Raphael 'kena' Poss <[email protected]>

98459: server,autoconfig: automatic configuration via config tasks r=adityamaru a=knz Epic: CRDB-23559 Informs #98431. All commits but the last are from #98993. This change introduces "auto config tasks", a mechanism through which configuration payloads ("tasks") can be injected into a running SQL service. This is driven via the "auto config runner" job that was introduced in the previous commit. The job listens for the arrival of new task definitions via a `Provider` interface. When new tasks are known, and previous tasks have completed, the runner creates a job for the first next task. Release note: None 100476: server/drain: shut down SQL subsystems gracefully before releasing table leases r=JeffSwenson,rytaft a=knz Needed for #99941 and #99958. Epic: CRDB-23559 See individual commits for details. 100511: sqlccl: deflake TestGCTenantJobWaitsForProtectedTimestamps r=adityamaru,arulajmani a=knz Fixes #94808 The tenant server must be shut down before the tenant record is removed; otherwise the tenant's SQL server will self-terminate by calling Stop() on its stopper, which in this case was shared with the parent cluster. Release note: None Co-authored-by: Raphael 'kena' Poss <[email protected]>

98466: cli,server: static configuration profiles r=stevendanna a=knz Epic: CRDB-23559 Informs #98431. Fixes #94856. (Based off #98459) Supersedes #98380. This change introduces a mechanism through which an operator can select a "configuration profile" via the command-line flag `--config-profile` or env var `COCKROACH_CONFIG_PROFILE`. The SQL initialization defined by the profile is applied during server start-up. The profiles are (currently) hardcoded inside CockroachDB. The following profiles are predefined: - `default`: no configuration. - `multitenant+noapp`: no pre-defined `application` tenant, but with a predefined application tenant template that is used whenever a new tenant is defined. This config profile is meant for use for C2C replication target clusters. - `multitenant+app+sharedservice`: shared-process multitenancy with pre-defined `application` tenant, based off the same configuration as `multitenant+noapp`. Release note: None 101907: multitenant: misc fixes related to tenant capabilities r=arulajmani a=knz See individual commits for details. The last commit in particular probably addresses #99087. Epic: CRDB-23559 101935: sql: add issue number to todo r=rharding6373 a=rharding6373 Epic: none Informs: #101934 Release note: none Co-authored-by: Raphael 'kena' Poss <[email protected]> Co-authored-by: rharding6373 <[email protected]>

106414: server: properly check existence of tenant record r=yuzefovich,stevendanna a=knz PR extracted from #105441. Informs #83650. Part of #98431. Epic: CRDB-26691 Prior to this patch, the API on the tenant info watcher would return an "undefined" metadata payload if called prior to the tenant record being created. This was most visible in the following scenario: 1. new cluster starts. watcher+rangefeed start successfully (tenant table empty) 2. tenant client connects. At this time there is no metadata for its tenant ID, so the metadata payload is available but empty. 3. CREATE TENANT is executed for the new tenant. 4. only then (later) the rangefeed introduces the metadata into the cache. This is insufficient for use by the KV tenant client RPCs: there we only want to accept incoming requests from tenant clients after we actually have seen metadata for them. This patch improves the situation by checking whether the tenant record is present before returning bogus data to the SQL tenant client. Simultaneously, we add error handling logic in the SQL tenant client to handle this case gracefully. In a mixed-version deployment (with and without this patch applied), the following behaviors will occur if one starts the SQL tenant server without a tenant record defined: - Unpatched server: Late startup error in client with "database 1 does not exist". - Patched server: Client loops forever, waiting for tenant record. Behavior when the tenant record is created shortly *after* the SQL tenant server starts up: - Unpatched server: Inconsistent / non-deterministic behavior. - Patched server: Clean startup of client after tenant record appears. Co-authored-by: Raphael 'kena' Poss <[email protected]>

105441: server: notify tenant servers of metadata changes r=yuzefovich,stevendanna a=knz Informs (towards fixing) #98431. Fixes #105721. Epic: CRDB-26691 See the individual commits for details. 106325: sql/parser: parse CREATE PROCEDURE r=mgartner a=mgartner #### sql/parser: parse CREATE PROCEDURE `CREATE PROCEDURE` statements can now be parsed by the SQL parser. Currently, executing `CREATE PROCEDURE` will produce an unimplemented error. Epic: CRDB-25388 Release note: None #### sem/tree: rename function AST nodes This commit renames UDF-related AST nodes with "function" in the name to use the more general term "routine". The term "routine" is inclusive of both UDFs and procedures (e.g., Postgres implements a `DROP ROUTINE` statement which drops both UDFs and procedures), which is fitting because we'll be using the same AST nodes for both `CREATE FUNCTION` and `CREATE PROCEDURE` statements. Release note: None #### sem/tree: rename udf.go to create_routine.go Release note: None 107470: server, ccl: adjust more tests to work with test tenant r=yuzefovich a=yuzefovich Informs: #76378. Epic: CRDB-18499. 107507: kvserver: transfer lease when acquiring lease outside preferences r=erikgrinaker,andrewbaptist a=kvoli First 2 commits are from #107505. When a leaseholder is lost, any surviving replica may acquire the lease, even if it violates lease preferences. There are two main reasons for this: we need to elect a new Raft leader who will acquire the lease, which is agnostic to lease preferences, and there may not even be any surviving replicas that satisfy the lease preferences at all, so we don't want to keep the range unavailable while we try to figure this out (considering e.g. network timeouts can delay this for many seconds). However, after acquiring a lease, we rely on the replicate queue to transfer the lease back to a replica that conforms with the preferences, which can take several minutes. In multi-region clusters, this can cause severe latency degradation if the lease is acquired in a remote region. This PR will detect lease preference violations when a replica acquires a new lease, and eagerly enqueue it in the replicate queue for transfer (if possible). If the first process fails, the replica will be sent to purgatory and retried soon after. Additionally, two roachtests are added for lease preference conformance timing. The first test, `.../partial-first-preference-down`, takes down one of the preferred locality nodes (1/2). The second test, `.../full-first-preference-down`, takes down both the preferred locality nodes (2/2). Resolves #106100. Epic: none Release note (bug fix): When losing a leaseholder and using lease preferences, the lease can be acquired by any other replica (regardless of lease preferences) in order to restore availability as soon as possible. The new leaseholder will now immediately check if it violates the lease preferences, and attempt to transfer the lease to a replica that satisfies the preferences if possible. 107512: sql: fix crdb_internal.encode_key in some contexts r=yuzefovich a=yuzefovich Previously, we forgot to set `CatalogBuiltins` for the internal planner which is used by `crdb_internal.encode_key` (which, in turn, is used to power `SHOW RANGE ... FOR ROW`), so evaluating it would lead to an error. For example, the internal planner is used in the backfill. This is now fixed. Fixes: #106397. Release note (bug fix): CockroachDB would previously return an error when using `SHOW RANGE ... FOR ROW ...` in `CREATE TABLE ... AS ...` construct, and this is now fixed. 107537: workload/ycsb: default --read-modify-write-in-txn to false r=erikgrinaker a=nvanbenschoten The `read-modify-write-in-txn` flag was added to ycsb in #103117. This commit changes the default of the flag from true to false. This makes the default configuration of ycsb more closely mirror the official ycsb implementation, to avoid confusion when external users run the workload and compare. We saw in #103117 and in #107517 that this improves throughput substantially. We will expect to see the same in roachperf once this change is merged. After merging the PR, I will add a roachperf annotation. Epic: None Release note: None 107549: persistedsqlstats: add AOST clause to statement statistics size check r=maryliag a=THardy98 Epic: None This change as an AOST clause to the statement statistics size check. This helps mitigate contention on the `system.statement_statistics` table which has caused the sql stats compaction job to fail. Release note: None Co-authored-by: Raphael 'kena' Poss <[email protected]> Co-authored-by: Marcus Gartner <[email protected]> Co-authored-by: Yahor Yuzefovich <[email protected]> Co-authored-by: Austen McClernon <[email protected]> Co-authored-by: Nathan VanBenschoten <[email protected]> Co-authored-by: Thomas Hardy <[email protected]>

knz added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Mar 11, 2023

knz changed the title ~~server: meta-issue to organize the work in CRDB-23559~~ server: meta-issue to organize part of work in CRDB-23559 Mar 11, 2023

knz changed the title ~~server: meta-issue to organize part of work in CRDB-23559~~ server: meta-issue to organize part of the work in CRDB-23559 Mar 11, 2023

This was referenced Mar 11, 2023

upgrades,jobs: stub the auto config runner #98440

Merged

server: auto config initialization #98444

Closed

server,autoconfig: automatic configuration via config tasks #98459

Merged

cli,server: static configuration profiles #98466

Merged

blathers-crl bot mentioned this issue Apr 10, 2023

release-23.1: server,autoconfig: automatic configuration via config tasks #101069

Merged

This was referenced Apr 20, 2023

release-23.1: cli,server: static configuration profiles #101957

Merged

release-23.1.0: cli,server: static configuration profiles #101958

Merged

knz changed the title ~~server: meta-issue to organize part of the work in CRDB-23559~~ server: meta-issue to organize part of the work in CRDB-26691 Apr 21, 2023

knz mentioned this issue Jun 23, 2023

server: notify tenant servers of metadata changes #105441

Merged

knz mentioned this issue Jul 7, 2023

server: properly check existence of tenant record #106414

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

server: meta-issue to organize part of the work in CRDB-26691 #98431

server: meta-issue to organize part of the work in CRDB-26691 #98431

knz commented Mar 11, 2023 •

edited

Loading

server: meta-issue to organize part of the work in CRDB-26691 #98431

server: meta-issue to organize part of the work in CRDB-26691 #98431

Comments

knz commented Mar 11, 2023 • edited Loading

Architectural overview / tech strategy

Work threads

Actual work (execution)

knz commented Mar 11, 2023 •

edited

Loading