Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spanconfig: add mechanism to alter the buffer limit used by the KVSubscriber #77539

Open
Tracked by #81009 ...
arulajmani opened this issue Mar 9, 2022 · 4 comments
Open
Tracked by #81009 ...
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-kv KV Team

Comments

@arulajmani
Copy link
Collaborator

arulajmani commented Mar 9, 2022

Is your feature request related to a problem? Please describe.

The KVSubscriber uses a rangefeedcache.Watcher over system.span_configurations to construct and incrementally maintain in-memory span configuration state. We limit this watcher's buffer size statically, here:

spanConfig.subscriber = spanconfigkvsubscriber.New(
clock,
rangeFeedFactory,
keys.SpanConfigurationsTableID,
1<<20, /* 1 MB */
fallbackConf,
spanConfigKnobs,
)
}

We coarsely estimate each row in system.span_configurations to be 5KB. Given this, limiting the buffer size to 1MB seems sane for incremental updates. However, this limit is also enforced when performing an initial scan of the table as well, which is no good. We'll have a separate issue to fix that, but independent of it, it still seems prudent to have a mechanism to be able to alter the 1MB limit for the KVSubscriber's buffer in the wild as an escape hatch.

Describe the solution you'd like

Let's introduce an env var that dictates the buffer size to use when initializing the KVSubscriber. We could introduce a cluster setting as well, though that might be a bit more involved.

cc @ajwerner @irfansharif

Jira issue: CRDB-13643

@arulajmani arulajmani added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) branch-master Failures and bugs on the master branch. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. labels Mar 9, 2022
@blathers-crl blathers-crl bot added the T-kv KV Team label Mar 9, 2022
@irfansharif irfansharif self-assigned this Mar 9, 2022
@irfansharif
Copy link
Contributor

irfansharif commented Mar 9, 2022

We use the underlying rangefeedcache library at various places now, not just the KVSubscriber. Do we want knobs for each usage? Or are the other initial table scans not as worrying?

Let's introduce an env var that dictates the buffer size to use when initializing the KVSubscriber. We could introduce a cluster setting as well, though that might be a bit more involved.

Given the KVSubscriber is a per-store thing, probably we want an the cluster setting. Changing out the env-var cluster wide would require a full cluster restart -- not the best during incidents that want such a thing.

@irfansharif irfansharif changed the title spanconfig: add env var to alter the buffer limit used by the KVSubscriber spanconfig: add mechanism to alter the buffer limit used by the KVSubscriber Mar 9, 2022
@ajwerner
Copy link
Contributor

ajwerner commented Mar 9, 2022

Let's set out to make the buffer size dynamic in the rangefeedcache library.

@irfansharif
Copy link
Contributor

irfansharif commented Mar 21, 2022

I'm still planning to make this limit dynamic and backport, but with #77687 this issue is no longer a release blocker. Concretely: running into buffer limits during the initial scan is unrecoverable, which #78148 fixes. Running into buffer limits during the incremental phase is entirely recoverable with a retry, which this issue would help reduce the occurrence of in practice.

@irfansharif irfansharif removed the release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. label Mar 21, 2022
@irfansharif irfansharif removed their assignment Apr 20, 2022
@irfansharif irfansharif removed the branch-master Failures and bugs on the master branch. label Apr 20, 2022
Copy link

We have marked this issue as stale because it has been inactive for
18 months. If this issue is still relevant, removing the stale label
or adding a comment will keep it active. Otherwise, we'll close it in
10 days to keep the issue queue tidy. Thank you for your contribution
to CockroachDB!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-kv KV Team
Projects
None yet
Development

No branches or pull requests

4 participants