-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
changefeedccl: changefeed concurrency and frontier observability #67206 #67268
Conversation
@HonoreDB I pushed PR you already looked at to another branch. Can you take another look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @HonoreDB and @miretskiy)
pkg/ccl/changefeedccl/kvfeed/scanner.go, line 285 at r1 (raw file):
// This is all hand-wavy: 3 per node used to be the default for a very long time. // However, this could get out of hand if the clusters are large. // So cap the max to an arbitrary value of a 100.
I like the addition of the cap (although perhaps moving it to a constant might be nice.
One question I have is why we want to scale by the node count at this point at all. We will naturally be scaling the number of scans that happen by the node count because change aggregators for different ranges are distributed across all of the nodes, so why account for it again here?
In a conversation about this, one item that came up is that in the case of core changefeeds, this makes a bit more sense.
To keep it backportable I think it makes sense to leave it as you have it here, but perhaps this is something we could follow-up on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 3 files at r1, 1 of 2 files at r2.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @miretskiy and @stevendanna)
pkg/ccl/changefeedccl/changefeed_processors.go, line 1291 at r2 (raw file):
}() cf.metrics.Flushes.Inc(1)
Is Flushes the right metric to be incrementing here?
pkg/ccl/changefeedccl/kvfeed/scanner.go, line 285 at r1 (raw file):
Previously, stevendanna (Steven Danna) wrote…
I like the addition of the cap (although perhaps moving it to a constant might be nice.
One question I have is why we want to scale by the node count at this point at all. We will naturally be scaling the number of scans that happen by the node count because change aggregators for different ranges are distributed across all of the nodes, so why account for it again here?
In a conversation about this, one item that came up is that in the case of core changefeeds, this makes a bit more sense.
To keep it backportable I think it makes sense to leave it as you have it here, but perhaps this is something we could follow-up on.
I think the rationale is that it's different kinds of nodes that can in theory be scaled independently? But fair point, as-is this doesn't really help with that, it just implicitly assumes that our ability to serve ExportRequests is O(number of changefeed aggregator nodes * total number of nodes =~ number of nodes squared), if I understand correctly, which grows faster than our actual capacity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @HonoreDB, @miretskiy, and @stevendanna)
pkg/ccl/changefeedccl/kvfeed/scanner.go, line 285 at r1 (raw file):
Previously, HonoreDB (Aaron Zinger) wrote…
I think the rationale is that it's different kinds of nodes that can in theory be scaled independently? But fair point, as-is this doesn't really help with that, it just implicitly assumes that our ability to serve ExportRequests is O(number of changefeed aggregator nodes * total number of nodes =~ number of nodes squared), if I understand correctly, which grows faster than our actual capacity.
I would like to keep things as they are so that this change can be backported.
We should specify an argument to kvfeed whether or not the changefeed is a core style changefeed.
Stop relying on ExportRequestLimit to determine the number of concurrent export requests, and introduce a decidated ScanRequestLimit setting. If the setting is specified, uses that setting; otherwise, the default value is computed as 3 * (number of nodes in the cluster), which is the old behavior, but we cap this number so that concurrency does not get out of hand if running in a very large cluster. Fixes cockroachdb#67190 Release Nodes: Provide a better configurability of scan request concurrency. Scan requests are issued by changefeeds during the backfill.
Add a metric to keep track of the number of frontier updates in the changefeed. Add logging when job progress updates take excessive amount of time. Fixes cockroachdb#67192 Release Notes: None
fefdf03
to
e5a133e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 3 files at r3.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @HonoreDB, @miretskiy, and @stevendanna)
tftr |
Build succeeded: |
Stop relying on ExportRequestLimit to determine the number of concurrent
export requests, and introduce a decidated ScanRequestLimit setting.
If the setting is specified, uses that setting; otherwise, the default
value is computed as 3 * (number of nodes in the cluster), which is the
old behavior, but we cap this number so that concurrency does not get
out of hand if running in a very large cluster.
Fixes #67190
Improve Observability of of change frontier updates.
Add a metric to keep track of the number of frontier updates in the
changefeed. Add logging when job progress updates take excessive amount
of time.
Fixes #67192
Release Nodes: Provide a better configurability of scan request
concurrency. Scan requests are issued by changefeeds during the
backfill.