-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvstreamer: figure out the story with the cost model in serverless #82167
Comments
Ran some experiments with all TPCH queries that have index and/or lookup joins, with the streamer OFF and the streamer ON. The prediction of the cost model in both scenarios are very similar to each other:
|
The actual CPU usage is also very similar (each query was run 3 times first with the streamer OFF, then 3 times with the streamer ON with somewhat non-deterministic interval in-between runs (me writing down the metrics numbers)): Overall, looks like that thanks to the recent work from @andy-kimball of updating the cost model we don't need to do anything special for the streamer. We can actually make the setting that controls whether the streamer is used TenantWritable (which I'll do shortly). |
83265: kvserver/gc: remove range tombstones during GC r=erikgrinaker a=aliher1911 First commit adds support for range tombstones when removing point keys. Range tombstone will have the same effect as a point tombstone within the mvcc key history. Second commit adds support for removal of range tombstones below GC threshold. 84493: sql: add parsing support for SHOW CREATE FUNCTION statement r=mgartner a=mgartner This commit adds a `SHOW CREATE FUNCTION` statement to the SQL grammar. This statement is not yet implemented and executing it results in an error. Release note: None 84504: execinfra: allow tenants to disable the streamer r=yuzefovich a=yuzefovich Previously, we marked the setting that controls whether the streamer is used as `TenantReadOnly` since we were not sure whether the streamer fit well into the tenant cost model. Recently we revamped the cost model so that it can now correctly predict the usage of the hardware resources by the streamer, so at this point it seems safe to mark the setting `TenantWritable`. Informs: #82167 Release note: None 84515: ui: fix sorting of explain plans r=maryliag a=maryliag Previously, the sorting on the plans on the Explain Plans tab on Statement Details wasn't working. This commit adds the missing code required to sort that table. Fixes #84079 https://www.loom.com/share/0f0ed0e1a8d04fc88def3b2460d617e6 Release note (bug fix): Sorting on the plans table inside the Statement Details page is now properly working. 84519: clusterversion: remove some older versions r=yuzefovich a=yuzefovich Release note: None 84601: rowexec: use OutOfOrder mode of streamer for lookup joins with ordering r=yuzefovich a=yuzefovich Currently, the join reader always restores the required order for lookup joins on its own since all looked up rows are buffered before any output row is emitted. This observation allows us to use the OutOfOrder mode of the streamer in such scenarios, so this commit makes such a change. Previously, we would effectively maintain the order twice - both in the streamer and in the join reader, and the former is redundant. This will change in the future, but for now we can use the more-efficient mode. ``` name old time/op new time/op delta LookupJoinEqColsAreKeyOrdering/Cockroach-24 6.64ms ± 1% 6.48ms ± 1% -2.34% (p=0.000 n=10+10) LookupJoinEqColsAreKeyOrdering/MultinodeCockroach-24 7.89ms ± 1% 7.75ms ± 1% -1.80% (p=0.000 n=10+10) LookupJoinOrdering/Cockroach-24 9.01ms ± 3% 8.88ms ± 4% ~ (p=0.218 n=10+10) LookupJoinOrdering/MultinodeCockroach-24 12.1ms ± 4% 12.0ms ± 3% ~ (p=0.393 n=10+10) name old alloc/op new alloc/op delta LookupJoinEqColsAreKeyOrdering/Cockroach-24 1.68MB ± 1% 1.60MB ± 1% -4.93% (p=0.000 n=10+10) LookupJoinEqColsAreKeyOrdering/MultinodeCockroach-24 2.37MB ± 2% 2.29MB ± 2% -3.11% (p=0.000 n=10+10) LookupJoinOrdering/Cockroach-24 1.75MB ± 1% 1.66MB ± 1% -5.01% (p=0.000 n=10+9) LookupJoinOrdering/MultinodeCockroach-24 2.36MB ± 1% 2.25MB ± 1% -4.68% (p=0.000 n=8+10) name old allocs/op new allocs/op delta LookupJoinEqColsAreKeyOrdering/Cockroach-24 10.0k ± 1% 10.0k ± 1% ~ (p=0.278 n=10+9) LookupJoinEqColsAreKeyOrdering/MultinodeCockroach-24 14.3k ± 1% 14.3k ± 1% ~ (p=0.470 n=10+10) LookupJoinOrdering/Cockroach-24 12.4k ± 1% 12.5k ± 1% ~ (p=0.780 n=10+10) LookupJoinOrdering/MultinodeCockroach-24 17.1k ± 1% 17.0k ± 1% ~ (p=0.494 n=10+10) ``` Addresses: #82159. Release note: None 84607: bench: add a benchmark of index join with ordering r=yuzefovich a=yuzefovich Release note: None Co-authored-by: Oleg Afanasyev <[email protected]> Co-authored-by: Marcus Gartner <[email protected]> Co-authored-by: Yahor Yuzefovich <[email protected]> Co-authored-by: Marylia Gutierrez <[email protected]>
The fact that the streamer issues BatchRequests dynamically, based on the available memory and the footprint of the responses, breaks the assumptions of the current cost model in the serverless setup. We need to figure out how to incorporate the streamer into the model. Possibly, we could disable the "streaming" nature of the streamer to a certain degree - for example, we would only issue new BatchRequests once the whole budget is opened up (i.e. wait until all received responses have been processed and released by the streamer's use).
Jira issue: CRDB-16465
The text was updated successfully, but these errors were encountered: