Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvstreamer: figure out the story with the cost model in serverless #82167

Closed
Tracked by #54680
yuzefovich opened this issue May 31, 2022 · 2 comments
Closed
Tracked by #54680

kvstreamer: figure out the story with the cost model in serverless #82167

yuzefovich opened this issue May 31, 2022 · 2 comments
Assignees
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. sync-me sync-me-5 T-sql-queries SQL Queries Team

Comments

@yuzefovich
Copy link
Member

yuzefovich commented May 31, 2022

The fact that the streamer issues BatchRequests dynamically, based on the available memory and the footprint of the responses, breaks the assumptions of the current cost model in the serverless setup. We need to figure out how to incorporate the streamer into the model. Possibly, we could disable the "streaming" nature of the streamer to a certain degree - for example, we would only issue new BatchRequests once the whole budget is opened up (i.e. wait until all received responses have been processed and released by the streamer's use).

Jira issue: CRDB-16465

@yuzefovich yuzefovich added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label May 31, 2022
@yuzefovich yuzefovich self-assigned this May 31, 2022
@blathers-crl blathers-crl bot added the T-sql-queries SQL Queries Team label May 31, 2022
@mari-crl mari-crl added sync-me and removed sync-me labels Jun 2, 2022
@yuzefovich
Copy link
Member Author

Ran some experiments with all TPCH queries that have index and/or lookup joins, with the streamer OFF and the streamer ON. The prediction of the cost model in both scenarios are very similar to each other:

TPC-H Q2 (non-streamer)		0.45
TPC-H Q2 (streamer)		0.45
TPC-H Q3 (non-streamer)		7.81
TPC-H Q3 (streamer)		7.87
TPC-H Q4 (non-streamer)		3.65
TPC-H Q4 (streamer)		3.65
TPC-H Q5 (non-streamer)		8.13
TPC-H Q5 (streamer)		8.13
TPC-H Q6 (non-streamer)		26.70
TPC-H Q6 (streamer)		26.69
TPC-H Q7 (non-streamer)		18.34
TPC-H Q7 (streamer)		18.35
TPC-H Q8 (non-streamer)		2.73
TPC-H Q8 (streamer)		2.73
TPC-H Q9 (non-streamer)		0.51
TPC-H Q9 (streamer)		0.51
TPC-H Q10 (non-streamer)	4.01
TPC-H Q10 (streamer)		4.01
TPC-H Q11 (non-streamer)	1.93
TPC-H Q11 (streamer)		1.95
TPC-H Q12 (non-streamer)	27.58
TPC-H Q12 (streamer)		27.56
TPC-H Q14 (non-streamer)	2.55
TPC-H Q14 (streamer)		2.55
TPC-H Q15 (non-streamer)	13.38
TPC-H Q15 (streamer)		13.37
TPC-H Q16 (non-streamer)	1.46
TPC-H Q16 (streamer)		1.46
TPC-H Q17 (non-streamer)	1.26
TPC-H Q17 (streamer)		1.26
TPC-H Q20 (non-streamer)	43.92
TPC-H Q20 (streamer)		43.91
TPC-H Q21 (non-streamer)	17.17
TPC-H Q21 (streamer)		17.65
TPC-H Q22 (non-streamer)	1.32
TPC-H Q22 (streamer)		1.28

@yuzefovich
Copy link
Member Author

The actual CPU usage is also very similar (each query was run 3 times first with the streamer OFF, then 3 times with the streamer ON with somewhat non-deterministic interval in-between runs (me writing down the metrics numbers)):
Screen Shot 2022-07-15 at 12 29 58 PM

Overall, looks like that thanks to the recent work from @andy-kimball of updating the cost model we don't need to do anything special for the streamer. We can actually make the setting that controls whether the streamer is used TenantWritable (which I'll do shortly).

craig bot pushed a commit that referenced this issue Jul 18, 2022
83265: kvserver/gc: remove range tombstones during GC r=erikgrinaker a=aliher1911


First commit adds support for range tombstones when removing point keys. Range tombstone will have the same effect as a point tombstone within the mvcc key history.

Second commit adds support for removal of range tombstones below GC threshold.


84493: sql: add parsing support for SHOW CREATE FUNCTION statement r=mgartner a=mgartner

This commit adds a `SHOW CREATE FUNCTION` statement to the SQL grammar.
This statement is not yet implemented and executing it results in an
error.

Release note: None

84504: execinfra: allow tenants to disable the streamer r=yuzefovich a=yuzefovich

Previously, we marked the setting that controls whether the streamer is
used as `TenantReadOnly` since we were not sure whether the streamer fit
well into the tenant cost model. Recently we revamped the cost model so
that it can now correctly predict the usage of the hardware resources by
the streamer, so at this point it seems safe to mark the setting
`TenantWritable`.

Informs: #82167

Release note: None

84515: ui: fix sorting of explain plans r=maryliag a=maryliag

Previously, the sorting on the plans on the
Explain Plans tab on Statement Details wasn't working.
This commit adds the missing code required to sort
that table.

Fixes #84079

https://www.loom.com/share/0f0ed0e1a8d04fc88def3b2460d617e6

Release note (bug fix): Sorting on the plans table inside the
Statement Details page is now properly working.

84519: clusterversion: remove some older versions r=yuzefovich a=yuzefovich

Release note: None

84601: rowexec: use OutOfOrder mode of streamer for lookup joins with ordering r=yuzefovich a=yuzefovich

Currently, the join reader always restores the required order for lookup
joins on its own since all looked up rows are buffered before any output
row is emitted. This observation allows us to use the OutOfOrder mode of
the streamer in such scenarios, so this commit makes such a change.
Previously, we would effectively maintain the order twice - both in the
streamer and in the join reader, and the former is redundant. This will
change in the future, but for now we can use the more-efficient mode.

```
name                                                  old time/op    new time/op    delta
LookupJoinEqColsAreKeyOrdering/Cockroach-24             6.64ms ± 1%    6.48ms ± 1%  -2.34%  (p=0.000 n=10+10)
LookupJoinEqColsAreKeyOrdering/MultinodeCockroach-24    7.89ms ± 1%    7.75ms ± 1%  -1.80%  (p=0.000 n=10+10)
LookupJoinOrdering/Cockroach-24                         9.01ms ± 3%    8.88ms ± 4%    ~     (p=0.218 n=10+10)
LookupJoinOrdering/MultinodeCockroach-24                12.1ms ± 4%    12.0ms ± 3%    ~     (p=0.393 n=10+10)

name                                                  old alloc/op   new alloc/op   delta
LookupJoinEqColsAreKeyOrdering/Cockroach-24             1.68MB ± 1%    1.60MB ± 1%  -4.93%  (p=0.000 n=10+10)
LookupJoinEqColsAreKeyOrdering/MultinodeCockroach-24    2.37MB ± 2%    2.29MB ± 2%  -3.11%  (p=0.000 n=10+10)
LookupJoinOrdering/Cockroach-24                         1.75MB ± 1%    1.66MB ± 1%  -5.01%  (p=0.000 n=10+9)
LookupJoinOrdering/MultinodeCockroach-24                2.36MB ± 1%    2.25MB ± 1%  -4.68%  (p=0.000 n=8+10)

name                                                  old allocs/op  new allocs/op  delta
LookupJoinEqColsAreKeyOrdering/Cockroach-24              10.0k ± 1%     10.0k ± 1%    ~     (p=0.278 n=10+9)
LookupJoinEqColsAreKeyOrdering/MultinodeCockroach-24     14.3k ± 1%     14.3k ± 1%    ~     (p=0.470 n=10+10)
LookupJoinOrdering/Cockroach-24                          12.4k ± 1%     12.5k ± 1%    ~     (p=0.780 n=10+10)
LookupJoinOrdering/MultinodeCockroach-24                 17.1k ± 1%     17.0k ± 1%    ~     (p=0.494 n=10+10)
```

Addresses: #82159.

Release note: None

84607: bench: add a benchmark of index join with ordering r=yuzefovich a=yuzefovich

Release note: None

Co-authored-by: Oleg Afanasyev <[email protected]>
Co-authored-by: Marcus Gartner <[email protected]>
Co-authored-by: Yahor Yuzefovich <[email protected]>
Co-authored-by: Marylia Gutierrez <[email protected]>
@mgartner mgartner moved this to Done in SQL Queries Jul 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. sync-me sync-me-5 T-sql-queries SQL Queries Team
Projects
Archived in project
Development

No branches or pull requests

3 participants