-
Notifications
You must be signed in to change notification settings - Fork 468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UNSPLIT AT statement page and diagrams #5360
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall direction LGTM, I think we have a few things to address re: terminology ("remove split enforcement" vs. "unsplit") and interaction of this with range merge settings which are not mentioned at all here or in SPLIT AT (yet). See below for more details.
@@ -0,0 +1,25 @@ | |||
<div><svg width="631" height="81"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file does not appear to be included from any pages. Unless I'm missing it?
v19.2/alter-index.md
Outdated
@@ -17,3 +17,4 @@ Subcommand | Description | |||
[`CONFIGURE ZONE`](configure-zone.html) | [Configure replication zones](configure-replication-zones.html) for an index. | |||
[`RENAME`](rename-index.html) | Change the name of an index. | |||
[`SPLIT AT`](split-at.html) | Force a key-value layer range split at the specified row in the index. | |||
[`UNSPLIT AT`](unsplit-at.html) | Remove a key-value layer range split enforcement at a specified row in the index. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should add the "new in 19.2" flag to all of these new links to UNSPLIT AT, the same as we've done on the 'SQL Statements' page.
v19.2/alter-table.md
Outdated
@@ -35,6 +35,7 @@ Subcommand | Description | Can combine with other subcommands? | |||
[`RENAME CONSTRAINT`](rename-constraint.html) | Change constraints columns. | Yes | |||
[`RENAME TABLE`](rename-table.html) | Change the names of tables. | No | |||
[`SPLIT AT`](split-at.html) | Force a key-value layer range split at the specified row in the table. | No | |||
[`UNSPLIT AT`](unsplit-at.html) | Remove a key-value layer range split enforcement at a specified row in the table. | No |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as previously re: "new in 19.2" flag.
v19.2/split-at.md
Outdated
@@ -252,3 +252,4 @@ SHOW RANGES FROM TABLE t; | |||
- [Distribution Layer](architecture/distribution-layer.html) | |||
- [Replication Layer](architecture/replication-layer.html) | |||
- [`SHOW JOBS`](show-jobs.html) | |||
- [`UNSPLIT AT`](unsplit-at.html) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as previously re: "new in 19.2" flag. Also, I think the SPLIT AT page should be updated to link to the Range Merges doc and to note that you have to explicitly turn off Range Merges if you want to use SPLIT AT. Perhaps that should be a separate issue? UPDATE: I did a search and found that #4960 has been filed to address this. I've added it to the 19.2 milestone as a P-1.
v19.2/unsplit-at.md
Outdated
@@ -0,0 +1,192 @@ | |||
--- | |||
title: UNSPLIT AT | |||
summary: The UNSPLIT AT statement reverts a manual key-value layer range split at a specified row in a table or index. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This summary could be updated to use the same "split enforcement" terminology as the doc. I must admit I find "reverts a range split" more understandable than "removes a split enforcement", but I guess the latter is a more correct description of what happens?
v19.2/unsplit-at.md
Outdated
(4 rows) | ||
~~~ | ||
|
||
The `split_enforced_until` column is now `NULL` for all ranges in the `users` table. The table is still split into ranges at `'chicago'`, `'new york'`, and `'seattle'`, but the split is no longer enforced. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be good to mention that "now the system will be free to re-merge the ranges at its convenience once you re-enable the kv.range_merge.queue_enabled
setting to true
. At least according to the Range Merges doc you will need to have turned off the range merge queue to even be able to SPLIT AT - if I am reading that correctly? I think we should at least make mention of that in the overview (I'll edit the comment on the overview as well).
v19.2/unsplit-at.md
Outdated
~~~ | ||
~~~ | ||
range_id | start_key | start_pretty | end_key | end_pretty | database_name | table_name | index_name | replicas | learner_replicas | split_enforced_until | lease_holder | ||
+----------+-------------------+------------------+-------------------+------------------+---------------+------------+-------------+----------+------------------+----------------------------------+--------------+ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that we have more updates to do re: split_enforced_until
incoming via cockroachdb/cockroach#38004 (namely a WITH EXPIRATION option to SPLIT AT) but that will happen via another ticket.
v19.2/unsplit-at.md
Outdated
(5 rows) | ||
~~~ | ||
|
||
Now unsplit the index: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After looking at this even longer I think we should use the "remove split enforcement, let the range merge queue run AKA unsplit" explanation in the intro and then just say "unsplit" everywhere else down-page.
v19.2/unsplit-at.md
Outdated
(5 rows) | ||
~~~ | ||
|
||
The `split_enforced_until` column is now `NULL` for all ranges in the `rides` table. The table is still split into ranges at `25.00`, `50.00`, and `75.00`, but the split is no longer enforced. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above re: giving a little more color about when/how the system merges the ranges back together.
|
||
|
||
## See also | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should also link to the Range Merges doc here.
ac0dc66
to
ea3fad7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei, @nvanbenschoten, and @rmloveland)
_includes/v19.2/sql/diagrams/unsplit.html, line 1 at r1 (raw file):
Previously, rmloveland (Rich Loveland) wrote…
This file does not appear to be included from any pages. Unless I'm missing it?
No you didn't miss anything... it's not included anywhere. I'll delete it.
v19.2/alter-index.md, line 20 at r1 (raw file):
Previously, rmloveland (Rich Loveland) wrote…
I think we should add the "new in 19.2" flag to all of these new links to UNSPLIT AT, the same as we've done on the 'SQL Statements' page.
Done.
v19.2/alter-table.md, line 38 at r1 (raw file):
Previously, rmloveland (Rich Loveland) wrote…
Same comment as previously re: "new in 19.2" flag.
Done.
v19.2/split-at.md, line 255 at r1 (raw file):
Done.
I never ran into any errors or warnings when I tried to split tables manually and I never changed any settings... In #4960 @andreimatei writes:
In 19.2 I believe the above limitation goes away and the new with expiration option should be documented. Introduced in cockroachdb/cockroach#38004 by @jeffrey-xiao.
So I think this requirement has been lifted (consistent with what @andreimatei wrote in #4960). Also, I didn't run into any warnings or any setting to prevent me from splitting any ranges, and I never set any settings prior to splitting/unsplitting. I did this both in an insecure local cluster (cockroach start --insecure
) and with cockroach demo
.
v19.2/unsplit-at.md, line 3 at r1 (raw file):
Previously, rmloveland (Rich Loveland) wrote…
This summary could be updated to use the same "split enforcement" terminology as the doc. I must admit I find "reverts a range split" more understandable than "removes a split enforcement", but I guess the latter is a more correct description of what happens?
Done.
v19.2/unsplit-at.md, line 7 at r1 (raw file):
make "range split" the link text
Done.
describes what happens when the "split enforcement" is removed
Done.
is there some cluster setting that would control how slowly or quickly this merging will happen?
Not that I am aware of... @nvanbenschoten Can you control range merges in any way (other than preventing them with a manual split) with cluster settings or something else?
a callout that links to the Range Merges doc and mentions that the range merge cluster setting.
I think this requirement has been lifted (consistent with what @andreimatei wrote in #4960). Also, I didn't run into any warnings or any setting to prevent me from splitting any ranges, and I never set any settings prior to splitting/unsplitting. I did this both in an insecure local cluster (cockroach start --insecure
) and with cockroach demo
.
v19.2/unsplit-at.md, line 17 at r1 (raw file):
Previously, rmloveland (Rich Loveland) wrote…
These diagrams look great!
:)
v19.2/unsplit-at.md, line 27 at r1 (raw file):
Previously, rmloveland (Rich Loveland) wrote…
Same comment as above re: standardizing on whether we are saying "unsplit this" vs. "removing split enforcements". If we stick with "remove split enforcement" I think we should describe how that works generally in the overview as noted in previous comment.
Done.
v19.2/unsplit-at.md, line 28 at r1 (raw file):
Previously, rmloveland (Rich Loveland) wrote…
Same as above re: "unsplit" vs. "remove split enforcement".
Done.
v19.2/unsplit-at.md, line 29 at r1 (raw file):
Previously, rmloveland (Rich Loveland) wrote…
Same as above re: "unsplit" vs. "remove split enforcement". Another option is to clarify in the intro that "unsplit" actually means remove split enforcement, but hereafter we will just say unsplit", which IMO would be a fine way to address this while simplifying the verbiage.
I just explained a little more what we mean by "unsplit", but also opt for "remove split enforcement" where it is clarifying elsewhere.
v19.2/unsplit-at.md, line 37 at r1 (raw file):
Previously, rmloveland (Rich Loveland) wrote…
Could the text of
--nodes
and--demo-locality
be made into links to the specific flags? This can be accomplished by adding an anchor<a name="nodes-flag"></a>
inside the table on thecockroach demo
page. Also, this says "the --nodes ... tags" but I think you meant "the --nodes ... flags" ??? Unless "tags" is an alternative name like "soda" vs. "pop" which I was not aware of. But I would still vote "flag" for consistency with our other docs. Or "options" could work.
Oh no you are totally right that is my bad. I changed the "tags" to "flags" and then added some refs to the "General" flags part of the cockroack demo
page. Unfortunately we don't have anchors for those specific arguments at this point (although we could add some).
v19.2/unsplit-at.md, line 41 at r1 (raw file):
Previously, rmloveland (Rich Loveland) wrote…
I would reword the last clause slightly to "At this point, the data in the
users
table can be stored in just one range". The table is the more important "thing" to a SQL user while the range is merely there in service of it.
Done.
v19.2/unsplit-at.md, line 114 at r1 (raw file):
Previously, rmloveland (Rich Loveland) wrote…
Might be good to mention that "now the system will be free to re-merge the ranges at its convenience once you re-enable the
kv.range_merge.queue_enabled
setting totrue
. At least according to the Range Merges doc you will need to have turned off the range merge queue to even be able to SPLIT AT - if I am reading that correctly? I think we should at least make mention of that in the overview (I'll edit the comment on the overview as well).
I think this requirement has been lifted (consistent with what @andreimatei wrote in #4960). Also, I didn't run into any warnings or any setting to prevent me from splitting any ranges, and I never set any settings prior to splitting/unsplitting. I did this both in an insecure local cluster (cockroach start --insecure
) and with cockroach demo
.
v19.2/unsplit-at.md, line 144 at r1 (raw file):
Previously, rmloveland (Rich Loveland) wrote…
Note that we have more updates to do re:
split_enforced_until
incoming via cockroachdb/cockroach#38004 (namely a WITH EXPIRATION option to SPLIT AT) but that will happen via another ticket.
OK, great to know.
v19.2/unsplit-at.md, line 153 at r1 (raw file):
Previously, rmloveland (Rich Loveland) wrote…
After looking at this even longer I think we should use the "remove split enforcement, let the range merge queue run AKA unsplit" explanation in the intro and then just say "unsplit" everywhere else down-page.
OK!
v19.2/unsplit-at.md, line 183 at r1 (raw file):
Previously, rmloveland (Rich Loveland) wrote…
Same comment as above re: giving a little more color about when/how the system merges the ranges back together.
Done.
v19.2/unsplit-at.md, line 187 at r1 (raw file):
Previously, rmloveland (Rich Loveland) wrote…
I think we should also link to the Range Merges doc here.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @andreimatei, @ericharmeling, and @nvanbenschoten)
_includes/v19.2/sql/diagrams/unsplit.html, line 1 at r1 (raw file):
Previously, ericharmeling (Eric Harmeling) wrote…
No you didn't miss anything... it's not included anywhere. I'll delete it.
Thanks!
v19.2/split-at.md, line 255 at r1 (raw file):
Previously, ericharmeling (Eric Harmeling) wrote…
Done.
I never ran into any errors or warnings when I tried to split tables manually and I never changed any settings... In #4960 @andreimatei writes:In 19.2 I believe the above limitation goes away and the new with expiration option should be documented. Introduced in cockroachdb/cockroach#38004 by @jeffrey-xiao.
So I think this requirement has been lifted (consistent with what @andreimatei wrote in #4960). Also, I didn't run into any warnings or any setting to prevent me from splitting any ranges, and I never set any settings prior to splitting/unsplitting. I did this both in an insecure local cluster (
cockroach start --insecure
) and withcockroach demo
.
Ah! thanks for explaining
v19.2/unsplit-at.md, line 41 at r1 (raw file):
Previously, ericharmeling (Eric Harmeling) wrote…
Done.
It's def not a blocker but I think you forgot to push this change? Or at least I'm not seeing it, sorry.
v19.2/unsplit-at.md, line 114 at r1 (raw file):
Previously, ericharmeling (Eric Harmeling) wrote…
I think this requirement has been lifted (consistent with what @andreimatei wrote in #4960). Also, I didn't run into any warnings or any setting to prevent me from splitting any ranges, and I never set any settings prior to splitting/unsplitting. I did this both in an insecure local cluster (
cockroach start --insecure
) and withcockroach demo
.
Ah, thanks for reading that more closely than I did. :-}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @andreimatei, @nvanbenschoten, and @rmloveland)
v19.2/unsplit-at.md, line 41 at r1 (raw file):
Previously, rmloveland (Rich Loveland) wrote…
It's def not a blocker but I think you forgot to push this change? Or at least I'm not seeing it, sorry.
Ah, I changed this, then in the process of making another change, changed the wording again because the data can be stored in one or three or 5 ranges at this point, whatever CRDB decides, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @andreimatei, @ericharmeling, @nvanbenschoten, and @rmloveland)
v19.2/unsplit-at.md, line 41 at r1 (raw file):
the data can be stored in one or three or 5 ranges at this point, whatever CRDB decides, correct?
AIUI, yes. Sorry for the noise on this one. Gosh I hate having to think about ranges up here in SQL-land.
Sincerely, a dumb user
@jseldess Added you for a quick pass before merging. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks fine to me. I can't give it a more in-depth review, nor am I a particular authority on this functionality. I've just commented with some immediate nits.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andreimatei, @ericharmeling, @jseldess, @nvanbenschoten, and @rmloveland)
v19.2/unsplit-at.md, line 3 at r2 (raw file):
--- title: UNSPLIT AT summary: The UNSPLIT AT statement removes a key-value layer range split enforcement at a specified row in the table or index..
nit: unless we use this "key-value layer" terminology extensively, I wouldn't use it here or on the SPLIT AT
page. I'd just say "range split".
On the split at
page we say "The key-value layer of CockroachDB is broken into sections of contiguous key...". I'd just say "data is split into ranges".
v19.2/unsplit-at.md, line 18 at r2 (raw file):
<div> {% include {{ page.version.version }}/sql/diagrams/unsplit_index_at.html %}
I have an pet peeve with this diagram, as well as the analogous one on the split at
page. And I think it applies to other pages as well.
The thing just says select_stmt
and that takes you to some monster. The thing I'm always looking for is the values (...)
syntax, which is eventually presented on these pages, but it's pretty buried.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 9 files at r1, 8 of 8 files at r2.
Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @andreimatei, @ericharmeling, @jseldess, and @rmloveland)
v19.2/unsplit-at.md, line 3 at r2 (raw file):
--- title: UNSPLIT AT summary: The UNSPLIT AT statement removes a key-value layer range split enforcement at a specified row in the table or index..
nit: two periods.
v19.2/unsplit-at.md, line 35 at r2 (raw file):
## Viewing schema changes {% include {{ page.version.version }}/misc/schema-change-view-job.md %}
I don't think this is true of SPLIT AT
or UNSPLIT AT
.
v19.2/unsplit-at.md, line 47 at r2 (raw file):
{% include copy-clipboard.html %} ~~~ sql > SELECT * FROM crdb_internal.ranges WHERE table_name='users';
nit: Consider refining the projection here to just the columns that users might find interesting (e.g. range_id
, start_pretty
, end_pretty
, split_enforced_until
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @andreimatei, @jseldess, and @rmloveland)
v19.2/unsplit-at.md, line 41 at r1 (raw file):
a dumb user
No such thing!!
v19.2/unsplit-at.md, line 3 at r2 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
nit: unless we use this "key-value layer" terminology extensively, I wouldn't use it here or on the
SPLIT AT
page. I'd just say "range split".On the
split at
page we say "The key-value layer of CockroachDB is broken into sections of contiguous key...". I'd just say "data is split into ranges".
Done (x2)
v19.2/unsplit-at.md, line 3 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit: two periods.
Done.
v19.2/unsplit-at.md, line 18 at r2 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
I have an pet peeve with this diagram, as well as the analogous one on the
split at
page. And I think it applies to other pages as well.
The thing just saysselect_stmt
and that takes you to some monster. The thing I'm always looking for is thevalues (...)
syntax, which is eventually presented on these pages, but it's pretty buried.
That's a good point. On some SQL statement pages (e.g. CREATE TABLE), we have a "basic" diagram view, that is compact and high-level, and then an "expanded" diagram view, where we show diagrams for some of those pieces that contain a lot more information buried down (e.g. your select_stmt
's). I'm going to open a separate issue to create that "expand" view for all SQL statements that need it. There are other issues open that deal with the same issue, for other statements (e.g. the INSERT DIAGRAM).
v19.2/unsplit-at.md, line 35 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
I don't think this is true of
SPLIT AT
orUNSPLIT AT
.
OK. Removing this from both.
v19.2/unsplit-at.md, line 47 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit: Consider refining the projection here to just the columns that users might find interesting (e.g.
range_id
,start_pretty
,end_pretty
,split_enforced_until
).
Done.
ea3fad7
to
23196ce
Compare
@nvanbenschoten @andreimatei Thank you for the feedback! I do have a final question... When I create an index, it appears that CRDB performs a manual split, but with a much shorter expiration time on the enforcement.
What's going on here? When we create indexes, does this temporary split always happen? |
I wasn't aware that it ever happened. @andreimatei do you know about this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That must be a schema change thing. I don't know what the deal is. cc @dt
Reviewable status: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @andreimatei, @jseldess, @nvanbenschoten, and @rmloveland)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @andreimatei, @ericharmeling, @nvanbenschoten, and @rmloveland)
v19.2/alter-index.md, line 20 at r3 (raw file):
[`RENAME`](rename-index.html) | Change the name of an index. [`SPLIT AT`](split-at.html) | Force a key-value layer range split at the specified row in the index. [`UNSPLIT AT`](unsplit-at.html) | <span class="version-tag">New in v19.2:</span> Remove a key-value layer range split enforcement at a specified row in the index.
nit: If you're changing key-value layer range split
to just range split
elsewhere, might as well do that here, too. Bot split at and unsplit at.
v19.2/alter-table.md, line 38 at r3 (raw file):
[`RENAME TABLE`](rename-table.html) | Change the names of tables. | No [`SPLIT AT`](split-at.html) | Force a key-value layer range split at the specified row in the table. | No [`UNSPLIT AT`](unsplit-at.html) | <span class="version-tag">New in v19.2:</span> Remove a key-value layer range split enforcement at a specified row in the table. | No
Same as above.
v19.2/unsplit-at.md, line 18 at r2 (raw file):
Previously, ericharmeling (Eric Harmeling) wrote…
That's a good point. On some SQL statement pages (e.g. CREATE TABLE), we have a "basic" diagram view, that is compact and high-level, and then an "expanded" diagram view, where we show diagrams for some of those pieces that contain a lot more information buried down (e.g. your
select_stmt
's). I'm going to open a separate issue to create that "expand" view for all SQL statements that need it. There are other issues open that deal with the same issue, for other statements (e.g. the INSERT DIAGRAM).
Yep, I think this is the same basic topic as #5396.
v19.2/unsplit-at.md, line 35 at r3 (raw file):
## Examples {% include {{page.version.version}}/sql/movr-statements-nodes.md %}
@ericharmling, I'd like to test these examples, but it looks like this branch is behind master. Can you rebase this against an up-to-date master and repush and then let me know?
23196ce
to
b395e57
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @andreimatei, @jseldess, @nvanbenschoten, and @rmloveland)
v19.2/alter-index.md, line 20 at r3 (raw file):
Previously, jseldess (Jesse Seldess) wrote…
nit: If you're changing
key-value layer range split
to justrange split
elsewhere, might as well do that here, too. Bot split at and unsplit at.
Done.
v19.2/alter-table.md, line 38 at r3 (raw file):
Previously, jseldess (Jesse Seldess) wrote…
Same as above.
Done.
v19.2/unsplit-at.md, line 35 at r3 (raw file):
Previously, jseldess (Jesse Seldess) wrote…
@ericharmling, I'd like to test these examples, but it looks like this branch is behind master. Can you rebase this against an up-to-date master and repush and then let me know?
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, Eric. with a few final nits.
Reviewable status: complete! 1 of 0 LGTMs obtained (and 2 stale) (waiting on @andreimatei, @ericharmeling, @nvanbenschoten, and @rmloveland)
v19.2/unsplit-at.md, line 114 at r1 (raw file):
Previously, rmloveland (Rich Loveland) wrote…
Ah, thanks for reading that more closely than I did. :-}
Can you make merge the data
a link to the page on range merges?
v19.2/unsplit-at.md, line 181 at r4 (raw file):
the
split_enforced_until
column is nowNULL
for all ranges in the table
That doesn't seem to be true. One range still has a split_enforced_until
date. Do you need to change the unsplit command somehow to make all of them null?
Also, can you make merge the data
a link to the page on range merges?
@jseldess Hate to do this again, but can you re-review? I changed all of the examples for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (and 2 stale) (waiting on @andreimatei, @ericharmeling, @jseldess, @nvanbenschoten, and @rmloveland)
v19.2/split-at.md, line 61 at r5 (raw file):
## Examples {% include {{page.version.version}}/sql/movr-statements-partitioning.md %}
How about putting this under a ### Setup
heading? Also, why not use the same include that you use for unsplit
?
{% include {{page.version.version}}/sql/movr-statements-nodes.md %}
v19.2/split-at.md, line 106 at r5 (raw file):
~~~ ### Split a table with a composite primary key
This is a nice addition, but why use a new table rather than one of the built-in movr
tables, many of which already have compound primary keys?
Btw, I think you want to use the term compound
instead of composite
: https://en.wikipedia.org/wiki/Compound_key. That's the term we use in the partitioning docs, I think.
v19.2/split-at.md, line 110 at r5 (raw file):
You may want to split a table with a composite primary key (e.g., when working with [partitions](partitioning.html#partition-using-primary-key)). Suppose that you want MovR to offer ride-sharing services, in addition to vehicle-sharing services. Some users need sign up to be drivers, so you need a `drivers` table to store driver information.
Some users need sign up...
> Some users need to sign up...
v19.2/split-at.md, line 146 at r5 (raw file):
~~~ Now you can split the table based on the composite primary key. Note that you don't have to specify the entire value for the primary key, just the prefix.
Since you emphasize only needing the specify the prefix, why don't you do that in the example? That would just be the city
values, right?
v19.2/unsplit-at.md, line 114 at r1 (raw file):
Previously, jseldess (Jesse Seldess) wrote…
Can you make
merge the data
a link to the page on range merges?
Still think we should add a link to the range merges doc.
v19.2/unsplit-at.md, line 181 at r4 (raw file):
Previously, jseldess (Jesse Seldess) wrote…
the
split_enforced_until
column is nowNULL
for all ranges in the tableThat doesn't seem to be true. One range still has a
split_enforced_until
date. Do you need to change the unsplit command somehow to make all of them null?Also, can you make
merge the data
a link to the page on range merges?
Still think we should add a link to the range merges doc.
v19.2/unsplit-at.md, line 52 at r5 (raw file):
~~~ Now split the `users` table ranges based on primary key values:
Maybe make split
a link to split-at.html
?
v19.2/unsplit-at.md, line 116 at r5 (raw file):
### Unsplit an index Add a new secondary [index](indexes.html) to the `rides` table, on the `revenue` column, and then split the table ranges by secondary index values:
Maybe make split
a link to split-at.html
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jseldess See my responses to two questions. We can discuss more offline as well.
Reviewable status: complete! 1 of 0 LGTMs obtained (and 2 stale) (waiting on @andreimatei, @jseldess, @nvanbenschoten, and @rmloveland)
v19.2/split-at.md, line 61 at r5 (raw file):
Previously, jseldess (Jesse Seldess) wrote…
How about putting this under a
### Setup
heading? Also, why not use the same include that you use forunsplit
?{% include {{page.version.version}}/sql/movr-statements-nodes.md %}
Done.
So my reasoning behind using a different number of nodes/localities for SPLIT AT
and UNSPLIT AT
is a little convoluted... I'd like to limit the amount of documentation exposure that we give to crdb_internal
, especially in SQL statement documentation. For UNSPLIT
, the crdb_internal
output is kind of important because it shows you whether or not the range splits are enforced. For SPLIT AT
, you just need to see the different ranges pre- and post-split. So for the SPLIT AT
examples, I'm using SHOW RANGES
, which lists out the replicas and the lease holders and the localities. All of those attributes are really interesting if you have a slightly more complex deployment. Also, we mention partitioning when splitting a table with a compound primary key, and partitioning doesn't make much sense with just 3 nodes. Thoughts?
v19.2/split-at.md, line 106 at r5 (raw file):
Previously, jseldess (Jesse Seldess) wrote…
This is a nice addition, but why use a new table rather than one of the built-in
movr
tables, many of which already have compound primary keys?Btw, I think you want to use the term
compound
instead ofcomposite
: https://en.wikipedia.org/wiki/Compound_key. That's the term we use in the partitioning docs, I think.
It's true that most, if not all, movr
tables have compound keys. Those compound keys are on city
(simple and intuitive string) and id
(random UUID). A city
is an easy value to split at, since there are discrete values that you can map to actual cities. id
, however, is a UUID, so it's a much less intuitive value to split at. You need to first print out all of the ID's, sorted alphabetically or numerically, then you need to figure out what kind of distribution you have on them per city, so you can split each city into separate ranges based on those groups of id
s. Also, (and probably most importantly) you can't use a prefix when you specify the split value for a UUID because the value type in your SPLIT ... AT <value>
needs to match the value type of the primary key. Anyways, it would be better to pick a different, more intuitive and categorizable value that can be prefixed to split at, but since you can't change the primary index of an existing table using an ALTER TABLE
statement, you need to create a new table. So since we've already used the drivers
table, I figured we'd make a new table with the primary key on city
and something else a little more manageable like dl
, which is unique and randomly generated in our example, but at least we know that it's a string (can be prefixed) and is at most 8 characters in length.
Apologies for such a lengthy explanation! We can also discuss more offline.
v19.2/split-at.md, line 110 at r5 (raw file):
Previously, jseldess (Jesse Seldess) wrote…
Some users need sign up...
>Some users need to sign up...
Done.
v19.2/split-at.md, line 146 at r5 (raw file):
Previously, jseldess (Jesse Seldess) wrote…
Since you emphasize only needing the specify the prefix, why don't you do that in the example? That would just be the
city
values, right?
We do that for the driver's license. Which is coincidentally why I chose to create the new table (to make a primary key that is a string that can be prefixed, unlike the UUID id
)
v19.2/unsplit-at.md, line 114 at r1 (raw file):
Previously, jseldess (Jesse Seldess) wrote…
Still think we should add a link to the range merges doc.
Done. Sorry - I don't know why I skipped over this comment!
v19.2/unsplit-at.md, line 181 at r4 (raw file):
Previously, jseldess (Jesse Seldess) wrote…
Still think we should add a link to the range merges doc.
Done.
v19.2/unsplit-at.md, line 52 at r5 (raw file):
Previously, jseldess (Jesse Seldess) wrote…
Maybe make
split
a link tosplit-at.html
?
Done.
v19.2/unsplit-at.md, line 116 at r5 (raw file):
Previously, jseldess (Jesse Seldess) wrote…
Maybe make
split
a link tosplit-at.html
.
Done.
aeccba8
to
e6b5498
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 3 stale) (waiting on @andreimatei, @jseldess, @nvanbenschoten, and @rmloveland)
v19.2/split-at.md, line 106 at r5 (raw file):
Previously, ericharmeling (Eric Harmeling) wrote…
It's true that most, if not all,
movr
tables have compound keys. Those compound keys are oncity
(simple and intuitive string) andid
(random UUID). Acity
is an easy value to split at, since there are discrete values that you can map to actual cities.id
, however, is a UUID, so it's a much less intuitive value to split at. You need to first print out all of the ID's, sorted alphabetically or numerically, then you need to figure out what kind of distribution you have on them per city, so you can split each city into separate ranges based on those groups ofid
s. Also, (and probably most importantly) you can't use a prefix when you specify the split value for a UUID because the value type in yourSPLIT ... AT <value>
needs to match the value type of the primary key. Anyways, it would be better to pick a different, more intuitive and categorizable value that can be prefixed to split at, but since you can't change the primary index of an existing table using anALTER TABLE
statement, you need to create a new table. So since we've already used thedrivers
table, I figured we'd make a new table with the primary key oncity
and something else a little more manageable likedl
, which is unique and randomly generated in our example, but at least we know that it's a string (can be prefixed) and is at most 8 characters in length.Apologies for such a lengthy explanation! We can also discuss more offline.
Is there a reason why we even have this section? Jesse and I were wondering when you would even want to manually split a table at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 3 stale) (waiting on @andreimatei, @jseldess, @nvanbenschoten, and @rmloveland)
v19.2/split-at.md, line 106 at r5 (raw file):
Previously, ericharmeling (Eric Harmeling) wrote…
Is there a reason why we even have this section? Jesse and I were wondering when you would even want to manually split a table at all.
You're asking why we have manual splits at all? Isn't that discussed in the "Why manually split a range?" section? Manually splitting is certainly much less important these days than it used to be, now that we have load based splitting to automatically split hotspots based on load, but it's still an option we provide. Maybe I'm missing your question though.
With respect to the UUID discussion: UUIDs are actually pretty easy to split because we know their distribution - it's exactly uniformly distributed. So splitting at the midway point is something like 80000000-00000000-00000000-00000000
.
v19.2/split-at.md, line 106 at r5 (raw file): Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Based on my conversation with Eric, I think the question is really: Why should we show users how to split based on the second value in a compound primary key. With partitioning, we partition (auto-split) based on the first value, typically the geo-column. But I don't understand when a user would ever want, or need, to split on the second value. Do you have insights? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 3 stale) (waiting on @andreimatei, @jseldess, @nvanbenschoten, and @rmloveland)
v19.2/split-at.md, line 106 at r5 (raw file):
Previously, jseldess (Jesse Seldess) wrote…
Based on my conversation with Eric, I think the question is really: Why should we show users how to split based on the second value in a compound primary key. With partitioning, we partition (auto-split) based on the first value, typically the geo-column. But I don't understand when a user would ever want, or need, to split on the second value. Do you have insights?
Well if you only have a few possible values for the first column in the primary key then you need to start splitting on the second part. For instance, imagine the first column in the primary key of a table was a BOOL
. How would I split the table 10 ways to avoid contention?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 3 stale) (waiting on @andreimatei, @ericharmeling, @jseldess, @nvanbenschoten, and @rmloveland)
v19.2/split-at.md, line 106 at r5 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Well if you only have a few possible values for the first column in the primary key then you need to start splitting on the second part. For instance, imagine the first column in the primary key of a table was a
BOOL
. How would I split the table 10 ways to avoid contention?
Ah, I see. Thanks for the explanation.
@ericharmling, I think this is fine.
Thanks @nvanbenschoten !
I meant "in the context of partitioning", in the section "Split a table with a composite primary key" - totally left that out of my question though. To resolve that, I'm just going to remove "(e.g., when working with partitions)" from that section. |
- WITH EXPIRATION added to SPLIT AT (new diagrams, new parameter, new example)
e6b5498
to
8bcf770
Compare
UNSPLIT AT
statement.UNSPLIT AT
statement.SPLIT AT
statement for WITH EXPIRATION.UNSPLIT AT
examples to usemovr
database.Fixes #4895.
Fixes #5008.
Fixes #5007.