rfc: sql interleaved tables #7465

danhhz · 2016-06-24T15:16:23Z

Support interleaving the data for sql tables such that the rows alternate in the
kv map. This allows a user to optimize reads on tables that are frequently
joined.

Rendered preview: Preview at https://github.com/paperstreet/cockroach/blob/rfc_interleaved/docs/RFCS/sql_interleaved_tables.md

This change is

petermattis · 2016-06-24T17:51:17Z

Review status: 0 of 1 files reviewed at latest revision, 4 unresolved discussions, all commit checks successful.

docs/RFCS/sql_interleaved_tables.md, line 62 [r1] (raw file):

    /customers/1/<customerID>/<familyID>/1
    # Order data
    /customers/1/<customerID>/<interleavedTableSentinal>/orders/<orderID>/0

s/sentinal/sentinel/g

docs/RFCS/sql_interleaved_tables.md, line 77 [r1] (raw file):

supports arbitrary levels of nesting: grandparent and great-grandparent
interleaves.

I think you need to add some additional bits here about how interleaved keys affect various SQL operations. In particular, deleting multiple contiguous rows from either the parent or the child interleaved table is more problematic as we can't use a single DelRange operation. Same applies to dropping an interleaved table.

docs/RFCS/sql_interleaved_tables.md, line 92 [r1] (raw file):

- When the interleaved data will be small, should the user be able to force all
rows of an interleaved table to stay with their parent rows?

It would be good to try and avoid splitting in the middle of the interleaved data. That would require being able to examine a key and determine if it is for an interleaved table without any additional information. I'm not sure if that is possible with the proposed encoding.

docs/RFCS/sql_interleaved_tables.md, line 96 [r1] (raw file):

- Should a foreign key constraint be automatically created between the shared
prefix of the interleaved and parent tables? (Probably not. It's not necessary
for the mechanics of interleaving tables and the user can do this if they like.)

Agreed, creating the constraint automatically doesn't seem necessary, though perhaps there might be a shorthand to create such a constraint.

Comments from Reviewable

tbg · 2016-06-24T18:16:04Z

Reviewed 1 of 1 files at r1.
Review status: all files reviewed at latest revision, 4 unresolved discussions, all commit checks successful.

docs/RFCS/sql_interleaved_tables.md, line 92 [r1] (raw file):

Previously, petermattis (Peter Mattis) wrote…

It would be good to try and avoid splitting in the middle of the interleaved data. That would require being able to examine a key and determine if it is for an interleaved table without any additional information. I'm not sure if that is possible with the proposed encoding.

It could also be bad if users interleave large amounts of data (i.e. 32mb of orders for a customer). I'm not clear that it is a good idea to prevent that split from happening if it wants to.

Comments from Reviewable

petermattis · 2016-06-24T18:46:24Z

Review status: all files reviewed at latest revision, 4 unresolved discussions, all commit checks successful.

docs/RFCS/sql_interleaved_tables.md, line 92 [r1] (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

It could also be bad if users interleave large amounts of data (i.e. 32mb of orders for a customer). I'm not clear that it is a good idea to prevent that split from happening if it wants to.

Right. I was suggesting that we prefer to avoid splitting in the middle of interleaved data, not prevent it entirely. This would involve making the logic in `MVCCFindSplitKey` more complex.

Comments from Reviewable

danhhz · 2016-06-24T20:08:36Z

Review status: all files reviewed at latest revision, 4 unresolved discussions, all commit checks successful.

docs/RFCS/sql_interleaved_tables.md, line 92 [r1] (raw file):

Previously, petermattis (Peter Mattis) wrote…

Right. I was suggesting that we prefer to avoid splitting in the middle of interleaved data, not prevent it entirely. This would involve making the logic in MVCCFindSplitKey more complex.

So this rfc currently suggests that

each row in the parent and child are unsplittable, but it can split between any row boundaries

and at the other extreme there is

each child row cannot be split away from its corresponding parent row (and thus both cannot be split from the other children rows corresponding to that parent as well as cannot be split from its grandparent)

Peter, it sounds like you're suggesting

each set of children rows in a table associated with one parent row can be split from the parent but not from each other

which is interesting because it solves the issue with (2) of a row being forced with its grandparent (and so on) and thus with all its grandparent's grandchildren. Which quickly gets out of control.

However, I feel like that's not the right default because it imposes an effective size limit (maximum reasonable range size vs all the rows that share the same parent) on every interleaved table that's not present on non-interleaved tables.

Also, I disagree that the proposed encoding is incompatible with (3). If the suffix is constructed such that it counts back to a prefix of /customers/1/<customerID>/<interleavedTableSentinal>/orders for determining splits, it should do what you're suggesting (assuming I've understood your suggestion).

Comments from Reviewable

danhhz · 2016-06-24T20:11:50Z

Review status: all files reviewed at latest revision, 4 unresolved discussions, all commit checks successful.

docs/RFCS/sql_interleaved_tables.md, line 92 [r1] (raw file):

Previously, paperstreet (Daniel Harrison) wrote…

So this rfc currently suggests that

each row in the parent and child are unsplittable, but it can split between any row boundaries

and at the other extreme there is

each child row cannot be split away from its corresponding parent row (and thus both cannot be split from the other children rows corresponding to that parent as well as cannot be split from its grandparent)

Peter, it sounds like you're suggesting

each set of children rows in a table associated with one parent row can be split from the parent but not from each other

which is interesting because it solves the issue with (2) of a row being forced with its grandparent (and so on) and thus with all its grandparent's grandchildren. Which quickly gets out of control.

However, I feel like that's not the right default because it imposes an effective size limit (maximum reasonable range size vs all the rows that share the same parent) on every interleaved table that's not present on non-interleaved tables.

Also, I disagree that the proposed encoding is incompatible with (3). If the suffix is constructed such that it counts back to a prefix of /customers/1/<customerID>/<interleavedTableSentinal>/orders for determining splits, it should do what you're suggesting (assuming I've understood your suggestion).

And on further re-read, it seems like you may also/instead be talking about adding a distinction between "never-split" vs "prefer-not-to-split" to MVCCFindSplitKey?

Comments from Reviewable

petermattis · 2016-06-25T01:25:33Z

Review status: all files reviewed at latest revision, 4 unresolved discussions, all commit checks successful.

docs/RFCS/sql_interleaved_tables.md, line 92 [r1] (raw file):

Previously, paperstreet (Daniel Harrison) wrote…

And on further re-read, it seems like you may also/instead be talking about adding a distinction between "never-split" vs "prefer-not-to-split" to MVCCFindSplitKey?

Yes, I'm suggesting the "prefer-not-to-split" as a middle ground. If we never allow a split between related parent and child rows we run the danger of over-large ranges. If we always allow such splits we might be making bad split decisions that could be avoided with a little more intelligence.

Comments from Reviewable

RaduBerinde · 2016-06-25T02:48:17Z

Nice!

Review status: all files reviewed at latest revision, 5 unresolved discussions, all commit checks successful.

docs/RFCS/sql_interleaved_tables.md, line 36 [r1] (raw file):

Interleaved tables will be surfaced to users by a syntax addition: `CREATE TABLE
orders (customer_id INT, order_id INT, cost DECIMAL, PRIMARY KEY (customer_id,
order_id)) INTERLEAVE IN PARENT customers`. It is required that the field types

I am wondering how schema changes will be possible if we require this at all times. A statement to change the schema of either wouldn't be possible without breaking this constraint, but then there is no way to chnage both at the same time either..

Related - will it be possible to "break" an interleave relationship? And will it be possible to create an interleave relation on an existing table?

I think it's reasonable if the answer to these questions is "no", at least for the time being (but should be spelled out).

docs/RFCS/sql_interleaved_tables.md, line 77 [r1] (raw file):

Previously, petermattis (Peter Mattis) wrote…

I think you need to add some additional bits here about how interleaved keys affect various SQL operations. In particular, deleting multiple contiguous rows from either the parent or the child interleaved table is more problematic as we can't use a single DelRange operation. Same applies to dropping an interleaved table.

Related to this - will we support something like "ON DELETE CASCADE", where a deletion in the parent table also deletes all the relevant rows of the child table?

There should be two modes - ON DELETE CASCADE where child rows are deleted, and ON DELETE NO ACTION where child rows are not deleted but if deleting a parent row leaves behind child rows, the operation fails.

Comments from Reviewable

RaduBerinde · 2016-06-25T02:54:43Z

Another question to think about - will we support interleaving a secondary index? Starting from your example, imagine the orders table uses an order ID as the primary key (and not the customer ID); but we want to have an index on (customerID, orderID) and interleave that index with the customer table.

Review status: all files reviewed at latest revision, 5 unresolved discussions, all commit checks successful.

Comments from Reviewable

tamird · 2016-06-27T18:27:27Z

Reviewed 1 of 1 files at r1.
Review status: all files reviewed at latest revision, 5 unresolved discussions, all commit checks successful.

docs/RFCS/sql_interleaved_tables.md, line 36 [r1] (raw file):

Previously, RaduBerinde wrote…

I am wondering how schema changes will be possible if we require this at all times. A statement to change the schema of either wouldn't be possible without breaking this constraint, but then there is no way to chnage both at the same time either..

Related - will it be possible to "break" an interleave relationship? And will it be possible to create an interleave relation on an existing table?

I think it's reasonable if the answer to these questions is "no", at least for the time being (but should be spelled out).

Do we allow mutating the primary key? If the answer is no (which I think it is) then I guess these questions are moot, though I'd be in favour of calling this out in the RFC.

Comments from Reviewable

petermattis · 2016-06-27T18:31:34Z

Review status: all files reviewed at latest revision, 5 unresolved discussions, all commit checks successful.

docs/RFCS/sql_interleaved_tables.md, line 36 [r1] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

Do we allow mutating the primary key? If the answer is no (which I think it is) then I guess these questions are moot, though I'd be in favour of calling this out in the RFC.

The columns that compose the primary key cannot be changed after the table has been created. Given that existing constraint, disallowing "breaking" of an interleave relationship sounds reasonable.

Comments from Reviewable

danhhz · 2016-06-28T16:29:36Z

Thanks for the reviews, everyone! RFAL

Secondary index interleaving is interesting, but I wonder if the additional complexity is worth it.

Review status: 0 of 1 files reviewed at latest revision, 5 unresolved discussions, some commit checks pending.

docs/RFCS/sql_interleaved_tables.md, line 36 [r1] (raw file):

Previously, petermattis (Peter Mattis) wrote…

The columns that compose the primary key cannot be changed after the table has been created. Given that existing constraint, disallowing "breaking" of an interleave relationship sounds reasonable.

Yeah, it was my intention that because primary indexes can't change, this can't either. I should have spelled that out originally. Done.

docs/RFCS/sql_interleaved_tables.md, line 62 [r1] (raw file):

Previously, petermattis (Peter Mattis) wrote…

s/sentinal/sentinel/g

I really can't seem to spell this word correctly : - /

docs/RFCS/sql_interleaved_tables.md, line 77 [r1] (raw file):

Previously, RaduBerinde wrote…

Related to this - will we support something like "ON DELETE CASCADE", where a deletion in the parent table also deletes all the relevant rows of the child table?

There should be two modes - ON DELETE CASCADE where child rows are deleted, and ON DELETE NO ACTION where child rows are not deleted but if deleting a parent row leaves behind child rows, the operation fails.

Good call to both. Done.

docs/RFCS/sql_interleaved_tables.md, line 92 [r1] (raw file):

Previously, petermattis (Peter Mattis) wrote…

Yes, I'm suggesting the "prefer-not-to-split" as a middle ground. If we never allow a split between related parent and child rows we run the danger of over-large ranges. If we always allow such splits we might be making bad split decisions that could be avoided with a little more intelligence.

In practice, I expect the interleaved data to fall into the two extremes: a little data in between each row or a lot of data in between each row. In the former case, if we split by row, it'll mostly be right since the rows will be much smaller than ranges. In the latter, we'll want to split it anyway, so it's not terrible if it gets it a little wrong. I agree that "prefer-not-to-split" is superior, but I think our existing mechanism is going to get us most of the way there in practice and, if possible, I'd like to see this cause issues in practice before we add the MVCC split complications.

Luckily, as we discussed offline, the way to add those complications is probably to plumb down TableDescriptors and make the split point selection code understand our keys. In which case, the proposed encoding will be fine if we later decide to do this.

docs/RFCS/sql_interleaved_tables.md, line 96 [r1] (raw file):

Previously, petermattis (Peter Mattis) wrote…

Agreed, creating the constraint automatically doesn't seem necessary, though perhaps there might be a shorthand to create such a constraint.

Done.

Comments from Reviewable

tbg · 2016-06-28T16:53:34Z

Reviewed 1 of 1 files at r2.
Review status: all files reviewed at latest revision, 5 unresolved discussions, some commit checks failed.

docs/RFCS/sql_interleaved_tables.md, line 36 [r1] (raw file):

Previously, paperstreet (Daniel Harrison) wrote…

Yeah, it was my intention that because primary indexes can't change, this can't either. I should have spelled that out originally. Done.

Is there anything else to be said for schema changes? Will any work need to be done after the above limitations?

Comments from Reviewable

tamird · 2016-06-28T17:19:26Z

Reviewed 1 of 1 files at r2.
Review status: all files reviewed at latest revision, 5 unresolved discussions, some commit checks failed.

Comments from Reviewable

danhhz · 2016-06-28T18:10:20Z

I've been convinced that we should look at secondary index interleaving. I'll add it in

Review status: all files reviewed at latest revision, 5 unresolved discussions, some commit checks failed.

Comments from Reviewable

danhhz · 2016-06-28T19:04:18Z

Updated with secondary index interleaving @RaduBerinde

Also swapped DO NOTHING for RESTRICT, which is what we're already using for this in foreign keys

Review status: 0 of 1 files reviewed at latest revision, 5 unresolved discussions, some commit checks pending.

Comments from Reviewable

tbg · 2016-06-30T17:54:44Z

Reviewed 1 of 1 files at r3.
Review status: all files reviewed at latest revision, 8 unresolved discussions, all commit checks successful.

docs/RFCS/sql_interleaved_tables.md, line 57 [r3] (raw file):

RESTRICT` will allow removal of parent rows with no interleaved rows, but will
error if interleaved rows would be orphaned. A missing `ON DELETE` clause will
fall back to table-level (or no) contraints.

s/train/strain/

docs/RFCS/sql_interleaved_tables.md, line 88 [r3] (raw file):

Where `interleavedTableSentinel` is 109, which is selected as the largest value
encodable in a single byte with `util/encoding.EncodeUVarintAscending`. This
effectively limits the number of families in a table to 108.

Couldn't more families exist, just with a gap at 108, if we were willing to go the extra mile?

docs/RFCS/sql_interleaved_tables.md, line 110 [r3] (raw file):

using kv `Seek`s or by scanning every key and discarding the irrelvant ones,
each with different performance tradeoffs. This logic will initially be
encapsulated in the sql `RowFetcher`, but can be pushed down if necessary for

I would expect all of this logic to be pretty complicated. If you make no assumptions about the size of the interleaved data, you really cannot afford to just scan all of it (scanning 10 users shouldn't send all of these users' orders back). I have the impression that some rudimentary SQL awareness of KV scans could really go a long way here.

Is it really better to implement all of this on the current KV store, when it is kind of clear that KV needs to learn about SQL keys for this to work performantly? Or am I overestimating the likelihood of that happening? Discussing how some operations would have to be carried out on vanilla-kv vs. aware-kv would be helpful to understand this better.

Comments from Reviewable

RaduBerinde · 2016-06-30T18:31:53Z

Review status: all files reviewed at latest revision, 8 unresolved discussions, all commit checks successful.

docs/RFCS/sql_interleaved_tables.md, line 110 [r3] (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

I would expect all of this logic to be pretty complicated. If you make no assumptions about the size of the interleaved data, you really cannot afford to just scan all of it (scanning 10 users shouldn't send all of these users' orders back). I have the impression that some rudimentary SQL awareness of KV scans could really go a long way here.

Is it really better to implement all of this on the current KV store, when it is kind of clear that KV needs to learn about SQL keys for this to work performantly? Or am I overestimating the likelihood of that happening? Discussing how some operations would have to be carried out on vanilla-kv vs. aware-kv would be helpful to understand this better.

This is an interesting discussion (CC @andreimatei)

Assuming KV had some SQL awareness, how would it implement this logic more effectively? Are there any storage (RocksDB) primitives that can be useful (and which we can't taken advantage of through the KV API)?

I understand the difference on not shipping the useless data around, but with distsql the RowFetcher logic would normally run on the range leader so it would still be a local operation.

Comments from Reviewable

RaduBerinde · 2016-06-30T18:37:25Z

Review status: all files reviewed at latest revision, 8 unresolved discussions, all commit checks successful.

docs/RFCS/sql_interleaved_tables.md, line 88 [r3] (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

Couldn't more families exist, just with a gap at 108, if we were willing to go the extra mile?

I think it's important that all KVs for a parent table row precede any interleaved table KVs. Would be very hard to skip reading the interleaved rows if we don't guarantee that.

Comments from Reviewable

danhhz · 2016-06-30T18:52:46Z

Review status: all files reviewed at latest revision, 8 unresolved discussions, all commit checks successful.

docs/RFCS/sql_interleaved_tables.md, line 88 [r3] (raw file):

Previously, RaduBerinde wrote…

I think it's important that all KVs for a parent table row precede any interleaved table KVs. Would be very hard to skip reading the interleaved rows if we don't guarantee that.

I'll update this to point more explicitly at the drawbacks below, where I note that this is possible, but has perf implications.

docs/RFCS/sql_interleaved_tables.md, line 110 [r3] (raw file):

Previously, RaduBerinde wrote…

This is an interesting discussion (CC @andreimatei)

Assuming KV had some SQL awareness, how would it implement this logic more effectively? Are there any storage (RocksDB) primitives that can be useful (and which we can't taken advantage of through the KV API)?

I understand the difference on not shipping the useless data around, but with distsql the RowFetcher logic would normally run on the range leader so it would still be a local operation.

This is hand-wavey, I'll think it through more and add it to the doc.

While you're searching for the beginning of the data for some interleaved table, if you get a row from the parent, you can construct the key to scan for the child by replacing the familyID with 109.

Then scanning a table, if you hit a row from some interleaved child , compute the key that comes after that (set family to 110) and, if it's not already sitting in the Batch, Scan to it and continue from there.

As @RaduBerinde mentions, I feel like doing it this way is fine for a first impl because of distsql. My big concern is if anything about the key encoding described will bite us later if we end up needing to make this more of a kv primitive, but I haven't seen anything yet to indicate that.

Comments from Reviewable

tbg · 2016-06-30T19:00:18Z

Review status: all files reviewed at latest revision, 8 unresolved discussions, all commit checks successful.

docs/RFCS/sql_interleaved_tables.md, line 110 [r3] (raw file):

Previously, paperstreet (Daniel Harrison) wrote…

This is hand-wavey, I'll think it through more and add it to the doc.

While you're searching for the beginning of the data for some interleaved table, if you get a row from the parent, you can construct the key to scan for the child by replacing the familyID with 109.

Then scanning a table, if you hit a row from some interleaved child , compute the key that comes after that (set family to 110) and, if it's not already sitting in the Batch, Scan to it and continue from there.

As @RaduBerinde mentions, I feel like doing it this way is fine for a first impl because of distsql. My big concern is if anything about the key encoding described will bite us later if we end up needing to make this more of a kv primitive, but I haven't seen anything yet to indicate that.

Shipping the data around was my main concern (and the associated GC pressure), but even the many more round-trips are awkward. It's not so much an issue for scans because these are served locally, but deletions which want to use `DeleteRange` (say you delete users from `asd` to `bsd` would have to pay for number-of-rows-to-delete Raft roundtrips, which is really expensive). Might not be the most prominent use case, so let's focus on `Scan`:

We use iterators everywhere, so it seems conceivable that we could skip over most of the interleaved tables (as opposed to pulling all keys out, looking at them, discarding them) which should show in benchmarks with LIMIT selects.

That said, I don't want to push for going this KV route because I think it's "sleeker", there's certainly a lot of complexity associated to it as well. I just wanted to make sure that the "naive" SQL implementation doesn't wind up introducing a lot of complexity there and that then performance considerations force us to push all of that complexity into KV instead.

Scanning 100 rows with large interleaved tables and deleting 100 rows with large interleaved tables are going to be interesting examples. For the latter, I don't see how we can get away without 100 Raft roundtrips (=dead).

Comments from Reviewable

danhhz · 2016-06-30T19:12:31Z

Review status: all files reviewed at latest revision, 8 unresolved discussions, all commit checks successful.

docs/RFCS/sql_interleaved_tables.md, line 110 [r3] (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

Shipping the data around was my main concern (and the associated GC pressure), but even the many more round-trips are awkward. It's not so much an issue for scans because these are served locally, but deletions which want to use DeleteRange (say you delete users from asd to bsd would have to pay for number-of-rows-to-delete Raft roundtrips, which is really expensive). Might not be the most prominent use case, so let's focus on Scan:

We use iterators everywhere, so it seems conceivable that we could skip over most of the interleaved tables (as opposed to pulling all keys out, looking at them, discarding them) which should show in benchmarks with LIMIT selects.

That said, I don't want to push for going this KV route because I think it's "sleeker", there's certainly a lot of complexity associated to it as well. I just wanted to make sure that the "naive" SQL implementation doesn't wind up introducing a lot of complexity there and that then performance considerations force us to push all of that complexity into KV instead.

Scanning 100 rows with large interleaved tables and deleting 100 rows with large interleaved tables are going to be interesting examples. For the latter, I don't see how we can get away without 100 Raft roundtrips (=dead).

(Great stuff @tschottdorf, keep it coming.)

I could be doing a better job of explaining my thoughts on scans, but before that, I'd like to step back for a second. If we interleave row by row, there's no way to get around 100 raft operations to delete 100 consecutive rows, correct?

Is there some other way this feature could work without interleaving row by row?

Comments from Reviewable

RaduBerinde · 2016-06-30T19:25:16Z

Review status: all files reviewed at latest revision, 9 unresolved discussions, all commit checks successful.

docs/RFCS/sql_interleaved_tables.md, line 50 [r3] (raw file):

secondary index and then dropping the old one.

It is frequently desirable to enforce that there is never a row in an

Is there a third option where we are allowed to remove rows from parent table? If yes, it's something that we can't explicitly choose with the ON DELETE syntax? If no, change "it is frequently desirable" to "we enforce"

docs/RFCS/sql_interleaved_tables.md, line 110 [r3] (raw file):

Previously, paperstreet (Daniel Harrison) wrote…

(Great stuff @tschottdorf, keep it coming.)

I could be doing a better job of explaining my thoughts on scans, but before that, I'd like to step back for a second. If we interleave row by row, there's no way to get around 100 raft operations to delete 100 consecutive rows, correct?

Is there some other way this feature could work without interleaving row by row?

I don't think that kind of deletion is an interesting operation - I'm not even sure it is possible (see the other question, not sure if we support anything other than CASCADE and RESTRICT on delete).

Comments from Reviewable

danhhz · 2016-06-30T19:29:38Z

Review status: all files reviewed at latest revision, 9 unresolved discussions, all commit checks successful.

docs/RFCS/sql_interleaved_tables.md, line 50 [r3] (raw file):

Previously, RaduBerinde wrote…

Is there a third option where we are allowed to remove rows from parent table? If yes, it's something that we can't explicitly choose with the ON DELETE syntax? If no, change "it is frequently desirable" to "we enforce"

Yes, you omit ON DELETE entirely. I guess it wasn't clear but I tried to cover this with "A missing `ON DELETE` clause will fall back to table-level (or no) contraints."

docs/RFCS/sql_interleaved_tables.md, line 110 [r3] (raw file):

Previously, RaduBerinde wrote…

I don't think that kind of deletion is an interesting operation - I'm not even sure it is possible (see the other question, not sure if we support anything other than CASCADE and RESTRICT on delete).

`DELETE FROM foo WHERE id < 100;` foo can be either a parent or interleaved child, both get the bad behavior here. Previously, this would be "fast pathed" and would be one DelRange.

Comments from Reviewable

RaduBerinde · 2016-06-30T20:01:14Z

Review status: all files reviewed at latest revision, 9 unresolved discussions, all commit checks successful.

docs/RFCS/sql_interleaved_tables.md, line 110 [r3] (raw file):

Previously, paperstreet (Daniel Harrison) wrote…

DELETE FROM foo WHERE id < 100; foo can be either a parent or interleaved child, both get the bad behavior here. Previously, this would be "fast pathed" and would be one DelRange.

Sure, I was thinking that deleting just the parent row doesn't make a lot of sense in practice when interleaving tables (I would expect ON DELETE CASCADE to be the norm). But even so you have a point when deleting child rows (though I'm not convinced it would be a common operation).

I guess if we wanted to optimize this case, we would need a primitive to delete ranges of keys while skipping certain keys (where "certain" might require some SQL awareness or some kind of generic string matching).

Comments from Reviewable

danhhz · 2016-07-01T21:38:19Z

Review status: 0 of 1 files reviewed at latest revision, 8 unresolved discussions, some commit checks pending.

docs/RFCS/sql_interleaved_tables.md, line 36 [r1] (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

Is there anything else to be said for schema changes? Will any work need to be done after the above limitations?

Done.

docs/RFCS/sql_interleaved_tables.md, line 57 [r3] (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

s/train/strain/

Done.

docs/RFCS/sql_interleaved_tables.md, line 110 [r3] (raw file):

Previously, RaduBerinde wrote…

Sure, I was thinking that deleting just the parent row doesn't make a lot of sense in practice when interleaving tables (I would expect ON DELETE CASCADE to be the norm). But even so you have a point when deleting child rows (though I'm not convinced it would be a common operation).

I guess if we wanted to optimize this case, we would need a primitive to delete ranges of keys while skipping certain keys (where "certain" might require some SQL awareness or some kind of generic string matching).

Rewrote this section a bit.

Comments from Reviewable

bdarnell · 2016-07-04T22:50:49Z

Review status: 0 of 1 files reviewed at latest revision, 12 unresolved discussions, all commit checks successful.

docs/RFCS/sql_interleaved_tables.md, line 27 [r4] (raw file):

# Motivation

A join of two tables accesses different parts of the kv map that are usually on

Interleaving is useful even without joins: consider a table with a composite PK and one or more secondary indexes that share a prefix with the PK:
CREATE TABLE orders (customer_id int, order_id int, ts timestamp, PRIMARY KEY (customer_id, order_id), INDEX (customer_id, timestamp). If this table is interleaved on customer_id, the secondary index rows are close to the primary data (whether there is a "parent" table or not).

docs/RFCS/sql_interleaved_tables.md, line 38 [r4] (raw file):

order_id)) INTERLEAVE IN PARENT customers`.

It is required that the field types of the primary key of the target (or

It feels error-prone to me to allow interleaving whenever the types match without actually specifying which columns are involved. I think I'd rather use syntax more like foreign key references to specify which columns are to be used, and return an error if the columns don't make up (a prefix of) the primary key.

docs/RFCS/sql_interleaved_tables.md, line 42 [r4] (raw file):

created (or "interleaved") table. Alternately, a secondary index could be
interleaved into the parent table instead. The syntax for this is `CREATE INDEX
name (fields) INTERLEAVE INTO parent`. Note that each of a table's indexes

Is there syntax for doing this in the CREATE TABLE statement, or is it always a separate CREATE INDEX?

docs/RFCS/sql_interleaved_tables.md, line 57 [r4] (raw file):

RESTRICT` will allow removal of parent rows with no interleaved rows, but will
error if interleaved rows would be orphaned. A missing `ON DELETE` clause will
fall back to table-level (or no) constraints.

Will an ON DELETE constraint defined by an INTERLEAVE clause be modifyable with ALTER TABLE even though the interleave clause itself is not? What happens if both INTERLEAVE and a FOREIGN KEY specify different ON DELETE behaviors? I think I'd rather not introduce this redundant shorthand for now.

Comments from Reviewable

petermattis · 2016-07-05T14:42:15Z

Review status: 0 of 1 files reviewed at latest revision, 13 unresolved discussions, all commit checks successful.

docs/RFCS/sql_interleaved_tables.md, line 86 [r4] (raw file):

    /customers/1/<customerID>/<interleavedTableSentinel>/orders/<indexID>/<orderID>/<familyID>/suffix

Where `interleavedTableSentinel` is 109, which is selected as the largest value

109 is the largest integer that can be encoded in a single byte, but is there a reason to restrict ourselves to an integer here. Seems like you could encoded 0xff (EncodeNullDescending) to guarantee a "familyID" for the interleaved table sentinel that is larger than any normal familyID. Is this not feasible for some reason?

Comments from Reviewable

danhhz · 2016-07-05T14:46:02Z

Review status: 0 of 1 files reviewed at latest revision, 13 unresolved discussions, all commit checks successful.

docs/RFCS/sql_interleaved_tables.md, line 86 [r4] (raw file):

Previously, petermattis (Peter Mattis) wrote…

109 is the largest integer that can be encoded in a single byte, but is there a reason to restrict ourselves to an integer here. Seems like you could encoded 0xff (EncodeNullDescending) to guarantee a "familyID" for the interleaved table sentinel that is larger than any normal familyID. Is this not feasible for some reason?

Super smart. Will update

Comments from Reviewable

danhhz · 2016-07-05T19:52:20Z

Okay, this is starting to settle down and I'd like to start on the impl. Is it ready to merge?

Review status: 0 of 1 files reviewed at latest revision, 13 unresolved discussions, some commit checks pending.

docs/RFCS/sql_interleaved_tables.md, line 27 [r4] (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Interleaving is useful even without joins: consider a table with a composite PK and one or more secondary indexes that share a prefix with the PK:
CREATE TABLE orders (customer_id int, order_id int, ts timestamp, PRIMARY KEY (customer_id, order_id), INDEX (customer_id, timestamp). If this table is interleaved on customer_id, the secondary index rows are close to the primary data (whether there is a "parent" table or not).

Good call. This proposal was originally just for primary index data. When I updated it to support interleaving secondary indexes, I missed this. Done.

docs/RFCS/sql_interleaved_tables.md, line 38 [r4] (raw file):

Previously, bdarnell (Ben Darnell) wrote…

It feels error-prone to me to allow interleaving whenever the types match without actually specifying which columns are involved. I think I'd rather use syntax more like foreign key references to specify which columns are to be used, and return an error if the columns don't make up (a prefix of) the primary key.

My intuition definitely leans in the direction of the simpler. How about I support both and let the user decide which to use? I can see users wanting the explicit version as well as the simple one.

docs/RFCS/sql_interleaved_tables.md, line 42 [r4] (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Is there syntax for doing this in the CREATE TABLE statement, or is it always a separate CREATE INDEX?

Yeah, I just didn't have it in here. Done

docs/RFCS/sql_interleaved_tables.md, line 57 [r4] (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Will an ON DELETE constraint defined by an INTERLEAVE clause be modifyable with ALTER TABLE even though the interleave clause itself is not? What happens if both INTERLEAVE and a FOREIGN KEY specify different ON DELETE behaviors? I think I'd rather not introduce this redundant shorthand for now.

This will just be a shorthand for table-level foreign key constraints, so I'm not concerned about any ALTER TABLE or conflicting definition issues. They'll have to be done by the fk code anyway. As of right now, the implementation of foreign key constraints hasn't gotten to the table-level syntax, so the work for this shorthand will probably wait on that, but I'd still like to keep it in the rfc.

docs/RFCS/sql_interleaved_tables.md, line 86 [r4] (raw file):

Previously, paperstreet (Daniel Harrison) wrote…

Super smart. Will update

Done.

Comments from Reviewable

tbg · 2016-07-05T20:18:58Z

Reviewed 1 of 1 files at r5.
Review status: all files reviewed at latest revision, 12 unresolved discussions, all commit checks successful.

Comments from Reviewable

tamird · 2016-07-05T20:47:34Z

Review status: all files reviewed at latest revision, 12 unresolved discussions, all commit checks successful.

docs/RFCS/sql_interleaved_tables.md, line 38 [r4] (raw file):

Previously, paperstreet (Daniel Harrison) wrote…

My intuition definitely leans in the direction of the simpler. How about I support both and let the user decide which to use? I can see users wanting the explicit version as well as the simple one.

+1 to explicit, FWIW

Comments from Reviewable

petermattis · 2016-07-05T21:21:08Z

Review status: all files reviewed at latest revision, 12 unresolved discussions, all commit checks successful.

docs/RFCS/sql_interleaved_tables.md, line 134 [r5] (raw file):

will be slower for any table interleaved with another. This can be mitigated
somewhat by pushing knowledge of the sql key structure down into kv, but this
still won't be quite as fast and will add additionally complexity to kv.

s/additionally/additional/g

Comments from Reviewable

Support interleaving the data for sql tables such that the rows alternate in the kv map. This allows a user to optimize reads on tables that are frequently joined. For cockroachdb#2972.

danhhz · 2016-07-05T21:28:15Z

Review status: 0 of 1 files reviewed at latest revision, 12 unresolved discussions, some commit checks pending.

docs/RFCS/sql_interleaved_tables.md, line 38 [r4] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

+1 to explicit, FWIW

Looks like I'm in the minority. Switched to only explicit.

docs/RFCS/sql_interleaved_tables.md, line 134 [r5] (raw file):

Previously, petermattis (Peter Mattis) wrote…

s/additionally/additional/g

Done.

Comments from Reviewable

danhhz assigned petermattis Jun 24, 2016

danhhz force-pushed the rfc_interleaved branch 2 times, most recently from 522fd5a to cb71bee Compare June 24, 2016 16:06

danhhz force-pushed the rfc_interleaved branch from cb71bee to d8d3ad0 Compare June 28, 2016 16:28

danhhz force-pushed the rfc_interleaved branch from d8d3ad0 to 375b2d4 Compare June 28, 2016 19:03

danhhz force-pushed the rfc_interleaved branch from 375b2d4 to 65f27ea Compare July 1, 2016 21:37

danhhz force-pushed the rfc_interleaved branch from 65f27ea to 4a6eb31 Compare July 5, 2016 19:50

rfc: sql interleaved tables

f8172b3

Support interleaving the data for sql tables such that the rows alternate in the kv map. This allows a user to optimize reads on tables that are frequently joined. For cockroachdb#2972.

danhhz force-pushed the rfc_interleaved branch from 4a6eb31 to f8172b3 Compare July 5, 2016 21:27

danhhz merged commit 77a2afc into cockroachdb:master Jul 5, 2016

danhhz deleted the rfc_interleaved branch July 5, 2016 22:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rfc: sql interleaved tables #7465

rfc: sql interleaved tables #7465

danhhz commented Jun 24, 2016 •

edited by petermattis

Loading

petermattis commented Jun 24, 2016

tbg commented Jun 24, 2016

petermattis commented Jun 24, 2016

danhhz commented Jun 24, 2016

danhhz commented Jun 24, 2016

petermattis commented Jun 25, 2016

RaduBerinde commented Jun 25, 2016

RaduBerinde commented Jun 25, 2016

tamird commented Jun 27, 2016

petermattis commented Jun 27, 2016

danhhz commented Jun 28, 2016

tbg commented Jun 28, 2016

tamird commented Jun 28, 2016

danhhz commented Jun 28, 2016

danhhz commented Jun 28, 2016

tbg commented Jun 30, 2016

RaduBerinde commented Jun 30, 2016

RaduBerinde commented Jun 30, 2016

danhhz commented Jun 30, 2016

tbg commented Jun 30, 2016

danhhz commented Jun 30, 2016

RaduBerinde commented Jun 30, 2016

danhhz commented Jun 30, 2016

RaduBerinde commented Jun 30, 2016

danhhz commented Jul 1, 2016

bdarnell commented Jul 4, 2016

petermattis commented Jul 5, 2016

danhhz commented Jul 5, 2016

danhhz commented Jul 5, 2016

tbg commented Jul 5, 2016

tamird commented Jul 5, 2016

petermattis commented Jul 5, 2016

danhhz commented Jul 5, 2016

rfc: sql interleaved tables #7465

rfc: sql interleaved tables #7465

Conversation

danhhz commented Jun 24, 2016 • edited by petermattis Loading

petermattis commented Jun 24, 2016

tbg commented Jun 24, 2016

petermattis commented Jun 24, 2016

danhhz commented Jun 24, 2016

danhhz commented Jun 24, 2016

petermattis commented Jun 25, 2016

RaduBerinde commented Jun 25, 2016

RaduBerinde commented Jun 25, 2016

tamird commented Jun 27, 2016

petermattis commented Jun 27, 2016

danhhz commented Jun 28, 2016

tbg commented Jun 28, 2016

tamird commented Jun 28, 2016

danhhz commented Jun 28, 2016

danhhz commented Jun 28, 2016

tbg commented Jun 30, 2016

RaduBerinde commented Jun 30, 2016

RaduBerinde commented Jun 30, 2016

danhhz commented Jun 30, 2016

tbg commented Jun 30, 2016

danhhz commented Jun 30, 2016

RaduBerinde commented Jun 30, 2016

danhhz commented Jun 30, 2016

RaduBerinde commented Jun 30, 2016

danhhz commented Jul 1, 2016

bdarnell commented Jul 4, 2016

petermattis commented Jul 5, 2016

danhhz commented Jul 5, 2016

danhhz commented Jul 5, 2016

tbg commented Jul 5, 2016

tamird commented Jul 5, 2016

petermattis commented Jul 5, 2016

danhhz commented Jul 5, 2016

danhhz commented Jun 24, 2016 •

edited by petermattis

Loading