Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: primary keys cannot be changed #19141

Closed
benesch opened this issue Oct 9, 2017 · 24 comments
Closed

sql: primary keys cannot be changed #19141

benesch opened this issue Oct 9, 2017 · 24 comments
Assignees
Labels
A-partitioning A-schema-changes A-sql-pgcompat Semantic compatibility with PostgreSQL C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) X-anchored-telemetry The issue number is anchored by telemetry references.

Comments

@benesch
Copy link
Contributor

benesch commented Oct 9, 2017

It is not currently possible to change a table's primary key, e.g. via

ALTER TABLE foo DROP CONSTRAINT old_pkey;
ALTER TABLE foo ADD PRIMARY KEY (new, pkey, cols)

This becomes more important in 1.2 because the upcoming partitioning feature (#18683) will only allow partitioning by prefixes of the primary key. For example, suppose you want to partition this table by country:

CREATE TABLE users (
  id INT PRIMARY KEY,
  name STRING,
  country STRING,
  ...
)

The table needs to have a composite primary key that lists country first:

CREATE TABLE users (
  id INT,
  name STRING,
  country STRING,
  ...,
  PRIMARY KEY (int, country)
) PARTITION BY LIST (country)...

This essentially requires that the schema be designed with partitioning in mind from the get-go until we can alter primary keys.

@benesch benesch added this to the Later milestone Oct 9, 2017
@benesch
Copy link
Contributor Author

benesch commented Oct 9, 2017

/cc @awoods187 for triage

@a6802739
Copy link
Contributor

a6802739 commented Oct 31, 2017

@benesch, Do you mean we should support SQL syntax like:

ALTER TABLE foo ADD PRIMARY KEY (new, pkey, cols)

Or just change the primary key of table users from id to (id, country) , when we try to add the partition for this table but the partition column is not in the primary?

We should change the data organization when we try to change the primary key of the specified table, this operation's overhead is quite expensive:

  1. we should try to insert the same data within the replica, but changed the key of data from /TableID/IndexID/ColumnID/id to /TableID/IndexID/ColumnID/(id, country), the value also changed with the same way, and clear the original key-value.
  2. change the TableDescriptor.
  3. backfill.

@benesch
Copy link
Contributor Author

benesch commented Nov 1, 2017

@a6802739 I mean we should support arbitrary primary key changes like

ALTER TABLE foo ADD PRIMARY KEY (new, pkey, cols)

Since changing a table's primary key is such an expensive operation, we definitely don't want to do it implicitly if the partitioning scheme specifies columns that are not in the primary key. Much rather error in that case, and let the user alter the pkey only if they so choose.

Note that traditionally changing the pkey on a table with an existing pkey has required two steps:

ALTER TABLE foo DROP CONSTRAINT foo_pkey;
ALTER TABLE foo ADD PRIMARY KEY (new, pkey, cols);

That's going to be needlessly expensive in Cockroach. Dropping the primary key requires creating a hidden rowid column to use as the implicit pkey—this already happens when tables without a primary key are created from scratch—and rewriting all the data in terms of rowdid, just to rewrite it all in terms of (new, pkey, cols) when the new primary key is added. We should do one or more of the following:

  1. Introduce new syntax to alter a primary key in one step:

    ALTER TABLE foo ALTER PRIMARY KEY TO (new, pkey, cols)
    
  2. Coalesce the operations when they occur in the same statement:

    ALTER TABLE foo DROP CONSTRAINT foo_pkey, ADD PRIMARY KEY (new, pkey, cols);
    
  3. Coalesce the operations when they appear in ALTER TABLEs within the same txn with no intervening writes:

    BEGIN;
    ALTER TABLE foo DROP CONSTRAINT foo_pkey;
    ALTER TABLE foo ADD PRIMARY KEY (new, pkey, cols);
    COMMIT;
    

My preference would be to eventually support all three, but start by supporting just option one. Options two and three are better for compatibility, but changing a table's primary key probably isn't something you want to lean on your ORM for.

Option three has some pathological edge cases FWIW:

CREATE TABLE foo (a INT PRIMARY KEY, b STRING);
INSERT INTO foo (a, b) VALUES (1, "a");
BEGIN;
ALTER TABLE foo DROP CONSTRAINT foo_pkey;
INSERT INTO foo (a, b) VALUES (1, "b");
ALTER TABLE foo ADD PRIMARY KEY (b);
COMMIT;

@jordanlewis @vivekmenezes @danhhz @knz @cockroachdb/sql-async does this sound reasonable?

@knz
Copy link
Contributor

knz commented Nov 2, 2017

Yes it's reasonable but @a6802739 please do not underestimate the complexity of the task! As much as possible, please split the work in small steps that can be easily reviewed. It might be beneficial that you study the problem and write down your implementation plan before you start (and perhaps share it with us).

@knz
Copy link
Contributor

knz commented Nov 2, 2017

I meant to write "reasonable" in my previous comment.

@a6802739
Copy link
Contributor

a6802739 commented Nov 2, 2017

@knz, I will first try to understand what should I do for this task, and then write down how to implement it. Maybe I will have some confusion in this process, hope could get some help then.

Thank you very much.

@cuongdo
Copy link
Contributor

cuongdo commented Nov 2, 2017

@a6802739 Thank you, as always, for your help. I've discussed this project with the team, and because other high-priority projects with upcoming deadlines rely on this feature, we'll need to develop this feature internally.

@cuongdo cuongdo assigned danhhz and unassigned a6802739 Nov 2, 2017
@cuongdo
Copy link
Contributor

cuongdo commented Nov 2, 2017

assigned to @danhhz to make sure this finds a good home

@awoods187 awoods187 modified the milestones: 1.2, 1.3 Nov 9, 2017
@petermattis petermattis assigned maddyblue and bobvawter and unassigned danhhz Mar 13, 2018
@petermattis
Copy link
Collaborator

Has there been any thought on how this would be implemented? Changing the primary key is difficult because it is stored in indexes. One thought is that we make a copy of all of the secondary indexes and backfill both them and the new primary index and afterwards delete the old secondary indexes and primary index. Seems like that would conceptually work, though it would be slow.

@mjibson, @danhhz, @vivekmenezes Thoughts?

@jordanlewis
Copy link
Member

It depends on whether there's a requirement for this to happen online.

If it needs to be online, which is what @benesch's first suggestion implies (ALTER TABLE ALTER PRIMARY KEY), then I agree with your suggestion. The phases in that case would be the same as adding an index, more or less:

  1. The new indexes and table descriptor would be created
  2. The old table descriptor would be updated so that If the table had n secondary indexes, the write path would then end up writing 2 * (n + 1) rows per update, guaranteeing that all writes from then on are reflected in the new tables.
  3. A backfill process begins for the table and all of its secondary indexes, at some time t before the write path began writing to the new copies.
  4. Once the backfill is finished, it's safe to switch reads to come from the new copies, and begin deleting the old copies.

If it doesn't have to be online, then perhaps that first step of "dropping the primary key" could effectively just put the table into read only mode until a new primary key is added, at which point we could perform the backfills above without having to do the table descriptor double write path dance.

@benesch
Copy link
Contributor Author

benesch commented Mar 13, 2018

It depends on whether there's a requirement for this to happen online.

I'd be sad if changing a table's primary key took the table offline. That said, it would still a big improvement over our current "just dump it to CSV and reimport it!" recommendation.

@petermattis
Copy link
Collaborator

I've been assuming that any schema change needs to be asynchronous. Let's not lower this bar yet.

@dianasaur323
Copy link
Contributor

We talked about this yesterday briefly, but just one data point from a customer who wanted to change column types generally (not just primary keys) -> okay with this locking up a table while the change is going on. I don't fully understand the term backfilling, but are we also using this term to describe the faster process we've been talking about re doing something with sstables (vague because I only get 20% of it)? If not, I wonder if someone would want a fast, but offline primary key swap versus a slow but online primary key swap.

@petermattis
Copy link
Collaborator

okay with this locking up a table while the change is going on

I'm not sure I believe this. Well, I believe the customer has stated that locking the table is ok, but we've also heard many times that async schema changes massively reduce operational burden.

@dianasaur323
Copy link
Contributor

I think you're right - it's definitely 100x better to have async schema changes. That being said, if one takes way way longer than the other, perhaps it changes our decision? Either way, this is obviously an engineering decision, but just wanted to provide that single relevant data point here.

@vivekmenezes
Copy link
Contributor

vivekmenezes commented Mar 15, 2018

I think this is a low impact high effort issue. This change is valuable to a customer who is

  1. running in production
  2. running globally
  3. running with partitioning
  4. wants to change the primary key

I think number 4 in particular is rare.

I'll also bring up that when we do this we need to first refactor the codebase to accommodate this change (it should be clear that the code can be modifying two tables). It's very easy to make this change and land up with what we had with the sql executor but this time on the sql DML code paths.

I'm just saying that we better think about the weight of this change on the stability of the system and the priority of it for our customers for 2.1.

@bdarnell
Copy link
Contributor

I agree that it's high effort, but I think you're understating the impact. We have at least two features where the primary key is crucial (partitioning and interleaving), and even without that people sometimes want to change their primary keys (existing databases all have this feature). It's unrealistic to expect people to be able to choose the correct PK when first creating the table.

okay with this locking up a table while the change is going on

I'm not sure I believe this. Well, I believe the customer has stated that locking the table is ok, but
we've also heard many times that async schema changes massively reduce operational burden.

If taking the table offline is acceptable, then an export/edit/import process would work. I think if we're going to do this we should do the work to make it async/online.

We talked about this yesterday briefly, but just one data point from a customer who wanted to change column types generally (not just primary keys)

Changing types of non-PK columns is a separate matter. It's much easier, and I think it's higher priority than changing PKs.

@awoods187
Copy link
Contributor

@Freeaqingme
Copy link

This item only shows examples where the old primary key is removed first. As outlined in #40771 there's also a scenario possible where a table was defined without a primary key in the first place, and as such nothing was deleted.

@jordanlewis I'm perfectly fine with closing the issue I mentioned above. Should this issue also be given the label for postgres compatibility, or would that be counter-effective?

@jordanlewis jordanlewis added the A-sql-pgcompat Semantic compatibility with PostgreSQL label Sep 26, 2019
@jordanlewis
Copy link
Member

In CockroachDB a table always has a primary key behind the scenes, so this issue does cover your case as well. Thanks for noticing the missing compat label - I've added it.

@fire
Copy link

fire commented Mar 15, 2020

Can this issue be closed once https://www.cockroachlabs.com/docs/v20.1/alter-primary-key.html is stable?

@awoods187
Copy link
Contributor

Yes! In fact, I'm going to go ahead and close this since it will be addressed in 20.1.

rafiss added a commit to rafiss/cockroach that referenced this issue Mar 26, 2020
Recently, cockroachdb#35879 and cockroachdb#19141 were closed, so new tests began passing

Release justification: non-productrion change
Release note: None
craig bot pushed a commit that referenced this issue Mar 30, 2020
46625: kv/kvserver: don't bump ReadTimestamp on EndTxn batch with bounded read latches r=nvanbenschoten a=nvanbenschoten

This commit prevents an error-prone situation where requests with an
EndTxn request that acquire read-latches at bounded timestamps were
allowed to move their ReadTimestamp to their WriteTimestamp without
re-acquiring new latches. This could allow the request to read at an
unprotected timestamp and forgo proper synchronization with writes with
respect to its lock-table scan and timestamp cache update.

This relates to 11bffb2, which made a similar fix to server-side
refreshes, but not to this new (as of this release) pre-emptive
ReadTimestamp update mechanism.

This bug was triggering the following assertion when stressing
`TestKVNemesisSingleNode`:

```
F200326 15:56:11.350743 1199 kv/kvserver/concurrency/concurrency_manager.go:261  [n1,s1,r33/1:/{Table/50/"60…-Max}] caller violated contract: discovered non-conflicting lock
```

The batch that hit this issue looked like:
```
Get [/Table/50/"a1036fd0",/Min),
Get [/Table/50/"e54a51a8",/Min),
QueryIntent [/Table/50/"aafccb40",/Min),
QueryIntent [/Table/50/"be7f37c9",/Min),
EndTxn(commit:true tsflex:true)
```

This commit adds two subtests to `TestTxnCoordSenderRetries` that
create this scenario.

Release note (bug fix): A rare bug causing an assertion failure was
fixed. The assertion error message was "caller violated contract:
discovered non-conflicting lock".

Release justification: fixes a series bug that could crash a server.
Additionally, the bug could have theoretically allowed isolation
violations between transactions without hitting the assertion, though
this was never observed in practice.

cc. @danhhz 

46636: roachtest: update blacklists after new fixes r=rafiss a=rafiss

Recently, #35879 and #19141 were closed, so new tests began passing

Release justification: non-productrion change
Release note: None

46732: build: disable required release justifications r=jordanlewis a=otan

release-20.1 is cut.

Release note: None

46733: licenses: Update BSL change date for master/20.2 r=bdarnell a=otan

Release note: None

Release justification: N/A

Co-authored-by: Nathan VanBenschoten <[email protected]>
Co-authored-by: Rafi Shamim <[email protected]>
Co-authored-by: Oliver Tan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-partitioning A-schema-changes A-sql-pgcompat Semantic compatibility with PostgreSQL C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) X-anchored-telemetry The issue number is anchored by telemetry references.
Projects
None yet
Development

No branches or pull requests