Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: creating a table and adding an index within a single transaction produces problems #39667

Closed
BramGruneir opened this issue Aug 14, 2019 · 6 comments
Assignees
Labels
A-schema-changes C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.

Comments

@BramGruneir
Copy link
Member

The following has a number of issues and affects 19.1.x as well as master.

BEGIN;
CREATE TABLE test (
    id                 UUID DEFAULT gen_random_uuid() PRIMARY KEY,
    segment_key        UUID
);
CREATE INDEX idx ON test (segment_key);
COMMIT;

There are a number of issues here:

  1. if you delay a bit between issuing the create table and create index, the transaction will fail and require a retry.
root@:26257/defaultdb  OPEN> commit;
pq: restart transaction: TransactionRetryWithProtoRefreshError: TransactionRetryError: retry txn (RETRY_SERIALIZABLE): id=3754c52d key=/Table/SystemConfigSpan/Start rw=true pri=0.00095111 stat=PENDING epo=0 ts=1565796278.946017000,1 orig=1565796271.134457000,0 min=1565796271.134457000,0 max=1565796271.134457000,0 wto=false seq=17
  1. If the full transaction is committed, it will leave a job in a pending state and never complete it.

Screen Shot 2019-08-14 at 11 27 39

  1. Subsequent schema changes to the table, may not take effect, even if they show up as passing.
    This is the problem that flyway migrations keep experiencing. Columns not being added is the most common.
@BramGruneir BramGruneir added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. A-schema-changes labels Aug 14, 2019
@BramGruneir
Copy link
Member Author

Assigning @dt for triage.
cc @awoods187

@awoods187
Copy link
Contributor

This impacted our support for Flyway. We were able to enable Flyway to work for a customer if they separated schema changes into separate files as a temporary workaround.

@BramGruneir
Copy link
Member Author

That being said, flyway itself initializes itself and does exactly this:

Screen Shot 2019-08-14 at 15 04 46

So it will always be in some sort of error state.

@RoachietheSupportRoach
Copy link
Collaborator

Zendesk ticket #3516 has been linked to this issue.

@adityamaru27
Copy link
Contributor

adityamaru27 commented Aug 22, 2019

Following are my findings after debugging (and discussing with @dt ):
• The "pending" jobs showing up for schema changes in the same txn as the one in which the table was created, is a problem which I'll send a fix out for tomorrow. In terms of subsequent schema changes failing it was a red herring.

• The problem lies in the fact that the flyway scripts attempt to update a column in the same txn as it was added. This is a fundamental restriction in the way schema changes are setup. I traced down the error column * does not exist to the optimizer, when it is attempting to plan the UPDATE statement. The reason for this error is that the new column has not been added to the table desc (it is still queued up in the descs mutation list), as the schema changer only runs after the txn has committed. So the planning of UPDATE fails. Similar behavior can be seen in the logic test - https://github.com/cockroachdb/cockroach/blob/master/pkg/sql/logictest/testdata/logic_test/update#L462
and so while not ideal, this is the correct/expected behavior.

• I haven't debugged the txn retrying issue mentioned above yet.

adityamaru27 added a commit to adityamaru27/cockroach that referenced this issue Aug 22, 2019
Previously, we would unconditionally create a job for a schema change
issued in the same txn as the one in which its table was created. This
is contrary to how we would handle these schema changes. In such
scenarios schema changes would be applied synchronously without using
the schema changer. Thus, these jobs would indefinitely remain in a pending
state.

This change adds a condition to check whether job creation is necessary.
It fixes one symptom of cockroachdb#39667.

Release note: None
craig bot pushed a commit that referenced this issue Aug 22, 2019
39819: sql: Skip creation/update of job for schema change in same txn as CREATE r=adityamaru27 a=adityamaru27

Previously, we would unconditionally create a job for a schema change
issued in the same txn as the one in which its table was created. This
is contrary to how we would handle these schema changes. In such
scenarios schema changes would be applied synchronously without using
the schema changer. Thus, these jobs would indefinitely remain in a pending
state.

This change adds a condition to check whether job creation is necessary.
It fixes one symptom of #39667.

Release note: None

Co-authored-by: Aditya Maru <[email protected]>
@dt
Copy link
Member

dt commented Sep 10, 2019

I think the bug here was fixed in #39819. What is left is a known-limitation of the SQL txn/schema-change interaction so I think this particular ticket can be closed now.

@dt dt closed this as completed Sep 10, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-schema-changes C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
Projects
None yet
Development

No branches or pull requests

5 participants