Query in transaction may return rows with same unique index column value #24195

vivid392845427 · 2021-04-21T16:21:29Z

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

drop table if exists t;
create table t (a int, b int, primary key (a, b) /*T![clustered_index] nonclustered */);
insert into t values (1, 10);
begin optimistic;
insert into t values (1, 10);
select * from t;
commit;

2. What did you expect to see? (Required)

Only one row is returned by the select statement.

3. What did you see instead (Required)

mysql> /* t */ select * from t; -- (1, 10), (1, 10)
+---+----+
| a | b  |
+---+----+
| 1 | 10 |
| 1 | 10 |
+---+----+
2 rows in set (0.00 sec)

mysql> /* t */ commit;
ERROR 1062 (23000): Duplicate entry '1-10' for key 'PRIMARY'

4. What is your TiDB version? (Required)

Release Version: v5.0.0-43-g41871e0c8
Edition: Community
Git Commit Hash: 41871e0
Git Branch: release-5.0
UTC Build Time: 2021-04-20 09:24:52
GoVersion: go1.13
Race Enabled: false
TiKV Min Version: v3.0.0-60965b006877ca7234adaced7890d7b029ed1306
Check Table Before Drop: false

The text was updated successfully, but these errors were encountered:

cfzjywxk · 2021-04-22T02:25:07Z

The different behaviours bettween int handle and non-int handle is caused that the unionScan operator will do merge or exclude based on the handle value. When the non-int primary key is used, the generated handle _tidb_rowid for two rows may be different so we could see two rows in the optimistic transaction.
This different behaviour exist in v4.0, v5.0 and master, it's needed to define the expected behaviour first.

vivid392845427 · 2021-04-23T10:46:11Z

Scene 2：Pessimistic transaction

/* s1 */ drop table if exists t;
/* s1 */ create table t (c1 varchar(10), c2 int, c3 char(20), primary key (c1, c2));
/* s1 */ insert into t values ('tag', 10, 't'), ('cat', 20, 'c');
/* s2 */ begin;
/* s2 */ update t set c1=reverse(c1) where c1='tag';
/* s4 */ begin;
/* s4 */ insert into t values('dress',40,'d'),('tag', 10, 't');
/* s2 */ commit;
/* s4 */ select * from t use index(primary) order by c1,c2;

when primary key is clustered index, query return
mysql> /* s4 */ select * from t use index(primary) order by c1,c2;
+-------+----+------+
| c1    | c2 | c3   |
+-------+----+------+
| cat   | 20 | c    |
| dress | 40 | d    |
| tag   | 10 | t    |
+-------+----+------+
3 rows in set (0.00 sec)

when primary key is nonclustered index, query return
mysql> /* s4 */ select * from t use index(primary) order by c1,c2;
+-------+----+------+
| c1    | c2 | c3   |
+-------+----+------+
| cat   | 20 | c    |
| dress | 40 | d    |
| tag   | 10 | t    |
| tag   | 10 | t    |
+-------+----+------+
4 rows in set (0.00 sec)

mysql return
mysql> /* s4 */ select * from t use index(primary) order by c1,c2;
+-------+----+------+
| c1    | c2 | c3   |
+-------+----+------+
| cat   | 20 | c    |
| dress | 40 | d    |
| gat   | 10 | t    |
| tag   | 10 | t    |
+-------+----+------+
4 rows in set (0.00 sec)

cfzjywxk · 2021-04-23T11:06:25Z

As described above, the different behaviours bettwen int handle and non-int handle sometimes may confuse the use that two rows with same unique key values are returned doing snapshot read in a transaction.
Both optimistic and pessimistic transactions may encounter this phenomenon.

cfzjywxk · 2021-04-29T06:20:59Z

It's needed to fix this issue, the way to to so is to change the unionScan operator which merge the in memory contents and snapshot results from kv storage, the unique index should be considered doing the merge task.

vivid392845427 · 2021-04-29T09:54:15Z

Scene 3：insert some record into the table，union scan can not merge when update clustered index set the value of the first record bigger than the second record;

/* s1 */ drop table if exists test1;
/* s1 */ create table test1(a int primary key clustered, b int);
/* s1 */ insert into test1 values(1,1),(5,5),(10,10);
/* s1 */ begin;
/* s1 */ update test1 set a=8 where a=1;
/* s2 */ begin;
/* s2 */ update test1 set a=a+1;
/* s1 */ commit;
/* s2 */ select * from test1;
actual return：
mysql> /* s2 */ select * from test1;
+----+------+
| a  | b    |
+----+------+
|  1 |    1 |
|  6 |    5 |
|  9 |    1 |
| 11 |   10 |
+----+------+
4 rows in set (0.00 sec)

expect return：
mysql> /* s2 */ select * from test1;
+----+------+
| a  | b    |
+----+------+
|  6 |    5 |
|  9 |    1 |
| 11 |   10 |
+----+------+
3 rows in set (0.01 sec)

cfzjywxk · 2021-04-29T10:18:02Z

@vivid392845427
This case is a bit different I think, the pessimistic mode transactions in tidb will use current read doing update, so the actually updated rows in s2 are 5, 8, 10, value 1 is not touched. After the update test1 set a=a+1; statement in s2, the new rows are inserted in the memory buffer, then the snapshot read in select * from test1 in s2 will see the value 1, as well as the newly inserted rows 6, 9, 11, the final result is 1, 6, 9, 11.
I think this is expected as the snapshot read in pessimistic mode is diffrent with that in optimistic mode.

But it's still incompatbible or different for cluster and non cluster index for this case, as the non-cluster index case the handle value which is _tidb_rowid will not change.

It's still the case cluster/non-cluster index table will have incompatible behaviour problem.

you06 · 2021-08-24T13:13:39Z

There is one thing that confused me, the optimistic transactions will not do unique key check for insert statement before commit. This behavior is consistent across all versions of TiDB. However, it brought the problem of dealing with union membuffer data and snapshot data together. Further, which data to be choose is hard to answer.

tidb/executor/union_scan.go

Lines 223 to 228 in 5516d7d

    
           checkKey := tablecodec.EncodeRecordKey(us.table.RecordPrefix(), snapshotHandle) 
        
           if _, err := us.memBufSnap.Get(context.TODO(), checkKey); err == nil { 
        
           	// If src handle appears in added rows, it means there is conflict and the transaction will fail to 
        
           	// commit, but for simplicity, we don't handle it here. 
        
           	continue 
        
           }

From the comment, we can notice what the developer thought is the conflict transaction fails when committing anyway, we do not handle the read issue for simplicity (Anomalies in aborted transaction can be dismissed in some theories).

IMO, there are 2 ways to solve this issue.

1. Handle the read issue for transactions which not able to commit or check the constaint during execution write statements. This can by done by check if any of unique indexes is in membuffer. However, some behavior may still not make sense.

/* s1 */ drop table if exists t;
/* s1 */ create table t (a int, b int, primary key (a));
/* s1 */ insert into t values (1, 10);
/* s1 */ begin optimistic;
/* s1 */ insert into t values (1, 11); -- Success in optimistic transaction 
/* s1 */ select * from t; -- What result set do we expect? If we take membuffer data prior, we'll got (1, 11)
/* s1 */ commit; -- Fail in optimistic transaction

1. Check the constraints for optimistic transactions also. This can make the issue clearer and we'll not meet the problem of which data to be choosed.

I prefer the second solution, @cfzjywxk, what's your idea?

cfzjywxk · 2021-08-24T13:25:04Z

There is one thing that confused me, the optimistic transactions will not do unique key check for insert statement before commit. This behavior is consistent across all versions of TiDB. However, it brought the problem of dealing with union membuffer data and snapshot data together. Further, which data to be choose is hard to answer.

tidb/executor/union_scan.go

Lines 223 to 228 in 5516d7d

checkKey := tablecodec.EncodeRecordKey(us.table.RecordPrefix(), snapshotHandle)

if _, err := us.memBufSnap.Get(context.TODO(), checkKey); err == nil {

// If src handle appears in added rows, it means there is conflict and the transaction will fail to

// commit, but for simplicity, we don't handle it here.

continue

}

From the comment, we can notice what the developer thought is the conflict transaction fails when committing anyway, we do not handle the read issue for simplicity (Anomalies in aborted transaction can be dismissed in some theories).

IMO, there are 2 ways to solve this issue.

Handle the read issue for transactions which not able to commit or check the constaint during execution write statements. This can by done by check if any of unique indexes is in membuffer. However, some behavior may still not make sense.
/* s1 */ drop table if exists t;
/* s1 */ create table t (a int, b int, primary key (a));
/* s1 */ insert into t values (1, 10);
/* s1 */ begin optimistic;
/* s1 */ insert into t values (1, 11); -- Success in optimistic transaction 
/* s1 */ select * from t; -- What result set do we expect? If we take membuffer data prior, we'll got (1, 11)
/* s1 */ commit; -- Fail in optimistic transaction
Check the constraints for optimistic transactions also. This can make the issue clearer and we'll not meet the problem of which data to be choosed.

I prefer the second solution, @cfzjywxk, what's your idea?

@you06
To keep the compatibility, I think it's more reasonable to process this conflict in the union scan operator and return the memory buffer content doing merge, it's weird to return same rows reading a table with unique index keys.
For this solution we may need to verify if this prior strategy will introduce some other weird behaviours or issues related to compatibility, if there's not I think it's an improvement for the optimistic transaction.

you06 · 2021-08-24T14:06:17Z

If I understood correctly, we may fix it by the first solution, handle it in union scan. The second solution may be a long term improvement in which we should care about if our users will suffer unexpected failures introduced by this change. @cfzjywxk is it what you mean?

cfzjywxk · 2021-08-25T02:02:53Z

If I understood correctly, we may fix it by the first solution, handle it in union scan. The second solution may be a long term improvement in which we should care about if our users will suffer unexpected failures introduced by this change. @cfzjywxk is it what you mean?

@you06
Yes, we could solve some unexpected or unreasonable behaviours first without breaking any compatibility by now.

you06 · 2021-08-27T02:45:07Z

Scene 2：Pessimistic transaction

The weird behavior of MySQL should blame to its lazy snapshot strategy, if you select after s4's begin statement(this will take a snapshot for MySQL's transaction). Then we'll get the following result, this is also the expected result for me.

/* s1 */ drop table if exists t;
/* s1 */ create table t (c1 varchar(10), c2 int, c3 char(20), primary key (c1, c2));
/* s1 */ insert into t values ('tag', 10, 't'), ('cat', 20, 'c');

/* s2 */ begin;
/* s2 */ update t set c1=reverse(c1) where c1='tag';

/* s4 */ begin;
/* s4 */ select * from t use index(primary) order by c1,c2;
/* s4 */ insert into t values('dress',40,'d'),('tag', 10, 't');

/* s2 */ commit;
/* s4 */ select * from t use index(primary) order by c1,c2;
-- s4 >> +-------+----+----+
-- s4    |  C1   | C2 | C3 |
-- s4    +-------+----+----+
-- s4    | cat   | 20 | c  |
-- s4    | dress | 40 | d  |
-- s4    | tag   | 10 | t  |
-- s4    +-------+----+----+
/* s4 */ commit;

/* s2 */ select * from t use index(primary) order by c1,c2;
-- s2 >> +-------+----+----+
-- s2    |  C1   | C2 | C3 |
-- s2    +-------+----+----+
-- s2    | cat   | 20 | c  |
-- s2    | dress | 40 | d  |
-- s2    | gat   | 10 | t  |
-- s2    | tag   | 10 | t  |
-- s2    +-------+----+----+

cfzjywxk · 2021-11-22T08:42:24Z

@you06 @vivid392845427 @zyguan
As there's compatibility risk for these related issues and they are not easy to fix, what do you think we process this next?

zyguan · 2021-11-23T15:02:15Z

@you06 @vivid392845427 @zyguan As there's compatibility risk for these related issues and they are not easy to fix, what do you think we process this next?

Maybe we can document it as a known issue currently. IMO, it's actually not a conventional usage, since a optimistic transaction is not guaranteed to be committed, one should handle the intermediate results within the transaction carefully.

you06 · 2021-11-24T07:01:16Z

@you06 @vivid392845427 @zyguan As there's compatibility risk for these related issues and they are not easy to fix, what do you think we process this next?

There are similarities to the optimization of delete-your-writes, since #28806 is blocked by some complex problems, I think it's better to hold this issue by now.

IMO, it's actually not a conventional usage

I agree with this point, maybe we can downgrade the severity of this issue.

vivid392845427 added the type/bug The issue is confirmed as a bug. label Apr 21, 2021

cfzjywxk added sig/transaction SIG:Transaction status/discussion labels Apr 22, 2021

ichn-hu mentioned this issue Apr 22, 2021

Welcome to contribute #20804

Closed

cfzjywxk changed the title ~~Optimistic transactions delay checking，the newly inserted record is unable to merge~~ Query in transaction may return rows with same unique index column value Apr 25, 2021

cfzjywxk added severity/major help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. and removed status/discussion labels Apr 29, 2021

cfzjywxk assigned you06 Aug 23, 2021

you06 linked a pull request Aug 27, 2021 that will close this issue

executor: union scan checks unique indexes #27630

Open

12 tasks

tangenta mentioned this issue Aug 30, 2021

Duplicate rows occur in primary/unique key in optimistic transaction #22730

Closed

vivid392845427 mentioned this issue Dec 6, 2021

TiDB fails to insert data when using transaction #30412

Closed

cfzjywxk mentioned this issue Dec 17, 2021

clustered index + unionscan，query using index return the same 2 records #30823

Closed

vivid392845427 mentioned this issue Dec 31, 2021

Incorrect select result after point update nothing #28011

Closed

jebter added affects-5.0 This bug affects 5.0.x versions. affects-5.1 This bug affects 5.1.x versions. labels Jan 11, 2022

jebter added affects-5.2 This bug affects 5.2.x versions. affects-5.3 This bug affects 5.3.x versions. affects-5.4 This bug affects the 5.4.x(LTS) versions. labels Jan 11, 2022

VelocityLight added the affects-6.0 label Mar 17, 2022

cfzjywxk mentioned this issue Mar 23, 2022

Weird SELECT when table has the primary key #33315

Closed

VelocityLight added the affects-6.1 This bug affects the 6.1.x(LTS) versions. label May 20, 2022

VelocityLight added the affects-6.2 label Jul 20, 2022

cfzjywxk mentioned this issue Aug 4, 2022

doc: lazy constraint check in pessimistic txn #36889

Merged

4 tasks

VelocityLight added the affects-6.3 label Sep 20, 2022

VelocityLight added the affects-6.4 label Nov 4, 2022

VelocityLight added the affects-6.5 This bug affects the 6.5.x(LTS) versions. label Dec 2, 2022

VelocityLight added the affects-6.6 label Feb 9, 2023

VelocityLight added the affects-7.0 label Mar 20, 2023

VelocityLight added the affects-7.1 This bug affects the 7.1.x(LTS) versions. label Apr 20, 2023

This was referenced May 26, 2023

Query in transaction may return rows with same primary key column value #44200

Open

Query in transaction behavior is inconsistent with MySQL #44303

Open

cfzjywxk removed the severity/major label Aug 16, 2023

zyguan mentioned this issue Dec 1, 2023

Phantom rows caused by update statements which changes value of the primary key #48960

Closed

jebter added the severity/moderate label Aug 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Query in transaction may return rows with same unique index column value #24195

Query in transaction may return rows with same unique index column value #24195

vivid392845427 commented Apr 21, 2021 •

edited by you06

Loading

cfzjywxk commented Apr 22, 2021

vivid392845427 commented Apr 23, 2021 •

edited by youjiali1995

Loading

cfzjywxk commented Apr 23, 2021 •

edited

Loading

cfzjywxk commented Apr 29, 2021

vivid392845427 commented Apr 29, 2021 •

edited by you06

Loading

cfzjywxk commented Apr 29, 2021 •

edited

Loading

you06 commented Aug 24, 2021

cfzjywxk commented Aug 24, 2021

you06 commented Aug 24, 2021

cfzjywxk commented Aug 25, 2021

you06 commented Aug 27, 2021

cfzjywxk commented Nov 22, 2021 •

edited

Loading

zyguan commented Nov 23, 2021

you06 commented Nov 24, 2021

Query in transaction may return rows with same unique index column value #24195

Query in transaction may return rows with same unique index column value #24195

Comments

vivid392845427 commented Apr 21, 2021 • edited by you06 Loading

Bug Report

1. Minimal reproduce step (Required)

2. What did you expect to see? (Required)

3. What did you see instead (Required)

4. What is your TiDB version? (Required)

cfzjywxk commented Apr 22, 2021

vivid392845427 commented Apr 23, 2021 • edited by youjiali1995 Loading

cfzjywxk commented Apr 23, 2021 • edited Loading

cfzjywxk commented Apr 29, 2021

vivid392845427 commented Apr 29, 2021 • edited by you06 Loading

cfzjywxk commented Apr 29, 2021 • edited Loading

you06 commented Aug 24, 2021

cfzjywxk commented Aug 24, 2021

you06 commented Aug 24, 2021

cfzjywxk commented Aug 25, 2021

you06 commented Aug 27, 2021

cfzjywxk commented Nov 22, 2021 • edited Loading

zyguan commented Nov 23, 2021

you06 commented Nov 24, 2021

vivid392845427 commented Apr 21, 2021 •

edited by you06

Loading

vivid392845427 commented Apr 23, 2021 •

edited by youjiali1995

Loading

cfzjywxk commented Apr 23, 2021 •

edited

Loading

vivid392845427 commented Apr 29, 2021 •

edited by you06

Loading

cfzjywxk commented Apr 29, 2021 •

edited

Loading

cfzjywxk commented Nov 22, 2021 •

edited

Loading