Frequently updating non-index column causing too many Lock records in the unique index key #25659

MyonKeminta · 2021-06-22T08:31:04Z

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

This problem occurs if a table has:

Never / rarely updated unique index
Another column(s) that's frequently updated with point get in pessimistic transaction.

Example table definition: (without clustered index so that the primary key is a separated unique index)

create table t (id varchar(128), v int, primary key (id));

And there's frequent update operations like:

begin pessimistic;
update t set v = ? where id = ?;
commit;

2. What did you expect to see? (Required)

Everything goes well

3. What did you see instead (Required)

The performance decreases and it can be seen that next operation is very high some requests like acquire_pessimistic_lock and kv_get.

Since this locks the unique index key, it will produce a WriteType::Lock record in the write cf on the unique index key. When reading value on such a key, it will need to iterate over all these Lock records to find the latest Put or Delete record. Every time it iterates a Lock record, it performs a next operation. In some bad cases, reading each key may produce over 100k+ next operations.

4. What is your TiDB version? (Required)

4.0.10+, 5.x, master

The text was updated successfully, but these errors were encountered:

cfzjywxk · 2021-06-23T02:44:04Z

Since #21229, the unique index key will be pessimistically locked, this change make the lock behaviour more reasonable but increased the possibility encountering the performance issue of the LOCK record at the tikv side.
So we may need to:

prepare a switch controling the lock behaviour for the unique index read, also a hotfix patch is needed in case some users may encouter the performance issues using old versions.
optimize the issue related to the LOCK record, there is already some plans for it.

sticnarf · 2021-06-23T05:49:04Z

@cfzjywxk To solve exactly the problem introduced by #21229, I think we can just use PUT instead LOCK to write the unchanged index.

The index record should be small, so it does not bring too much overhead to put the index record again. With clustered index, the index can be larger, but typically it won't.

cfzjywxk · 2021-06-23T06:36:10Z

@cfzjywxk To solve exactly the problem introduced by #21229, I think we can just use PUT instead LOCK to write the unchanged index.

The index record should be small, so it does not bring too much overhead to put the index record again. With clustered index, the index can be larger, but typically it won't.

Agree, this seems to be the plan with less changing risk, the TODOs are:

Change commit operation into PUT in tidb side if it's a LOCK on a unique index generating mutations.
As @sticnarf has suggested in the early document, change the prewrite logic in tikv side, if the prewrite operation is a LOCK, next a few nearest records for this key, if they are a sequence of LOCK records, change the current operation into PUT and use the latest value for it.

@sticnarf @MyonKeminta
What do you think?

BTW, I'm working on the hotfix patch as there are already some customers encoutering this issue.

sticnarf · 2021-06-23T07:52:58Z

Change commit operation into PUT in tidb side if it's a LOCK on a unique index generating mutations.

We can include this in the next patch.

As @sticnarf has suggested in the early document, change the prewrite logic in tikv side, if the prewrite operation is a LOCK, next a few nearest records for this key, if they are a sequence of LOCK records, change the current operation into PUT and use the latest value for it.

Currently, I don't have a very clear idea of how to achieve it in detail. I come up with some different approaches, but each of them has their own drawbacks.

Approach 1

Read until the latest PUT or DELETE record in prewrite. Then, we can know how many LOCK records we need to skip, so we can do a automatic re-put.

But now prewrite does not read the latest valid record except for CDC is enabled. Reading the valid record every time introduces overhead to the hot path.

Approach 2

To optimize approach 1, we can record how many contiguous records need to be skipped in the latest LOCK or ROLLBACK record.

But it is a problem to decide where to store the number. If it is added as a separate field in the Write, then the plan cannot be picked to TiDB 4.0. Older versions of TiKV 4.0 panics when meeting an unknown field.

To keep backward compatibility, it must be put in the short_value of LOCK. This is like what we did to "protected rollback", but looks tricky and ugly.

Approach 3

A more conservative way is to let TiDB decide the re-put. TiKV only provides hints of how many useless records are skipped in PointGet or AcquirePessimisticLock (with need values) responses.

This approach does not modify the data format or introduce much overhead. But it seems complex and has limited use cases. We only re-put if we did PointGet or AcquirePessimisticLock. If we need to support scan, it can be more complex.

cfzjywxk · 2021-06-24T06:02:55Z

@MyonKeminta @sticnarf
As the reading the lasting value in tikv side is expensive and changing the data format is risky.
Or could we workaround this issue doing for update point/batch point get read by unique index like this
https://github.com/pingcap/tidb/pull/25730/files in the release branch, and let GC do the whole cleanup work?

cfzjywxk · 2021-08-31T08:08:28Z

Fixed in #25730 and related cherry-picks.

github-actions · 2021-08-31T08:17:51Z

Please check whether the issue should be labeled with 'affects-x.y' or 'backport-x.y.z',
and then remove 'needs-more-info' label.

MyonKeminta added type/bug The issue is confirmed as a bug. sig/transaction SIG:Transaction severity/major labels Jun 22, 2021

ichn-hu mentioned this issue Jun 22, 2021

Welcome to contribute #20804

Closed

cfzjywxk mentioned this issue Jun 23, 2021

txn: add http api to change the unique index lock behaviour #25722

Closed

cfzjywxk mentioned this issue Jun 24, 2021

txn: change lock into put record for unique index key lock #25730

Merged

This was referenced Jul 14, 2021

txn: change lock into put record for unique index key lock (#25730) #26223

Merged

txn: change lock into put record for unique index key lock (#25730) #26224

Merged

txn: change lock into put record for unique index key lock (#25730) #26225

Merged

cfzjywxk closed this as completed Aug 31, 2021

github-actions bot added the needs-more-info label Aug 31, 2021

cfzjywxk added backport-4.0.14 affects-4.0 This bug affects 4.0.x versions. affects-5.0 This bug affects 5.0.x versions. affects-5.1 This bug affects 5.1.x versions. and removed needs-more-info labels Aug 31, 2021

MyonKeminta mentioned this issue Mar 24, 2022

Updating non-index column by point-getting on unique index may cause too many Lock records in Write CF #33393

Closed

zyguan mentioned this issue Mar 14, 2023

executor: lock duplicated keys on insert-ignore & replace-nothing #42210

Merged

12 tasks

zyguan mentioned this issue Mar 28, 2023

txnkv: convert LOCK on untouched kv into PUT tikv/client-go#752

Merged

MyonKeminta mentioned this issue May 19, 2023

Clusters upgraded from <6.5 to >=6.5 may suffer additional bad performance from accumulated lock records tikv/tikv#14780

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Frequently updating non-index column causing too many Lock records in the unique index key #25659

Frequently updating non-index column causing too many Lock records in the unique index key #25659

MyonKeminta commented Jun 22, 2021

cfzjywxk commented Jun 23, 2021

sticnarf commented Jun 23, 2021

cfzjywxk commented Jun 23, 2021

sticnarf commented Jun 23, 2021

cfzjywxk commented Jun 24, 2021

cfzjywxk commented Aug 31, 2021

github-actions bot commented Aug 31, 2021

Frequently updating non-index column causing too many Lock records in the unique index key #25659

Frequently updating non-index column causing too many Lock records in the unique index key #25659

Comments

MyonKeminta commented Jun 22, 2021

Bug Report

1. Minimal reproduce step (Required)

2. What did you expect to see? (Required)

3. What did you see instead (Required)

4. What is your TiDB version? (Required)

cfzjywxk commented Jun 23, 2021

sticnarf commented Jun 23, 2021

cfzjywxk commented Jun 23, 2021

sticnarf commented Jun 23, 2021

Approach 1

Approach 2

Approach 3

cfzjywxk commented Jun 24, 2021

cfzjywxk commented Aug 31, 2021

github-actions bot commented Aug 31, 2021