Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: Check index number in txn #29453

Closed
wants to merge 14 commits into from

Conversation

ekexium
Copy link
Contributor

@ekexium ekexium commented Nov 4, 2021

What problem does this PR solve?

Issue Number: close #xxx

Problem Summary:

It should be able to prevent some cases when index insertion is unexpectedly missing.

What is changed and how it works?

It asserts that the number of index insertions must be a multiple of row insertions.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

None

@ti-chi-bot
Copy link
Member

ti-chi-bot commented Nov 4, 2021

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • cfzjywxk

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note-none Denotes a PR that doesn't merit a release note. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Nov 4, 2021
@ekexium ekexium changed the title *: Check in txn *: Check index number in txn Nov 4, 2021
@ekexium
Copy link
Contributor Author

ekexium commented Nov 10, 2021

/run-integration-tests

@ekexium ekexium marked this pull request as ready for review November 11, 2021 08:26
@ti-chi-bot ti-chi-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 11, 2021
@ekexium
Copy link
Contributor Author

ekexium commented Nov 11, 2021

I'm not quite sure whether DDL or tools may use transactions that don't satisfy the constraint.

@cfzjywxk
Copy link
Contributor

I'm not quite sure whether DDL or tools may use transactions that don't satisfy the constraint.

Backfill transactions may lock the row keys and insert needed index keys, the lock operation would put zero value kv pairs into the memdb. If so the backfill transaction does not satisfy the rule, we may need to verify the current code or run a test about it.

Signed-off-by: ekexium <[email protected]>
Signed-off-by: ekexium <[email protected]>
@ti-chi-bot ti-chi-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 1, 2021
@ekexium
Copy link
Contributor Author

ekexium commented Dec 6, 2021

I'm not quite sure whether DDL or tools may use transactions that don't satisfy the constraint.

Backfill transactions may lock the row keys and insert needed index keys, the lock operation would put zero value kv pairs into the memdb. If so the backfill transaction does not satisfy the rule, we may need to verify the current code or run a test about it.

In such situation, #(row insertion) = 0, and we don't check the index counts.

@ti-chi-bot ti-chi-bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Dec 6, 2021
@ti-chi-bot ti-chi-bot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Dec 7, 2021
@ekexium
Copy link
Contributor Author

ekexium commented Dec 8, 2021

Theoretically I cannot find a case doesn't satisfy the rule at present.
I'd like to merge it and run regression tests with it, since it's sort of risky. @cfzjywxk @MyonKeminta

@cfzjywxk
Copy link
Contributor

cfzjywxk commented Dec 8, 2021

Theoretically I cannot find a case doesn't satisfy the rule at present. I'd like to merge it and run regression tests with it, since it's sort of risky. @cfzjywxk @MyonKeminta

We could limit the usage of this check to user transaction and skip internal ones by now

@ekexium
Copy link
Contributor Author

ekexium commented Dec 8, 2021

Good idea.
I didn't find a clear indicator of a "user sql" so I used the condition !internal && (isInsert || isUpdate || isDelete). Please point it out if there is a better choice.

Update: That's nonsense. The check is performed in a commit statement.

@cfzjywxk
Copy link
Contributor

cfzjywxk commented Dec 9, 2021

Good idea. I didn't find a clear indicator of a "user sql" so I used the condition !internal && (isInsert || isUpdate || isDelete). Please point it out if there is a better choice.

@ekexium
We could use the connection id to check if this transaction is user one, inner session always has a zero zonnection id

return errors.Trace(err)
}
for tableID, count := range indexInsertionCount {
if rowInsertionCount[tableID] > 0 && count%rowInsertionCount[tableID] != 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could add some comments to explain the proof or invariant here.

@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Dec 13, 2021
@cfzjywxk
Copy link
Contributor

@MyonKeminta PTAL

Copy link
Contributor

@MyonKeminta MyonKeminta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually still not very sure if this invariant is always true...

session/session.go Outdated Show resolved Hide resolved
Signed-off-by: ekexium <[email protected]>
@ekexium
Copy link
Contributor Author

ekexium commented Dec 15, 2021

I'm actually still not very sure if this invariant is always true...

Me too, so I would like to see it if can pass regression tests. Can the comments somewhat convince you?

@MyonKeminta
Copy link
Contributor

I'm actually still not very sure if this invariant is always true...

Me too, so I would like to see it if can pass regression tests. Can the comments somewhat convince you?

At least I didn't come up with any counterexample..

// v no duplicates (index does not exist): PR + 1, PI + N
// v duplicated (0 puts, M dels): PR + 1, DI - M, PI + N
// note: duplicated index mutations can only come from deletions on unique indices
// h is del
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does h does not exist and h is del indicates whether it exists in the current transaction's mem buffer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. h is del means it exists in the membuf and is a del.

}
return nil
}
if err := kv.WalkMemBuffer(memBuffer, f); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it process entries that's not modified but pessimistic-locked? Can it work with this change?

Copy link
Contributor

@cfzjywxk cfzjywxk Dec 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it will process the locked but not changed keys, as each unique index key used to fetch rowkey will be turned into a PUT record, seems they are not compatibe, consider this:

create table t1(c1 int key, c2 int, c3 int, unique key uk(c2), key k1(c3));
insert into t1 values(1, 1, 1),(2, 2, 2),(3, 3, 3);
update t1 set c3 = c3 + 1 where c1 in (2, 3); // Use the batch point get executor.
here the to be committed keys are:
PUT uk 1->1 uk 2->2 uk 3->3
PUT sk  3_2 -> 2
PUT sk  4_3 -> 3
DEL sk   3_3 -> 3
DEL sk   2_2  -> 2
PUT rowkey  3-> 3, 3, 4
PUT rowkey  2-> 2, 2, 3

The number of PUT on row is 2 and the number of PUT on index is 5?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ekexium
Seems @MyonKeminta has found a key point.

Copy link
Contributor Author

@ekexium ekexium Dec 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right.. it's an intentional special case. It seems to me we can't simply tell whether it comes from the optimization. A workaround might be adding a flag to indicate the special usage so they can be ignored in the check.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change is actually not expected and it's a temoprary solution to workaround the performance issues with many LOCK records in the write cf. Maybe we could put back this check after solving the LOCK record issue so that these tricky optimization or transformation could be removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking if there is a method to distinguish the usage and won't block removing it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have to add some new flags to membuffer to handle this, I would suspect whether it worths...

@ti-chi-bot
Copy link
Member

@ekexium: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@ti-chi-bot ti-chi-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 17, 2021
@ti-chi-bot ti-chi-bot deleted the branch pingcap:ft-data-inconsistency February 11, 2022 05:51
@ti-chi-bot ti-chi-bot closed this Feb 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. release-note-none Denotes a PR that doesn't merit a release note. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. status/LGT1 Indicates that a PR has LGTM 1.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants