Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[3.5] etcdserver: backport check scheduledCompactKeyName and finishedCompac… #16068

Merged
merged 1 commit into from
Jul 15, 2023

Conversation

CaojiamingAlan
Copy link
Contributor

@CaojiamingAlan CaojiamingAlan commented Jun 12, 2023

…tKeyName before writing hash to release-3.5.

Follow up of #15985

Fix #15919.
Check ScheduledCompactKeyName and FinishedCompactKeyName before writing hash to hashstore.
If they do not match, then it means this compaction has once been interrupted and its hash value is invalid. In such cases, we won't write the hash values to the hashstore, and avoids the incorrect corruption alarm.

Please read https://github.com/etcd-io/etcd/blob/main/CONTRIBUTING.md#contribution-flow.

// checkPrevCompactionCompleted checks whether the previous scheduled compaction is completed.
func (s *store) checkPrevCompactionCompleted() bool {
tx := s.b.ReadTx()
tx.Lock()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious why not use tx.RLock() but understand the PR is a backport of #15985

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lock() for ReadTx() should be by default a Rlock().

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my understanding, do you mean the following RWMutex Lock is equal to RLock to protect the read buffer?

func (rt *readTx) Lock() { rt.mu.Lock() }
func (rt *readTx) Unlock() { rt.mu.Unlock() }
func (rt *readTx) RLock() { rt.mu.RLock() }
func (rt *readTx) RUnlock() { rt.mu.RUnlock() }

Copy link
Contributor Author

@CaojiamingAlan CaojiamingAlan Jun 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After investigating a little bit, I think you are right, we should use RLock() instead. Lock() should only be used when the read buffer is updated by the writes committed by the BatchTxs. However, there are several positions where use read-only operations but Lock() is used. I think it is good to open a separate PR to correct them all.

Please correct me if I'm wrong.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. That aligns with my understanding.

I also don't think it blocks this PR to be merged. This can be corrected separately.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Old implementation uses stricter lock, which is sub-optimal, but is not a bug.

stricter lock is true, it's also wrong lock. It's a bug to me.

It should also have impact on performance for Non-K8s use case when auth is enabled. [Of course, it will be better if we have some performance comparison, but I will NOT force contributor to do it given it's a straightforward change]

To ensure release stability, we need to assume that the list you proposed can have a mistake.

That's why we need careful review. I wouldn't be afraid backporting it give it's a simple change and if we have confidence.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note I prefer to backporting it, but I am not insist on backporting it if there is strong objection from other maintainers.

Copy link
Member

@serathius serathius Jul 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stricter lock is true, it's also wrong lock. It's a bug to me.

Without the change system behaves correctly, with the change only thing you get is performance improvement. How this is a bug?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say it's harmless (on functionality) bug, obviously the lock isn't correctly be used. Let alone it has impact on performance.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bugs is when locks don't protect the critical section, relaxing locks is performance improvement. Nothing is harmless until proven, let alone concurrency change.

Copy link
Member

@jmhbnz jmhbnz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ahrtr ahrtr changed the title etcdserver: backport check scheduledCompactKeyName and finishedCompac… [3.5] etcdserver: backport check scheduledCompactKeyName and finishedCompac… Jul 14, 2023
Comment on lines +235 to +246
_, scheduledCompactBytes := tx.UnsafeRange(buckets.Meta, scheduledCompactKeyName, nil, 0)
scheduledCompact := int64(0)
if len(scheduledCompactBytes) != 0 {
scheduledCompact = bytesToRev(scheduledCompactBytes[0]).main
}

_, finishedCompactBytes := tx.UnsafeRange(buckets.Meta, finishedCompactKeyName, nil, 0)
finishedCompact := int64(0)
if len(finishedCompactBytes) != 0 {
finishedCompact = bytesToRev(finishedCompactBytes[0]).main

}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a UnsafeReadScheduledCompact and UnsafeReadFinishedCompact in server/mvcc/util.go for 3.5.

The two functions should be reused in multiple places, (Please search scheduledCompactKeyName and finishedCompactKeyName in the code base)

It's accepted to fix it in a separate PR. Please also raise a followup ticket.

@ahrtr
Copy link
Member

ahrtr commented Jul 14, 2023

Please raise two followup ticket as mentioned in #16068 (comment) and #16068 (comment).

Pls also rebase this PR.

…tKeyName before writing hash to release-3.5.

Fix etcd-io#15919.
Check ScheduledCompactKeyName and FinishedCompactKeyName
before writing hash to hashstore.
If they do not match, then it means this compaction has once been interrupted and its hash value is invalid. In such cases, we won't write the hash values to the hashstore, and avoids the incorrect corruption alarm.

Signed-off-by: caojiamingalan <[email protected]>
CaojiamingAlan added a commit to CaojiamingAlan/etcd that referenced this pull request Jul 15, 2023
Replace unnecessary Lock()/Unlock()s with RLock()/RUnlock()s

Signed-off-by: caojiamingalan <[email protected]>
@CaojiamingAlan
Copy link
Contributor Author

Just raised #16247. I will solve #16068 (comment) real quick, so no need for another PR.

Copy link
Member

@ahrtr ahrtr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

thx @CaojiamingAlan

@ahrtr ahrtr merged commit 8f4b6c9 into etcd-io:release-3.5 Jul 15, 2023
@CaojiamingAlan CaojiamingAlan deleted the release-3.5 branch July 17, 2023 18:47
CaojiamingAlan added a commit to CaojiamingAlan/etcd that referenced this pull request Jul 17, 2023
 Add a UnsafeReadScheduledCompact and UnsafeReadFinishedCompact

Signed-off-by: caojiamingalan <[email protected]>
CaojiamingAlan added a commit to CaojiamingAlan/etcd that referenced this pull request Jul 17, 2023
 Add a UnsafeReadScheduledCompact and UnsafeReadFinishedCompact

Signed-off-by: caojiamingalan <[email protected]>
CaojiamingAlan added a commit to CaojiamingAlan/etcd that referenced this pull request Jul 18, 2023
 Add a UnsafeReadScheduledCompact and UnsafeReadFinishedCompact

Signed-off-by: caojiamingalan <[email protected]>
@serathius serathius mentioned this pull request Oct 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

5 participants