Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lint: add new deferunlockcheck pass #107577

Merged
merged 3 commits into from
Sep 12, 2023
Merged

Conversation

Santamaura
Copy link
Contributor

This commit adds a new pass which checks for
...Unlock() expressions without defer which
could result in a lock being held indefinitely.

Resolves #105366

Release note: None

@Santamaura Santamaura requested a review from a team as a code owner July 25, 2023 20:37
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@Santamaura Santamaura requested review from a team and ericharmeling and removed request for a team July 25, 2023 20:38
Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you also arrange for the check to be disabled if there's a special // nolint:XXX comment next to an Unlock call to say to disable the linter?

See the other linters - we have // nolint:errwrap, // nolint:errcmp etc.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ericharmeling)

Copy link
Contributor Author

@Santamaura Santamaura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check does account for this 😄 the exclusion rule is // nolint:deferunlock

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ericharmeling)

Copy link
Collaborator

@rickystewart rickystewart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like CI is failing, have a look.

Copy link
Collaborator

@rickystewart rickystewart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, also, add //pkg/testutils/lint/passes/deferunlockcheck along with the other lint checks in the list in top-level BUILD.bazel as part of nogo.

@rickystewart
Copy link
Collaborator

Looks like you will have to go into build/bazelutil/nogo_config.json and add a stanza for deferunlockcheck so that we only check first-party code. You can copy one of the existing blocks. Let me know if you have any questions.

Copy link
Contributor Author

@Santamaura Santamaura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

"deferunlockcheck": {

"only\_files": {

"cockroach/pkg/.\*$": "first-party code",

"cockroach/bazel-out/.\*/bin/pkg/.\*$": "first-party code"

}

},

to the nogo_config.json but build still seems to fail on external code..anything additional needed?

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ericharmeling)

@rickystewart
Copy link
Collaborator

What's going on with the backslash?

"only\_files":

Copy link
Contributor Author

@Santamaura Santamaura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Odd it wasn't a typo, it's inserted when copy-pasting
Screenshot 2023-07-26 at 12.32.33 PM.png

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ericharmeling)

@rickystewart
Copy link
Collaborator

You spelled the name of your linter wrong. It's called deferunlockcall, not deferunlockcheck.

This diff works. Should you copy-paste it, be very careful you do not insert backslashes.

diff --git a/build/bazelutil/nogo_config.json b/build/bazelutil/nogo_config.json
index cddfbc90d5e..b5e9b80b09e 100644
--- a/build/bazelutil/nogo_config.json
+++ b/build/bazelutil/nogo_config.json
@@ -28,6 +28,12 @@
             "_test\\.go$": "tests"
         }
     },
+    "deferunlockcall": {
+        "only_files": {
+            "cockroach/pkg/.*$": "first-party code",
+            "cockroach/bazel-out/.*/bin/pkg/.*$": "first-party code"
+        }
+    },
     "errcheck": {
         "exclude_files": {
             "pkg/.*\\.eg\\.go$": "generated code",

@rickystewart
Copy link
Collaborator

It looks like you use deferunlockcheck in most of the places besides the actual name of the Analyzer. Is that intentional? You may find it easier just to update the Analyzer name to deferunlockcheck if that's what you want it to be called.

@rickystewart
Copy link
Collaborator

Oh, and the PR calls it deferunlocktest. :P It will probably be much less confusing for everyone if we select one name.

@Santamaura Santamaura changed the title lint: add new deferunlocktest pass lint: add new deferunlockcheck pass Jul 26, 2023
Copy link
Contributor Author

@Santamaura Santamaura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that was the problem thanks! In this scenario should I nolint the current offenders to pass CI?

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ericharmeling)

@rickystewart
Copy link
Collaborator

In this scenario should I nolint the current offenders to pass CI?

If you do, at least file bugs so we can keep track of the current offenders. Otherwise they'll just get lost.

@Santamaura
Copy link
Contributor Author

There is a group effort internally right now to address unsafe locks so I will hold off on merging this until that work progresses since it will resolve the problems this linter is picking up anyways.

@Santamaura
Copy link
Contributor Author

@rickystewart I have PRs that are addressing #107946 but there will still be many manual Unlocks which this linter will catch, is there a way to set the linter such that it would only capture new cases or is the only way to nolint all the remaining safe cases?

@rickystewart
Copy link
Collaborator

@rickystewart I have PRs that are addressing #107946 but there will still be many manual Unlocks which this linter will catch, is there a way to set the linter such that it would only capture new cases or is the only way to nolint all the remaining safe cases?

You want to nolint all the remaining cases. Otherwise it becomes impossible to lint unless you perform a diff between the commit in question and the commit before this one, which is very confusing -- we don't want lints depending on git history in some way.

@Santamaura Santamaura force-pushed the deferunlocktest branch 3 times, most recently from 0dc79ce to 3904acc Compare August 11, 2023 14:39
@Santamaura
Copy link
Contributor Author

Made the "smarter" linter but still doing some testing. Need to update the nogo_config once done with testing.

@Santamaura Santamaura force-pushed the deferunlocktest branch 5 times, most recently from fb21941 to 70f1b24 Compare August 28, 2023 20:32
@Santamaura
Copy link
Contributor Author

OK this should be RFAL

@Santamaura Santamaura requested a review from renatolabs August 28, 2023 21:31
@Santamaura
Copy link
Contributor Author

Should also mention my team wants me to timebox any more time spent on this linter since it's not really obs inf domain so if this iteration ain't it then I'll probably need to hand it off to dev inf.

Copy link
Collaborator

@stevendanna stevendanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking time to address the concerns around this linter.

For me, the inclusion of the "close enough" exception makes this something worth trying.

I'm not convinced we need to report an error on for/range statements. but can live with it if y'all feel it is important.

I briefly read over the linter code, but it should probably have some review from someone more familiar with the relevant APIs.

@Santamaura
Copy link
Contributor Author

Santamaura commented Aug 31, 2023

Not too sure who to get a review from maybe @renatolabs since you're familiar with the analyzer apis?

// shouldReportNonLinearControlFlow checks if locks has a *LockInfo without a matching unlock
// and if the node has a nolint comment near it. If it finds both it returns true.
func (lt *LockTracker) shouldReportNonLinearControlFlow(node ast.Node) bool {
hasNoLintComment := passesutil.HasNolintComment(lt.pass, node, noLintName)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like you can immediately if hasNoLintComment { return false }, right? We will always return false in this case with the way this function is currently written.

}

var (
noLintName = "deferunlock"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: noLintName and maxLineDistance should be const, not var.

if lockDistance > maxLineDistance && !passesutil.HasNolintComment(lt.pass, node, noLintName) {
lt.issues = append(lt.issues, analysis.Diagnostic{
Pos: unlockPos,
Message: fmt.Sprintf("{RW}Mutex {R}Unlock is >%d lines away from matching {R}Lock", maxLineDistance),
Copy link
Collaborator

@rickystewart rickystewart Aug 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message here (and on line 311) is not great. Note that this is the message that will be displayed to devs when they violate the lint rule. In particular it's unclear what someone is supposed to do to solve the problem. e.g. "Oh, I can't have more than 5 lines between the lock and unlock, so I need to start combining multiple lines of code into one line with semicolons" or something like that.

The error message should succinctly describe what the actual problem is and suggest a fix like "move the Unlock() call to a defer statement immediately after the Lock() call". If you produce something the developer can directly copy-paste (like defer x.y.whatever.Unlock() -- I think you actually already have the pieces you need to make this work), even better.

The error message on line 311 has an additional problem because what is "non-linear control flow"? Why is it a problem if I have non-linear control flow, and how do I fix it?

This commit adds a new pass which checks for
`...Unlock()` expressions without `defer` which
could result in a lock being held indefinitely.

Resolves cockroachdb#105366

Release note: None
for linter to pass

This commit adds a number of files to the nogo config for the linter
to ignore. These files have manual unlocks that are catagorized as
potentially problematic since there are unlocks >10 lines away from
the lock. The commit also applies the nolint rule to any remaining
manual unlock cases that are <=10 lines away from the lock.

Release note: None
This commit udpates the lock linter to work by
keeping track of locks and if we find any
if conditions, loops, or function calls before
finding a matching unlock it will report. It will
also report if the matching unlock is >5 lines away
from the lock. It will ignore cases where a
`nolint:deferunlock` is present.

Release note: None
Copy link
Contributor Author

@Santamaura Santamaura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ericharmeling, @renatolabs, and @rickystewart)


pkg/testutils/lint/passes/deferunlockcheck/deferunlockcheck.go line 40 at r13 (raw file):

Previously, rickystewart (Ricky Stewart) wrote…

Nit: noLintName and maxLineDistance should be const, not var.

Done


pkg/testutils/lint/passes/deferunlockcheck/deferunlockcheck.go line 286 at r13 (raw file):

Previously, rickystewart (Ricky Stewart) wrote…

The error message here (and on line 311) is not great. Note that this is the message that will be displayed to devs when they violate the lint rule. In particular it's unclear what someone is supposed to do to solve the problem. e.g. "Oh, I can't have more than 5 lines between the lock and unlock, so I need to start combining multiple lines of code into one line with semicolons" or something like that.

The error message should succinctly describe what the actual problem is and suggest a fix like "move the Unlock() call to a defer statement immediately after the Lock() call". If you produce something the developer can directly copy-paste (like defer x.y.whatever.Unlock() -- I think you actually already have the pieces you need to make this work), even better.

The error message on line 311 has an additional problem because what is "non-linear control flow"? Why is it a problem if I have non-linear control flow, and how do I fix it?

Thanks for the feedback, I made the error messages a bit simpler and added what the dev can do to fix it.


pkg/testutils/lint/passes/deferunlockcheck/deferunlockcheck.go line 298 at r13 (raw file):

Previously, rickystewart (Ricky Stewart) wrote…

Seems like you can immediately if hasNoLintComment { return false }, right? We will always return false in this case with the way this function is currently written.

Agree, done.

@Santamaura
Copy link
Contributor Author

TFTRs!

bors r+

@craig
Copy link
Contributor

craig bot commented Sep 12, 2023

Build succeeded:

@craig craig bot merged commit b99bcd9 into cockroachdb:master Sep 12, 2023
@Santamaura Santamaura deleted the deferunlocktest branch September 12, 2023 15:35
andrewbaptist added a commit to andrewbaptist/cockroach that referenced this pull request May 14, 2024
As part of cockroachdb#107577 an exclusion was added to replica_raft.go. After this
PR this exclusion is no longer necessary.

Part of: cockroachdb#105366

Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

lint: add check to ensure mutex Unlock() calls are deferred
10 participants