Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiline detection in LineCheckBase is buggy #278

Closed
francoisvn opened this issue Aug 19, 2016 · 4 comments · Fixed by #309
Closed

Multiline detection in LineCheckBase is buggy #278

francoisvn opened this issue Aug 19, 2016 · 4 comments · Fixed by #309
Labels

Comments

@francoisvn
Copy link
Contributor

Multiline detection in LineCheckBase is prone to bugs. Including a quote in a comment will result in the rest of the file being ignored, which is clearly not correct.

Test that should pass, but fails: 543846f

Using regular expressions to detect and ignore some checks inside strings seems like a challenge not worth trying to fight.

@Gama11
Copy link
Member

Gama11 commented Nov 6, 2016

I just ran into this... ' is extremely common in Flixel's doc comments, so the indentation character check is broken basically everywhere.

Using regular expressions to detect and ignore some checks inside strings seems like a challenge not worth trying to fight.

I don't know, that would make using multiline strings together with checkstyle a huge pain... And checkstyle uses them all over the place in its tests.

@AlexHaxe
Copy link
Member

AlexHaxe commented Nov 6, 2016

I think there are two ways to fix the problem:

  1. pattern matching: add detection for comments and escaped quotation marks
  2. go hybrid: use token tree to locate comments and strings

@seraku24
Copy link
Contributor

seraku24 commented Nov 7, 2016

Doing a quick code review, the logic within the (curiously misspelled) function isMultineString() appears to me to simply be incorrect on detecting the start and end of strings. For instance, the presence of any escaped quotes in a line would prevent the proper closing of a multi-line string, which doesn't make sense. In fact, the regex for finding a quote should include the escaping logic within it: (?<!\\)['"], for example. This would match either a single or double quote that is not preceded by a backslash. Of course, there still needs to be added logic to ensure this quoted string is not within the scope of a line or block comment.

Additionally, IndentationCharacterCheck is not even using the function properly. In order for the base class to maintain its tracking of multi-line strings, it needs to be invoked for every line regardless of what the derived check does with the lines. As it stands, a line suppression would result in the state becoming invalid due to skipping over lines that might contain the start or end of a string. I have only done a spot check on a few other derived classes, and most seem to be calling the base function in the correct location, but it would be worth reviewing the other checks to ensure proper usage of the function.

If I get some time between projects, I will fork this project and try my hand at updating the multi-line string logic. Hopefully, this project is easy to build and test. (:

@adireddy
Copy link
Member

adireddy commented Nov 7, 2016

Like @AlexHaxe said, I think using token tree will be a better & reliable solution.

@seraku24 if you can spare time, please go ahead and review/modify the current logic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants