-
-
Notifications
You must be signed in to change notification settings - Fork 352
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: escape sequences in text-blocks are kept #4409
Conversation
- fix assertEquals arguments
- change testTextBlockEscapes() to expect the correct result and mark it as disabled
The failure in the "Extra Checks" step seems to be a technical problem and not related to the changes: |
* @return source code representation of the literal | ||
*/ | ||
public static String getTextBlockToken(CtTextBlock literal) { | ||
String token = "\"\"\"\n" | ||
+ literal.getValue() | ||
+ literal.getValue().replaceAll("\\\\", "\\\\\\\\") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this work? From reading the code, I get no clue why.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It escapes each single backslash by replacing it with a double backslash. It just looks weird because each backslash has to be escaped twice, once for the string literal and then again in the regular expression.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It escapes each single backslash by replacing it with a double backslash.
Would it not be sufficient to write .replaceAll("\", "\\")
? This is what I can infer from your explanation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@algomaster99 no, that's a compile time error. The backslash inside the quotes has to be escaped. Then you have a regex consisting of only a backslash. That is an error because the backslash has to be escaped in the regex. So you need another backslash escaping that one, which has to be escaped inside the quotes. That's why you need four backslashes to represent a single backslash in the search regex.
The same then applies for the replacement, so this makes it four backslashes in the search string and eight backslashes in the replacement string.
But you have a point in that we don't really need a regex here. So we can use String.replace(CharSequence, CharSequence)
instead (note the different method name), which does not use regular expressions. This will be both faster and require only half the escaping. I have updated the PR.
…aceAll(String, String)
// test text block value | ||
CtTextBlock l1 = (CtTextBlock) stmt5.getValueByRole(CtRole.ASSIGNMENT); | ||
assertEquals("no-break space: \\00a0\n" + | ||
"newline: \\n\n" + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using replace("\\", "\\\\")
would only escape the backslashes and not \n
because the latter is treated as a single character and not a combination of \
+ n
. For example, if you run replace("\\", "\\\\")
on \\\n
, it would return \\\\\n
. Notice that \n
is not escaped. I think it would be better if we write a generic function to escape all characters or use one which already exists. I have something like this in mind:
public static String escape(String s){
return s.replace("\\", "\\\\")
.replace("\t", "\\t")
.replace("\b", "\\b")
.replace("\n", "\\n")
.replace("\r", "\\r")
.replace("\f", "\\f")
.replace("\'", "\\'")
.replace("\"", "\\\"");
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As long as in the source code it's written as "\n" it works because at that time it's still two characters. It is not supposed to escape the newline at the end of a line in a textblock. The unit test for this passes and the original problem I had with spoon is also solved by this patch, so I am quote sure it works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As long as in the source code it's written as "\n" it works because at that time it's still two characters.
Okay. So do we just want to escape backslashes when we put the literal inside a text block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, "normal" String literals are handled in LiteralHelper.getLiteralToken(), which in turn calls LiteralHelper.getStringLiteral(). But if we use that one, all newlines, tabs etc. are replaced by their respective escape sequences and all textblocks would be printed on a single line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. Your patch looks good to me in that case.
Thanks a lot @xzel23 for the patch and @algomaster99 for the review. |
Fixes #4408
Besides adding the (currently disabled) test case, I changed the Junit imports and fixed the parameter order of the assertEquals calls.