-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More string format removals #15316
More string format removals #15316
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
core/trino-parser/src/main/java/io/trino/sql/ExpressionFormatter.java
Outdated
Show resolved
Hide resolved
if (root == null) { | ||
throw new NullPointerException(optimizer.getClass().getName() + " returned a null plan"); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I doubt this is significant enough to warrant making the code more verbose and harder to read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I'm sure that the performance cost isn't massive, this is called inside of the loop for each optimizer. On option would be to switch from requireNonNull(root, format("%s ..."))
to requireNonNull(root, () -> format("%s ..."))
but that would still be a capturing lambda creation per loop iteration even if it avoids the String.format
overhead, which is why I changed it over to an explicit null check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, but there are only a handful of optimizers. Also, they do way more complicated things, so this is going to be a drop in the bucket.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does lambda capture matter here? I suspect this would be optimized away by inlining.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It won't matter as long as the JIT does actually inline the requireNonNull
and the lambda, but if either one of those doesn't happen then you'll get an extra allocation inside of the loop (which would still be better than calling String.format
inside of each loop iteration, but is avoidable by using an explicit null check).
"@%s: %s", | ||
Integer.toHexString(identityHashCode(node)), | ||
node); | ||
return "@" + Integer.toHexString(identityHashCode(node)) + ": " + node; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar comment here. This shouldn't matter from a performance perspective.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this toString()
implementation seems to only be present for more helpful debugging, I'm fine with changing it back here and will rebase without this change.
However, at a high level- I think the general philosophy should be to avoid using String.format
in normal (ie: non-exceptional, not just for debugging) code paths and to prefer string concatenation instead- especially for simple things like format("%s=%s", ...)
. JDK 9+ javac
implementations generate more efficient, specialized to the call-site concatenations even compared to StringBuilder
, and even small inefficiencies can add up to a non-trivial overall performance overhead.
99a850b
to
444d601
Compare
@@ -58,7 +57,7 @@ public BytecodeNode generateExpression(BytecodeGeneratorContext generator) | |||
RowExpression term = terms.get(i); | |||
block.append(generator.generate(term)); | |||
|
|||
IfStatement ifWasNull = new IfStatement(format("if term %s wasNull...", i)) | |||
IfStatement ifWasNull = new IfStatement("if term " + i + " wasNull...") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These make the code less readable, IMO. Also, the code generation is not such a frequent operation that performance of concatenation vs String.format is a concern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I agree about concatenation being less readable, but that's a matter of taste so for the sake of argument, let's assume that format is slightly more readable. Even so, I don't agree with the idea that we should prefer the less efficient implementation (in non-exceptional, non-debugging code paths) by default. I'm arguing the opposite: even when the codepath is not especially "hot", we should prefer the more efficient of the two approaches when the readability difference is small to try to avoid accumulating thousands of small hits to performance overall.
If you still disagree, then I'm happy to remove either/both of the two places you left PR feedback comments about (and any others you want to flag)- I just wanted to make the case that we should reconsider the practice of adding these avoidable performance costs and prefer concatenation over formatting in the general.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sympathetic to the "accumulating thousands of small hits to performance" argument, but the question is if this has any measurable impact. Our general philosophy is to prefer readability, unless the performance impact matters. This is a slippery slope.
In this specific case, I'm not sure it's less readable. But this one seems worse:
- .setDescription(format("cursor.get%s(%d)", type, field))
+ .setDescription("cursor.get" + type + "(" + field + ")")
Hopefully String Templates will allow both readability and performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@electrum - Fair point on the other change, I've pushed a new revision that removes that particular change in CursorProcessorCompiler
.
if (root == null) { | ||
throw new NullPointerException(optimizer.getClass().getName() + " returned a null plan"); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, but there are only a handful of optimizers. Also, they do way more complicated things, so this is going to be a drop in the bucket.
444d601
to
e9fbf00
Compare
e9fbf00
to
126213f
Compare
4656dec
to
5f3af9c
Compare
@martint, I've rebased the revisions and the tests are still green. I think there's still the open comment item from your earlier review- can you take a look again to close this PR out? |
5f3af9c
to
b1bc63d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with this, but will let @martint take another look
if (root == null) { | ||
throw new NullPointerException(optimizer.getClass().getName() + " returned a null plan"); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does lambda capture matter here? I suspect this would be optimized away by inlining.
@@ -58,7 +57,7 @@ public BytecodeNode generateExpression(BytecodeGeneratorContext generator) | |||
RowExpression term = terms.get(i); | |||
block.append(generator.generate(term)); | |||
|
|||
IfStatement ifWasNull = new IfStatement(format("if term %s wasNull...", i)) | |||
IfStatement ifWasNull = new IfStatement("if term " + i + " wasNull...") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sympathetic to the "accumulating thousands of small hits to performance" argument, but the question is if this has any measurable impact. Our general philosophy is to prefer readability, unless the performance impact matters. This is a slippery slope.
In this specific case, I'm not sure it's less readable. But this one seems worse:
- .setDescription(format("cursor.get%s(%d)", type, field))
+ .setDescription("cursor.get" + type + "(" + field + ")")
Hopefully String Templates will allow both readability and performance.
b1bc63d
to
e171697
Compare
Test failure is unrelated and caused by #16406 |
Avoids an eager and unnecessary String.format call by letting checkArgument perform the required formatting only when the check fails.
Also replaces unnecessary usages of Guava's Joiner in favor of Collectors.joining where appropriate.
Replaces simple String.format usages in non-exceptional code paths with simple string concatenations where applicable.
e171697
to
6f269f4
Compare
Description
Follow up from #15258 to replace other usages of
String.format
in non-exceptional code paths where simple string concatenations are equally clear (but more performant). These changes are not expected to be nearly as significant in terms of performance as the linked previous PR.Additional context and related issues
No additional context should be necessary.
Release notes
(x) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
( ) Release notes are required, with the following suggested text: