-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[UNDERTOW-2210] Improved write ASCII path on ServletPrintWriter #1424
Conversation
Still WIP, I'll run some micro and end 2 end benchmark soon with the final implementation. |
@fl4via @stuartwdouglas
I didn't yet spend time into profiling (with perf) to see if bound check elimination has been effective and I still need to run a full fat end 2 end test with SpecJ, but code-wise, should be a in a decent state for a review. |
👋 Hi there, @franz1981 :-) Out of curiosity, how do the benchmarks above compare to some variation of |
For smaller strings is very likely will perform better then what I have done, but not fully sure; let me do it, I am curious too |
And I didn't tried with varhandle byte buffer view (that will save, likely the reverse bytes due to endianess) |
Results from SpecJ are very good: as expected it deliver a reduction of overhead ~2X |
Interestingly, JDK 19 (note: prev results were using JDK 11) seems to disagree:
and report the first commit of this PR to be the winner by some margin. Still on JDK 19 using a VarHandle to write LittleEndian long data into the buffer instead, makes
For reference,
performing even worse then the |
I believe this can be made even faster by reading chars as long via varhandle/methodhandle, will experiment with it next week. |
servlet/src/main/java/io/undertow/servlet/spec/ServletPrintWriter.java
Outdated
Show resolved
Hide resolved
if (((batch1 | batch2) & 0xff80_ff80_ff80_ff80L) != 0) { | ||
return i << 3; | ||
} | ||
final long batch = (batch1 << 8) | batch2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@theRealAph I've removed
final long maskedBatch1 = (batch1 & 0x007f_007f_007f_007fL) << 8;
final long maskedBatch2 = batch2 & 0x007f_007f_007f_007fL;
final long batch = maskedBatch1 | maskedBatch2;
because of the previous check re being in [0, 127]
@fl4via this is now ready for review; it delivers a great speedup (up to 2X) across different JDK versions (11->19) |
benchmarks/src/main/java/io/undertow/benchmarks/AsciiEncodingBenchmark.java
Outdated
Show resolved
Hide resolved
Thanks @franz1981 ! |
Thanks @ropalka !!! |
Placing this comment here to remember: https://github.com/AdoptOpenJDK/openjdk-jdk11/blob/master/src/java.base/share/classes/java/lang/StringLatin1.java#L52 shows that whatever single byte char is fine, meaning 0-255 too! |
No description provided.