-
Notifications
You must be signed in to change notification settings - Fork 171
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix regression in do_write(s) causing significant performance issues when using large (>10meg) writes #706
Fix regression in do_write(s) causing significant performance issues when using large (>10meg) writes #706
Conversation
2dee91d
to
d114c5e
Compare
Could you share the benchmark? I'm not sure how this change can be a massive performance improvement, since
this is not supposed to happen. |
d114c5e
to
0140ba8
Compare
huh, what's wild is that it was right in my test harness, but I managed to
lose it somewhere in committing - I lost it when I was addressing the
exception issue with a bad copy/paste, and then it didn't hit when I was
testing it locally.... clearly, I need to be more careful with tests! (I've
fixed this in the PR now)
…On Sat, Jan 13, 2024 at 2:25 AM Kazuki Yamaguchi ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In lib/openssl/buffering.rb
<#706 (comment)>:
> @@ -345,13 +345,18 @@ def do_write(s)
@wbuffer << s
@wbuffer.force_encoding(Encoding::BINARY)
@sync ||= false
- if @sync or @wbuffer.size > BLOCK_SIZE
- until @wbuffer.empty?
- begin
- nwrote = ***@***.***)
- rescue Errno::EAGAIN
- retry
+ buffer_size = @wbuffer.size
+ if @sync or buffer_size > BLOCK_SIZE
+ nwrote = 0
+ begin
+ while nwrote < buffer_size do
+ begin
+ nwrote += ***@***.***)
⬇️ Suggested change
- nwrote += ***@***.***)
+ nwrote += ***@***.***[nwrote, buffer_size - nwrote])
@wbuffer is no longer updated after each syswrite so this change is
required.
—
Reply to this email directly, view it on GitHub
<#706 (review)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAGAHFZBDD7LUAH5D3LW5WDYOJOLDAVCNFSM6AAAAABASIHSN6VHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMYTQMJZHE3DSMRZGY>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
So my original test is serving a large file via chef-zero, but I
managed to make a small reproduce case since it makes sense that
someone other us needs to be able to reproduce it. This matches the
behaviour of that exactly, and is fast in ruby 2.7, but very very slow
in ruby 3.0:
```
require 'webrick'
require 'webrick/https'
class FileReader < WEBrick::HTTPServlet::AbstractServlet
def do_GET(request, response)
response.content_type = "binary/octet-stream"
response.body = IO.read(File.expand_path("~/webroot/test.tgz"))
end
end
root = File.expand_path '~/webroot'
cert_name = 'CN=localhost'
server = WEBrick::HTTPServer.new :Port => 8443, :DocumentRoot => root,
:SSLEnable => true, :SSLCertName => cert_name
server.mount "/", FileReader
trap 'INT' do server.shutdown end
server.start
```
With this patch, it's as fast in ruby 3.0 as well. While I realise
that using IO.read is not the most ruby efficient way to do things, it
still shouldn't regress to 1% of performance between releases
EDIT: the file i'm using for this is a 80mbyte tar.gz file - i test by ensuring i can curl and pipe thgat into tar so that i know it's returning valid data
…On Sat, Jan 13, 2024 at 2:32 AM Kazuki Yamaguchi ***@***.***> wrote:
Could you share the benchmark?
I'm not sure how this change can be a massive performance improvement, since
(The reason was that removing the head of the buffer was a full copy every single write block, which turned out to be 16k for us, hence every 16k copying the 77meg of data that went in the front of it).
this is not supposed to happen.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
d3b3d14
to
5b008f8
Compare
Apart from the main topic, the code block on the comment above is not displayed properly on the page. Perhaps the code block syntax may not work when replying via email. |
Apologies, it does not
is what I was going for with the test case code |
Similar issue: https://blog.mattstuchlik.com/2024/01/31/sneaky-one-liner.html cc @s7nfo |
Very cool, this looks exactly like what I ran into.
I suggest running ‘ltrace -e memcpy’ against a reproducer like I did in the blog post, the issue becomes very apparent. |
Ruby Strings for IO are surprisingly hard to get correct / "efficient" and there are a bunch of land mines we need to avoid... |
I reproduced it with the webrick example using an older Ruby version. This is a regression introduced in commit acc8079, which went to ruby/openssl v2.1.3 and Ruby 2.6. #484 also reported this, but I didn't realize it was this bad. This occurs in Ruby <= 3.1 where I didn't merge this PR right away because this costs one extra string object allocation per write, but we should fix it if it affects Ruby 3.1 which has more than 1 year until EOL.
Agreed... |
I think this should be applied to stable branches too. @jaymzjulian could you rebase this on top of maint-3.0 branch (the oldest branch we fix non-secuirty bugs) and fix commit message formatting? |
lib/openssl/buffering.rb
Outdated
begin | ||
while nwrote < buffer_size do | ||
begin | ||
nwrote += syswrite(@wbuffer[nwrote..buffer_size]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nwrote += syswrite(@wbuffer[nwrote..buffer_size]) | |
nwrote += syswrite(@wbuffer[nwrote, buffer_size - nwrote]) |
can save one Range object allocation in the loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh hey, this is totally better - updated!
5b008f8
to
0768730
Compare
This causes significant performance issues when using large (>10meg) writes Fix by adjusting the buffer write function to clear the buffer once, rather than piece by piece, avoiding a case where a large write (in our case, around 70mbytes) will consume 100% of CPU. This takes a webrick GET request via SSL from around 200kbyts/sec and consuming 100% of a core, to line speed on gigabit ethernet and 6% cpu utlization.
0768730
to
d4389b4
Compare
This is now rebased to target maint-3.0, as well as with better formatting in the commit message. Thanks! |
@rhenium are you good to merge this? |
(Sorry for the long delay) Yes, it looks good to me. Thank you! |
Adjust the buffer write function to clear the buffer once, rather than piece by piece. This avoids a case where a large write (in our case, around 70mbytes) will consume 100% of CPU. This takes a webrick GET request via SSL from around 200kbyts/sec and consuming 100% of a core, to line speed on gigabit ethernet and 6% cpu utlization.
(The reason was that removing the head of the buffer was a full copy every single write block, which turned out to be 16k for us, hence every 16k copying the 77meg of data that went in the front of it).
Passes tests, and has been tested on our systems with chef-zero