upload file - slow perfomance #382

SofianJabal · 2020-04-02T14:24:37Z

Hi, I found that it is started to take a lot of time to copy ~200 MB file on rpyc module when I moved from 4.1.2 to 4.1.4
it was taking less than 2 min, but now it takes ~20 min
I'm using upload_file function

After debugging the issue, I found the problematic commit
It is 020f9fd ...

I suggest to revert it and return the old code, then think on a better way to handle it

Thanks,
Sofian.

The text was updated successfully, but these errors were encountered:

comrumino · 2020-04-03T18:16:12Z

@SofianJabal I think I may already fixed it at 7a216864789df6f3d9de61223c52a30c947554a1 when mentioned in a comment for an IPython issue---this commit isn't included in 4.1.4. If you have the time, could you confirm the issue is resolved using master? I can distribute the fix for you if it is resolved on master.

Thanks :)

SofianJabal · 2020-04-04T19:48:31Z

I used your code in 7a21686 and it is still reproduced, you should fix self.stream.write(...) to write one time and not 3 times as you implemented, see in channel.py lines 82, 83 and 84

Or you can return the original code of the send function in channel.py to be like this (this is what I did for me :) ):

def send(self, data):
    """Sends the given string of data as a packet over the underlying
    stream. Blocks until the packet has been sent.

    :param data: the byte string to send as a packet
    """
    if self.compress and len(data) > self.COMPRESSION_THRESHOLD:
        compressed = 1
        data = zlib.compress(data, self.COMPRESSION_LEVEL)
    else:
        compressed = 0
    data_size = len(data)
    header = self.FRAME_HEADER.pack(data_size, compressed)
    buf = header + data + self.FLUSHER
    self.stream.write(buf)

llchan · 2020-04-22T20:48:58Z

+1. The fix in master helps a lot, but it's still degraded relative to 4.1.2. In our application it was 1.53x slower (vs 40x in 4.1.4), but that includes a lot of other non-RPyC stuff so the RPyC-specific degradation is likely higher.

comrumino · 2020-04-25T05:42:20Z

Thanks for the response!

I released 4.1.5 which should perform 1.5x slower. Of course, a slower performance for your use case isn't ideal. At the same time, there is an issue with the copy operation if the object being transmitted is say 2GB and the associated copy operation overhead being detrimental to some users. This makes the changes between 4.1.2 to 4.1.5 a bit tricky to perfect.

The current implementation considers the small and the very large. To optimize for the objects in between the small and large, you could increase MAX_IO_CHUNK to provide better performance at the cost of responsiveness. The question remains, how to make a self-documenting and usable way for users to tune performance for the core channel. Here is the differences for reference.

diff --git a/rpyc/core/channel.py b/rpyc/core/channel.py
index 446e89d..7eaa7fd 100644
--- a/rpyc/core/channel.py
+++ b/rpyc/core/channel.py
@@ -70,6 +70,15 @@ class Channel(object):
             data = zlib.compress(data, self.COMPRESSION_LEVEL)
         else:
             compressed = 0
-        header = self.FRAME_HEADER.pack(len(data), compressed)
-        buf = header + data + self.FLUSHER
-        self.stream.write(buf)
+        data_size = len(data)
+        header = self.FRAME_HEADER.pack(data_size, compressed)
+        flush_size = len(self.FLUSHER)
+        if self.FRAME_HEADER.size + data_size + flush_size <= self.stream.MAX_IO_CHUNK:
+            # avoid overhead from socket writes requiring GIL to be held
+            self.stream.write(header + data + self.FLUSHER)
+        else:
+            # Data larger than 64KB, the extra writes are negligible
+            part1 = self.stream.MAX_IO_CHUNK - self.FRAME_HEADER.size
+            self.stream.write(header + data[:part1])
+            self.stream.write(data[part1:])
+            self.stream.write(self.FLUSHER)

comrumino · 2020-05-08T05:27:50Z

Since the fix helps significantly and other code changes could have changed performance, I referenced this issue in #326 considering that further improvement is out of this issue's scope.

Community involvement is always welcomed 💯

comrumino self-assigned this Apr 3, 2020

comrumino added the To Start Description reviewed and a maintainer needs "to start" triage label Apr 3, 2020

comrumino added Triage Investigation by a maintainer has started and removed To Start Description reviewed and a maintainer needs "to start" triage labels Apr 3, 2020

comrumino mentioned this issue Apr 25, 2020

classic: slow performance #379

Closed

comrumino mentioned this issue May 8, 2020

Reducing Network Traffic and Misc Performance Considerations #326

Open

comrumino closed this as completed May 8, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

upload file - slow perfomance #382

upload file - slow perfomance #382

SofianJabal commented Apr 2, 2020

comrumino commented Apr 3, 2020 •

edited

Loading

SofianJabal commented Apr 4, 2020

llchan commented Apr 22, 2020

comrumino commented Apr 25, 2020 •

edited

Loading

comrumino commented May 8, 2020

upload file - slow perfomance #382

upload file - slow perfomance #382

Comments

SofianJabal commented Apr 2, 2020

comrumino commented Apr 3, 2020 • edited Loading

SofianJabal commented Apr 4, 2020

llchan commented Apr 22, 2020

comrumino commented Apr 25, 2020 • edited Loading

comrumino commented May 8, 2020

comrumino commented Apr 3, 2020 •

edited

Loading

comrumino commented Apr 25, 2020 •

edited

Loading