-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance slow with streams compared to old version #264
Comments
Looking more into this more..., I've instrumented my code to hack pdfkit's compression stream to detect when it's write state is over the high water mark to see if that would speed it it up.
As we can see from the above table that by respecting the highwatermark on the zlib compression stream we increase the speed with compression enabled. Here is what my hack looks like:
without the hack it would look like this:
I'll probably try to come up with some PDFKit specific test cases now that I know that I'm starting to get to the bottom of this. One unfortunate thing is that I feel as if PDFKit might need a big refactor to enable asynchronous code to allow for these callbacks to happen. I also might look at the old implementation as well to see if I can use the old implementation that was high speed but still expose the readable stream interface more easily than with the current version. |
I've updated my report on node to include a test suite for testing just the zlib compression stream. It seems to have the same issue. You can view it here: nodejs/node-v0.x-archive#8048 (comment) If someone could run this test and verify they are seeing the same issues that would be much appreciated... thank you. |
Here is a proposed workaround to deal with the node steams bug mentioned above: https://gist.github.com/ashelley/6cbad3f489499996a95e Basically we have to queue writes to the zlib stream when compression is turned on and defer compressing the chunks until we can do them all at once. Here are some numbers to show the improvement:
Note that this doesn't fix overfilling the pipe'd stream mentioned here: Also note that I think the true fix is that pdfkit should possibly call the .drain event when needed on the zlib stream but I think that patch is non-trival. |
I just ran into a memory leak situation without the above patch: nodejs/node-v0.x-archive#6623 (comment) Adding the patch lets my program finish running. No minimum test case yet. |
I took a look at the machine performance and i detected that the mem and cpu used, doesnt get released after rendering pdf. |
@devongovett any chance this may be fixed sometime? @ashelley did great work here analyzing the problem. |
hello @floledermann I did some more recent work on this discussed in this comment: also note that the underlying streams issue was also recently discussed here but still no fix i stable (i don't think) |
Possibly related: bpampuch/pdfmake#280 The above also discusses a workaround by hacking Readable. |
Hi. Is this issue fixed in any updated version of PDFkit? Or is a fix coming anytime soon? |
Some performance related stuff was done in recent releases please test again and reopen if still a problem |
Okay,
So I've really been battling performance with pdfkit recently. You can read a bit about my problems here:
#258
I decided to download a known fast version of pdfkit that i've worked with before that I knew was high performance. I picked an arbitrary version from around the time I last used pdfkit on a project. This means that versions after this are probably fast too but I didn't want to dig deep, I just wanted document the difference.
https://github.com/devongovett/pdfkit/tree/933aee7d6062c4f5c5a147c15ca3d999f224f0b9
I have a series of pdf documents that I create for my own testing purposes. I am generating these pdfs 3 times:
Note that I had to change my code slightly to work with the old version of pdf kit. Also note that this is the only change that I'm making to my program before running the tests.
The test documents produced are visually identical in all scenarios
Example 1:
Example 2:
Example 3:
Example 4:
Example 5:
Example 6:
The problem seems to compound when you do large documents to the point where it becomes unusable:
Example 7:
Example 8:
One thing to note about the above numbers is that "write ms" actually means the time since we called doc.end() until the time we get stream.finish(). We can see by the above numbers that when compression turns on there seems to be some different way the document gets output so the problems that caused by turning on compression are probably in that code some where.
The main issue I think with pdfkit as it stands right now is possibly this line:
from https://github.com/devongovett/pdfkit/blob/master/lib/document.coffee
I'm not sure if there is any other way to do this in the code base yet but I think because every .write operation creates a new buffer (it seems) that you can't really write a big document with pdf kit without turning off compression and even if you turn off compression it's still a lot slower than it used to be.
In summary, i think there are two separate issues...
Obviously any insight into this definitely appreciated.
The text was updated successfully, but these errors were encountered: