-
-
Notifications
You must be signed in to change notification settings - Fork 905
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: use default memory management within libxml2 #2734
Conversation
instead of ruby's memory management functions, which has been the case since 2009. HTML4 parsing is 14-19% faster: alloc html4 parse 656: 22371.3 i/s BASE_ html4 parse 656: 19589.4 i/s - 1.14x (± 0.03) slower alloc html4 parse 70095: 461.4 i/s BASE_ html4 parse 70095: 385.5 i/s - 1.19x (± 0.11) slower alloc html4 parse 1929522: 43.7 i/s BASE_ html4 parse 1929522: 37.2 i/s - 1.18x (± 0.05) slower HTML5 parsing is 5-8% faster: alloc html5 parse 656: 17909.1 i/s BASE_ html5 parse 656: 16597.7 i/s - 1.08x (± 0.04) slower alloc html5 parse 70095: 277.3 i/s BASE_ html5 parse 70095: 263.4 i/s - same-ish: difference falls within error alloc html5 parse 1929522: 16.6 i/s BASE_ html5 parse 1929522: 15.8 i/s - 1.05x (± 0.03) slower HTML4 serialization is 10-14% faster: alloc html4 srlze 656: 69707.3 i/s BASE_ html4 srlze 656: 63537.4 i/s - 1.10x (± 0.02) slower alloc html4 srlze 70095: 1504.5 i/s BASE_ html4 srlze 70095: 1331.9 i/s - 1.13x (± 0.02) slower alloc html4 srlze 1929522: 145.5 i/s BASE_ html4 srlze 1929522: 128.0 i/s - 1.14x (± 0.02) slower HTML5 serialization is not very impacted, as expected. alloc html5 srlze 656: 42255.7 i/s BASE_ html5 srlze 656: 41118.8 i/s - 1.03x (± 0.02) slower alloc html5 srlze 70095: 1050.6 i/s BASE_ html5 srlze 70095: 1033.8 i/s - same-ish: difference falls within error alloc html5 srlze 1929522: 121.5 i/s BASE_ html5 srlze 1929522: 118.3 i/s - 1.03x (± 0.02) slower
The performance wins seem nice. IIRC the reason we used custom allocators was so that the GC had more accurate book keeping WRT to actual memory usage (as well as an opportunity to actually run the GC). I'd expect this change to cause average memory usage of any particular app to increase since we won't GC as frequently. That said, it's probably fine to do this. I just wanted to add some context! 😃 |
@byroot had also asked to re-run the benchmarks with GC disabled, I'll get around to that. |
With GC disabled ... HTML4 parsing is 8-25% faster:
HTML5 parsing is 0-6% faster:
HTML4 serialization is 10-18% faster:
|
Interesting. I would have expected most of the perf difference to be caused by 👍 |
Not triggering the GC can have a quite impressive memory bloat in some situations. In pg.gem we had this issue for years, so that Sam Saffron wrote a blog post about it. To that time I solved this by adding an estimator for the amount of memory that is used behind the scenes and this value is sent to the ruby GC per Maybe it's possible to use the native allocation functions in libxml2 and instruct the ruby GC about it per |
OK, I'm going to hold off on trying to land this in the upcoming 1.14.0 release, so we can consider a few other solutions. |
Based on the discovery in #2807, I'm now in support of this change. The reasoning is:
|
@casperisfine For whatever it's worth, I think there are ways around the issue we discovered in #2807, I've created #2822 so I don't forget to circle back. That doesn't need to be the forcing function here (though it may still be a good idea anyway). |
Closing in favor of #2843 which keeps the current default but allows opting into using system |
What problem is this PR intended to solve?
Related to #2722, performance.
use default memory management within libxml2 instead of ruby's memory management functions (which has been used since 2009).
HTML4 parsing is 14-19% faster:
HTML5 parsing is 5-8% faster:
HTML4 serialization is 10-14% faster:
HTML5 serialization is not very impacted, as expected.
Have you included adequate test coverage?
N/A
Does this change affect the behavior of either the C or the Java implementations?
N/A