-
-
Notifications
You must be signed in to change notification settings - Fork 905
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce allocations from hack in document_fragment #2087
Reduce allocations from hack in document_fragment #2087
Conversation
Code Climate has analyzed commit 6f81566 and detected 0 issues on this pull request. The test coverage on the diff in this pull request is 100.0% (80% is the threshold). This pull request will bring the total coverage in the repository to 94.3% (0.0% change). View more on Code Climate. |
✅ Build nokogiri 1.0.684 completed (commit d8675fbf93 by @ashmaroli) |
@ashmaroli Hi! Thanks for submitting this. Do you have any metrics around the reduced number of allocations? I'm curious how you performed your analysis. |
I'll need to look into why the jruby gem tests failed -- I don't think it's related to this PR. |
@@ -33,7 +33,7 @@ def initialize document, tags = nil, ctx = nil | |||
self.errors = document.errors - preexisting_errors | |||
else | |||
# This is a horrible hack, but I don't care | |||
if tags.strip =~ /^<body/i | |||
if /^\s*?<body/i.match?(tags) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you tell me why you've inserted \s*?
at the front of this? That's a functional change that I wouldn't expect based on your description; and there aren't any tests provided demonstrating why this is introduced.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original code tags.strip =~ /^<body/i
removes whitespace around string tags
and tests if tags
starts with <body
.
So I thought instead of allocating a sanitized version of tags
and testing, it'd be better if test was to check if <body
substring exists preceded by an optional amount of whitespace from the start of string.
In other words, to do away with tags.strip
...
I used |
The reason behind this is |
6f81566
to
be2be40
Compare
What problem is this PR intended to solve?
Reduce object allocations from a hack in the constructor of
Nokogiri::HTML::DocumentFragment
.Nokogiri 1.11 requires at least Ruby 2.4.0. Therefore, we can safely use
Regexp#match?
instead ofString#=~
.The advantages are:
Regexp#match?
doesn't allocateMatchData
objects.String#strip
which duplicates the given string.Have you included adequate test coverage?
This change should be covered by existing tests.
Does this change affect the C or the Java implementations?
No, it doesn't.