Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

introduce html4 namespace #2278

Merged
merged 6 commits into from
Jun 21, 2021
Merged

Conversation

flavorjones
Copy link
Member

@flavorjones flavorjones commented Jun 21, 2021

What problem is this PR intended to solve?

As the Nokogumbo merger progresses (see #2204), we now have an HTML5 module and namespace, but the previous libxml2-(and nekohtml-) based functionality is parked under the ambiguous HTML module and namespace.

I'd like to disambiguate, and also introduce an opportunity for us to use HTML for more general use in the future (e.g., perhaps detection of HTML doc format and choosing the right DOM parser).

This PR moves everything currently under HTML to HTML4, and makes HTML an alias for HTML4. It updates doc strings and class names.

Some changes in behavior that I want to note:

  • objects will report a class of Nokogiri::HTML4::XXX where they previously reported Nokogiri::HTML::XXX
  • some of the exported symbols have been renamed (e.g., mNokogiriHTML is now mNokogiriHTML4) which might impact anyone writing C code and linking against Nokogiri's dylib

Have you included adequate test coverage?

I've left the tests alone (except for the addition of some "HTML/HTML4 equivalence" tests) to demonstrate there's no behavioral breakage.

Does this change affect the behavior of either the C or the Java implementations?

Notably, I've updated the Java files to rename classes and variable, and use the proper module and class names, so that it stays in sync with CRuby despite not having an HTML5 module/namespace.

also make private HTML5::Node#add_child_node_and_reparent_attrs
and make HTML an alias to HTML4.

- renamed files, C and Java variables, and Java class names
- updated gemspec
- updated doc strings
- updated usage in lib/ and ext/ to specify HTML4
- added a test asserting on equality of identity of the modules

Notably, left the tests alone to ensure this isn't a breaking change.
@flavorjones flavorjones merged commit b022660 into main Jun 21, 2021
@flavorjones flavorjones deleted the flavorjones-introduce-html4-namespace branch June 21, 2021 20:13
flavorjones added a commit that referenced this pull request Nov 28, 2023
**What problem is this PR intended to solve?**

Before a minor release, I generally review deprecations and look for
things we can remove.

* Removed `Nokogiri::HTML5.get` which was deprecated in v1.12.0. [#2278]
(@flavorjones)
* Removed the CSS-to-XPath utility modules
`XPathVisitorAlwaysUseBuiltins` and `XPathVisitorOptimallyUseBuiltins`,
which were deprecated in v1.13.0 in favor of `XPathVisitor` constructor
args. [#2403] (@flavorjones)
* Removed `XML::Reader#attribute_nodes` which was deprecated in v1.13.8
in favor of `#attribute_hash`. [#2598, #2599] (@flavorjones)

Also we're now specifying version numbers in remaining deprecation
warnings.

**Have you included adequate test coverage?**

Tests have been removed, otherwise no new coverage needed.

**Does this change affect the behavior of either the C or the Java
implementations?**

As documented above.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant