Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: Use the parser's DOM without modifications #1559

Merged
merged 1 commit into from
Dec 14, 2020
Merged

Conversation

fb55
Copy link
Member

@fb55 fb55 commented Dec 11, 2020

BREAKING: This removes the root property from top-level nodes. Instead, the root reference is kept in the (previously null) parent property. Cheerio now uses the root object generated by either of the parsers.

Also bumps css-select, as the new version includes a fix for parent references that aren't elements.

BREAKING: This removes the `root` property from root nodes. Instead, the `root` reference is kept in the (previously `null`) `parent` property.

Also bumps `css-select`, as the new version includes a fix for `parent` references that aren't elements.
@@ -182,7 +182,6 @@ The options in the `xml` object are taken directly from [htmlparser2](https://gi

```js
{
withDomLvl1: true,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This option no longer exists in htmlparser2.

@@ -33,7 +33,7 @@ describe('cheerio', function () {
var $ = cheerio.load('<div>a div</div><span>a span</span>');
var $collection = $('div').add($.root()).add('span');
var expected =
'<span>a span</span><html><head></head><body><div>a div</div><span>a span</span></body></html><div>a div</div>';
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These elements used to be disjoint, because uniqueSort did not resolve the root reference. Now, uniqueSort will actually put the document first & the span last (as it should).

@5saviahv
Copy link
Contributor

If root element is removed does parse5 htmlparser2 tree-adapter also needs to change ?? So it wont generate root node ?

@fb55
Copy link
Member Author

fb55 commented Dec 11, 2020

Both parsers produce root nodes. This change prevents cheerio from taking away the existing root nodes and replacing them with a custom one.

Updating the description above as I can see where the confusion comes from.

@fb55 fb55 linked an issue Dec 12, 2020 that may be closed by this pull request
@fb55 fb55 merged commit 2e75c2a into v1.0.0 Dec 14, 2020
@fb55 fb55 deleted the feat/parser-dom branch December 14, 2020 18:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

html isn't the root of the document
2 participants