Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replaces the old xml parser with a new, faster one #861

Merged
merged 3 commits into from
Nov 5, 2020

Conversation

ebozduman
Copy link
Contributor

@ebozduman ebozduman commented Aug 6, 2020

Fixes #859

This PR replaces the old xml parser package, xml2js, with the much faster, fast-xml-parser. package.

@ebozduman ebozduman force-pushed the update-xml-parser branch 2 times, most recently from f468bc5 to 907bba8 Compare August 31, 2020 11:46
@ebozduman ebozduman changed the title WIP: Replaces the old xml parser with a new, faster one Replaces the old xml parser with a new, faster one Aug 31, 2020
@ebozduman
Copy link
Contributor Author

@kanagarajkm, @kaankabalak,

Reminder: PTAL and review this PR
Thanks

@ebozduman ebozduman force-pushed the update-xml-parser branch 6 times, most recently from 0c6b4bf to fa82a87 Compare September 29, 2020 08:49
@ebozduman
Copy link
Contributor Author

ebozduman commented Sep 29, 2020

@kanagarajkm, @kaankabalak,

"arrayMode: true" option works only for node attributes. There are also other options to define what those attr and text fields are going to be named as. So, it doesn't work for us.
What we need is a new option to tell fxp the name of those nodes that will always be parsed as an array.
Since fxp did not have this kind of an option by design, I found many issues and prs created for this purpose:

Issue #23
Issue #43
Issue #68
PR #160
PR #135
PR #181
Issue #166
Issue #190
Issue #185

I checked all of them and they are at some point closed either because it is very complicated or the request is rejected because the fix impacts fxp performance adversely.

fxp package's default behavior is to return a single json element, if the node appears once, that is; the node name is not repeated more than once.
If the xml node is repeated more than once with different values, then these nodes, with the same name, are returned/parsed in an array.
So, the simple solution is to also return an array regardless of how many times the node name appears; once, twice or more.

Note: The extra commits you see in this PR had to be created to debug and diagnose a build issue, which I could not reproduce in my local environment.

@ebozduman
Copy link
Contributor Author

@kanagarajkm, @kaankabalak,
Reminder => PTAL again...

@ebozduman
Copy link
Contributor Author

@kaankabalak , @kanagarajkm ,
ping!
Any other review comments?

@harshavardhana harshavardhana merged commit fdbc801 into minio:master Nov 5, 2020
@ebozduman ebozduman deleted the update-xml-parser branch November 11, 2020 08:52
@PacoDu PacoDu mentioned this pull request Dec 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Upgrading XML Parser
4 participants