Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for docx xml containing a BOM #27

Closed
wants to merge 5 commits into from
Closed

Add support for docx xml containing a BOM #27

wants to merge 5 commits into from

Conversation

studiochris
Copy link
Contributor

XML files within the DOCX file may begin with a byte order mark. Node doesn't strip this out automatically when reading UTF-8 data as a string, despite setting UTF-8 encoding ( nodejs/node-v0.x-archive#1918 & nodejs/node-v0.x-archive#4039 ). When these strings are passed to sax, they will cause errors because the XML string will not begin with the XML declaration or a node ("Non-whitespace before first tag.").

studiochris and others added 3 commits August 18, 2014 22:26
XML files within the DOCX file may begin with a byte order mark. Node doesn't strip this out automatically, and XML strings passed to sax that start with the BOM sequence will cause read errors because the XML string will not begin with the XML declaration.

References:
nodejs/node-v0.x-archive#1918
nodejs/node-v0.x-archive#4039
@mwilliamson
Copy link
Owner

Thanks for the pull request! I'm just wondering if it's worth pulling in a library rather than just implementing stripBom directly seeing as it's just a one-liner (string.replace(/^\uFEFF/g, '')).

Also, if you could amend the commit to avoid the whitespace noise in the diff, that would be great.

studiochris and others added 2 commits August 19, 2014 13:54
Replace with simple string.replace
@studiochris
Copy link
Contributor Author

I wondered about that myself. This does seem much more simple since the input is a string at this point in the code.

@mwilliamson
Copy link
Owner

Merged and published as 0.3.10. Thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants