Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

space prior to xml preamble causes nokogiri to lose child nodes, on jruby #790

Closed
coffeejunk opened this issue Nov 15, 2012 · 5 comments
Closed

Comments

@coffeejunk
Copy link

when creating a new Nokogiri::XML document from a string, that has one or more space characters prior to the xml preamble, on jruby all following child nodes get lost:

jruby-1.7.0 :001 > require 'nokogiri'
 => true 
jruby-1.7.0 :002 > doc1 = Nokogiri::XML(" <?xml version='1.0' encoding='utf-8' ?><first \>")
 => #<Nokogiri::XML::Document:0x7fa name="document"> 
jruby-1.7.0 :003 > doc2 = Nokogiri::XML("<?xml version='1.0' encoding='utf-8' ?><first \>")
 => #<Nokogiri::XML::Document:0x7fe name="document" children=[#<Nokogiri::XML::Element:0x7fc name="first">]>

this works perfectly fine in ruby 1.8, 1.9, rbx1.8 and rbx1.9:

1.9.3-p286 :001 > require 'nokogiri'
 => true 
1.9.3-p286 :002 > doc1 = Nokogiri::XML(" <?xml version='1.0' encoding='utf-8' ?><first \>")
 => #<Nokogiri::XML::Document:0x3fd36c8e30bc name="document" children=[#<Nokogiri::XML::ProcessingInstruction:0x3fd36c8e2b80 name="xml">, #<Nokogiri::XML::Element:0x3fd36c8e2964 name="first">]> 
1.9.3-p286 :003 > doc2 = Nokogiri::XML("<?xml version='1.0' encoding='utf-8' ?><first \>")
 => #<Nokogiri::XML::Document:0x3fd36c8d7f28 name="document" children=[#<Nokogiri::XML::Element:0x3fd36c8d776c name="first">]>

weird however is, that this works as originally expected with all listed ruby implementations when the first element is not the xml preamble:

jruby-1.7.0 :004 > doc1 = Nokogiri::XML("<bar><foo /></bar>")
 => #<Nokogiri::XML::Document:0x804 name="document" children=[#<Nokogiri::XML::Element:0x802 name="bar" children=[#<Nokogiri::XML::Element:0x800 name="foo">]>]> 
jruby-1.7.0 :005 > doc2 = Nokogiri::XML(" <bar><foo /></bar>")
 => #<Nokogiri::XML::Document:0x80a name="document" children=[#<Nokogiri::XML::Element:0x808 name="bar" children=[#<Nokogiri::XML::Element:0x806 name="foo">]>]> 
1.9.3-p286 :004 > doc1 = Nokogiri::XML("<bar><foo /></bar>")
 => #<Nokogiri::XML::Document:0x3fd36c8cef90 name="document" children=[#<Nokogiri::XML::Element:0x3fd36c8ceb30 name="bar" children=[#<Nokogiri::XML::Element:0x3fd36c8ce8d8 name="foo">]>]> 
1.9.3-p286 :005 > doc2 = Nokogiri::XML(" <bar><foo /></bar>")
 => #<Nokogiri::XML::Document:0x3fd36c8c8848 name="document" children=[#<Nokogiri::XML::Element:0x3fd36c8c8280 name="bar" children=[#<Nokogiri::XML::Element:0x3fd36c8c7b50 name="foo">]>]>
@yokolet
Copy link
Member

yokolet commented Nov 23, 2012

This behavior comes from Apache Xerces parser. We can't fix this.

Did you use some other gem that has dependency to Nokogiri?

@jvshahid
Copy link
Member

Actually this is fixed on mastee. I just tried it and it produced the same output as the C Nokogiri. I also verified that the previous version of Nokogiri suffered from this problem. @yokolet can you please double check.

@coffeejunk
Copy link
Author

@yokolet I found this bug when using mbklein/equivalent-xml#9 but the output you see above was solely nokogiri in irb.

@jvshahid great news! :)

@yokolet
Copy link
Member

yokolet commented Nov 25, 2012

@jvshahid hmm,,, it is fixed. What commit fixed this?

Anyway, this issue should be closed. The fix will be in next release.

@coffeejunk This bug was once reported when Nokgoiri is used from fog (I think). But, the original reporter hasn't used fog since then, so the bug remains opened. This bug usually pops up when other gem generats XML document starting from space.

@yokolet yokolet closed this as completed Nov 25, 2012
@jvshahid
Copy link
Member

It looks like it's fixed by this commit 00bd8d5, I think it's the 'continue after fatal errors' feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants