Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WstxSAXParser error handling when used with JAXB validation #196

Closed
winfriedgerlach opened this issue Feb 20, 2024 · 11 comments
Closed

WstxSAXParser error handling when used with JAXB validation #196

winfriedgerlach opened this issue Feb 20, 2024 · 11 comments

Comments

@winfriedgerlach
Copy link
Contributor

Use case: JAXB unmarshalling with schema validation enabled, unmarshalling an invalid XML document
Environment: WstxSAXParser 6.6.0 for parsing, JDK 17 Xerces for validation, jaxb-runtime 4.0.4
Expected outcome: org.xml.sax.SAXParseException with a useful message (e.g., "Content of element x incomplete, y expected")
Actual outcome: java.lang.AssertionError without any useful message or cause

We are using the Woodstox SAX parser with JAXB as a drop-in replacement for JDK's default SAX parser due to better performance. We have enabled XML Schema validation for JAXB by calling Unmarshaller.setSchema(). As Woodstox does not provide a javax.xml.validation.SchemaFactory, JAXB automatically falls back to JDK's built-in Xerces for XML Schema validation.

We can indeed confirm that this setup works, XML messages are parsed and validated. Unfortunately, due to the error handling in WstxSAXParser, the helpful SAXParseException generated by Xerces during XML validation is not thrown, but instead jaxb-runtime throws an AssertionError in UnmarshallingContext.endDocument():

https://github.com/eclipse-ee4j/jaxb-ri/blob/13402b7f1f19c840206936b56ad73e693eff2463/jaxb-ri/runtime/impl/src/main/java/org/glassfish/jaxb/runtime/v2/runtime/unmarshaller/UnmarshallingContext.java#L584-L585

When looking at the relevant code in Woodstox, two things come to mind:

} catch (IOException io) {
throwSaxException(io);
} catch (XMLStreamException strex) {
throwSaxException(strex);
} finally {
if (mContentHandler != null) {
mContentHandler.endDocument();
}

  1. the SAXParseException from schema validation is not handled in the catch (might be OK?)
  2. in finally, endDocument() is called, which leads to the AssertionError in UnmarshallingContext.endDocument()

If I understand JDK's JavaDoc correctly, it seems to be wrong to call endDocument() in case of an "fatal error", which may explain the diffferent behavior when using the Xerces SAX parser:

https://github.com/openjdk/jdk/blob/b419e9517361ed9d28f8ab2f5beacf5adfe3db91/src/java.xml/share/classes/org/xml/sax/ContentHandler.java#L144-L151

Could the .endDocument() call be moved away from the finally without breaking anything else? Looking forward to your opinion!

@cowtowncoder
Copy link
Member

cowtowncoder commented Feb 21, 2024

@winfriedgerlach From quick read your analysis sounds correct: I cannot remember why I put content handler call in finally block as that does, indeed, seem wrong.

But I'll have to think about this a bit since I usually have some reason for doing things, even if I do not immediately remember why :)
(my former self proves often surprisingly clever when I try changing things without careful thought)
I do wish that I had left a comment for everyone's benefit there tho.

@cowtowncoder
Copy link
Member

@winfriedgerlach Created #197 for change.

Passes CI, but to be honest, SAX parser test coverage is rather low (compared to Stax side). Still, I think change makes sense and could make it into 7.0.0 release.

@winfriedgerlach
Copy link
Contributor Author

winfriedgerlach commented Feb 21, 2024

@cowtowncoder thank you so much for the super quick reply!
This fix works for me - the behavior is now the same as with Xerces SAX, a jakarta.xml.bind.UnmarshalException being thrown with linked exception SAXParseException containing the useful detail message.

I am still not 100% sure whether it would make sense to call the error handler in case of a SAXParseException, as you do for the other exceptions (via throwSaxException())?

catch (SAXParseException sex) {
    if (mErrorHandler != null) {
        mErrorHandler.fatalError(sex);
    }
}

@cowtowncoder
Copy link
Member

@winfriedgerlach Good point, I'll have a look later tonight.

@cowtowncoder
Copy link
Member

cowtowncoder commented Feb 23, 2024

@winfriedgerlach Ok, so, Woodstox parser itself does not throw SAXParseExceptions which is probably why no such catch block exists. But could such thing be thrown by registered validator?

I'll update PR with what I think makes sense here.

@cowtowncoder
Copy link
Member

@winfriedgerlach Ok, I think I am done -- LMK if there are remaining concerns.

@winfriedgerlach
Copy link
Contributor Author

winfriedgerlach commented Feb 23, 2024

@cowtowncoder LGTM, thank you so much!

@ThisIsMyWay
Copy link

Hello @cowtowncoder ,
is there any plan to release it?

Thanks in advance for your reply :)

@cowtowncoder
Copy link
Member

Hmmh. Not too many things yet queued, but it has been a while.

So I'll consider releasing 7.0.0 in near future to get the fix out.

@ThisIsMyWay
Copy link

Great to hear. Thanks for your reply.

@cowtowncoder
Copy link
Member

Versions 6.7.0 and 7.0.0 have been released, see: https://repo1.maven.org/maven2/com/fasterxml/woodstox/woodstox-core/

cjmamo added a commit to smooks/smooks that referenced this issue Oct 29, 2024
cjmamo added a commit to smooks/smooks that referenced this issue Oct 29, 2024
…7.1.0 (#850)

* build(deps): bump com.fasterxml.woodstox:woodstox-core

Bumps [com.fasterxml.woodstox:woodstox-core](https://github.com/FasterXML/woodstox) from 6.6.2 to 7.1.0.
- [Commits](FasterXML/woodstox@woodstox-core-6.6.2...woodstox-core-7.1.0)

---
updated-dependencies:
- dependency-name: com.fasterxml.woodstox:woodstox-core
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <[email protected]>

* test: fix failing tests caused by FasterXML/woodstox#196

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: cjmamo <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants