Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jackson message converters and codecs do not respect character encoding in canRead/canWrite #25076

Closed
saimonsez opened this issue May 14, 2020 · 9 comments
Assignees
Labels
in: web Issues in web modules (web, webmvc, webflux, websocket) status: backported An issue that has been backported to maintenance branches type: bug A general bug
Milestone

Comments

@saimonsez
Copy link

Affects: org.springframework:spring-web:5.1.15.RELEASE


XML webservices seem to be limited to unicode encodings. Using jackson-dataformat-xml 2.9.8 for a webservice which produces "text/xml;charset=ISO-8859-1" outputs utf-8.

The problem might be located in AbstractJackson2HttpMessageConverter:

...
protected void writeInternal(Object object, @Nullable Type type, HttpOutputMessage outputMessage) throws IOException, HttpMessageNotWritableException {
    MediaType contentType = outputMessage.getHeaders().getContentType();
    JsonEncoding encoding = getJsonEncoding(contentType);
    JsonGenerator generator = this.objectMapper.getFactory().createGenerator(outputMessage.getBody(), encoding);
...

JsonEncoding only provides unicode. A workaround would be to override the whole method but it is quite lenghty.

I made a very small reproducer: https://github.com/saimonsez/spring-jackson-xml-encoding-problem

@spring-projects-issues spring-projects-issues added the status: waiting-for-triage An issue we've not yet triaged or decided on label May 14, 2020
@poutsma poutsma self-assigned this Jun 3, 2020
@poutsma poutsma added the in: web Issues in web modules (web, webmvc, webflux, websocket) label Jun 3, 2020
@poutsma poutsma added this to the 5.2.7 milestone Jun 4, 2020
@poutsma poutsma added type: bug A general bug and removed status: waiting-for-triage An issue we've not yet triaged or decided on labels Jun 4, 2020
@poutsma
Copy link
Contributor

poutsma commented Jun 4, 2020

The underlying issues seems to be that AbstractJackson2HttpMessageConverter and subclasses do not check the media type encoding if asked if they can read or write a given media type. As a result, the converter reports that it can write (for instance) "application/json;charset=ISO-8859-1", but in practice writes the default charset (UTF-8).

As you eluded to Jackson does not support non-unicode encodings and we cannot change that, but we can fix the issue above. This way, you can use a different XML converter (such as the Jaxb2RootElementHttpMessageConverter, which does support non-unicode charsets).

@spring-projects-issues spring-projects-issues added status: backported An issue that has been backported to maintenance branches and removed for: backport-to-5.1.x labels Jun 4, 2020
@poutsma poutsma changed the title Encoding issue with XML webservice AbstractJackson2HttpMessageConverter does not respect character encoding in canRead/canWrite Jun 4, 2020
@saimonsez
Copy link
Author

Regarding Jacksons support of non-unicode, please also see FasterXML/jackson-dataformat-xml#315.

@poutsma poutsma changed the title AbstractJackson2HttpMessageConverter does not respect character encoding in canRead/canWrite Jackson message converters and codecs do not respect character encoding in canRead/canWrite Jun 5, 2020
@poutsma
Copy link
Contributor

poutsma commented Jun 5, 2020

Regarding Jacksons support of non-unicode, please also see FasterXML/jackson-dataformat-xml#315.

Thank you, I will keep an eye on that.

For now, my recommendation would be to use a different XML message converter for non-unicode charsets, such as the aforementioned Jaxb2RootElementHttpMessageConverter, which does support other encodings. To do so, you would have to override configureMessageConverters in your WebMvcConfigurationSupport configuration class (see here), to make sure that the JAXB message converter comes before the Jackson XML converter.

When the fix for this has been released, it will no longer be necessary to override configureMessageConverters, because the Jackson converters will no longer claim they support non-unicode encodings. You would still have to override extendMessageConverters in your configuration class to add the JAXB converter though, because it is not added by default.

@poutsma
Copy link
Contributor

poutsma commented Jun 5, 2020

This issue also occurs in the Jackson codecs (i.e. Jackson2CodecSupport and subclasses), so I will fix it there as well.

poutsma added a commit that referenced this issue Jun 5, 2020
Before this commit, AbstractJackson2HttpMessageConverter and subclasses
did not check media type encoding in the canRead and canWrite
methods. As a result, the converter reported that it can write
(for instance) "application/json;charset=ISO-8859-1", but in practice
wrote the default charset (UTF-8).

This commit fixes that bug.

See: gh-25076
poutsma added a commit that referenced this issue Jun 5, 2020
Before this commit, Jackson2CodecSupport and subclasses
did not check media type encoding in the supportsMimeType
method (called from canEncode/canDecode).
As a result, the encoder reported that it can write
(for instance) "application/json;charset=ISO-8859-1", but in practice
wrote the default charset (UTF-8).

This commit fixes that bug.

Closes: gh-25076
poutsma added a commit that referenced this issue Jun 5, 2020
Before this commit, AbstractJackson2HttpMessageConverter and subclasses
did not check media type encoding in the canRead and canWrite
methods. As a result, the converter reported that it can write
(for instance) "application/json;charset=ISO-8859-1", but in practice
wrote the default charset (UTF-8).

This commit fixes that bug.

See: gh-25076
poutsma added a commit that referenced this issue Jun 5, 2020
Before this commit, Jackson2CodecSupport and subclasses
did not check media type encoding in the supportsMimeType
method (called from canEncode/canDecode).
As a result, the encoder reported that it can write
(for instance) "application/json;charset=ISO-8859-1", but in practice
wrote the default charset (UTF-8).

This commit fixes that bug.

Closes: gh-25076
poutsma added a commit that referenced this issue Jun 5, 2020
Before this commit, AbstractJackson2HttpMessageConverter and subclasses
did not check media type encoding in the canRead and canWrite
methods. As a result, the converter reported that it can write
(for instance) "application/json;charset=ISO-8859-1", but in practice
wrote the default charset (UTF-8).

This commit fixes that bug.

See: gh-25076
@poutsma poutsma closed this as completed in 5f1326f Jun 5, 2020
@poutsma
Copy link
Contributor

poutsma commented Jun 5, 2020

Fixed. Once Jackson's ToXmlGenerator is no longer hardcoded to using unicode (FasterXML/jackson-dataformat-xml#315), we can consider using an OutputStreamWriter instead of the current OutputStream/JsonEncoding combination when invoking the Jackson mapper.

kenny5he pushed a commit to kenny5he/spring-framework that referenced this issue Jun 21, 2020
Before this commit, AbstractJackson2HttpMessageConverter and subclasses
did not check media type encoding in the canRead and canWrite
methods. As a result, the converter reported that it can write
(for instance) "application/json;charset=ISO-8859-1", but in practice
wrote the default charset (UTF-8).

This commit fixes that bug.

See: spring-projectsgh-25076
kenny5he pushed a commit to kenny5he/spring-framework that referenced this issue Jun 21, 2020
Before this commit, Jackson2CodecSupport and subclasses
did not check media type encoding in the supportsMimeType
method (called from canEncode/canDecode).
As a result, the encoder reported that it can write
(for instance) "application/json;charset=ISO-8859-1", but in practice
wrote the default charset (UTF-8).

This commit fixes that bug.

Closes: spring-projectsgh-25076
@nikomiranda
Copy link

nikomiranda commented Jul 29, 2020

Is there a way to go back to the behavior before 5.2.7 ? I have a client that send ;charset=ISO-8859-1 but I want to read/write in UTF-8. Somehow force to ignore the charset.

@poutsma
Copy link
Contributor

poutsma commented Aug 4, 2020

Is there a way to go back to the behavior before 5.2.7 ? I have a client that send ;charset=ISO-8859-1 but I want to read/write in UTF-8. Somehow force to ignore the charset.

The only way I can think of is to write a MVC interceptor or HTTP servlet filter to change the request character set for the URL and HTTP method the client uses.

@nikomiranda
Copy link

Is there a way to go back to the behavior before 5.2.7 ? I have a client that send ;charset=ISO-8859-1 but I want to read/write in UTF-8. Somehow force to ignore the charset.

The only way I can think of is to write a MVC interceptor or HTTP servlet filter to change the request character set for the URL and HTTP method the client uses.

I am not sure how to modify a header of HttpServletRequest using the MVC interceptor or a Filter without wrapping the request. It's possible? The private method readJavaType reads the charset from the Http request Content-Type.

FelixFly pushed a commit to FelixFly/spring-framework that referenced this issue Aug 16, 2020
Before this commit, AbstractJackson2HttpMessageConverter and subclasses
did not check media type encoding in the canRead and canWrite
methods. As a result, the converter reported that it can write
(for instance) "application/json;charset=ISO-8859-1", but in practice
wrote the default charset (UTF-8).

This commit fixes that bug.

See: spring-projectsgh-25076
FelixFly pushed a commit to FelixFly/spring-framework that referenced this issue Aug 16, 2020
Before this commit, Jackson2CodecSupport and subclasses
did not check media type encoding in the supportsMimeType
method (called from canEncode/canDecode).
As a result, the encoder reported that it can write
(for instance) "application/json;charset=ISO-8859-1", but in practice
wrote the default charset (UTF-8).

This commit fixes that bug.

Closes: spring-projectsgh-25076
zx20110729 pushed a commit to zx20110729/spring-framework that referenced this issue Feb 18, 2022
Before this commit, AbstractJackson2HttpMessageConverter and subclasses
did not check media type encoding in the canRead and canWrite
methods. As a result, the converter reported that it can write
(for instance) "application/json;charset=ISO-8859-1", but in practice
wrote the default charset (UTF-8).

This commit fixes that bug.

See: spring-projectsgh-25076
zx20110729 pushed a commit to zx20110729/spring-framework that referenced this issue Feb 18, 2022
Before this commit, Jackson2CodecSupport and subclasses
did not check media type encoding in the supportsMimeType
method (called from canEncode/canDecode).
As a result, the encoder reported that it can write
(for instance) "application/json;charset=ISO-8859-1", but in practice
wrote the default charset (UTF-8).

This commit fixes that bug.

Closes: spring-projectsgh-25076
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in: web Issues in web modules (web, webmvc, webflux, websocket) status: backported An issue that has been backported to maintenance branches type: bug A general bug
Projects
None yet
Development

No branches or pull requests

5 participants
@poutsma @nikomiranda @spring-projects-issues @saimonsez and others