Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON renderer uses invalid Content-Type #1611

Closed
dstufft opened this issue Mar 17, 2015 · 5 comments
Closed

JSON renderer uses invalid Content-Type #1611

dstufft opened this issue Mar 17, 2015 · 5 comments

Comments

@dstufft
Copy link
Contributor

dstufft commented Mar 17, 2015

When using the JSON renderer Pyramid adds this header: Content-Type: application/json; charset=UTF-8. However according to the IANA the application/json media type does not actually support a charset. It is always in UTF-8 there is no other valid encoding.

@mmerickel
Copy link
Member

Note: No "charset" parameter is defined for this registration.
Adding one really has no effect on compliant recipients.

I guess I don't see what the issue is. The JSON RFC defines the content as UTF-8 by default but that other encodings are also valid. We are just being cautious, and according to your link still compliant, no?

@dstufft
Copy link
Contributor Author

dstufft commented Mar 17, 2015

It's not a major issue, but it's not compliant no. Any compliant parser will ignore the value because the application/json mimetype does not have any parameters (required or optional). It's as meaningless as putting application/json; frob=lob.

The only valid encodings for JSON is UTF8, UTF16, and UTF32. The way a compliant JSON parser determines encoding is by looking at the first four octects:

Encoding

JSON text SHALL be encoded in Unicode. The default encoding is
UTF-8.

Since the first two characters of a JSON text will always be ASCII
characters [RFC0020], it is possible to determine whether an octet
stream is UTF-8, UTF-16 (BE or LE), or UTF-32 (BE or LE) by looking
at the pattern of nulls in the first four octets.

      00 00 00 xx  UTF-32BE
      00 xx 00 xx  UTF-16BE
      xx 00 00 00  UTF-32LE
      xx 00 xx 00  UTF-16LE
      xx xx xx xx  UTF-8

and

IANA Considerations

The MIME media type for JSON text is application/json.

Type name: application

Subtype name: json

Required parameters: n/a

Optional parameters: n/a

Encoding considerations: 8bit if UTF-8; binary if UTF-16 or UTF-32

JSON may be represented using UTF-8, UTF-16, or UTF-32. When JSON
is written in UTF-8, JSON is 8bit compatible. When JSON is
written in UTF-16 or UTF-32, the binary content-transfer-encoding
must be used.

FWIW I'm OK if this is wontfixed too, I just noticed that it was generating a meaningless (to a compliant parser) value and figured I'd open an issue incase y'all cared.

@mmerickel
Copy link
Member

Someone disagrees with you. The relevant code[1] is in webob (where this issue belongs). Webob is explicitly adding a charset to application/json. A simple fix if webob stays the same is to delete the charset after setting the content type.

resp = Response(content_type='application/json')
del resp.charset

resp = Response()
resp.content_type = 'application/json'
del resp.charset

resp = Response()
resp.headers['Content-Type'] = 'application/json'
# it looks like you can trick webob since it doesn't monitor headers for changes.

[1] https://github.com/Pylons/webob/blob/master/webob/response.py#L124

@nnja
Copy link
Contributor

nnja commented Apr 14, 2015

punt to webob

@mcdonc
Copy link
Member

mcdonc commented Jun 6, 2015

Closing in this tracker, as it's now being tracked in WebOb.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants