Fix: content_type/charset handling #261

digitalresistor · 2016-07-09T23:10:21Z

Response.content_type removes charset unless the new content_type is a text like type that has a charset parameter
Reverted change that was made in Improve charset defaults in Response #253 whereby Response.charset would not allow the user to set a charset if the content type was JSON. I am not a big fan of trying to stop the user from shooting themselves in the foot, and would rather let them add/remove as needed.
json.dumps/loads are now always UTF-8. Python returns a string to us, and we can encode it as we see fit.
Response.__init__ has been cleaned up to remove a lot of extraneous branch conditions that were only complicating the logic.

digitalresistor · 2016-07-09T23:13:53Z

Closes #130 and #237

digitalresistor · 2016-07-09T23:59:31Z

Closes #236

digitalresistor · 2016-07-10T01:26:25Z

This change breaks Pyramid because it treats JSON as text, and attempts to get/set .text on the Response object.

cdent · 2016-07-11T16:58:15Z

I reckon a common way in which this change may surprise people is if they have tests which were written to expect (the incorrect) application/json; charset=UTF-8. With these changes they'll now get (the correct) application/json. Since such tests are wrong™ I reckon that's okay.

digitalresistor · 2016-07-11T19:47:27Z

Any changes made in this area were going to have this issue, I'd argue that I'd rather have WebOb do the right thing rather than continue sending an invalid content-type.

digitalresistor · 2016-07-12T17:22:37Z

@cdent Wanted to add a little more of a note: in WebOb 1.6 there's already been some work to remove charset=UTF-8 from certain responses that accidentally contained it, so I don't think it will be that big of a surprise to people.

digitalresistor · 2016-07-17T01:05:58Z

CHANGES.txt

+
+     # Will raise
+     try:
+        print(res.text)


mmerickel · 2016-07-17T02:13:22Z

Is there any planned deprecation warnings for this feature? Or just ripping the bandaid off?

digitalresistor · 2016-07-17T02:18:34Z

@mmerickel What part of it? The content-type/charset stuff is fixing the remainder of the items that should have been fixed in 1.6 but were inadequately completed.

The default_encoding is something I am not happy with, and is a change I am no longer planning to make.

Addressing #236

Refactored the logic in Response.__init__ to handle default charset more consistently. Added logic in the charset setter that ignores attempts to set it on JSON content types. Removed explicit charset specification from exceptions since this is handled correctly for the text types within Response. Fixed some affected tests, and added assertions for content types in exceptions. Addresses #237

This way the next time I come across this I don't have to spend 20 minutes trying to figure out why status_code wasn't being used.

It is entirely possible that if there is no content_type passed by the user, no headerlist that contains a content type and there is no default content type that we don't have a content_type. In that case we want to make sure we don't set anything in the header dictionary since None is not a valid header to return in a Response.

Overhaul a lot of the logic in Response.__init__ to be smaller, and remove a lot of extra logic checking that didn't belong in __init__ but instead in the properties themselves.

We no longer want to specialise application/json, instead attempt to do the right thing as best as possible all the time.

Don't try to stop the user from shooting themselves in the foot, if the user explicitly asks for a charset, even on a content type that may not support it as a paramater who are we tell them they can't do that.

Makes pytest output more sane

Just encode the json.dumps as UTF-8, then raise an error if the user passes a text type. This basically undoes a bunch of changes where we were attempting to stop the user from shooting their own foot off. Let them shoot.

The docs fairy came to town!

digitalresistor · 2016-07-17T02:34:18Z

webob/response.py

+
+                if 'charset' in params:
+                    if not _content_type_has_charset(value):
+                        warn_deprecation(


This should maybe be turned into a RuntimeWarning instead.

digitalresistor · 2016-07-17T02:34:21Z

The only backwards incompatible change is https://github.com/Pylons/webob/pull/261/files#diff-84ba6a58e29e169ddf6578bceca65dc9R768. Whereby if you currently have a content-type with a charset, and you then you replace it with one that doesn't, it is explicitly removed, but the rest of the parameters stick around.

I don't see a way to gracefully deprecate the removal of charset, since that is the crux of the issue. For the other parameters it can continue as is for now, and full removal of all of that can be done at a later point in time.

- make sub-classing notes an HTML list - fix grammar and punctuation - rewrap as needed

digitalresistor · 2016-07-18T16:37:27Z

Thanks @stevepiercy. I need to make a couple minor changes to this, and then this will be ready to go.

I forgot to document app_iter for the constructor so I need to do that.

When the new Content-Type value did not contain any semicolons we attempted to save any existing parameters from the existing Content-Type headers. If that header was empty, it would do nothing, and no Content-Type would be set at all. The new branch explicitly sets the Content-Type to the value if we don't find any paramters to save.

We already state that if the headerlist contains a Content-Type, then we won't accept the passed in value, nor will we set it to a default content type. There is no reason to pull the value out of the header because if it exists we won't set the content_type anyway. We have also have all of these fantastic properties, why are we changing the underlying headers list directly when we can just use the property and have it do it's appropriate magic.

If the caller provides the headerlist, we shouldn't try to be too smart and add in a default Content-Type. This is used by Request.get_response() for example to create a Response object that matches the Response received from a WSGI application. We do still add a default charset even if there is none provided on Content-Types that are known to allow for a charset (basically texty responses). This way Response.text will function without additional work.

The charset provided to the constructor should not be used if a charset is already set on the Content-Type.

…dling

digitalresistor added this to the 1.7.0 milestone Jul 9, 2016

This was referenced Jul 9, 2016

Improve charset defaults in Response #253

Closed

default_content_type does not match documented value #213

Closed

This was referenced Jul 9, 2016

Response object adding default_content_type is perhaps bad magic #205

Closed

Oddities due to having a default_content_type/charset #238

Closed

digitalresistor mentioned this pull request Jul 10, 2016

JSON is not text and has no charset (upcoming WebOb changes) Pylons/pyramid#2691

Closed

digitalresistor reviewed Jul 17, 2016
View reviewed changes

CHANGES.txt

# Will raise

try:

print(res.text)

Copy link

Member Author

digitalresistor Jul 17, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo.

rylz and others added 15 commits July 16, 2016 20:23

whatsnew-1.6: more detail in note on JSON charset

ff5bb96

Addressing #236

Make PEP8 happier

76c89bd

Document why we are doing what we are doing

d3fba26

This way the next time I come across this I don't have to spend 20 minutes trying to figure out why status_code wasn't being used.

Remove nocovers

cd944b6

Remove unnecessary statement

f576afc

Remove todo

2655b08

Clean up Response.__init__

7ea15ed

Overhaul a lot of the logic in Response.__init__ to be smaller, and remove a lot of extra logic checking that didn't belong in __init__ but instead in the properties themselves.

Remove _is_json and add _content_type_has_charset

f4201f3

We no longer want to specialise application/json, instead attempt to do the right thing as best as possible all the time.

Allow charset on all content_types

188a417

Don't try to stop the user from shooting themselves in the foot, if the user explicitly asks for a charset, even on a content type that may not support it as a paramater who are we tell them they can't do that.

Flip assert around

840f3ae

Makes pytest output more sane

Verify that content_type doesn't modify parameters if passed

c564b71

json.dumps/loads is now always UTF-8

f9bcd6a

Instead of del, call prop function directly

3e7622b

digitalresistor added 5 commits July 16, 2016 20:23

Fixup tests

6b38c52

Make charset a first-class keyword argument

dabc50c

Add documentation for why lines exist

2afaa3b

Remove body_encoding

04082d7

Just encode the json.dumps as UTF-8, then raise an error if the user passes a text type. This basically undoes a bunch of changes where we were attempting to stop the user from shooting their own foot off. Let them shoot.

Documentation for Response()

141d14e

The docs fairy came to town!

digitalresistor force-pushed the fix/charset_handling branch from 774148a to 141d14e Compare July 17, 2016 02:25

digitalresistor reviewed Jul 17, 2016
View reviewed changes

digitalresistor and others added 2 commits July 16, 2016 20:45

Clarify headerlist

5115765

improve presentation of class variables

2dfee0f

- make sub-classing notes an HTML list - fix grammar and punctuation - rewrap as needed

digitalresistor added 12 commits July 29, 2016 21:10

Turn DeprecationWarning into RuntimeWarning

165a6ae

charset is not *that* special

69a512e

Update more documentation

892fe1e

Revert doc changes

d814273

Remove :py from Sphinx declarations

b8f11bf

Add new tests

160326a

Don't set charset if already set

451bf7b

The charset provided to the constructor should not be used if a charset is already set on the Content-Type.

Merge branch 'master' of github.com:Pylons/webob into fix/charset_han…

27f8e2f

…dling

Use full word, not shorthand

d6521b6

digitalresistor merged commit e632139 into master Jul 30, 2016

digitalresistor deleted the fix/charset_handling branch July 30, 2016 08:09

digitalresistor mentioned this pull request Jul 31, 2016

Response wrongly keeps charset in Content-Type header #130

Closed

digitalresistor mentioned this pull request Dec 10, 2016

Feature: remove ctype params #301

Merged

jensens mentioned this pull request Dec 28, 2016

Using WebOb 1.7.0 tests are failing plone/diazo#67

Closed

pyup-bot mentioned this pull request Nov 3, 2017

Update webob to 1.7.3 tracim/tracim#434

Closed

pyup-bot mentioned this pull request Jan 26, 2018

Update webob to 1.7.4 DemocracyClub/yournextrepresentative#407

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix: content_type/charset handling #261

Fix: content_type/charset handling #261

digitalresistor commented Jul 9, 2016 •

edited

Loading

digitalresistor commented Jul 9, 2016

digitalresistor commented Jul 9, 2016

digitalresistor commented Jul 10, 2016

cdent commented Jul 11, 2016

digitalresistor commented Jul 11, 2016

digitalresistor commented Jul 12, 2016

digitalresistor Jul 17, 2016

mmerickel commented Jul 17, 2016

digitalresistor commented Jul 17, 2016

digitalresistor Jul 17, 2016

digitalresistor commented Jul 17, 2016

digitalresistor commented Jul 18, 2016

Fix: content_type/charset handling #261

Fix: content_type/charset handling #261

Conversation

digitalresistor commented Jul 9, 2016 • edited Loading

digitalresistor commented Jul 9, 2016

digitalresistor commented Jul 9, 2016

digitalresistor commented Jul 10, 2016

cdent commented Jul 11, 2016

digitalresistor commented Jul 11, 2016

digitalresistor commented Jul 12, 2016

digitalresistor Jul 17, 2016

Choose a reason for hiding this comment

mmerickel commented Jul 17, 2016

digitalresistor commented Jul 17, 2016

digitalresistor Jul 17, 2016

Choose a reason for hiding this comment

digitalresistor commented Jul 17, 2016

digitalresistor commented Jul 18, 2016

digitalresistor commented Jul 9, 2016 •

edited

Loading