Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sanitize request body for all HTTP verbs #15

Merged
merged 1 commit into from
Jul 7, 2014
Merged

Sanitize request body for all HTTP verbs #15

merged 1 commit into from
Jul 7, 2014

Conversation

ntalbott
Copy link
Contributor

@ntalbott ntalbott commented Jul 7, 2014

The HTTP spec allows a GET body, and Rails violates the intent of the
spec by parsing that body (and I wouldn't be surprised if other
frameworks do as well). Unfortunately this means that a poorly behaved
client (looking at you, EasuoSpider) can submit bad UTF-8 in a GET and
cause upstream problems.

This fixes the issue by sanitizing rack.input - the request body - for
all request methods.

Closes #14.

@bf4
Copy link
Collaborator

bf4 commented Jul 7, 2014

Hi @ntalbott Funny, I've been getting that same spider myself, as well. I'm ok with this since the code guards against an empty rack.input anyway. Also implicated in another rack bug rack/rack#337 (comment)

@@ -33,10 +33,7 @@ def call(env)
PUT = 'PUT'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's remove the above two lines PUT = 'PUT' since we're not using them anymore

The HTTP spec allows a GET body, and Rails violates the intent of the
spec by parsing that body (and I wouldn't be surprised if other
frameworks do as well). Unfortunately this means that a poorly behaved
client (looking at you, EasuoSpider) can submit bad UTF-8 in a GET and
cause upstream problems.

This fixes the issue by sanitizing `rack.input` - the request body - for
all request methods.

Closes #14.
@ntalbott
Copy link
Contributor Author

ntalbott commented Jul 7, 2014

@bf4 removed those lines.

bf4 added a commit that referenced this pull request Jul 7, 2014
Sanitize request body for all HTTP verbs
@bf4 bf4 merged commit 1b09926 into whitequark:master Jul 7, 2014
@ntalbott
Copy link
Contributor Author

ntalbott commented Jul 8, 2014

Awesome - thanks for the quick merge. Putting this into production today to replace my temporary hack.

@j15e
Copy link

j15e commented Jul 9, 2014

Thanks, EasouSpider is hell on earth.

@henrik
Copy link

henrik commented Jul 10, 2014

I actually just got an "invalid %-encoding" even after adding this lib (including the fix above) to our Rails app.

Full backlog: https://gist.github.com/henrik/9d1c3882a9a34ed924f8

Summarized:

ArgumentError (invalid %-encoding (jI�bw\�]X��z��|/�O�������`���[�==C��|�0%|^���B3��74���
����7~o�w������6\�7rA�[]9�-8��w�1��A�5}sL�
�FB�����-��<0�)��4���S����s������ ��#�?��sH
z�
  gTlXh��EnTvI��YORTm_���)L   h�#��=v���D�oQ`�g��d
                                                    }�}e���N0U#0��z�)):
  /opt/app_ruby/lib/ruby/2.1.0/uri/common.rb:901:in `decode_www_form_component'
  rack (1.4.5) lib/rack/utils.rb:41:in `unescape'
  rack (1.4.5) lib/rack/utils.rb:94:in `block (2 levels) in parse_nested_query'
…
  actionpack (3.2.19) lib/action_dispatch/http/request.rb:275:in `parse_query'
  rack (1.4.5) lib/rack/request.rb:209:in `POST'
…
  rack-cache (1.2) lib/rack/cache/context.rb:51:in `call'
  /data/auctionet/shared/bundled_gems/ruby/2.1.0/bundler/gems/rack-utf8_sanitizer-053665e30a14/lib/rack/utf8_sanitizer.rb:15:in `call'
  honeybadger (1.15.3) lib/honeybadger/rack/error_notifier.rb:43:in `call'
…
  /data/auctionet/shared/bundled_gems/ruby/2.1.0/bin/unicorn:23:in `<main>'

@j15e
Copy link

j15e commented Jul 10, 2014

^ +1 still receiving errors from the damn EasouSpider 😢

@ntalbott
Copy link
Contributor Author

@henrik @j15e I'd need to see the headers and preferably the whole Rack env to figure out what's still getting through.

We haven't seen the issue on our side since deploying this, so I don't have anything to go off of yet.

@henrik
Copy link

henrik commented Jul 10, 2014

@ntalbott Thank you. Don't think we get that in any logs currently, but I added some code to hopefully store it next time (via request.env in Rails controllers).

@bf4
Copy link
Collaborator

bf4 commented Jul 11, 2014

Hey, so this encoding issue that @henrik posted is, depending on your opinion, a bug in Ruby, a bug in Rack, or the job of your web server.

tell-tale signs for this diagnosis:

  1. "invalid %-encoding"
  2. opt/app_ruby/lib/ruby/2.1.0/uri/common.rb:901:in 'decode_www_form_component'
    rack (1.4.5) lib/rack/utils.rb:41:in 'unescape'

It is outside the scope of this middleware, per @raggi there and as agreed by the passenger team, 2

See rack/rack#337 (comment) for discussion. There, I link to a middleware I wrote that triggers and handles this issue, which is how I chose to deal with it, in absence of a change to the web server.

I did briefly looking into putting hooks for before/after filters to this middleware, but I think it got too bloated and complicated and too out-of-scope too quickly. Also, this is @whitequark 's project, and an important part of any app that uses it, so I am being particularly conservative :)

@henrik
Copy link

henrik commented Jul 11, 2014

Had some more errors but the debug code didn't work as I hoped.

@bf4 Thank you. Adding your middleware and hoping for the best!

@henrik
Copy link

henrik commented Jul 12, 2014

So now I'm using latest rack-utf8_sanitizer plus the middleware by @bf4 and I still got this error tonight: https://gist.github.com/henrik/ed1ce44d1244196c8ea5

app/middleware/handle_invalid_percent_encoding.rb is the middleware by @bf4.

If I understand it correctly, what's happening is that the percent encoding middleware runs first, triggers an exception that it's not designed to handle (namely, the situation that rack-utf8_sanitizer is designed to handle), and just raises it.

So I switched their order and will see if that works better. In my config/application.rb:

    # NOTE: These must be in this order relative to each other.
    # HandleInvalidPercentEncoding just raises for encoding errors it doesn't cover
    # when it should pass them on to Rack::UTF8Sanitizer.
    config.middleware.insert 0, HandleInvalidPercentEncoding
    config.middleware.insert 0, Rack::UTF8Sanitizer  # from a gem

@bf4
Copy link
Collaborator

bf4 commented Jul 13, 2014

fwiw, I'm running it without errors as

  config.middleware.insert_before "Rack::Runtime", 'ExceptionApp'
  config.middleware.insert_before "Rack::Runtime", Rack::UTF8Sanitizer

(haven't changed the 'Rack::Runtime' to 0 yet)

I added the 'Rails' version of the 'Exception App' middleware in a comment https://gist.github.com/bf4/d26259acfa29f3b9882b#comment-1262675 I don't think it should make a difference, though.

@henrik
Copy link

henrik commented Jul 13, 2014

Have had this switched-order setup (with the middleware inserted the way I pasted it, and with "the Rack version" rather than "the Rails version") deployed to my Rails app for maybe 6 hours now – not a single error so far. Promising!

@henrik
Copy link

henrik commented Jul 13, 2014

This seems like a simpler solution than this gem + the middleware if it works: http://stackoverflow.com/a/24637719/6962

@whitequark
Copy link
Owner

Do you really want to catch all ArgumentErrors? E.g. "foo".index() will be eaten by it...

@henrik
Copy link

henrik commented Jul 13, 2014

@whitequark That's a very good point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bad UTF-8 in GET body
5 participants