Improve CSRF Support for Safety, REST, and Decoupling #1573

dstufft · 2015-02-10T18:23:06Z

The current CSRF support in Pyramid suffers from a few problems:

It's needlessly coupled to sessions, if you wish to use a CSRF token without a session you have to make a fake ISession which has exception raising session methods.
Since it uses the predicate system, failing a CSRF check is signaled back to the user as a 404 instead of something more appropriate like a 403 error. This makes users feel like they got the wrong URL not the wrong CSRF token.
Views which use the CSRF token need a Vary: header based on the kind of session that it's using (Vary: Cookie for example or Vary: Authorization), however the current CSRF system doesn't do anything about this.
The CSRF protection is opt in, which means that it's extremely easy for developers to accidentally leave it off. If they leave it off then their views will be vulnerable to CSRF but that is a non-obvious situation that, like many security sensitive things, is non obvious unless you're looking for it since everything will appear to work.

I extracted some CSRF stuff from an application I'm writing in Pyramid and sketched out some basic ideas for how I think CSRF should be implemented. That's available (without tests or documentation) at https://github.com/dstufft/pyramid_csrf.

The basic idea there is:

Instead of making CSRF coupled to ISession, it adds it's own interface ICSRF (and ICSRFFactory), the ICSRF instance is made available as request.csrf.
- Splitting this off of session means that CSRF can be implemented without requiring a session to be around. My pyramid_csrf app includes 3 different implementations, one is CookieCSRF which just stores the tokens it generates inside of a dedicated cookie, another is SessionCSRF which just stores the tokens it generates inside of the session, and the final is LegacySessionCSRF which doesn't store or generate tokens itself at all, and instead it just dispatches to the legacy methods on ISession.
Instead of using the predicate system, it relies on being part of the request -> response cycle of calling a view, this means that instead of getting a 404 error it'll have a 403 error raised while attempting to call the view.
Since (again) this is part of the request -> response cycle, it's able to smartly add Vary headers to the response to ensure that caches don't serve a page with a CSRF token cached inside of them.
It adds code in the default response path that ensure CSRF checking is done, in pyramid_csrf this is done in a hacky way by creating a dynamic subclass of the current view mapper that wraps every view with a decorator, however with support in Pyramid for some sort of on by default mechanism or just baking CSRF directly into it this hack wouldn't need to exist.
You can disable the CSRF system completely by calling config.registry.unregisterUtility(provider=ICSRFFactory).

A key part in being able to make this on by default is that any particular view can exist in 3 states:

i. No CSRF is configured for this view, in this case attempting to use an "unsafe" action (POST, PUT, etc) will raise a HTTPMethodNotAllowed without any vary headers.
ii. The view is configured to be exempt, in this case the CSRF system is no-oped for this view and no CSRF checking is done and no vary headers are added. This is also what happens for every view if the CSRF system is disabled.
iii. The view is configured to be csrf protected, in this case the CSRF system will kick in, check the CSRF token, and either allow the view code to be executed if things check out or will raise a 403 error otherwise. In either case a Vary header is added based on the type of CSRF backend being used.

The example code I wrote isn't meant to be drop-in ready for Pyramid, as it has a number of problems that I know of offhand:

It assumes that all CSRF tokens will Vary based on Cookie, but that should be dependent on the backend.
Things are written to prioritize readability over performance so things aren't always written how Pyramid would want them.
It hardcodes the name of the CSRF header and POST variable but this should be configurable in the real system.
It's possible that it'd prefer something like config.unset_csrf_factory() instead of config.registry.unregisterUtility(provider=ICSRFFactory).
It uses decorators to implement marking a view as exempt or protected, but probably this would be better suited to be part of @view_config or something like that.
It just defaults to CookieCSRF, but probably Pyramid would want to have a dynamic default where it look to see if a ISessionFactory has been registered, and if it has then default to LegacySessionCSRF, and if it hasn't then defualt to CookieCSRF. Maybe it'd also be possible to make those legacy csrf methods on ISession optional, and if a sesion factory is registered that doesn't have those it'd use SessionCSRF instead.
It uses a super hacky dynamic subclass of whatever default view mapper is configured in order to implement "on by default" behavior.

There are some unrelated to the mechanism itself, but still improvements to the CSRF system found in pyramid_csrf as well that would be good to get into Pyramid either way:

It introduces the concept of "scoped" CSRF tokens, these are tokens which are valid only for a particular "scope" and you can tell pyramid_csrf that a particular view is using a particular scope (a scope is just a string). If a particular view is scoped then only a CSRF token that matches that scope will be accepted. This limits the damage that a leaked CSRF token can cause, since it's only valid for that one action. You can implement this without needing any additional storage by doing HMAC_sha512(unscoped_token, scope), and even the unscoped token could be implemented that way, just with a scope of "".
It adds origin verification when running under HTTPS. This will attempt to look at the Origin header, and if failing that will fall back to the Referer header. This check will ensure that when your CSRF tokens depend on the value of some cookie (whether session or jsut a csrf cookie) that a subdomain or a root domain cannot submit a CSRF request since their origin wouldn't match. This particular check has been in Django for some time, and there is a research paper showing that on same-domain HTTPS sites the referer header is only missing in 0.2% of cases (See https://github.com/django/django/blob/a3473454ada67c0a16efeabcb78950641d4ac93c/django/middleware/csrf.py#L134-L162 and https://sparrow.ece.cmu.edu/group/731-s11/readings/csrf.pdf).

The text was updated successfully, but these errors were encountered:

dstufft · 2015-02-10T18:26:54Z

Oh, and I forgot to mention: I don't expect the code in pyramid_csrf to be taken wholesale, it's just a small proof of concept and I hope with this issue to get some feedback on the high level idea and whether or not it's something Pyramid would want to do before I bothered to dig into what the best way to integrate it into Pyramid itself and make a PR like that. If people think that the high level idea has merit then I'll see about making up a PR that implements it for real inside of Pyramid.

dstufft · 2015-02-10T18:34:14Z

Oh, and using this inside of a template is similar to how you use things today, an example in Jinja2:

<input name="csrf_token" type="hidden" value="{{ request.csrf.get_scoped_token('accounts.login') }}">

Other enhancement to make that better could be:

Since the view would be configured with what POST variable the CSRF system expects to be able to seen, that could be added to the request and request.csrf could add a method to retrieve the expect form field name.
Since the view would be configured with what scope the CSRF expects to be able to seen, that could also be added to the request and request.csrf.get_token() could simply always return a scoped token, either scoped to "" if the view was not configured with a particular scope, or scoped to whatever the view was configured with.

mmerickel · 2016-07-17T06:52:28Z

I think every point here was addressed by the CSRF improvements in 1.7 except for 2. We don't allow for scoped tokens and we are still coupled to the session.

I see no point in scoped tokens however decoupling the CSRF token from the session is something I'm happy to consider if we can do it in a bw-compat way (for example request.get_csrf_token() would need to fallback to request.session.get_csrf_token() in the case that an ICSRFTokenPolicy is not defined.

Anyway just some thoughts incase someone wants to work on this.

mmerickel · 2017-04-30T23:40:51Z

Issues here should be fixed via #2854.

rmoorman · 2017-05-11T13:27:35Z

@mmerickel Are the Vary headers set correctly too when CSRF protection is enabled or is this something that can/ought to be done manually?

mmerickel · 2017-05-11T14:25:45Z

@rmoorman Good catch, the vary headers are not set automatically by the default policies. I would be happy to evaluate solutions to that problem but they would vary (pun intended) between the different storage mechanisms. For example for the session-based policies it is up to the session to do the right thing whereas the cookie-based policy may want to do its own thing if it makes sense.

In general you should expect to do it yourself right now.

rmoorman · 2017-05-11T14:54:46Z

@mmerickel I wouldn't have a good idea where to start in order to let the CSRF policies apply the correct Vary headers to the requests. I presume the implementation would be somewhere colocated with ICSRFStoragePolicy and the implementing policies, but adding to the policy interface just like that wouldn't be appropriate as the term "storage" wouldn't fit that well anymore i.m.o.

mmerickel · 2017-05-11T15:45:01Z

Well the "storage" policy is dictating how the CSRF token is tracked between requests as well. Usually via some form of cookie. If you look at the CookieCSRFStoragePolicy it should be clear how you might add a Vary header to the response but it would need to be done every time get_csrf_token is called as well as the other methods in order to properly annotate responses that may have been affected by the token cookie.

rmoorman · 2017-05-13T21:51:11Z

@mmerickel I see. Django for example does roughly the same (annotating the request object when the token is retrieved from somewhere through it's csrf.middleware.get_token function). Which other methods do you think are necessary? Isn't the get_csrf_token method the only one involved while fetching the token to use inside the templates?

domenkozar added feature-request security labels Apr 13, 2015

awwright mentioned this issue Aug 11, 2015

Generalize route matching for status codes #1873

Closed

mmerickel mentioned this issue Nov 19, 2016

Pyramid 2.0 possible feature list #2362

Closed

MatthewWilkes mentioned this issue Dec 7, 2016

Decouple CSRF protection from the session machinery #2854

Merged

mmerickel added this to the 1.9 milestone Apr 29, 2017

mmerickel closed this as completed Apr 30, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve CSRF Support for Safety, REST, and Decoupling #1573

Improve CSRF Support for Safety, REST, and Decoupling #1573

dstufft commented Feb 10, 2015

dstufft commented Feb 10, 2015

dstufft commented Feb 10, 2015

mmerickel commented Jul 17, 2016

mmerickel commented Apr 30, 2017

rmoorman commented May 11, 2017

mmerickel commented May 11, 2017 •

edited

Loading

rmoorman commented May 11, 2017

mmerickel commented May 11, 2017

rmoorman commented May 13, 2017

Improve CSRF Support for Safety, REST, and Decoupling #1573

Improve CSRF Support for Safety, REST, and Decoupling #1573

Comments

dstufft commented Feb 10, 2015

dstufft commented Feb 10, 2015

dstufft commented Feb 10, 2015

mmerickel commented Jul 17, 2016

mmerickel commented Apr 30, 2017

rmoorman commented May 11, 2017

mmerickel commented May 11, 2017 • edited Loading

rmoorman commented May 11, 2017

mmerickel commented May 11, 2017

rmoorman commented May 13, 2017

mmerickel commented May 11, 2017 •

edited

Loading