Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compression Fairing #781

Closed
wants to merge 9 commits into from
Closed

Compression Fairing #781

wants to merge 9 commits into from

Conversation

Martin1887
Copy link
Contributor

Clean Compression Fairing Pull Request with only a commit with all changes done about the compression fairing in a single commit.

Copy link
Collaborator

@jebrosen jebrosen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment about Content-Encoding headers is the most important one. Aside from minor documentation and formatting changes this looks pretty good, and I'll do tests this week to be sure. @SergioBenitez, any other major comments?

This may be worth adding to the guide, but I'm not sure what section it belongs in.

contrib/lib/src/compression.rs Outdated Show resolved Hide resolved
contrib/lib/src/compression.rs Outdated Show resolved Hide resolved
contrib/lib/src/compression.rs Outdated Show resolved Hide resolved
@jebrosen
Copy link
Collaborator

Sorry for the late response time this go around. I did make formatting, style, and documentation fixes (no significant code changes) but have not yet gotten to the "real-world" tests I planned to do. It's on my list for the weekend, though!

@Martin1887
Copy link
Contributor Author

@jebrosen, are your tests running successfully? Any comments or news about this feature?

Thanks for your work!

@jebrosen
Copy link
Collaborator

Whoops! I have been waiting for a reply to a comment I forgot to post:

Functionality wise, this is good. I have cleaned up the documentation and code style to closely match other things in contrib, and can merge whenever.

I only have one real issue, which is that "all responses are compressed except images or with the identity opt-out" compresses large responses and already-compressed data such as zip files. I would not feel bad releasing this with the disclaimer that it performs badly with those types of responses, but I would also like to add those restrictions in the future without feeling too bad about "breaking" anything. I also don't want to go to the other extreme of "compress only text/* responses."

@SergioBenitez, how would you feel about providing this functionality as-is with the possibility of restricting it or making it more configurable in the future?

@SergioBenitez
Copy link
Member

SergioBenitez commented Nov 4, 2018

This has come a long way. Nice job, @jebrosen and @Martin1887! There are quite a few style issues, but I figure you've already fixed those locally, @jebrosen.

I think this needs two things:

1) Configuration. At minimum, we should allow the user to say which media types not to compress, and we should provide a default list that works for most folks.

[global.compress]
exclude = ["application/gzip", "application/zip", "image/*"]

I think opt-out is the right approach here. The alternative is to ask the user to white-list instead, but that feels significantly more cumbersome and prone to forgetting something.

2) A Responder. We should allow users to compress individual responses without using a fairing. Presumable the responder wouldn't be beholden to the exclude list, which means you can selectively override the list as well.

This also needs to be rebased and reorganized so that it's behind a compression or compress module.

@Martin1887
Copy link
Contributor Author

I could work on these things, but firstly the last style commits should be pushed.

Not only code but also tests should be written to accomplish these tasks. What do you think @jebrosen?

@jebrosen
Copy link
Collaborator

jebrosen commented Nov 4, 2018

This is my "version" of the whole PR: https://paste.rs/BDF.diff - the bulk of the difference is renaming and reorganizing code and documentation to match the layout of other features in rocket_contrib, but notice a few other minor changes I made. It might not be fully up-to-date, but I'll do that comparison again when we're closer to merging.

I think both configuration and a Responder (as described) are a good idea. Don't hesitate to ping me here or on IRC if you have any questions.

@Martin1887
Copy link
Contributor Author

I have rebased the @SergioBenitez master branch and now everything compiles successfully and all tests pass.

@jebrosen, I saw in your diff that you had move the extern crate brotli and extern crate flate2 lines to compression.rs, but this does not compiles for me, so I have moved them again to lib.rs.

Tomorrow I will begin with the proposed changes.

Regards.

@Martin1887
Copy link
Contributor Author

I'm working on the exceptions read from the Rocket configuration and I have almost finished it.

I have added a field exceptions of type Vec<String> and a on_attach function. The problem that I have found is that the Compression struct cannot be modified inside the on_attach function because it is not mutable at this point.

I have seen that the templates fairing uses ContextManager to change the state of the Rocket instance inside the on_attach function, but this seems too cumbersome to me to only change one parameter. Is there any other alternatives to change the exceptions field of the Compression struct?

Another question that has arisen is about the responder: a GzipResponder and another BrotliResponder should be created or it is better having only one CompressionResponder? Maybe both options?

Thanks.

@SergioBenitez
Copy link
Member

I have added a field exceptions of type Vec and a on_attach function. The problem that I have found is that the Compression struct cannot be modified inside the on_attach function because it is not mutable at this point.

I have seen that the templates fairing uses ContextManager to change the state of the Rocket instance inside the on_attach function, but this seems too cumbersome to me to only change one parameter. Is there any other alternatives to change the exceptions field of the Compression struct?

Note that you shouldn't change Rocket's configuration for this. All of this should occur outside of Rocket. You can either 1) use a RwLock, or 2) put the configuration in managed state and retrieve it later.

Another question that has arisen is about the responder: a GzipResponder and another BrotliResponder should be created or it is better having only one CompressionResponder? Maybe both options?

I think we should just have a single Compressed Responder. We can add ways to configure it in the future, but for now, it suffices if it does what the fairing does minus checking exceptions.

@Martin1887
Copy link
Contributor Author

Martin1887 commented Nov 10, 2018

Regarding the ContentTypes I have noticed that ContentType::WASM should also be excluded for compression, since it is already compressed. This may apply also to ContentType::Binary. Do you think so? Any other ContentTypes to be excluded?

@Martin1887
Copy link
Contributor Author

I have implemented the compress.exclude configuration parameter in the Compression trait. The module has also been reorganized inside a folder.

The Content-Type comparison is slightly ad-hoc, I would like receiving your advice to improve it:

fn skip_encoding(
        &self,
        content_type: &Option<rocket::http::ContentType>,
        content_type_top: &Option<&UncasedStr>,
        cm: &rocket::State<ContextManager>,
    ) -> bool {
        let context = cm.context();
        let exceptions = context.exceptions();
        exceptions
            .iter()
            .filter(|c| match content_type {
                Some(ref orig_content_type) => match MediaType::parse_flexible(c) {
                    Some(exc_media_type) => {
                        if exc_media_type.sub() == "*" {
                            Some(exc_media_type.top()) == *content_type_top
                        } else {
                            exc_media_type == *orig_content_type.media_type()
                        }
                    }
                    None => {
                        if c.contains("/") {
                            let split: Vec<&str> = c.split("/").collect();

                            let exc_media_type =
                                MediaType::new(String::from(split[0]), String::from(split[1]));

                            if split[1] == "*" {
                                Some(exc_media_type.top()) == *content_type_top
                            } else {
                                exc_media_type == *orig_content_type.media_type()
                            }
                        } else {
                            false
                        }
                    }
                },
                None => false,
            })
            .count()
            > 0
    }

Meanwhile, I will begin to implement the responder.

@Martin1887
Copy link
Contributor Author

I have just implemented the Compressed Responder and its tests.

The code also has been reorganized inside the compression module to share the common functions.

Only documentation and improvements should be missing now.

I expect your comments and reviews.

Regards.

Copy link
Collaborator

@jebrosen jebrosen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, I think I'm happy with the functionality (for real this time!)

My biggest concern now is that the fairing and the responder duplicate a lot of code. I think it's reasonable to extract out a single function, along the lines of compress_response(response: &mut Response, respect_excludes: bool).

I left some other comments below, but I will do another more careful review after the deduplication.

contrib/lib/Cargo.toml Outdated Show resolved Hide resolved
contrib/lib/src/compression/context.rs Outdated Show resolved Hide resolved
contrib/lib/src/compression/context.rs Outdated Show resolved Hide resolved
contrib/lib/src/compression/responder.rs Outdated Show resolved Hide resolved
contrib/lib/src/lib.rs Outdated Show resolved Hide resolved
@Martin1887
Copy link
Contributor Author

I have completed the suggested changes and added docstrings.

I have also unified the exclusions name instead of a mix of exclusions and exceptions.

I think that the below function and maybe documentation and style are the only things that could need improvements.

I have implemented the compress.exclude configuration parameter in the Compression trait. The module has also been reorganized inside a folder.

The Content-Type comparison is slightly ad-hoc, I would like receiving your advice to improve it:

fn skip_encoding(
        &self,
        content_type: &Option<rocket::http::ContentType>,
        content_type_top: &Option<&UncasedStr>,
        cm: &rocket::State<ContextManager>,
    ) -> bool {
        let context = cm.context();
        let exceptions = context.exceptions();
        exceptions
            .iter()
            .filter(|c| match content_type {
                Some(ref orig_content_type) => match MediaType::parse_flexible(c) {
                    Some(exc_media_type) => {
                        if exc_media_type.sub() == "*" {
                            Some(exc_media_type.top()) == *content_type_top
                        } else {
                            exc_media_type == *orig_content_type.media_type()
                        }
                    }
                    None => {
                        if c.contains("/") {
                            let split: Vec<&str> = c.split("/").collect();

                            let exc_media_type =
                                MediaType::new(String::from(split[0]), String::from(split[1]));

                            if split[1] == "*" {
                                Some(exc_media_type.top()) == *content_type_top
                            } else {
                                exc_media_type == *orig_content_type.media_type()
                            }
                        } else {
                            false
                        }
                    }
                },
                None => false,
            })
            .count()
            > 0
    }

Meanwhile, I will begin to implement the responder.

Copy link
Collaborator

@jebrosen jebrosen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I gave this another near-full pass. Pretty much all of these comments are style/organization changes. Excellent work!

contrib/lib/src/compression/fairing.rs Outdated Show resolved Hide resolved
contrib/lib/src/compression/fairing.rs Show resolved Hide resolved
contrib/lib/src/compression/fairing.rs Outdated Show resolved Hide resolved
contrib/lib/src/compression/fairing.rs Show resolved Hide resolved
match rocket.config().get_slice("compress.exclude") {
Ok(excls) => {
let mut error = false;
let mut exclusions_vec = Vec::with_capacity(excls.len());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can probably be written a bit more succinctly, along the lines of:

let exclusions: Result<Vec<String>, ConfigError> = excls.map(|e| match e {
    Value::String(s) => Ok(s.clone()),
    _ => Err(ConfigError::BadType(...)),
}).collect();

// Handle Ok, Err variants for exclusions...

It might be relatively easy to combine the error handling as well:

let exclusions = rocket.config().get_slice(...).and_then(|excls| ...);
let ctxt = match exclusions {
    Ok(e) => Context::with_exclusions(e),
    Err(ConfigError::Missing(_)) => { /* ignore missing */ },
    Err(e) => { e.pretty_print(); warn!(...) },
};

These should only be taken as rough ideas, I know they don't work directly as-is and it might be trickier than this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not been able to do this, mutability error in excls.map inside the match branch.

contrib/lib/src/compression/fairing.rs Outdated Show resolved Hide resolved
contrib/lib/src/compression/mod.rs Outdated Show resolved Hide resolved
contrib/lib/src/compression/mod.rs Outdated Show resolved Hide resolved
pub mod compression;

#[cfg(feature = "brotli_compression")]
extern crate brotli;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This extern crate really should be in compression.rs for consistency with the rest of contrib. I'll try to get it to work again before this is merged.

contrib/lib/tests/compressed.rs Outdated Show resolved Hide resolved
@Martin1887
Copy link
Contributor Author

I have pushed the last changes, sorry for the delay, these have been busy weeks.

I have marked as resolved the implemented changes and I have commented still open reviews.

Copy link
Collaborator

@jebrosen jebrosen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My apologies for being so slow on these. I know the the fairing functions basically as expected in production, and the Responder and configuration should be trivial to production-test when I have a bit more time. @SergioBenitez, were you planning on doing another review of this soon?

contrib/lib/src/compression/context.rs Outdated Show resolved Hide resolved
contrib/lib/src/compression/fairing.rs Show resolved Hide resolved
contrib/lib/src/compression/mod.rs Outdated Show resolved Hide resolved
@Martin1887
Copy link
Contributor Author

@jebrosen, @SergioBenitez, are you reviewing and testing the last changes?

For any question or proposal don't hesitate on doing it.

@Martin1887
Copy link
Contributor Author

Any news in this pull request?

@jebrosen
Copy link
Collaborator

I'd been waiting for @SergioBenitez to come back so I could run a few relatively minor implementation/API changes by the both of you:

  • Renaming the Compressed Responder to Compress -- Compressed sounded wrong because the inner type is not already compressed, it's something that will be compressed at response time.
  • Invalid exclusions. Rather than use the default exclusions if anything is wrong, I've changed it to warn for each invalid MIME type but continue with whichever ones are valid.

I did reorganize some documentation to keep up with the rest of contrib and made a few other minor style changes and fixes, but the actual implementation should be otherwise unchanged.

The only thing missing is an example -- is this worthy of a full example itself or is including it in another example somewhere enough?

@richard-uk1
Copy link

It's worth noting that compression can compromise security in certain circumstances (see BREACH). In order to be vulnerable, you need all of these things to be true for a response:

  1. It is compressed
  2. The attacker can control some part of the response (it reflects user input, for instance)
  3. The response contains a secret

I'm generally weary of compressing API data, and only compress static assets. Not sure how this applies here - maybe just a note in the docs, or maybe things like application/json should not be compressed by default. Up to you.


// Compression is done when the request accepts brotli or gzip encoding
// and the corresponding feature is enabled
if cfg!(feature = "brotli_compression") && CompressionUtils::accepts_encoding(request, "br")
Copy link

@meven meven Apr 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be nice to check gzip_compression and brotli_compression feature upon the fairing's on_attach and save in a variable which compression to use.
A user that did not read the doc might not see what he is missing, or think his responses are compressed but are not.
Plus it would save some instructions on every single requests.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only checks are 1) "is brotli enabled" and 2) "did this request say it supports brotli". 1 is done at compile time with the cfg!, and 2 must be checked again for every request.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I forgot cfg! was compile time only,

Great Pr btw

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still half my point was that the cfg! check should also be done in on_attach so that if none of the required features are set the user can get an error instead of discovering at runtime that something is amiss, given the user has not read properly the documentation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In lib.rs there is these lines, so if none of compression features are enabled a compile error is raised:

#[cfg(any(feature = "brotli_compression", feature = "gzip_compression"))]
pub mod compression;

Regards.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok didn't know about this.

@Martin1887
Copy link
Contributor Author

This pull request have been opened too much time. Any news @jebrosen?

It seems that the project is a bit paralyzed, it's a pity because it is a great project.

@jebrosen
Copy link
Collaborator

This pull request have been opened too much time. Any news jebrosen?

You're right, and this fell off my todo list and got buried a while back. I'm working on merging it right now.

@jebrosen
Copy link
Collaborator

This is at the point (and has been for a while, sorry!) where going back and forth with GitHub reviews isn't so productive. So I've merged this in 6a55aa7, with my refactoring in 0a3960b to keep up with some changes made to the way contrib was organized and to rename the Responder.

There are a few things that are not done, but should be easy to modify or add:

  • (docs) Guide-level documentation
  • (docs) Mention of the security implications raised by @derekdreery (Compression Fairing #781 (comment))
  • (docs) Example configuration of exclusions
  • (tests) Tests that are run when only one of brotli or gzip is enabled - currently, the tests are only run with both enabled.
  • "Real life" stress testing

I will put these on my personal todo list (or open an issue if I won't get to some of them), but anyone interested is welcome to work on them as well.

@jebrosen jebrosen closed this Apr 27, 2019
@jebrosen jebrosen added the pr: merged This pull request was merged manually. label Apr 27, 2019
@jebrosen
Copy link
Collaborator

Thanks @Martin1887 for persisting, and sorry this has taken so long!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr: merged This pull request was merged manually.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants