Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support ETag generation on ResourceHttpRequestHandler #29031

Closed
skjelmo opened this issue Aug 26, 2022 · 7 comments
Closed

Support ETag generation on ResourceHttpRequestHandler #29031

skjelmo opened this issue Aug 26, 2022 · 7 comments
Assignees
Labels
in: web Issues in web modules (web, webmvc, webflux, websocket) type: enhancement A general enhancement
Milestone

Comments

@skjelmo
Copy link

skjelmo commented Aug 26, 2022

Affects: 5.3.21
Stack overflow question

Expected behaviour:

If-None-Match has precedence when If-None-Match is used in combination with If-Modified-Since.

The function checkNotModified in org.springframework.web.context.request.ServletWebRequest references the expected order of precedence with the following comment:

// Evaluate conditions in order of precedence.
// See https://tools.ietf.org/html/rfc7232#section-6
checkNotModified methods
	@Override
	public boolean checkNotModified(long lastModifiedTimestamp) {
		return checkNotModified(null, lastModifiedTimestamp);
	}

	@Override
	public boolean checkNotModified(String etag) {
		return checkNotModified(etag, -1);
	}

	@Override
	public boolean checkNotModified(@Nullable String etag, long lastModifiedTimestamp) {
		HttpServletResponse response = getResponse();
		if (this.notModified || (response != null && HttpStatus.OK.value() != response.getStatus())) {
			return this.notModified;
		}

		// Evaluate conditions in order of precedence.
		// See https://tools.ietf.org/html/rfc7232#section-6

		if (validateIfUnmodifiedSince(lastModifiedTimestamp)) {
			if (this.notModified && response != null) {
				response.setStatus(HttpStatus.PRECONDITION_FAILED.value());
			}
			return this.notModified;
		}

		boolean validated = validateIfNoneMatch(etag);
		if (!validated) {
			validateIfModifiedSince(lastModifiedTimestamp);
		}

		// Update response
		if (response != null) {
			boolean isHttpGetOrHead = SAFE_METHODS.contains(getRequest().getMethod());
			if (this.notModified) {
				response.setStatus(isHttpGetOrHead ?
						HttpStatus.NOT_MODIFIED.value() : HttpStatus.PRECONDITION_FAILED.value());
			}
			if (isHttpGetOrHead) {
				if (lastModifiedTimestamp > 0 && parseDateValue(response.getHeader(HttpHeaders.LAST_MODIFIED)) == -1) {
					response.setDateHeader(HttpHeaders.LAST_MODIFIED, lastModifiedTimestamp);
				}
				if (StringUtils.hasLength(etag) && response.getHeader(HttpHeaders.ETAG) == null) {
					response.setHeader(HttpHeaders.ETAG, padEtagIfNecessary(etag));
				}
			}
		}

		return this.notModified;
	}

Observed behaviour

The server returns 304 Not Modified for an invalid etag when both the If-None-Match and If-Modified-Since-header is set.

The response is updated with status NOT MODIFIED when checkNotModified is called from handleRequest in org.springframework.web.servlet.resource.ResourceHttpRequestHandler and lastModifiedTimestamp indicates that the resources is not modified.

The response is not updated again when checkNotModified is called from updateResponse in org.springframework.web.filter.ShallowEtagHeaderFilter with an etag value indicating that the resource is modified.

Edit: I'm honestly quite confused. I tried to write a test to see if the response code is updated correctly in ShallowEtagHeaderFilter. Looking on the test, it seems that it handles the situation correctly. When debugging locally, it looks like the response status code is still 304 Not Modified. Furthermore, it should not matter if it handles the status code correctly when ShallowEtagHeaderFilter is executed before ResourceHttpRequestHandler which updates the response status code to 304 Not Modified anyway.

Test coverage

checkNotModified is covered by the test IfNoneMatchAndIfNotModifiedSinceShouldNotMatchWhenDifferentETag. However, this test covers the case where checkNotModified is called with both eTag and lastModifiedTimestamp. In the case described, checkNotModified is called sequentially by overloaded methods. The overloaded method call with lastModifiedTimestamp alters the response status to 304 Not Modified, and it seems that it makes isEligibleForEtag in ShallowEtagHeaderFilter return false if the status code is 304, and the call to checkNotModified from updateResponse in shallowEtagHeaderFilter does not update the response from 304 to 200 OK if the etag has changed.

I currently see no test in ShallowEtagHeaderFilterTests asserting that the response code is 200 OK if a response has an invalid etag and response code 304 Not Modified.

Problem

After rollback to an earlier deployment, clients will send a lastModifiedTimestamp newer than the content on the server and an invalid etag. Since If-None-Match doesn't have precedence as expected, the client recieves a 304 NOT MODIFIED when a 200 OK was expected.

Work-around

Problem can be mitigated by not using If-Modified-Since. E.g. utilizing setUseLastModified(false) on a resource handler.

Reproduction

Send a GET-request with an invalid etag and lastModifiedTimestamp in the future.

@spring-projects-issues spring-projects-issues added the status: waiting-for-triage An issue we've not yet triaged or decided on label Aug 26, 2022
@bclozel bclozel self-assigned this Aug 26, 2022
@bclozel bclozel added the in: web Issues in web modules (web, webmvc, webflux, websocket) label Aug 26, 2022
@skjelmo
Copy link
Author

skjelmo commented Aug 26, 2022

When debugging locally, it seems that ResourceHttpRequestHandler which updates the response status to 304 is called after ShallowEtagHeaderFilter.

If this is the problem, could this simply be the result of a bad configuration where the ResourceHttpRequestHandler is called after the ShallowEtagHeaderFilter? I'm unsure how we have managed to get the ResourceHttpRequestHandler to execute after the ShallowEtagHeaderFilter if that is not the default behaviour.

@bclozel
Copy link
Member

bclozel commented Aug 27, 2022

You've obviously spent a lot of time analyzing this issue, thanks for that.

Unfortunately it's still hard to understand the core problem.
Can you provide a simple application created with https://start.spring.io showing the problem and share it on GitHub? From what I understand this would involve a Spring MVC app, a static resource and the shallow ETag filter contributed as a bean. With that, the only piece missing would be a curl command, the expected HTTP response and what you're getting instead.

The problem might be in multiple places, a documentation issue or both. Often, a simple reproducer is unbeatable!

Thanks!

@bclozel bclozel added the status: waiting-for-feedback We need additional information before we can continue label Aug 27, 2022
@skjelmo
Copy link
Author

skjelmo commented Aug 27, 2022

Thanks for the feedback @bclozel.

I've added a simple reproducer here: https://github.com/skjelmo/spring-unexpected-304
It's a clean Spring starter with Spring Web, a HTML file and the shallow etag filter enabled.

Opening http://localhost:8080/index.html should provide necessary values from ETag and Last-Modified in the response headers.

Sending a curl request with the ETag value set as If-None-Match, and Last-Modified set as If-Modified-Since returns 304 Not Modified as expected.

If you modify If-Modified-Since to be back in time, you receive a 200 OK as expected.

However, if you only modify the ETag value, meaning that the ETag of the cached content differs from the ETag calculated on the server, you expect to get 200 OK. But, you get a 304 Not Modified which is unexpected since If-None-Match has precedence when If-None-Match is used in combination with If-Modified-Since. The precedence is documented in https://tools.ietf.org/html/rfc7232#section-6 referenced in checkNotModified and seemingly obsoleted by https://www.rfc-editor.org/rfc/rfc9110#section-13.2.2.

(Remember to replace header values)

curl --location --request GET 'http://localhost:8080/index.html' \
--header 'If-None-Match: "0c01d44fcb5bdfc18313ea8cb309217af"' \
--header 'If-Modified-Since: Sat, 27 Aug 2022 23:26:55 GMT'

The core problem seems to be that checkNotModified in org.springframework.web.filter.ShallowEtagHeaderFilter is called before the response is updated with status NOT MODIFIED when checkNotModified is called from handleRequest in org.springframework.web.servlet.resource.ResourceHttpRequestHandler.

The existing test IfNoneMatchAndIfNotModifiedSinceShouldNotMatchWhenDifferentETag only handles the case when checkNotModified is called with both ETag and lastModifiedTimestamp, and not when checkNotModified is called sequentially by overloaded methods.

@spring-projects-issues spring-projects-issues added status: feedback-provided Feedback has been provided and removed status: waiting-for-feedback We need additional information before we can continue labels Aug 27, 2022
@skjelmo skjelmo changed the title Unexpected 304 Not Modifiedwhen combining If-Modified-Since and If-None-Match Unexpected 304 Not Modified when combining If-Modified-Since and If-None-Match Aug 27, 2022
@bclozel
Copy link
Member

bclozel commented Aug 29, 2022

Thanks for the analysis and the sample application, I think I understand now what's going on. The core problem here is that you expect the resource handling (ResourceHttpRequestHandler) and ShallowEtagHeaderFilter to operate as a single entity, in one pass. This is not the case.

ShallowEtagHeaderFilter, as described in its javadoc and reference doc, is a Servlet filter that is called once the entire HTTP response has been written to the response wrapper. This is an easy strategy for generating ETag headers for any type of response sent by the application. This does not save any CPU or server side resource, as the entire response needs to be generated. This only saves network bandwidth. This filter only considers ETags and operates completely independently of Spring MVC - this Servlet filter can be applied on any Servlet-based application.

Resource handling is a separate feature provided by Spring MVC. This is a complete solution for serving static resources from a Spring MVC application. This includes managing conditional and caching headers.

What happens here is that the resource handling only considers the caching aspects using the information it's got: the last modified resource metadata and the request headers. The handler is called first, as the filter at this point only wrapped the response. The handler here is in a position to skip entirely the writing of the resource to the response, saving CPU and memory. At this point, there is no ETag information available.

Once that's done, the ShallowEtagHeaderFilter filter sees an outgoing response marked as 304 Not Modified and can only assume that something decided that the response should not be produced. There is no response body to operate on, so the filter cannot calculate a MD5 hash of the content and compare it with the incoming ETag.

In order to get the expected behavior here, we need to be in a position to have both the last modified information and a calculated ETag for the resource. Of course, calculating that hash is an opinion on its own. In a Spring MVC application, this can be provided at the controller level (calling checkNotModifed from a controller method directly using business domain information), a fixed version (the application version) or the content of the file as a hash. The last two alternatives are available through the resource chain documented here and here.

In your case, this index.html file be generated by a JavaScript framework or could be versioned with the app itself. In this case, a "fixed version" strategy would work. If you provide more details about your static resources setup - how they're generated, the caching strategy, how resources are linked to - we could find a concrete solution or improve the current infrastructure in Spring.

In general, I think that the ShallowEtagHeaderFilter is a low-level, crude solution. If you'd like a more evolved caching approach, I'd suggest excluding it and looking into the asset pipeline infrastructure.

@bclozel bclozel added status: waiting-for-feedback We need additional information before we can continue and removed status: feedback-provided Feedback has been provided labels Aug 29, 2022
@skjelmo
Copy link
Author

skjelmo commented Sep 1, 2022

Thank you for the thorough response @bclozel!

You are correct in that we expected the resource handling and ShallowEtagHeaderFilter to operate as a single entity.

We have not yet been able to handle both If-Modified-Since and If-None-Match with the expected behaviour.

  • Our attempt of checking the etag on the controller level seems to move the problem, since the controller is called before ResourceHttpRequestHandler which still rewrites the respons based solely on the lastModifiedTimestamp.
  • Using VersionResourceResolver seems to be dependant on the URL, so that we cannot enable it for the root URL?
  • Disabling If-Modified-since solves the problem with rollback for us.
      registry.addResourceHandler("/path_to/index.html")
            .setUseLastModified(false)
            .addResourceLocations("/path_to/index.html")
    Combining setUseLastModified(false) with a controller checking the etag for the html-file in question seems superflous since ShallowEtagHeaderFilter works with setUseLastModified(false).

However, we are curious how a FixedVersionStrategy can be implemented?

If you provide more details about your static resources setup - how they're generated, the caching strategy, how resources are linked to - we could find a concrete solution or improve the current infrastructure in Spring.

CSS- and JavaScript-files are generated by webpack with a content-hash in their filename. They are included by Webpack in an index.html-file which is generated without a content hash. The index.html references cache-busting filenames, but the file itself depends on the server handling both If-None-Match and If-Modified-Since correctly. We are running Spring only, not Spring boot. All the generated files are placed in a resources-folder, and we have set up a ResourceHandler and a ViewController with Java-config as follows:

@Configuration
public class WebConfig extends WebMvcConfigurationSupport {

    @Override
    public void addResourceHandlers(ResourceHandlerRegistry registry) {
        registry.addResourceHandler("/path_to_resources/**")
            .addResourceLocations("/path_to_resources/");
    }

    @Override
    protected void addViewControllers(ViewControllerRegistry registry) {
        registry.addViewController("/").setViewName("forward:/path_to_resources/html/index.html");
    }
}

Is it perhaps possible to configure a handler that handles both If-Modified-Since and If-None-Match? Seems that we need to find a way to prevent ResourceHttpRequestHandler.handleRequest() from being called, since it updates the response based on If-Modified-Since only.

@spring-projects-issues spring-projects-issues added status: feedback-provided Feedback has been provided and removed status: waiting-for-feedback We need additional information before we can continue labels Sep 1, 2022
@bclozel
Copy link
Member

bclozel commented Jan 30, 2023

Sorry for the late reply.

I think we could consider an enhancement where we could configure on the ResourceHttpRequestHandler a way to optionally generate an ETag value based on a custom strategy and/or a default MD5 hash. The ResourceHttpRequestHandler already supports the last modified header through the Resource last modification date, we could extend this with a Function<Resource, String> strategy that would allow us to use ServletWebRequest#checkNotModified within its implementation.

In the meantime, I think the easiest way to handle the index page is a code snippet similar to this:

@GetMapping("/")
@ResponseBody
public Resource index(WebRequest request) throws IOException {
	ClassPathResource indexPage = new ClassPathResource("static/index.html");
	String eTag = DigestUtils.md5DigestAsHex(indexPage.getInputStream());
	long lastModified = indexPage.lastModified();
	if (request.checkNotModified(eTag, lastModified)) {
		return null;
	}
	return indexPage;
}

I'm repurposing this issue as an enhancement.

@bclozel bclozel added type: enhancement A general enhancement and removed status: waiting-for-triage An issue we've not yet triaged or decided on status: feedback-provided Feedback has been provided labels Jan 30, 2023
@bclozel bclozel added this to the 6.1.x milestone Jan 30, 2023
@bclozel bclozel changed the title Unexpected 304 Not Modified when combining If-Modified-Since and If-None-Match Support ETag generation on ResourceHttpRequestHandler Jan 30, 2023
@bclozel bclozel removed their assignment Jan 30, 2023
@bclozel bclozel self-assigned this Oct 24, 2023
@bclozel bclozel modified the milestones: 6.1.x, 6.1.0-RC2 Oct 25, 2023
@skjelmo
Copy link
Author

skjelmo commented Oct 25, 2023

Thanks for implementing the enhancement @bclozel 🙌 🥳!

bclozel added a commit that referenced this issue Oct 25, 2023
This commit replicates the ETag generation option now available on
`ResourceHttpRequestHandler` but for its WebFlux counterpart.

See gh-29031
bclozel added a commit that referenced this issue Oct 25, 2023
This commit surfaces the ETag generation feature for both
`ResourceHttpRequestHandler` and `ResourceWebHandler` on their
respective `ResourceHandlerRegistration` for easier configuration.

See gh-29031
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in: web Issues in web modules (web, webmvc, webflux, websocket) type: enhancement A general enhancement
Projects
None yet
Development

No branches or pull requests

3 participants