You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Short summary: Conditional requests allow browsers to efficiently refresh their locally cached versions of files by making a request like: curl -H "If-Modified-Since: <date>" https://some-site.pgs.sh/file.html. If pgs supported conditional requests, it would return a 304 Not Modified response with an empty body, saving everyone a lot of bandwidth and time.
Here's the current situation: Souin supports conditional requests if the file requested is cached and the TTL hasn't expired. But browsers will basically never benefit from this, because this is what happens:
Browser sends an unconditional GET file.html request
pgs serves the file with Etag and Last-Modified headers (from minio), and Cache-Control: max-age=600. Souin saves the file with a 600 second ttl.
The browser saves file.html in a local cache with a 600 second ttl.
601 seconds later, the user requests file.html again.
The browser's cache has expired, but because the cached version had Etag and Last-Modified headers, it sends a conditional GET file.html request with If-Modified-Since and If-None-Match headers.
(Any image files served by imgproxy currently do not include Etag or Last-Modified headers, so browsers don't even attempt conditional requests for images).
Souin's cached version has expired, so it forwards the conditional request to pgs.
(The redis storage provider has more advanced management of "stale" cache entries via max-stale, but Otter doesn't AFAIK)
pgs doesn't support conditional requests, so it returns 200 OK with the entire contents of file.html even though it hasn't changed in the last 601 seconds.
I think we can fix this with two small-ish changes:
1 - Use checkPreconditions from the Go standard library
EDIT: I originally planned to useServeContent but that didn't end up working because responses returned from imgproxy are not seekable. This section has been edited to describe the replacement solution I ended up using.
The Golang standard library has a utility called http.ServeContent which handles conditional requests. But we can't use that directly since it requires seekable files (files returned from pobj and imgproxy are not seekable).
Instead, we can just use the helper function checkPreconditions directly. The downside is that we still don't support Range requests, which would definitely require seekable files.
checkPreconditions supports all the RFCs so we don't have to worry about it.
PS: It's not clear in the docs, but checkPreconditions compares the etag/mod time from the If-None-Match/If-Modified-Since headers on the *Request to the Etag/Last-Modified headers in the ResponseWriter, so you have to set those headers manually before calling it.
PPS: the pobj storage interface we use could accept the Etag and Last-Modified headers as parameters so that it can instantly return after the StatObject check, avoiding the need to read the whole file. But then we'd have to move the checkPreconditions check into pobj. The extra disk read operation for re-validation is not a big deal IMO.
2 - Add etag and last-modified headers from imgproxy
Imgproxy can already provide etag and last-modified headers, we just need to enable that and forward them in the response.
If we do both of those things, then I think that's all that is required to support conditional requests for pgs. Analytics would still be counted like normal (would require visitHost to count 304s as a "visit" but actually checkPreconditions sets the status code after we log the visit as a 200 anyway, so no change needed).
Hey @neurosnap , has #178 and #179 been deployed yet? No rush, just wanted to make sure it wasn't bugged since I'm not seeing the correct etags in responses yet
It has not been deployed yet. I'm working on moving pgs out of the pico repo in the hope that it will be easier to self host and collaborate. I will update in thread when it is deployed
I did some looking into conditional requests now that we have caching in place and files pulled from minio include Etag and last-modified headers.
Short summary: Conditional requests allow browsers to efficiently refresh their locally cached versions of files by making a request like:
curl -H "If-Modified-Since: <date>" https://some-site.pgs.sh/file.html
. If pgs supported conditional requests, it would return a304 Not Modified
response with an empty body, saving everyone a lot of bandwidth and time.Here's the current situation: Souin supports conditional requests if the file requested is cached and the TTL hasn't expired. But browsers will basically never benefit from this, because this is what happens:
GET file.html
requestEtag
andLast-Modified
headers (from minio), andCache-Control: max-age=600
. Souin saves the file with a 600 second ttl.file.html
in a local cache with a 600 second ttl.file.html
again.GET file.html
request withIf-Modified-Since
andIf-None-Match
headers.max-stale
, but Otter doesn't AFAIK)200 OK
with the entire contents offile.html
even though it hasn't changed in the last 601 seconds.I think we can fix this with two small-ish changes:
1 - Use
checkPreconditions
from the Go standard libraryEDIT: I originally planned to use
ServeContent
but that didn't end up working because responses returned from imgproxy are not seekable. This section has been edited to describe the replacement solution I ended up using.The Golang standard library has a utility called
http.ServeContent
which handles conditional requests. But we can't use that directly since it requires seekable files (files returned frompobj
andimgproxy
are not seekable).Instead, we can just use the helper function
checkPreconditions
directly. The downside is that we still don't supportRange
requests, which would definitely require seekable files.checkPreconditions
supports all the RFCs so we don't have to worry about it.PS: It's not clear in the docs, but
checkPreconditions
compares the etag/mod time from theIf-None-Match/If-Modified-Since
headers on the*Request
to theEtag/Last-Modified
headers in the ResponseWriter, so you have to set those headers manually before calling it.PPS: the
pobj
storage interface we use could accept the Etag and Last-Modified headers as parameters so that it can instantly return after theStatObject
check, avoiding the need to read the whole file. But then we'd have to move thecheckPreconditions
check intopobj
. The extra disk read operation for re-validation is not a big deal IMO.Implemented in #178
2 - Add etag and last-modified headers from imgproxy
Imgproxy can already provide etag and last-modified headers, we just need to enable that and forward them in the response.
If we do both of those things, then I think that's all that is required to support conditional requests for pgs. Analytics would still be counted like normal (would require
visitHost
to count 304s as a "visit" but actuallycheckPreconditions
sets the status code after we log the visit as a 200 anyway, so no change needed).Implemented in #179.
Thoughts?
The text was updated successfully, but these errors were encountered: