-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
installer/pkg/config/libvirt: Caching for libvirt pulls #280
installer/pkg/config/libvirt: Caching for libvirt pulls #280
Conversation
The errors are because we aren't using Go 1.11 yet, so we get complaints about |
I also haven't bothered vendoring httpcache until folks have time to look this over and review the general direction. I don't think our CI will green out without the vendor bump, but just in case, I'll drop a... /hold |
9f90bb1
to
04d2575
Compare
installer/pkg/config/parser.go
Outdated
@@ -29,6 +29,7 @@ func ParseConfig(data []byte) (*Cluster, error) { | |||
return nil, err | |||
} | |||
cluster.PullSecret = string(data) | |||
cluster.PullSecretPath = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this supposed to be in here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this supposed to be in here?
Yeah. We used to just copy the input config over. Now that I'm saving the updates config (because it might contain a new libvirt image URI), I need to clear PullSecretPath
(after reading the content into PullSecret
) to avoid hitting this when we reload the config for the install
step.
04d2575
to
5f705cf
Compare
Rebased around #273 with 04d2575 -> 5f705cf. |
Checking our RHCOS source against ETag [1] / If-None-Match [2] and Last-Modified [3] / If-Modified-Since [4], it seems to support In-None-Match well, but only supports If-Modified-Since for exact matches: $ URL=http://aos-ostree.rhev-ci-vms.eng.rdu2.redhat.com/rhcos/images/cloud/latest/rhcos-qemu.qcow2.gz $ curl -I "${URL}" HTTP/1.1 200 OK Server: nginx/1.8.0 Date: Wed, 19 Sep 2018 04:32:19 GMT Content-Type: application/octet-stream Content-Length: 684934062 Last-Modified: Tue, 18 Sep 2018 20:05:24 GMT Connection: keep-alive ETag: "5ba15a84-28d343ae" Accept-Ranges: bytes $ curl -sIH 'If-None-Match: "5ba15a84-28d343ae"' "${URL}" | head -n1 HTTP/1.1 304 Not Modified $ curl -sIH 'If-Modified-Since: Tue, 18 Sep 2018 20:05:24 GMT' "${URL}" | head -n1 HTTP/1.1 304 Not Modified $ curl -sIH 'If-Modified-Since: Tue, 18 Sep 2015 20:05:24 GMT' "${URL}" | head -n1 HTTP/1.1 200 OK $ curl -sIH 'If-Modified-Since: Tue, 18 Sep 2018 20:05:25 GMT' "${URL}" | grep 'HTTP\|Last-Modified' HTTP/1.1 200 OK Last-Modified: Tue, 18 Sep 2018 20:05:24 GMT That last entry should have 304ed, although the spec has [4]: When used for cache updates, a cache will typically use the value of the cached message's Last-Modified field to generate the field value of If-Modified-Since. This behavior is most interoperable for cases where clocks are poorly synchronized or when the server has chosen to only honor exact timestamp matches (due to a problem with Last-Modified dates that appear to go "back in time" when the origin server's clock is corrected or a representation is restored from an archived backup). However, caches occasionally generate the field value based on other data, such as the Date header field of the cached message or the local clock time that the message was received, particularly when the cached message does not contain a Last-Modified field. ... When used for cache updates, a cache will typically use the value of the cached message's Last-Modified field to generate the field value of If-Modified-Since. This behavior is most interoperable for cases where clocks are poorly synchronized or when the server has chosen to only honor exact timestamp matches (due to a problem with Last-Modified dates that appear to go "back in time" when the origin server's clock is corrected or a representation is restored from an archived backup). However, caches occasionally generate the field value based on other data, such as the Date header field of the cached message or the local clock time that the message was received, particularly when the cached message does not contain a Last-Modified field. So the server is violating the SHOULD by not 304ing dates greater than Last-Modified, but it's not violating a MUST-level requirement. The server requirements around If-None-Match are MUST-level [2], so using it should be more portable. The RFC also seems to prefer clients use If(-None)-Match [4,5]. I'm using gregjones/httpcache for the caching, since that implemenation seems reasonably popular and the repo's been around for a few years. That library uses the belt-and-suspenders approach of setting both If-None-Match (to the cached ETag) and If-Modified-Since (to the cached Last-Modified) [6], so we should be fine. UserCacheDir requires Go 1.11 [7,8,9]. [1]: https://tools.ietf.org/html/rfc7232#section-2.3 [2]: https://tools.ietf.org/html/rfc7232#section-3.2 [3]: https://tools.ietf.org/html/rfc7232#section-2.2 [4]: https://tools.ietf.org/html/rfc7232#section-3.3 [5]: https://tools.ietf.org/html/rfc7232#section-2.4 [6]: https://github.com/gregjones/httpcache/blob/9cad4c3443a7200dd6400aef47183728de563a38/httpcache.go#L169-L181 [7]: golang/go#22536 [8]: golang/go@816154b [9]: golang/go@50bd1c4
Assume the caller has HOME set and XDG_CACHE_HOME not set, because that's easy. The ~/.cache default is from [1]. Once we bump to Go 1.11, we can revert this commit and get the more-robust stdlib implementation, which is why I haven't bothered with a robust implementation here. [1]: https://standards.freedesktop.org/basedir-spec/basedir-spec-0.7.html
Passing the compressed images to libvirt was giving me "no bootable device" errors with my: $ rpm -q libvirt qemu-kvm-rhev libvirt-3.9.0-14.el7_5.7.x86_64 qemu-kvm-rhev-2.9.0-16.el7_4.13.x86_64 Ideally, the HTTP headers would tell us that the image was compressed (via Content-Type [1] or Content-Encoding [2]), but aos-ostree.rhev-ci-vms.eng.rdu2.redhat.com isn't configured to do that at the moment: $ curl -I http://aos-ostree.rhev-ci-vms.eng.rdu2.redhat.com/rhcos/images/cloud/latest/rhcos-qemu.qcow2.gz HTTP/1.1 200 OK Server: nginx/1.8.0 Date: Sat, 22 Sep 2018 04:59:53 GMT Content-Type: application/octet-stream Content-Length: 684751099 Last-Modified: Fri, 21 Sep 2018 13:56:07 GMT Connection: keep-alive ETag: "5ba4f877-28d078fb" Accept-Ranges: bytes $ curl -I http://aos-ostree.rhev-ci-vms.eng.rdu2.redhat.com/rhcos/images/cloud/latest/rhcos-qemu.qcow2 HTTP/1.1 404 Not Found Server: nginx/1.8.0 Date: Sat, 22 Sep 2018 04:59:56 GMT Content-Type: text/html Content-Length: 168 Connection: keep-alive I've opened [3] about getting Content-Encoding support via gzip_static [4], but in the meantime, just assume that anything with a ".gz" suffix is gzipped. Unzipping is also somewhat expensive (although not as expensive as network requests). It would be nice for callers to be able to configure whether we cache the compressed images (less disk consumption) or the uncompressed images (less CPU load when launching clusters). For now, we're just caching the network response. [1]: https://tools.ietf.org/html/rfc7231#section-3.1.1.5 [2]: https://tools.ietf.org/html/rfc7231#section-3.1.2.2 [3]: https://projects.engineering.redhat.com/browse/COREOS-593 [4]: http://nginx.org/en/docs/http/ngx_http_gzip_static_module.html
Generated with: $ bazel run //:gazelle using: $ bazel version Build label: 0.17.2- (@non-git) Build target: bazel-out/k8-opt/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar Build time: Fri Sep 21 15:04:25 2018 (1537542265) Build timestamp: 1537542265 Build timestamp as int: 1537542265
5f705cf
to
85286c5
Compare
Updated to unzip |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: crawford, wking The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/hold cancel |
The new pipeline handles Content-Encoding gzip: $ curl -LI --compressed https://releases-rhcos.svc.ci.openshift.org/storage/releases/maipo/47.73/redhat-coreos-maipo-47.73-qemu.qcow2 HTTP/1.1 302 Moved Temporarily Server: nginx/1.12.1 Date: Tue, 06 Nov 2018 20:55:06 GMT Content-Type: text/html Content-Length: 161 Location: https://d26v6vn1y7q7fv.cloudfront.net/releases/maipo/47.73/redhat-coreos-maipo-47.73-qemu.qcow2 Set-Cookie: ...; path=/; HttpOnly; Secure HTTP/1.1 200 OK Content-Type: binary/octet-stream Content-Length: 726878219 Connection: keep-alive Date: Tue, 06 Nov 2018 20:23:37 GMT Last-Modified: Tue, 06 Nov 2018 17:56:36 GMT ETag: "021080ef3b515d2443be3749ebbb0b08-87" Content-Encoding: gzip Accept-Ranges: bytes Server: AmazonS3 Age: 1890 X-Cache: Hit from cloudfront Via: 1.1 d85d7507ed6501757cfe600c02a26c7d.cloudfront.net (CloudFront) X-Amz-Cf-Id: ... so we can rely on Go's default compression handling. See DisableCompression, which defaults to false, in [1]. To confirm the default handling, try to retrieve from a local server: $ export OPENSHIFT_INSTALL_LIBVIRT_IMAGE=http://localhost:8080/example.qcow $ openshift-install --dir=wking create cluster INFO Fetching OS image... FATAL Error executing openshift-install: failed to fetch Terraform Variables: failed to generate asset "Terraform Variables": failed to get Tfvars: failed to use cached libvirt image: Get http://localhost:8080/example.qcow: EOF In another terminal, I was using Ncat as a dummy server to capture request headers: $ nc -l -p 8080 </dev/null GET /example.qcow HTTP/1.1 Host: localhost:8080 User-Agent: Go-http-client/1.1 Accept-Encoding: gzip This commit drops the now-unnecessary suffix handling from 7abe94d (config/libvirt/cache: Decompress when URI has ".gz" suffix, 2018-09-21, openshift#280). [1]: https://golang.org/pkg/net/http/#Transport
Checking our RHCOS source against
ETag
/If-None-Match
andLast-Modified
/If-Modified-Since
, it seems to supportIn-None-Match
well, but only supportsIf-Modified-Since
for exact matches:That last entry should have 304ed, although the spec has:
So the server is violating the SHOULD by not 304ing dates greater than
Last-Modified
, but it's not violating a MUST-level requirement. The server requirements aroundIf-None-Match
are MUST-level, so using it should be more portable. The RFC also seems to prefer clients useIf(-None)-Match
.I'm using gregjones/httpcache for the caching, since that implemenation seems reasonably popular and the repo's been around for a few years. That library uses the belt-and-suspenders approach of setting both
If-None-Match
(to the cachedETag
) andIf-Modified-Since
(to the cachedLast-Modified
), so we should be fine.UserCacheDir
requires Go 1.11.No unit tests, but you can excercise the logic with: