Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Follow redirect (HTTP 302) just once for objects/ #1541

Closed
cgwalters opened this issue Apr 17, 2018 · 65 comments
Closed

Follow redirect (HTTP 302) just once for objects/ #1541

cgwalters opened this issue Apr 17, 2018 · 65 comments

Comments

@cgwalters
Copy link
Member

Fedora recently started redirecting:

# curl -L --head https://dl.fedoraproject.org/atomic/repo/objects/
HTTP/1.1 302 Found
Date: Tue, 17 Apr 2018 19:51:47 GMT
Server: Apache/2.4.6 (Red Hat Enterprise Linux) OpenSSL/1.0.2k-fips
Location: https://d2os45suu8yck8.cloudfront.net/
Content-Type: text/html; charset=iso-8859-1

HTTP/2 200 
content-type: text/html;charset=ISO-8859-1
content-length: 0
date: Mon, 16 Apr 2018 15:47:31 GMT
server: Apache/2.4.33 (Fedora)
strict-transport-security: max-age=31536000; includeSubDomains; preload
x-frame-options: SAMEORIGIN
x-xss-protection: 1; mode=block
x-content-type-options: nosniff
referrer-policy: same-origin
strict-transport-security: max-age=31536000; includeSubDomains; preload
apptime: D=193972
appserver: proxy10.phx2.fedoraproject.org
x-varnish: 85516815
via: 1.1 varnish (Varnish/5.1), 1.1 af24f02bfe857ae430e1bfd9eef550ba.cloudfront.net (CloudFront)
accept-ranges: bytes
age: 101055
x-cache: Hit from cloudfront
x-amz-cf-id: rexTQTixl0NvnQPpKUOouUvDK08uDOIHzYSnE8KNWrX-IYUi0WwnOA==

But things would be dramatically faster if when we hit a redirect for a commit object, we transparently used that as the base URL for future requests.

(Also we should re-enable http2 in fedora)

@cgwalters
Copy link
Member Author

cgwalters commented May 1, 2018

Although, I believe this would just work if the content in the CDN was prefixed with objects/:

[remote "fedora-27"]
url=https://kojipkgs.fedoraproject.org/atomic/repo/
contenturl=https://d2os45suu8yck8.cloudfront.net/

Right now:

# curl -L --head https://dl.fedoraproject.org/atomic/repo/objects/fd/8b0bd029d4e2926b4f35853e1e99a1b62f1a813370f4d491ab0bde5ab526f0.filez
HTTP/1.1 302 Found
Date: Tue, 01 May 2018 14:51:21 GMT
Server: Apache/2.4.6 (Red Hat Enterprise Linux) OpenSSL/1.0.2k-fips
Location: https://d2os45suu8yck8.cloudfront.net/fd/8b0bd029d4e2926b4f35853e1e99a1b62f1a813370f4d491ab0bde5ab526f0.filez
Content-Type: text/html; charset=iso-8859-1

HTTP/2 200 
...

But libostree will generate a URL with objects/ in the target too:
> GET /objects/fd/8b0bd029d4e2926b4f35853e1e99a1b62f1a813370f4d491ab0bde5ab526f0.filez HTTP/1.1

which is a 404.

@dustymabe
Copy link
Contributor

I'll honestly have to ask patrick in most of these cases, but let me try to comment on a few things:

But things would be dramatically faster if when we hit a redirect for a commit object, we transparently used that as the base URL for future requests.

That's an RFE for ostree, right?

(Also we should re-enable http2 in fedora)

Fedora infra has re-enabled it, but we still have it disabled in ostree. We (ostree) are waiting for the curl bug to get fixed and propagate down to Fedora (and into ostree), I think.

contenturl=https://d2os45suu8yck8.cloudfront.net/

Are you suggesting we use contenturl in our ostree remote configs? Doesn't it work transparently already?

But libostree will generate a URL with objects/ in the target too

I see.. with contenturl it expects the url to have objects/ prepended. Rather than have everyone update their remotes could we just implement the RFE to have ostree transparently used that as the base URL for future requests ?

@cgwalters
Copy link
Member Author

Are you suggesting we use contenturl in our ostree remote configs?

Maybe. If we tweaked the server side as above, then it would be easy for people to opt-in to testing things. Also this model is exactly why the contenturl bit was implemented, so I could use it as a reference example.

I filed this issue as I think it'd make sense to change libostree to handle it by default too, but if it's not too hard to change Fedora infrastructure I'd like to do that in addition.

@lucab
Copy link
Member

lucab commented Jun 15, 2018

Does contenturl only affect fetching remote entries under objects/? My current understanding is that it also affects delta parts living under deltas/, but for the fedora-atomic repo those are not served by cloudfront:

$ curl -L --head https://dl.fedoraproject.org/atomic/repo/deltas/0p/xFSSJsqKUIRjYM8vcUn74VtKudHJHQ0q_xhYs6sfM-3+JOXUlewW++LWHmtJTe8rEZMB9lZbi27NVC15sC34k/0

HTTP/1.1 302 Found
[...]
Location: https://kojipkgs.fedoraproject.org/atomic/repo/deltas/0p/xFSSJsqKUIRjYM8vcUn74VtKudHJHQ0q_xhYs6sfM-3+JOXUlewW++LWHmtJTe8rEZMB9lZbi27NVC15sC34k/0

@dustymabe
Copy link
Contributor

but for the fedora-atomic repo those are not served by cloudfront:

we probably should be serving those over cloudfront. I'll ask @puiterwijk if he can add that.

@nirik
Copy link

nirik commented Jun 15, 2018

The deltas should be cached in cloudfront now.

Also, curl is fixed in f27+ updates for that http/2 bug and we have re-enabled http/2, so might test that and see if it can be re-enabled.

@lucab
Copy link
Member

lucab commented Jun 15, 2018

@nirik @puiterwijk Would it be possible to keep the relative hierarchy intact and mirror whole objects/ and deltas/ path? IMHO that would be a better mirror layout, as it reflects the repo structure and unlocks the kind of client optimazion described here. @cgwalters anything else that is expected to be served from a contenturl endpoint and we may be missing?

@cgwalters
Copy link
Member Author

cgwalters commented Jun 15, 2018

I was playing with the below patch locally. --set=contenturl-bareobjects=1 is a repo option to tell the fetcher to drop objects/ so it works with the existing FAH cloudfront content.

Here's a script I wrote which clones a partially-cloned repo (so we're just doing content fetches):

# cat bench.sh 
#!/bin/bash
set -xeuo pipefail
bareobjects=$1
rm repo3 -rf
cp -a repo2 repo3
ostree --repo=repo3 init --mode=archive
ostree --repo=repo3 remote delete fedora
ostree --repo=repo3 remote add fedora --set=gpg-verify=false \
       --set=contenturl=https://d2os45suu8yck8.cloudfront.net/ \
       --set=contenturl-bareobjects=${bareobjects} \
       https://kojipkgs.fedoraproject.org/atomic/repo/
time ostree --repo=repo3 pull --mirror --http-trusted fedora:fedora/28/x86_64/atomic-host

One thing that was immediately apparent is the non-contenturl path is a lot faster (3MB/s vs 800k/s) - presumably Cloudfront is penalizing small requests? Or is maybe allocating bandwidth in a different way.

I think in the end, probably trying to optimize the current archive path isn't too useful. Ensuring more people hit deltas is a far bigger win. And rojig gives us chunking on a higher level which is just more CDN friendly.

From 76f2aab9c71176c558c894f064b7438964b1080b Mon Sep 17 00:00:00 2001
From: Colin Walters <[email protected]>
Date: Tue, 22 May 2018 19:44:30 +0000
Subject: [PATCH] wip

---
 src/libostree/ostree-fetcher-curl.c | 28 +++++++++++++++++++++++++++-
 src/libostree/ostree-repo-pull.c    | 15 +++++++++++++++
 2 files changed, 42 insertions(+), 1 deletion(-)

diff --git a/src/libostree/ostree-fetcher-curl.c b/src/libostree/ostree-fetcher-curl.c
index 2e090cfa..942c2cb0 100644
--- a/src/libostree/ostree-fetcher-curl.c
+++ b/src/libostree/ostree-fetcher-curl.c
@@ -341,7 +341,7 @@ check_multi_info (OstreeFetcher *fetcher)
 
               if (req->idx + 1 == req->mirrorlist->len)
                 {
-                  g_autofree char *msg = g_strdup_printf ("Server returned HTTP %lu", response);
+                  g_autofree char *msg = g_strdup_printf ("Server returned HTTP %lu for %s", response, eff_url);
                   g_task_return_new_error (task, G_IO_ERROR, giocode,
                                            "%s", msg);
                   if (req->fetcher->remote_name &&
@@ -909,6 +909,32 @@ _ostree_fetcher_request_to_tmpfile_finish (OstreeFetcher *self,
   return TRUE;
 }
 
+/* Like _ostree_fetcher_request_to_tmpfile_finish(), but
+ * @out_redirection may store a new base URL if we got a HTTP 304 redirection.
+ */
+static gboolean
+_ostree_fetcher_request_to_tmpfile_ext_finish (OstreeFetcher *self,
+                                               GAsyncResult  *result,
+                                               char         **out_redirection,
+                                               GLnxTmpfile   *out_tmpf,
+                                               GError       **error)
+{
+  g_return_val_if_fail (g_task_is_valid (result, self), FALSE);
+  g_return_val_if_fail (g_async_result_is_tagged (result, _ostree_fetcher_request_async), FALSE);
+
+  GTask *task = (GTask*)result;
+  FetcherRequest *req = g_task_get_task_data (task);
+
+  if (!g_task_propagate_boolean (task, error))
+    return FALSE;
+
+  g_assert (!req->is_membuf);
+  *out_tmpf = req->tmpf;
+  req->tmpf.initialized = FALSE; /* Transfer ownership */
+
+  return TRUE;
+}
+
 void
 _ostree_fetcher_request_to_membuf (OstreeFetcher         *self,
                                    GPtrArray             *mirrorlist,
diff --git a/src/libostree/ostree-repo-pull.c b/src/libostree/ostree-repo-pull.c
index 9553272e..b1e2f34c 100644
--- a/src/libostree/ostree-repo-pull.c
+++ b/src/libostree/ostree-repo-pull.c
@@ -83,6 +83,7 @@ typedef struct {
 
   GPtrArray     *meta_mirrorlist;    /* List of base URIs for fetching metadata */
   GPtrArray     *content_mirrorlist; /* List of base URIs for fetching content */
+  gboolean       content_mirrorlist_bare_objects;
   OstreeRepo   *remote_repo_local;
   GPtrArray    *localcache_repos; /* Array<OstreeRepo> */
 
@@ -2184,6 +2185,12 @@ start_fetch (OtPullData *pull_data,
     {
       obj_subpath = _ostree_get_relative_object_path (expected_checksum, objtype, TRUE);
       mirrorlist = pull_data->content_mirrorlist;
+      if (pull_data->content_mirrorlist_bare_objects)
+        {
+          g_assert (g_str_has_prefix (obj_subpath, "objects/"));
+          g_autofree char *old_subpath = obj_subpath;
+          obj_subpath = g_strdup (obj_subpath + strlen ("objects/"));
+        }
     }
 
   /* We may have determined maximum sizes from the summary file content; if so,
@@ -3832,6 +3839,14 @@ ostree_repo_pull_with_options (OstreeRepo             *self,
 
             pull_data->content_mirrorlist =
               g_ptr_array_new_with_free_func ((GDestroyNotify) _ostree_fetcher_uri_free);
+
+            g_autofree char *contenturl_bareobjects = NULL;
+            if (!ostree_repo_get_remote_option (self, remote_name_or_baseurl,
+                                                "contenturl-bareobjects", NULL,
+                                                &contenturl_bareobjects, error))
+              goto out;
+            if (contenturl_bareobjects)
+              pull_data->content_mirrorlist_bare_objects = TRUE;
             g_ptr_array_add (pull_data->content_mirrorlist,
                              g_steal_pointer (&contenturi));
           }
-- 
2.17.1

@cgwalters
Copy link
Member Author

That said, we should still tweak this, and people who don't have a good connection to the existing kojipkgs server may have a different experience than I did.

@dustymabe
Copy link
Contributor

Sorry it's not immediately obvious to me what the outcome of your statement is. I'll try to ask clarifying questions:

One thing that was immediately apparent is the non-contenturl path is a lot faster (3MB/s vs 800k/s) - presumably Cloudfront is penalizing small requests? Or is maybe allocating bandwidth in a different way.

so you're saying the non --set contenturl= path is faster? meaning if we store a new base URL if we got a HTTP 304 redirection then the empirical performance seems better?

I think in the end, probably trying to optimize the current archive path isn't too useful. Ensuring more people hit deltas is a far bigger win.

yeah we just enabled deltas via CDN above, thought that was already done before. Do you think the store a new base URL if we got a HTTP 304 redirection does give enough gains to submit the patch you wrote?

@cgwalters
Copy link
Member Author

so you're saying the non --set contenturl= path is faster? meaning if we store a new base URL if we got a HTTP 304 redirection then the empirical performance seems better?

The confusing thing here is that (AIUI) what kojipkgs today is doing is using Varnish to act as a cache itself, but redirecting to Cloudfront for objects it doesn't have.

It's apparently faster (for me) to access objects that are in kojipkgs vs Cloudfront if they're cached. Setting contenturl always hits Cloudfront.

But again, this could be different for someone with a different connection to kojipkgs.

store a new base URL if we got a HTTP 304 redirection does give enough gains to submit the patch you wrote?

My patch is brute forcing this, it's not handling redirections as that'd be messier in the code.

If Fedora moved the URL layout, then it'd be a lot easier for everyone today to try setting contenturl= and see if it was faster for them.

But the Varnish caching behavior is going to make performance testing unpredictable.

(Really in the end I think ostree should probably learn a chunked format - we could even consider pushing rojig-like semantics down into ostree itself, I suspect it'd be very useful to have some of the stuff that lives in rpm-ostree work for debs/etc too as well, although the complexity of making it generic is high)

@dustymabe
Copy link
Contributor

The confusing thing here is that (AIUI) what kojipkgs today is doing is using Varnish to act as a cache itself, but redirecting to Cloudfront for objects it doesn't have.

That's not exactly my understanding. My understanding is that dl.fp.o/atomic/repo/objects/aa/bbccdd redirects to https://d2os45suu8yck8.cloudfront.net/aa/bbccdd. If the cloudfront endpoint has the object it will return it. If it doesn't it will grab it from kojipkgs.fp.o/atomic/repo/objects/aa/bbccdd (which I guess then uses varnish?), store it, and then return it. so kojipkgs itself will never redirect to cloudfront, but dl.fp.o will.

If you run the test multiple times (so we know the objects should be in cloudfront now), do the results change?

@lucab
Copy link
Member

lucab commented Jun 18, 2018

I would personally avoid introducing a new niche contenturl-bareobjects config option just for this specific usecase. Instead we could:

  • adjust the CDN layout so that it preserves objects/ and deltas/ path fragment
  • probe for redirect with a common $base for objects/ and deltas/
  • auto-enable contenturl=$base if probing was successful (manual setting still takes precedence over this)

@puiterwijk
Copy link

puiterwijk commented Jul 9, 2018

So, I'm fine with making objects/ and deltas/ work from the same cloudfront if I can make sure that refs/ won't be served from it (I think I can make that happen, but need to check).
Also, I do not think it's ideal to set contenturl= in our configurations, since the cloudfront fronting was meant as a temporary workaround to see if it helped, and considered an implementation detail.

@cgwalters
Copy link
Member Author

probe for redirect with a common $base for objects/ and deltas/

Yeah, I'm fine to do that, though once we do the first part, anyone can test out performance etc. easily by setting contenturl manually.

@sinnykumari
Copy link
Collaborator

For testing purpose, we did a cloudfront CDN set-up for entire ostree repo (https://kojipkgs.fedoraproject.org/atomic/repo/) with url https://d1dgksnh07m2j5.cloudfront.net/ .

For testing, I used:

Result:

  • On local vm, I see faster upgrade compared to when not using above cloudfront url in contenturl . Result varies depending upon which cloudfront CDN location gets picked up.
  • Launching insatnce in AMIs (ami-01e537269ce6bd151) and upgrading works superfast for each run with the use of contenturl
    $ time sudo rpm-ostree upgrade
real    0m21.069s
user    0m0.055s
sys     0m0.021s

Finished quite fast!

We also added redirects in our Fedora-infra config with testing url https://dl.fedoraproject.org/cdn-testing/atomic/repo/objects/ , using this as url in ostree repo conf doesn't really help much. I believe CDN works more effectively when cloudfront CDN url is used directly as the contenturl in ostree repo config.

Did I miss something to try out?

@cgwalters
Copy link
Member Author

Probably better to test with a direct ostree pull fedora-atomic:fedora/29/x86_64/atomic-host; otherwise one is also benchmarking e.g. I/O performance.

Also very important when discussing this is whether it's a delta pull or not.

Here's a way to easily reproduce "clean" pulls (I'm starting from Version: 29.20181210.0 (2018-12-10T00:50:21Z) which corresponds to commit 0232d115fa5f1d4609dcf2eb90de228ccb08dbad1134b30a555d92ae676bb92d):

# ostree reset fedora-atomic:fedora/29/x86_64/atomic-host 0232d115fa5f1d4609dcf2eb90de228ccb08dbad1134b30a555d92ae676bb92d && ostree admin cleanup && ostree pull fedora-atomic:fedora/29/x86_64/atomic-host

You can also add --disable-static-deltas to the pull command to test that, and also it's easy to add env OSTREE_DEBUG_HTTP=1 ostree pull ... to see what's going on.

@cgwalters
Copy link
Member Author

One other random thing I noticed while playing with this...at some point I was testing with a local build of ostree and had been noticing some hangs in the curl code while fetching. I think that boils down to not having --disable-http2 in my dev build. Not sure whether it's a libcurl bug or a libostree bug.

@cgwalters
Copy link
Member Author

OK yeah, http2+cloudfront contenturl is quite a dramatic improvement over http1+constant redirects. With a primed CloudFront cache (i.e. this is my 2nd pull) and --disable-static-deltas (since that's the worst case we're trying to improve), I get:

1377 metadata, 5949 content objects fetched; 273111 KiB transferred in 47 seconds

@sinnykumari
Copy link
Collaborator

OK yeah, http2+cloudfront contenturl is quite a dramatic improvement over http1+constant redirects. With a primed CloudFront cache (i.e. this is my 2nd pull) and --disable-static-deltas (since that's the worst case we're trying to improve), I get:

Indeed. http2+cloudfront contenturl works much faster.
I was even playing around with upgrading F29 silverblue from release day to latest using contenturl (https://d1dgksnh07m2j5.cloudfront.net/) and it was comparatively fast (finished in around 17 minutes, which usually takes much longer time considering total number objects fetched were 17972 .

1377 metadata, 5949 content objects fetched; 273111 KiB transferred in 47 seconds

@cgwalters
Copy link
Member Author

So this leaves the question of how we enable this. We definitely don't want people hardcoding the CF URL.

The previous suggestion here was #1541 (comment)

I have a new idea: We could support a special x-ostree.persistredirects=1 HTTP header or so to enable persistent redirects - i.e. if we get a 304 for the .commit object, we then apply that to all objects. This would maximize flexibility because it'd be under the control of the server side. I imagine it'd be pretty easy to add config to Fedora infra's Varnish to add a custom header like this? Something like this ?

@dustymabe
Copy link
Contributor

@sinnykumari I'm trying to understand exactly what settings you had when you were doing tests. Was this your setup?

$ cat /etc/ostree/remotes.d/fedora.conf 
[remote "fedora"]
url=https://dl.fedoraproject.org/cdn-testing/atomic/repo/objects/
gpg-verify=true
gpgkeypath=/etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-29-primary
contenturl=https://d1dgksnh07m2j5.cloudfront.net/

I think the above setup would appropriately test things because we would pay the redirect penalty the (same as our prod repo setup today) on initial contact, but would subsequently go directly to CDN for objects (different from our prod repo setup today).

@cgwalters
I have a new idea: We could support a special x-ostree.persistredirects=1 HTTP header or so to enable persistent redirects - i.e. if we get a 304 for the .commit object, we then apply that to all objects.

Yeah I don't see that being too hard to add to the server side, but is there a great benefit to adding the special header rather than supporting the "probe+assume" method as previously suggested (#1541 (comment), #1541 (comment))

@sinnykumari
Copy link
Collaborator

@sinnykumari I'm trying to understand exactly what settings you had when you were doing tests. Was this your setup?

$ cat /etc/ostree/remotes.d/fedora.conf 
[remote "fedora"]
url=https://dl.fedoraproject.org/cdn-testing/atomic/repo/objects/
gpg-verify=true
gpgkeypath=/etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-29-primary
contenturl=https://d1dgksnh07m2j5.cloudfront.net/

I think the above setup would appropriately test things because we would pay the redirect penalty the (same as our prod repo setup today) on initial contact, but would subsequently go directly to CDN for objects (different from our prod repo setup today).

My fedora-atomic remote config is:

$ cat /etc/ostree/remotes.d/fedora-atomic.conf
[remote "fedora-atomic"]
url=https://kojipkgs.fedoraproject.org/atomic/repo/
contenturl=https://d1dgksnh07m2j5.cloudfront.net/
gpg-verify=true
gpgkeypath=/etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-29-primary

Even I have tried with different url values https://dl.fedoraproject.org/atomic/repo/ and https://dl.fedoraproject.org/cdn-testing/atomic/repo/ (CDN testing url which set-up). url value doesn't matter much because when contenturl has a valid ostree repo, it fetches content from there. But I think using https://kojipkgs.fedoraproject.org/atomic/repo/ is preferable because that's the source of truth for latest content from where missing objects should be fetched from when content is not yet in CDN.

@cgwalters
Copy link
Member Author

but is there a great benefit to adding the special header rather than supporting the "probe+assume" method as previously suggested (#1541 (comment), #1541 (comment))

It's a bit about conservatism - we could theoretically be breaking someone's setup where e.g. they have the .commit objects on one server but redirect for other objects? Not that I can think of a good reason to do that...

Maybe we add an opt-out config option at least temporarily? Dunno.

@dustymabe
Copy link
Contributor

but is there a great benefit to adding the special header rather than supporting the "probe+assume" method as previously suggested (#1541 (comment), #1541 (comment))

It's a bit about conservatism - we could theoretically be breaking someone's setup where e.g. they have the .commit objects on one server but redirect for other objects? Not that I can think of a good reason to do that...

It's definitely reasonable to be conservative. For some reason I thought this was a relatively safe thing to do.

Maybe we add an opt-out config option at least temporarily? Dunno.

yeah that could be something to do. If you think a x-ostree.persistredirects=1 header is the safest way to go then I'm +1 for that.

@dustymabe
Copy link
Contributor

dustymabe commented Jan 8, 2019

@sinnykumari. I think the two setups we need to compare numbers for are:

  • The existing set up today for atomic host machines:
$ cat /etc/ostree/remotes.d/fedora.conf 
[remote "fedora"]
url=https://dl.fedoraproject.org/atomic/repo/
gpg-verify=true
gpgkeypath=/etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-29-primary
  • the new proposed setup (in the future contenturl won't be needed because we will write code that makes the ostree pull smarter)
$ cat /etc/ostree/remotes.d/fedora.conf 
[remote "fedora"]
url=https://dl.fedoraproject.org/cdn-testing/atomic/repo/
gpg-verify=true
gpgkeypath=/etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-29-primary
contenturl=https://d1dgksnh07m2j5.cloudfront.net/

In the first case we are paying a redirect penalty for all requests. An initial request for a file under objects will go go client -> dl.fp.o -> cloudfront -> kojipkgs -> cloudfront -> client. For subsequent requests of objects that have already been cached in cloudfront it would be: client -> dl.fp.o -> cloudfront -> client

In the second case we are paying a redirect penalty only for requests about the status of the repo (i.e. "what commit do I update to"?), but not for objects. For all objects requests, on an inital request it would be: client -> cloudfront -> kojipkgs -> cloudfront -> client. requests for cached content would be: client -> cloudfront -> client.

I think these two setups are the ones that represent the smallest delta from what we currently have to where we want to go.

@dustymabe
Copy link
Contributor

dustymabe commented Jan 9, 2019

@sinnykumari

  • With first case: real 13m40.663s
  • With second case: real 0m19.557s

wow that looks very promising

@ramcq
if you really feel the need to only have CDN used for deltas/ and objects/ (this is a mistake IMO)

This is mostly a requirement by our infra team. They want to make sure the source of truth always comes from Fedora itself and are being conservative here, which isn't bad. The CDN is an implementation detail that could be changed at any time.

@dustymabe dustymabe added the jira label Jan 9, 2019
@dustymabe
Copy link
Contributor

so @cgwalters, @jlebon - I think there are three options from our discussions above:

  • probe redirect and optimize automatically by default
  • probe redirect and optimize automatically by default, allow config option to disable
  • require x-ostree.persistredirects=1 http header in order to enable optimization

any opinion?

@ramcq
Copy link
Contributor

ramcq commented Jan 10, 2019

@dustymabe I liked Colin's idea of allowing a mirrorlist to be specified for the url or contenturl entry, and have that cached/reused sensibly. Then only the main server's URL is included inside the client config, but that server can send you to whatever CDN or mirror they want, either statically or with some geo-magic. (That could be with or without some way of flagging a content only server in the mirrorlist, so the whole url= is just one mirrorlist which has both types of servers in.)

@ramcq
Copy link
Contributor

ramcq commented Jan 10, 2019

(I dislike any form of redirect probing. Seems deeply magic and relies on more complicated server configuration, because you can't just have your CDN proxy your origin without also having your origin, or another proxy, separate client requests from CDN requests to decide if they should be redirected.)

@dustymabe
Copy link
Contributor

@ramcq so you are saying there is a 4th option:

  • probe redirect and optimize automatically by default
  • probe redirect and optimize automatically by default, allow config option to disable
  • require x-ostree.persistredirects=1 http header in order to enable optimization
  • configure a mirrorlist with CDN entry inside of it, the mirrorlist url is a fedora URL, the mirrorlist contains the CDN url

I'd have to get input from fedora infra on that option. @puiterwijk, any thoughts?

@cgwalters
Copy link
Member Author

Yes, the 4th or "mirrorlist" option is described in #1541 (comment)

@jlebon
Copy link
Member

jlebon commented Jan 10, 2019

Your mention of mirrorlist reminded me that another approach (that would require changing the client-side config files, but that's something we can figure out) is:

[remote "fedora"]
url=mirrorlist=https://something.fedoraproject.org/ostree-mirrorlist.txt

You mean contenturl=mirrorlist= here right?

One hybrid method to avoid having to update people's remote configs is a x-ostree.contenturl header (which could in turn have the value mirrorlist=...). Then when fetching the summary/repo config, we can use the contenturl for the rest of the pull operation (and maybe allow hardcoded contenturl= config options to override that).

@cgwalters
Copy link
Member Author

cgwalters commented Jan 10, 2019

You mean contenturl=mirrorlist= here right?

Yep thanks 👍 - I edited my comment above.

One hybrid method to avoid having to update people's remote configs is a x-ostree.contenturl header (which could in turn have the value mirrorlist=...). Then when fetching the summary/repo config, we can use the contenturl for the rest of the pull operation (and maybe allow hardcoded contenturl= config options to override that).

Not opposed, but in the end I think it'd be clearer to edit the client config files; less magic (once the migration is done), ostree code already exists, and it's probably useful to break the ice for "fix up client configs".

@ramcq
Copy link
Contributor

ramcq commented Jan 10, 2019

One hybrid method to avoid having to update people's remote configs is a x-ostree.contenturl header (which could in turn have the value mirrorlist=...). Then when fetching the summary/repo config, we can use the contenturl for the rest of the pull operation (and maybe allow hardcoded contenturl= config options to override that).

Not opposed, but in the end I think it'd be clearer to edit the client config files; less magic (once the migration is done), ostree code already exists, and it's probably useful to break the ice for "fix up client configs".

Yeah, Flatpak actually has some of these metadata keys defined so that they push new keys into the repo config, such as updated URLs or collection IDs, so there is precedent for that. The repo config editing API and code in Flatpak isn't stellar, but we could split out the story a bit and do a) support for this URL type (likely unblocks FCOS subject to infra team approval), b) clear up repo config editing, c) define a metadata key (maybe a few) to push ostree config updates.

@sinnykumari
Copy link
Collaborator

I was exploring 4th option mentioned in #1541 (comment) and details from #1541 (comment) which is about using mirrorlist in client's ostree repo config.

I created a mirrorlist hosted in my personal fedora space (should be replaced with official link for production)

# cat /etc/ostree/remotes.d/fedora-atomic.conf 
[remote "fedora-atomic"]
url=https://dl.fedoraproject.org/cdn-testing/atomic/repo/
gpg-verify=true
gpgkeypath=/etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-29-primary
contenturl=mirrorlist=https://sinnykumari.fedorapeople.org/ostree-mirrorlist.txt

In the mirrorlist, I have added our testing cloudfront ostree repo url (https://d1dgksnh07m2j5.cloudfront.net/)
Running env OSTREE_DEBUG_HTTP=1 ostree pull fedora-atomic:fedora/29/x86_64/atomic-host (with and without --disable-static-deltas) on ostree version 29.20181210, works as expected.
It fetches first GET /ostree-mirrorlist.txt HTTP/1.1 from host sinnykumari.fedorapeople.org, later fetches config, summary file from dl.fp.o and then rest of the objects/ deltas/ requests from cloudfront url listed in ostree-mirrorlist.txt file

Using metalink instead of mirrorlist
We also have metalink support added in ostree long time back. I looked into using metalink as well consodering its advantage over mirrorlist.

In this case, client's ostree config looks like:

# cat /etc/ostree/remotes.d/fedora-atomic.conf 
[remote "fedora-atomic"]
#url=https://dl.fedoraproject.org/atomic/repo/
metalink=https://sinnykumari.fedorapeople.org/ostree-metalink.xml
gpg-verify=true
gpgkeypath=/etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-29-primary

ostree-metalink.xml file which I created contains different values based on today's summary file from https://kojipkgs.fedoraproject.org/atomic/repo/summary . In production, ostree-summary.xml can be generated on Fedora server which updates ostree-summary.xml as well along with summary file

Running env OSTREE_DEBUG_HTTP=1 ostree pull fedora-atomic:fedora/29/x86_64/atomic-host (with and without --disable-static-deltas) currently on ostree version 29.20181210.0` with this config works as expected. It first fetches /ostree-metalink.xml file from host sinnykumari.fedorapeople.org and all further requests like fetching config, summary, deltas/ or objects/ are from cloudfront url which we have mentioned in ostree-metalink.xml .

Questions:

Personally, I like having metalink in client's ostree remote config. Also for the short term future, we are looking at having CDN cloudfront. For long term we may have more CDN and local mirrors. So, we don't need to worry about having MirrorManager style GeoIP for at least short term .

CC @dustymabe @jlebon @cgwalters

@cgwalters
Copy link
Member Author

In production, ostree-summary.xml can be generated on Fedora server which updates ostree-summary.xml as well along with summary file

I think that's probably the main stumbling block for metalink, it adds two things which need to be kept tightly in sync.

Since we're fetching either the summary or refs/ from the fedora infra which provides the checksum...I don't see the value in having another thing with a checksum. Right?

@dustymabe
Copy link
Contributor

thanks @sinnykumari for the explanation. Let me understand option #4 a bit better. We basically add contenturl=mirrorlist=https://someurl.fedoraproject.org/mirrorlist.txt and we can ship people off to the CDN that way. Using this approach allows us to not expose the CDN url in client configs and also allows us to have the CDN hit directly. IOW we don't need any changes to the ostree pull code but we still get the benefits? If all of that is true then it would seem like a good answer to me.

I still would want @puiterwijk or @nirik to weigh in here before we decided on optimal solutions.

@sinnykumari
Copy link
Collaborator

In production, ostree-summary.xml can be generated on Fedora server which updates ostree-summary.xml as well along with summary file

I think that's probably the main stumbling block for metalink, it adds two things which need to be kept tightly in sync.

That's true

Since we're fetching either the summary or refs/ from the fedora infra which provides the checksum...I don't see the value in having another thing with a checksum. Right?

agree

My main motivation to prefer metalink over mirrorlist was advantage of metalink and our regular Fedora system(rpm based updates) using metalink (don't know implementation detail though).

If we don't have any additional advantage (like better security) with using metalink over mirrorlist here, then I am +1 for keeping things simple with using mirrolist.

@sinnykumari
Copy link
Collaborator

thanks @sinnykumari for the explanation. Let me understand option #4 a bit better. We basically add contenturl=mirrorlist=https://someurl.fedoraproject.org/mirrorlist.txt and we can ship people off to the CDN that way. Using this approach allows us to not expose the CDN url in client configs and also allows us to have the CDN hit directly. IOW we don't need any changes to the ostree pull code but we still get the benefits?

That's correct.

@dustymabe
Copy link
Contributor

That's correct.

in that case let's hold off any progress on this pending a discussion with fedora infra.

@cgwalters
Copy link
Member Author

One thing I'd also add to this is that if refs/summary are fetched from Fedora infra, we can also do TLS pinning in addition to GPG. (Although this would require a libostree change to say the TLS pins only apply to url and not contenturl)

@sinnykumari
Copy link
Collaborator

thanks @sinnykumari for the explanation. Let me understand option #4 a bit better. We basically add contenturl=mirrorlist=https://someurl.fedoraproject.org/mirrorlist.txt and we can ship people off to the CDN that way. Using this approach allows us to not expose the CDN url in client configs and also allows us to have the CDN hit directly. IOW we don't need any changes to the ostree pull code but we still get the benefits? If all of that is true then it would seem like a good answer to me.

I still would want @puiterwijk or @nirik to weigh in here before we decided on optimal solutions.

@puiterwijk has set-up a new cloudfront with caching entire ostree/repo/ . We are serving summary, config and mirrorlist file at https://ostree.fedoraproject.org/ . Here, mirrorlist and config are static file and summary gets synced every 15 minutes from https://kojipkgs.fedoraproject.org/atomic/repo/ .

So, ostree remote config on client will be something like:

[remote "fedora"]
url=https://ostree.fedoraproject.org
gpg-verify=true
gpgkeypath=/etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-29-primary
contenturl=mirrorlist=https://ostree.fedoraproject.org/mirrorlist

We will use https://ostree.fedoraproject.org as main url so that in future if https://kojipkgs.fedoraproject.org/atomic/repo/ is down or not available, we can serve content from another place without affecting client's ostree remote confg

@sinnykumari
Copy link
Collaborator

Can we get it tested by couple of folks so that we know that new config works fine?

@jlebon
Copy link
Member

jlebon commented Jan 21, 2019

Hmm, looks like GPG verification is failing because of:

$ curl -I https://ostree.fedoraproject.org/objects/b1/5eaaa5d007cb4307cbe9e7bcb868292609f5483fabfbac342ccd842ee2fe50.commitmeta
HTTP/2 404
date: Mon, 21 Jan 2019 17:44:55 GMT
server: Apache/2.4.37 (Fedora)
strict-transport-security: max-age=31536000; includeSubDomains; preload
x-frame-options: SAMEORIGIN
x-xss-protection: 1; mode=block
x-content-type-options: nosniff
referrer-policy: same-origin
content-type: text/html; charset=iso-8859-1

(Note that OSTree still fetches summaries and signatures from url).

Otherwise, testing with GPG verification off works great! I think there's a bug somewhere that wipes out the download stats when it's at the bottom of the terminal, but I did see it peak at around 2.5MB/s, which is definitely faster than any speeds I've had before.

@sinnykumari
Copy link
Collaborator

gpg verification should works fine now, redirect rule got added for .commitmeta file https://infrastructure.fedoraproject.org/cgit/ansible.git/tree/roles/fedora-web/ostree/files/ostree.conf#n5

$ curl -I https://ostree.fedoraproject.org/objects/b1/5eaaa5d007cb4307cbe9e7bcb868292609f5483fabfbac342ccd842ee2fe50.commitmeta
HTTP/2 302 
date: Tue, 22 Jan 2019 12:28:17 GMT
server: Apache/2.4.37 (Fedora)
strict-transport-security: max-age=31536000; includeSubDomains; preload
x-frame-options: SAMEORIGIN
x-xss-protection: 1; mode=block
x-content-type-options: nosniff
referrer-policy: same-origin
location: https://d1gglb5celp6et.cloudfront.net/objects/b1/5eaaa5d007cb4307cbe9e7bcb868292609f5483fabfbac342ccd842ee2fe50.commitmeta
content-type: text/html; charset=iso-8859-1

@sinnykumari
Copy link
Collaborator

sinnykumari commented Jan 30, 2019

I got chance to upgrade sliverblue system in my home network (Bangalore, India) with our new config set-up and I find it super fast!

Upgrade from ostree version 29.1.2 to 29.20190130.0

With existing config:

# cat /etc/ostree/remotes.d/fedora-workstation.conf
[remote "fedora-workstation"]
url=https://dl.fedoraproject.org/atomic/repo/
gpg-verify=true
gpgkeypath=/etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-29-primary

# time env OSTREE_DEBUG_HTTP=1 ostree pull fedora-workstation:fedora/29/x86_64/silverblue
real    41m38.873s
user    5m29.563s
sys     0m59.102s

With new config

# cat /etc/ostree/remotes.d/fedora-workstation.conf 
[remote "fedora-workstation"]
url=https://ostree.fedoraproject.org
gpg-verify=true
gpgkeypath=/etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-29-primary
contenturl=mirrorlist=https://ostree.fedoraproject.org/mirrorlist

# time env OSTREE_DEBUG_HTTP=1 ostree pull fedora-workstation:fedora/29/x86_64/silverblue
real    2m14.775s
user    0m14.979s
sys     0m12.822s

This is like 20 times faster, can't expect more!

@ramcq
Copy link
Contributor

ramcq commented Jan 30, 2019

Superb news! Code was there all along... :D

@miabbott
Copy link
Collaborator

I didn't see the 20x gains that @sinnykumari saw, but definite improvement:

Old config:

$ cat repo/config 
[core]
repo_version=1
mode=archive-z2

[remote "fedora"]
url=https://dl.fedoraproject.org/atomic/repo/
gpg-verify=true
gpgkeypath=/etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-29-primary

$ time ostree --repo=repo pull --mirror --depth=1 fedora:fedora/29/x86_64/atomic-host
...
real    33m54.054s
user    13m38.848s
sys     1m0.169s

New config:

$ cat repo/config
[core]                                                                            
repo_version=1                                                 
mode=archive-z2                                                                                         
                                                      
[remote "fedora"]                                                                     
url=https://ostree.fedoraproject.org            
gpg-verify=true                                                                   
gpgkeypath=/etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-29-primary      
contenturl=mirrorlist=https://ostree.fedoraproject.org/mirrorlist

$ time ostree --repo=repo pull --mirror --depth=1 fedora:fedora/29/x86_64/atomic-host                                                                                  
...                                                                                                                            

real    10m37.881s
user    2m27.230s
sys     0m27.189s

@sinnykumari
Copy link
Collaborator

Thanks @miabbott for running updates with new config. Results are nice!
There are chances that some objects which you fetched while pulling ref fedora/29/x86_64/atomic-host were accessed first time from the cloudfonrt (since it's a new set-up) and hence object may not been cached. Trying again might give even better results.

PS: Results which I pasted was for 2nd attempt (to ensure that what I am fetching is already been cached in cloudfront)

@ramcq
Copy link
Contributor

ramcq commented Jan 31, 2019

I think as long as we confirm that ostree isn't doing anything "silly" (the mirrorlist is being cached during a run, HTTP connections being reused, etc) then the benchmarks will be dominated by CDN performance which is dependent on many factors outside of ostree. We should verify that ostree is following the new configuration as desired/corrected and we can resolve/invalidate this ticket.

@miabbott
Copy link
Collaborator

There are chances that some objects which you fetched while pulling ref fedora/29/x86_64/atomic-host were accessed first time from the cloudfonrt (since it's a new set-up) and hence object may not been cached. Trying again might give even better results.

Of course, this makes sense. I repeated the test using the mirrorlist, this time pulling it twice and found excellent improvements.

Check out that throughput!!!

Receiving objects: 68% (21475/31206) 3.3 MB/s 391.0 MB
Receiving objects: 74% (23168/31206) 5.3 MB/s 451.5 MB

First time (even better than the original test):

$ time ostree --repo=repo pull --mirror --depth=1 fedora:fedora/29/x86_64/atomic-host


GPG: Verification enabled, found 1 signature:

  Signature made Sun 20 Jan 2019 07:31:11 PM EST using RSA key ID A20AA56B429476B4
  Good signature from "Fedora 29 <[email protected]>"


GPG: Verification enabled, found 1 signature:

  Signature made Sat 19 Jan 2019 07:32:41 PM EST using RSA key ID A20AA56B429476B4
  Good signature from "Fedora 29 <[email protected]>"
3740 metadata, 27466 content objects fetched; 647307 KiB transferred in 177 seconds                                                                                                                                                                             

real    2m57.379s
user    2m18.408s
sys     0m21.883s

Second pull immediately afterwards...so fast, much wow!

$ time ostree --repo=repo pull --mirror --depth=1 fedora:fedora/29/x86_64/atomic-host


GPG: Verification enabled, found 1 signature:

  Signature made Sun 20 Jan 2019 07:31:11 PM EST using RSA key ID A20AA56B429476B4
  Good signature from "Fedora 29 <[email protected]>"
Receiving metadata objects: 2/(estimating) 21.0 kB/s 21.0 kB                                                                                                                                                                                                    

GPG: Verification enabled, found 1 signature:

  Signature made Sat 19 Jan 2019 07:32:41 PM EST using RSA key ID A20AA56B429476B4
  Good signature from "Fedora 29 <[email protected]>"
3740 metadata, 27466 content objects fetched; 647307 KiB transferred in 128 seconds                                                                                                                                                                             

real    2m8.794s
user    2m17.034s
sys     0m19.019s

@sinnykumari
Copy link
Collaborator

@cgwalters Should we close this issue now?

@cgwalters
Copy link
Member Author

@cgwalters Should we close this issue now?

Yep, I think so. Thanks all!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants