-
Notifications
You must be signed in to change notification settings - Fork 306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Follow redirect (HTTP 302) just once for objects/
#1541
Comments
Although, I believe this would just work if the content in the CDN was prefixed with
Right now:
But libostree will generate a URL with which is a 404. |
I'll honestly have to ask patrick in most of these cases, but let me try to comment on a few things:
That's an RFE for ostree, right?
Fedora infra has re-enabled it, but we still have it disabled in ostree. We (ostree) are waiting for the curl bug to get fixed and propagate down to Fedora (and into ostree), I think.
Are you suggesting we use
I see.. with |
Maybe. If we tweaked the server side as above, then it would be easy for people to opt-in to testing things. Also this model is exactly why the I filed this issue as I think it'd make sense to change libostree to handle it by default too, but if it's not too hard to change Fedora infrastructure I'd like to do that in addition. |
Does
|
we probably should be serving those over cloudfront. I'll ask @puiterwijk if he can add that. |
The deltas should be cached in cloudfront now. Also, curl is fixed in f27+ updates for that http/2 bug and we have re-enabled http/2, so might test that and see if it can be re-enabled. |
@nirik @puiterwijk Would it be possible to keep the relative hierarchy intact and mirror whole |
I was playing with the below patch locally. Here's a script I wrote which clones a partially-cloned repo (so we're just doing content fetches):
One thing that was immediately apparent is the non-contenturl path is a lot faster (3MB/s vs 800k/s) - presumably Cloudfront is penalizing small requests? Or is maybe allocating bandwidth in a different way. I think in the end, probably trying to optimize the current archive path isn't too useful. Ensuring more people hit deltas is a far bigger win. And rojig gives us chunking on a higher level which is just more CDN friendly.
|
That said, we should still tweak this, and people who don't have a good connection to the existing |
Sorry it's not immediately obvious to me what the outcome of your statement is. I'll try to ask clarifying questions:
so you're saying the non
yeah we just enabled deltas via CDN above, thought that was already done before. Do you think the |
The confusing thing here is that (AIUI) what kojipkgs today is doing is using Varnish to act as a cache itself, but redirecting to Cloudfront for objects it doesn't have. It's apparently faster (for me) to access objects that are in kojipkgs vs Cloudfront if they're cached. Setting But again, this could be different for someone with a different connection to kojipkgs.
My patch is brute forcing this, it's not handling redirections as that'd be messier in the code. If Fedora moved the URL layout, then it'd be a lot easier for everyone today to try setting But the Varnish caching behavior is going to make performance testing unpredictable. (Really in the end I think ostree should probably learn a chunked format - we could even consider pushing rojig-like semantics down into ostree itself, I suspect it'd be very useful to have some of the stuff that lives in rpm-ostree work for debs/etc too as well, although the complexity of making it generic is high) |
That's not exactly my understanding. My understanding is that If you run the test multiple times (so we know the objects should be in cloudfront now), do the results change? |
I would personally avoid introducing a new niche
|
So, I'm fine with making |
Yeah, I'm fine to do that, though once we do the first part, anyone can test out performance etc. easily by setting |
For testing purpose, we did a cloudfront CDN set-up for entire ostree repo (https://kojipkgs.fedoraproject.org/atomic/repo/) with url https://d1dgksnh07m2j5.cloudfront.net/ . For testing, I used:
Result:
Finished quite fast! We also added redirects in our Fedora-infra config with testing url https://dl.fedoraproject.org/cdn-testing/atomic/repo/objects/ , using this as Did I miss something to try out? |
Probably better to test with a direct Also very important when discussing this is whether it's a delta pull or not. Here's a way to easily reproduce "clean" pulls (I'm starting from
You can also add |
One other random thing I noticed while playing with this...at some point I was testing with a local build of ostree and had been noticing some hangs in the curl code while fetching. I think that boils down to not having |
OK yeah, http2+cloudfront contenturl is quite a dramatic improvement over http1+constant redirects. With a primed CloudFront cache (i.e. this is my 2nd pull) and
|
Indeed. http2+cloudfront contenturl works much faster.
|
So this leaves the question of how we enable this. We definitely don't want people hardcoding the CF URL. The previous suggestion here was #1541 (comment) I have a new idea: We could support a special |
@sinnykumari I'm trying to understand exactly what settings you had when you were doing tests. Was this your setup?
I think the above setup would appropriately test things because we would pay the redirect penalty the (same as our prod repo setup today) on initial contact, but would subsequently go directly to CDN for objects (different from our prod repo setup today).
Yeah I don't see that being too hard to add to the server side, but is there a great benefit to adding the special header rather than supporting the "probe+assume" method as previously suggested (#1541 (comment), #1541 (comment)) |
My fedora-atomic remote config is:
Even I have tried with different url values https://dl.fedoraproject.org/atomic/repo/ and https://dl.fedoraproject.org/cdn-testing/atomic/repo/ (CDN testing url which set-up). url value doesn't matter much because when contenturl has a valid ostree repo, it fetches content from there. But I think using https://kojipkgs.fedoraproject.org/atomic/repo/ is preferable because that's the source of truth for latest content from where missing objects should be fetched from when content is not yet in CDN. |
It's a bit about conservatism - we could theoretically be breaking someone's setup where e.g. they have the Maybe we add an opt-out config option at least temporarily? Dunno. |
It's definitely reasonable to be conservative. For some reason I thought this was a relatively safe thing to do.
yeah that could be something to do. If you think a |
@sinnykumari. I think the two setups we need to compare numbers for are:
In the first case we are paying a redirect penalty for all requests. An initial request for a file under objects will go go In the second case we are paying a redirect penalty only for requests about the status of the repo (i.e. "what commit do I update to"?), but not for objects. For all objects requests, on an inital request it would be: I think these two setups are the ones that represent the smallest delta from what we currently have to where we want to go. |
wow that looks very promising
This is mostly a requirement by our infra team. They want to make sure the source of truth always comes from Fedora itself and are being conservative here, which isn't bad. The CDN is an implementation detail that could be changed at any time. |
so @cgwalters, @jlebon - I think there are three options from our discussions above:
any opinion? |
@dustymabe I liked Colin's idea of allowing a mirrorlist to be specified for the url or contenturl entry, and have that cached/reused sensibly. Then only the main server's URL is included inside the client config, but that server can send you to whatever CDN or mirror they want, either statically or with some geo-magic. (That could be with or without some way of flagging a content only server in the mirrorlist, so the whole url= is just one mirrorlist which has both types of servers in.) |
(I dislike any form of redirect probing. Seems deeply magic and relies on more complicated server configuration, because you can't just have your CDN proxy your origin without also having your origin, or another proxy, separate client requests from CDN requests to decide if they should be redirected.) |
@ramcq so you are saying there is a 4th option:
I'd have to get input from fedora infra on that option. @puiterwijk, any thoughts? |
Yes, the 4th or "mirrorlist" option is described in #1541 (comment) |
You mean One hybrid method to avoid having to update people's remote configs is a |
Yep thanks 👍 - I edited my comment above.
Not opposed, but in the end I think it'd be clearer to edit the client config files; less magic (once the migration is done), ostree code already exists, and it's probably useful to break the ice for "fix up client configs". |
Yeah, Flatpak actually has some of these metadata keys defined so that they push new keys into the repo config, such as updated URLs or collection IDs, so there is precedent for that. The repo config editing API and code in Flatpak isn't stellar, but we could split out the story a bit and do a) support for this URL type (likely unblocks FCOS subject to infra team approval), b) clear up repo config editing, c) define a metadata key (maybe a few) to push ostree config updates. |
I was exploring 4th option mentioned in #1541 (comment) and details from #1541 (comment) which is about using I created a mirrorlist hosted in my personal fedora space (should be replaced with official link for production)
In the mirrorlist, I have added our testing cloudfront ostree repo url (https://d1dgksnh07m2j5.cloudfront.net/) Using metalink instead of mirrorlist In this case, client's ostree config looks like:
ostree-metalink.xml file which I created contains different values based on today's summary file from https://kojipkgs.fedoraproject.org/atomic/repo/summary . In production, ostree-summary.xml can be generated on Fedora server which updates ostree-summary.xml as well along with summary file Running Questions:
Personally, I like having metalink in client's ostree remote config. Also for the short term future, we are looking at having CDN cloudfront. For long term we may have more CDN and local mirrors. So, we don't need to worry about having MirrorManager style GeoIP for at least short term . |
I think that's probably the main stumbling block for metalink, it adds two things which need to be kept tightly in sync. Since we're fetching either the |
thanks @sinnykumari for the explanation. Let me understand option I still would want @puiterwijk or @nirik to weigh in here before we decided on optimal solutions. |
That's true
agree My main motivation to prefer metalink over mirrorlist was advantage of metalink and our regular Fedora system(rpm based updates) using metalink (don't know implementation detail though). If we don't have any additional advantage (like better security) with using metalink over mirrorlist here, then I am +1 for keeping things simple with using mirrolist. |
That's correct. |
in that case let's hold off any progress on this pending a discussion with fedora infra. |
One thing I'd also add to this is that if refs/summary are fetched from Fedora infra, we can also do TLS pinning in addition to GPG. (Although this would require a libostree change to say the TLS pins only apply to |
@puiterwijk has set-up a new cloudfront with caching entire ostree/repo/ . We are serving summary, config and mirrorlist file at https://ostree.fedoraproject.org/ . Here, mirrorlist and config are static file and summary gets synced every 15 minutes from https://kojipkgs.fedoraproject.org/atomic/repo/ . So, ostree remote config on client will be something like:
We will use https://ostree.fedoraproject.org as main url so that in future if https://kojipkgs.fedoraproject.org/atomic/repo/ is down or not available, we can serve content from another place without affecting client's ostree remote confg |
Can we get it tested by couple of folks so that we know that new config works fine? |
Hmm, looks like GPG verification is failing because of:
(Note that OSTree still fetches summaries and signatures from Otherwise, testing with GPG verification off works great! I think there's a bug somewhere that wipes out the download stats when it's at the bottom of the terminal, but I did see it peak at around 2.5MB/s, which is definitely faster than any speeds I've had before. |
gpg verification should works fine now, redirect rule got added for .commitmeta file https://infrastructure.fedoraproject.org/cgit/ansible.git/tree/roles/fedora-web/ostree/files/ostree.conf#n5
|
I got chance to upgrade sliverblue system in my home network (Bangalore, India) with our new config set-up and I find it super fast! Upgrade from ostree version 29.1.2 to 29.20190130.0 With existing config:
With new config
This is like 20 times faster, can't expect more! |
Superb news! Code was there all along... :D |
I didn't see the 20x gains that @sinnykumari saw, but definite improvement: Old config:
New config:
|
Thanks @miabbott for running updates with new config. Results are nice! PS: Results which I pasted was for 2nd attempt (to ensure that what I am fetching is already been cached in cloudfront) |
I think as long as we confirm that ostree isn't doing anything "silly" (the mirrorlist is being cached during a run, HTTP connections being reused, etc) then the benchmarks will be dominated by CDN performance which is dependent on many factors outside of ostree. We should verify that ostree is following the new configuration as desired/corrected and we can resolve/invalidate this ticket. |
Of course, this makes sense. I repeated the test using the mirrorlist, this time pulling it twice and found excellent improvements. Check out that throughput!!!
First time (even better than the original test):
Second pull immediately afterwards...so fast, much wow!
|
@cgwalters Should we close this issue now? |
Yep, I think so. Thanks all! |
Fedora recently started redirecting:
But things would be dramatically faster if when we hit a redirect for a commit object, we transparently used that as the base URL for future requests.
(Also we should re-enable http2 in fedora)
The text was updated successfully, but these errors were encountered: