-
Notifications
You must be signed in to change notification settings - Fork 312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow the NavigationController to manage resources on first load #73
Comments
Sounds like you're looking for a means to block page load until the 'controller' is up and running on first load. Some of us had talked about an option like that in the registration process at some point. I think it was dropped mostly as a matter of reducing the scope for the sake of clarity more than a fundamental problem with it. At the time of those discussion we had envisioned a header based registration mechanism such that the body of the initial page itself was re-requested thru the controller once it was up and running. |
One option is something like this, which is slightly underspecified right now:
|
@alecf with that mechanism, wouldn't that risk flashing some of the content that gets loaded before the controller is done loading? The document reload could be avoided if the script was blocking. Or if somehow the browser was aware that a controller was going to load, and it could block the load of the next resource until the controller is finished. |
The way alec pointed out is pretty close approximation and it would result in the main page load also being routed thru the controller for the reload. Being browser developers, we're understandably reluctant to introduce things that involve "blocking page loads" :) |
We're waging a war in the webperf community to rid "blocking resources" whenever and wherever possible... I would upgrade "reluctant to introduce blocking resources" to something much, much stronger. First, we're stuck on a controller download, then on parsing and eval, and then on potentially slow routing calls for each request -- ouch x3. |
Copied my comment from discussion on https://groups.google.com/a/chromium.org/forum/#!topic/blink-dev/Du9lhfui1Mo So, while there are performance implications of giving developers full control of resource loading, I really think it's the best solution going forward. One thing we all have to stop and realize is that when developers want to control resources, they manage to do it - just in non-optimal ways that also have big performance implications. One solution developers have and use is proxying pages and making modifications to resources before hitting the client, which can have big security implications, and does not have the advantage of making decisions based on device conditions. Another option developers have and use is writing alternates to So when thinking about giving the Navigation Controller the power to control resources on first load, its not a matter of blocking vs no blocking, its a matter of blocking vs the 3 options previously listed. |
Hi guys, I'm a colleague of Shawn's at Mobify, and I thought I'd butt in here because this API is really intriguing to me: Not providing an API that allows full control will just lead people to using the This API is already potentially quite "performance dangerous" (not to mention "basic functionality dangerous") in the sense of providing a very deep hook into resource scheduling, far in excess of what's previously been available, but the most likely application is in fact performance improvement, e.g. the Application specific caching rules as presented in the groups thread linked above, or choosing device appropriate resources for various devices, and making those decisions (and starting those downloads) as early as possible. I haven't dug too deeply into the API itself yet, but would it be hypothetically possible to throw a "bootstrap" controller into a page inline to overcome the "additional blocking resource" objection Ilya brought up? |
Would it be a terrible idea to have some sort of |
@noahadams - perhaps flip this around - I'm not sure reload is "at least as bad as blocking" - from my perspective, blocking is the worst possible option, because it effectively prevents ALL resource loads and introduces complicated behavior. Since you can virtually emulate the blocking behavior with interstitial content while the page loads, I can't see a good reason to introduce blocking. From mobify's own blog about 'm.' sites and redirects:
This situation is worse, because you actually have to start reading and parsing html and multiple resources before the pageload can continue. A few examples... What happens here:
Do we block loading of "bar.png"? is foo.png visible on the screen? what about this:
Is that script loaded before or after controller.js? when is it evaluated? What if it takes 2 seconds to get controller.js? To me these examples demonstrate that there is no way any web platform API is will ever support a method that blocks the main thread, especially one dependent on a network request. document.write() was bad enough, this is far worse. Further, a properly designed website could immediately put up an interstitial message, "Loading resources.." or what have you, if your site truly is non-functional without the controller. |
@alecf the Navigation Controller doesn't have to block the rendering thread in all cases, it just has to block resources from loading. Say for example, you had a document like this:
I would imagine in this case, Foo and Bar would render regardless of whether or not the controller was finished, and only the images would be delayed from loading until the controller was finished downloading. When the controller is finished loading and we have the instructions, the images could then start downloading. Now, if you had an external script tag in the head placed after the controller, like this...
...then yes, I would envision that the main rendering thread would be blocked, because loading jquery would be delayed waiting for the controller to finish loading, and well, external scripts block rendering. But scripts in the head block rendering anyways - and we all know the best practice is to throw scripts at the end of body. Therefore if developers follow that best practices, there would be no blocking of the main rendering thread even if the Navigation Controller behaved as I'm suggesting. The real performance loss here is that the preparser/preloader would be delayed until the controller is finished loading. As for "what if it takes 2 seconds to download controller.js", based on the spec, it seems as though the controller wouldn't get large enough to take that long to download... Of course, its possible. Once again, I just want to emphasis that in order to solve the responsive image problem, people already are blocking resources - just in different ways. Some are using proxies to rewrite resources, some are changing src attributes and loading images at the end of rendering - neither of these are optimal. Aside from giving users full control over resource loading, I can't think of a better alternative to solve the responsive image problem. |
Once you have a hammer, everything looks like a nail. We don't need Navigation Controller to solve the responsive images problem. The responsive images problem needs to be solved via appropriate API's - srcset, picture, client-hints, etc. The argument that NC "doesn't have to" block rendering is not practical: most every page out there (unfortunately) has blocking CSS and JavaScript, so the end-result will be that we block ourselves not only from rendering, but also from fetching those resources ahead of time. In Chrome, just the lookahead parser gives us ~20% improvement [1]. Further optimizations with aggressive pre-connects, pre-fetch, etc, will help us hide more of the mobile latency. [1] https://plus.google.com/+IlyaGrigorik/posts/8AwRUE7wqAE Also, since we've already had a lengthy discussion on this before. As a reference: I completely understand your (Mobify) motivation to have NC controller override all browser behavior -- you've built a business in rewriting poorly implemented sites into something more mobile friendly.. But let's face it, the actual answer is: you shouldn't need a proxy layer here, the site should be rebuilt to begin with. Adding more proxies can make this type of work easier, but it won't give us the performance we want (yes, it means the laggards have to manually update their sites). tl;dr: let's keep responsive images out of this discussion. |
First I just want say that I hope it's clear that I really appreciate the fact that we are all trying to come up with great ideas to benefit the web, and I think that it's pretty awesome that we can do it collaborative and open forum like this :) I think if the overall goal is to do everything that we can to improve the performance of the web, then I don't think we should be limited to hoping that laggards will manually update their sites. Automated tools are a very scalable way of achieving the goal of making the web faster. Google's own Pagespeed Service is a great example of this - it's not an optimal solution since pages must be routed and proxied through Googles servers, but it can definitely significantly improve the performance of most websites. I liked something you said in one of our earlier discussions on G+: "My only knit-pick is the "we will all benefit from another demonstrably effective technique to consider". If we qualify that with a bunch of caveats, like "on some sites and in some cases, your mileage will vary, and still slower than what the platform could deliver, assuming it implemented the right features".. then we're good! That's all. :-)" Even if we couldn't figure out a way to give developers the ability to have full control of resource loading without incurring a penalty, I still think it's a worthwhile feature that could be very useful to create automated tools to help speedup the web without needing to educate every single developer on the rules of performance. Then like you said, as long as we indicate that "your mileage will vary, and (your site is) still slower than what the platform could deliver", and as long as we can slowly educate them afterwards on how to take advantage of the platform, then we are good :) And one note for responsive images: I have a few gripes about picture and srcset, but I won't list them here. |
@alecf I suppose you're right about the reload() workaround approach to blocking behaviour being at least less complicated (though I have my own UX gripes about loading interstitials in pages and in browsers, I won't raise them here). My one concern about using it would be the case of a stateful transition between origins (that is to say, a cross-domain POST), though I'll admit that that's an uncommon edge case. I think there's an argument to be made that a blocking version of this would have blocking semantics similar to a blocking What about the potential for bootstrapping a controller inline with enough logic to "correctly" load the current page and using the "upgrade dance" later to install something more full featured? |
I think there's certainly something interesting in notion of inline controllers for bootstrap... it sounds like we should file a new issue for that suggestion. I'd be interested in hearing this thought out in particular (file a new issue, discuss these things there...)
|
@jansepar we may be getting off topic here, but we shouldn't be putting in large new features which will guarantee a significantly degraded performance -- blocking all resource downloads is a big red flag, and then there is still the question of overhead per lookup. Besides, you basically do this already with your implementation, so it's not clear what you would win here. |
@igrigorik I think the potential for an inline controller could be a good compromise that gives resource control on first load without degrading performance! Looking forward to seeing what comes out of that discussion which should be opening in a separate ticket. In regards to my implementation vs doing it with NC - there would definitely be some big performance wins of controlling resources via an API (NC), rather than capturing the entire document. |
@igrigorik @noahadams is planning on creating an issue for being able to bootstrap the controller inline. |
I think some applications may want to have some kind of always-on service worker. I see a way to allow that without hurting the page performance too much: via an http header.
This enables no-performance-loss installation:
|
The header is an interesting idea, but it's always on once you register it.. in fact since there is no pattern here, I don't see a way to register it in the general case.. you'd at least need to change the header to
But I'm still not convinced that this helps enough to justify a whole new header. Just to be clear: the usecase that is covered by this is one where the user loads a page, and then goes "offline" before that page is reloaded.. and you don't want to then reload offline, right? All the other ways of using this (responsive images on first load, etc) are beyond the scope of Service Worker design, even if people try to use Service Workers to solve them. |
@FremyCompany "this enables no-performance-loss installation" is not true. Stuffing the controller reference into an HTTP header may speed things up just a tad - by definition, headers will arrive first, allowing the browser to start fetching the controller - but it still does not address the problem of having to block dispatch of all other resources until the controller is loaded. @alecf agreed, don't think the header adds much here. |
@igrigorik The advantage of headers is that you don't have to wait to parse the page, and also that the header is only sent once over HTTP 2.0 because of header compression. You don't pay the cost of inlining multiple times. Regarding the blocking resource issue, this is a developer issue. If the developer need something it will achieve it anyway; for example by putting in an HTML comment all the HTML and waiting for the ServiceProvider to be loaded to reload the page; then extract the HTML from the comment on DOMContentReady. That will do the same thing, only slower. Also, do not forget that we are not forced to apply the service worker to all URLs; we ca restrict to some element only which may still leave the page usable in the mean time. |
Yep, that is true.
Everything is a developer issue if you get the API wrong. Perhaps the header is a reasonable solution, but this point alone is not sufficient as an argument for it.
That's true, but practically speaking, if you actually want to take your app offline, that's not the case, is it? As opposed to just using NavController to intercept a few different requests and rewrite them.. As such, I would fully expect most people to just claim "/*". |
I think you are right about /* but to he honest I'm still hoping that some "critical" resources can be put into an improved appcache instead, allowing those resources to be kept offline longer and bypass the service worker. The amount of such resources, being limited and rarely changed, should be sufficiently low to be managed by hand. That's the hope at least... |
|
There is absolutely no way we're combining AppCache and ServiceWorker - if anything I expect that using them together will result in several developers feeling so bad about themselves for trying that they give up on web development entirely, and write native apps as a penance for their sins. I think we need to get back to the issue at hand which is the attempt to "go offline" during the initial load of the document, the first time its ever seen by the browser. This is only the very first time - registration is persistent across future pageloads and even browser restarts! We're jumping through hoops to avoid this: navigator.registerServiceWorker("/*", "service-worker.js").then(function() { window.reload(); }) or alternatively
Or something similar. I just can't see introducing anything that would block all resources from loading the first time a user visits a page. The browser will just sit there with a blank white page spinning/progressing until the service worker is downloaded and started. A developer who did that would essentially be saying "I want my web page to suck for 10-30 seconds for all first time visitors" - if your site is really that heavily dependent on the service worker, you WANT some kind of "installer" or progress meter to give feedback to your users, so they don't just hit the "back" button and never visit your site again. (like the god-awful but unfortunately necessary progress bar that gmail has) |
I think we need to get back to the issue at hand which is the attempt to
"Si quieres viajar alrededor del mundo y ser invitado a hablar en un monton |
wait, think about what you're really asking though:
Putting aside appcache for a moment: if they are available in the system cache and are fresh, then you don't need a service worker present to be aware of them. If the service worker is registered and requests its cache be populated, then that mechanism is really a function of the browser implementation of the SW cache - if the implementation is written such that it can just refer to the existing, fresh data in the system cache from the SW cache implementation, then it won't have to re-download those resources when the SW is instantiated. I don't really see what this has to do with having SW loaded in the first invocation of the page - it sounds like you're more concerned about the transition from a non-SW-controlled page to a SW-controlled page, but trying to solve it by avoiding non-SW pages altogether. |
Hum, now that I think it, maybe this content request would also be done by "Si quieres viajar alrededor del mundo y ser invitado a hablar en un monton |
@alecf Why would it be hard to have both a SW and an appcache? I seriously don't get it... I could totally use an appcache for my "/style", "/resources" and "/scripts" folders while requiring a ServiceWorker for the "/data" or "/api" folders. Then the website can load perfectly fine without SW if needed because the essential content wil rely on the appcache, while still providing a case-by-case caching functionnality for more variable and user-dependent content, and it is not a critical issue if that happens after a very small latency because the core content can be ready independtly. By the way, it is totally false that the page will display blank while the SW is loaded, because a developer with a minimum of logic will make sure not to have blocking content until it has enough stuff going on to display its progressbar; or it will make sure the SW is fast, or more likely both. This 10-30s analogy is hyperbolic and totally misses the point. The SW may allow huge win on the wires by allowing finer content negotation; diff files and request prioritization that in the end may make the page appears to load faster even on first run. |
"Si quieres viajar alrededor del mundo y ser invitado a hablar en un monton |
@abarth @igrigorik ah, I see what you mean. What if Also - I wonder if we should move this discussion elsewhere (although the outcome of the |
Heh, @PhUU and I were talking about this on Friday and came to the same conclusion regarding the preloader. I don't think inline workers or onfetch are feasible because of this. It becomes a worse version of The only way I can see it working is if the worker registration was declarative (response header, or attribute) and optionally required it to load, install & activate before handling page requests. However, this is performance ugly. |
@abarth fwiw, independent of the onfetch discussion here, we'll have to join against ServiceWorker... that's a feature. |
@igrigorik ServiceWorker isn't main thread, preloader can call onfetch in the active worker or (if absolutely need be) not run for urls with an active serviceworker. |
@jakearchibald yep, I'm just pointing out that it won't be a straight shot from parser thread to preloader. Re, "if absolutely need be": so, SW does not guarantee that if its installed, that all requests must be routed through it? |
Ah, no, sorry, that was really ambiguous. I mean the preloader can disable itself for urls with an active serviceworker, since it knows that upfront (which it wouldn't with the inline worker/onfetch thing). Obviously having the preloader call the worker's onfetch would be better. |
Disabling the preload scanner will significantly impact performance. |
@abarth Right. For serviceworker pages the preloader should call onfetch so it can work out which caches to fetch from. |
I don't see a problem running preloads through the serviceworker - it would be even greater if there was a hint in the request indicating that the request originates from a preload - the SW might be able to get involved more:
|
@alecf I don't think we gain anything by (1) and (2)... as far as SW is concerned, preload requests are no different from parser-initiated requests, and should be treated as such. I don't see why we need to special case them. For (3), no Chrome does not mark preload requests with any special headers/flags -- I believe IE does though, not sure about others. |
@abarth I don't think the
If the |
@jansepar: The preload scanner runs before & ahead of JavaScript execution. So: var fetchStr = 'fetch';
window['on' + fetchStr] = function() { ... }; The above has to be parsed & executed before the preloader can continue its work. <script src="1.js"></script>
<script src="2.js"></script>
<script src="3.js"></script> The above would have to download one after the other, because each may contain a fetch listener that'd change how the next script is fetched. |
@jakearchibald that's not entirely true. Preload scanner is invoked when the parser is blocked - namely, when a blocking script is encountered. Further, as of today an inline script actually blocks both the parser and the preload scanner - the latter part is considered a bug and will/should be addressed in the future. Long story short, today inlining the fetch script would work. This points to a larger question: how does the UA know to block requests on SW? Say I have my SW declaration at the top of the page, and bunch of scripts below it... How will UA distinguish between the cases of: (a) I've registered this controller before, therefore route these requests through me, vs (b) this is a new controller, preloader/parser please go ahead... |
I don't believe that's true in Firefox, possibly other UAs too. Besides, having it work in inline scripts but not external scripts sounds like bad magic.
Controller registration is entirely async and will have no say over the resource loading of this page unless the registered controller calls When a fetch happens, the UA will look to see if there's an active worker that applies to the page url. If not, things continue as normal. Otherwise, the preparser will trigger fetch events in the worker for requests it wants to make. Say you have 3 scripts at the bottom of the page, it's possible that the first 2 will be requested normally, then the worker install completes, it calls
|
It does hold for FF, IE, and others -- exception, IE 8/9. This is something we recently ran into on PageSpeed side, I can dig up the WPT runs if needed. That said, I'm not suggesting this is the right way to solve the problem. As I mentioned earlier, this behavior is considered a bug... even if its consistent between popular UAs.
What does active worker actually mean? Say I visited awesome-widgets.com yesterday for the first time and it installed a SW instance. A week later, and a few browser reboots later, I come back to that site: what is the UA logic here? Is it going to check some local URL registry and see if it has the controller script in cache? Then block/wait to spin up the controller and forward requests to it? Ultimately, what I'm curious about is: if I can guarantee that the controller is in cache, what can we say about when the controller will be executed / how the requests are routed through it. |
Yep!
When a browser tab closes:
When the browser fetches a page, or a request is made from a page:
The intention is to avoid v1 and v2 of a worker running at the same time. Because if v2 migrates data and deletes caches, it leaves v1 unusable or worse, silently saving data to a location that's no longer being used. |
@jakearchibald I don't think this is how
Totally agreed :). But I don't think that would be at all necessary. The preloader doesn't kick off until a blocking script is encountered - therefore if the script for registering the |
This is exactly what you get with serviceworker currently. This thread appears to be about guaranteeing the preloader won't run until the fetch listener is in place.
Right, and this is behaviour that will change in the future.
Yeah, I said that a few comments ago (#73 (comment)). The only way I currently see this working is by doing worker registration via a response header, along with another indicating that the worker is required either for the current request or subsequent requests. We wanted to avoid this due to performance. |
With lots of help from @slightlyoff and @jakearchibald (thanks guys), I think I finally have a handle on the general shape of this problem... Below is my attempt at the summary of the discussion so far, and a proposal for how to move forward. First, let's start with the basics:
On a more tactical side, possible ways to minimize the impact of said race condition:
Everything above is implementable with the current spec as is. On the "inlining the SW controller" discussion:
On the "onfetch" proposal/idea:
Finally, some thoughts on moving forward with this discussion:
Phew. Hopefully that makes sense. |
Great summary @igrigorik! I only have one question - in order for developers to use SW (initially), is Server Push a requirement? Or are we saying that SW can still be installed using the markup pattern defined in the spec, but if you want to get SW installed ASAP to (possibly) control loading on the initial page load, then the advice would be "use server push"? |
@jansepar there are no hard dependencies between server-push and SW. That said, if you want to accelerate the instantiation of the SW controller then you can leverage push to avoid the extra roundtrips (just as you described). Further, the logic here is that to start push allows us to experiment with this feature and see how well it works (or not) in practice. Once we have some experience with it, we can revisit the discussion to see if we need more controls and if inlining, etc., is worth the effort. |
Yup, sounds perfectly reasonable! Thanks for that last bit of clarification
|
Looks like we have rough consensus. It seems plausible that we'll need some sort of HTTP header or other control to allow pages to disable/control the preload scanner, which would allow us to reduce the size of the race further. That's something we should probably propose elsewhere (and as @igrigorik rightly points out, wait on impl experience to understand the need for). Closing the issue for now. Great conversation, all! This thread will be a valuable reference for us in the future if/when we revisit the topic. |
Just a small note (I'm glad it was resolved the best option would be to introduce an HTTP header, this is what I always felt was the right option) but, in some cases, we may want to allow a SW worker to have access to the requests that were done during the pageload before it was installed on the page. In such case, the worker can let the preloader do its job and still put in one of its specialized cache the content that was fetched independently by the browser before it could handle the requests. This is another way to resolve the race-condition vs performance trade-of worth looking at. |
Copied from my post on the discussion on chromium: https://groups.google.com/a/chromium.org/forum/#!topic/blink-dev/Du9lhfui1Mo
Just found this, and it seems extremely interesting and has lots of potential to be a very useful addition to browsers. I was disappointed to read this bit though:
"The first time http://videos.example.com/index.html is loaded, all the resources it requests will come from the network. That means that even if the browser runs the install snippet for ctrl.js, fetches it, and finishes installing it before it begins fetching logo.png, the new controller script won't be consulted about loading logo.png. This is down to the first rule of Navigation Controllers"
I think there is a lot of value that can come from giving developers the power to have full control over resource loading, even on the first load. For example, having the ability to swap image URLs before they are kicked off by the preloader would be a big win for responsive images. I am the author of the Capturing API (https://hacks.mozilla.org/2013/03/capturing-improving-performance-of-the-adaptive-web/) which provides this exact functionality in a non-optimal way. In order to control resource loading with Capturing, we must first buffer the entire document before being able to manipulate resources, which is a bummer, but it's ability to control resources on the page is very, very useful. If the Navigation Controller worked on first page load, the need for Capturing would be eliminated.
It does not seem like total control of resource loading is the goal of the Navigation Controller, but the API is very close to being able to provide exactly that without much change at all. I would love to have a conversation about whether or not adding this functionality is feasible!
The text was updated successfully, but these errors were encountered: