diff --git a/src/static/css/2019.css b/src/static/css/2019.css index 2a5fb00d7ac..c2e9dab8aef 100644 --- a/src/static/css/2019.css +++ b/src/static/css/2019.css @@ -15,6 +15,7 @@ body { h1, h2, h3, h4, h5, h6, .subtitle { font-family: 'Poppins', sans-serif; + line-height: 1.2em; } b { @@ -216,7 +217,6 @@ header .btn + .language-switcher { .main a, .main a:visited { color: #0b1423; - word-break: break-all; } h2, h3, h4 { @@ -374,9 +374,9 @@ p.copyright { } } -.visually-hidden { +.visually-hidden { position: absolute !important; - height: 1px; + height: 1px; width: 1px; overflow: hidden; clip: rect(1px 1px 1px 1px); /* IE6, IE7 */ diff --git a/src/static/css/methodology.css b/src/static/css/methodology.css index ee7f063e7a9..0e251e0631b 100644 --- a/src/static/css/methodology.css +++ b/src/static/css/methodology.css @@ -8,7 +8,7 @@ #methodology p, #methodology li { - font-size: 18px; + font-size: 17px; line-height: 30px; } .decorative-line { diff --git a/src/static/css/page.css b/src/static/css/page.css index 53b60328bc7..88018a75f7f 100644 --- a/src/static/css/page.css +++ b/src/static/css/page.css @@ -4,16 +4,22 @@ grid-template-areas: 'index content'; grid-template-columns: 300px auto; } -.table-wrap, +.table-wrap-container, .floating-card { border-radius: 16px; box-shadow: 0 0 16px 0 rgba(78, 85, 100, 0.2); } .table-wrap { - display: inline-block; - margin: 10px 0; + display: flex; + margin: 0 auto; + padding: 16px; max-width: 100%; + justify-content: center; +} +.table-wrap-container { overflow: auto; + display: inline-block; + margin: 0 -16px; } .code-block { @@ -107,7 +113,6 @@ margin-top: 0; padding-top: 0; min-width: 320px; - font-size: 18px; } .content > section { margin-bottom: 64px; @@ -173,7 +178,7 @@ } .authors .tagline{ - font-size: 18px; + font-size: 16px; } .authors .avatar{ @@ -271,14 +276,13 @@ figure iframe { margin: 0 auto; } figcaption { - margin-top: 20px; + margin-top: 8px; text-align: center; } table { margin: 0 auto; max-width: 100%; border-collapse: collapse; - display: block; } thead { font-family: 'Poppins', sans-serif; @@ -354,7 +358,9 @@ figure .big-number { max-height: 100%; transition: max-height 0.25s ease-in; } - + .table-wrap { + justify-content: left; + } table { font-size: .8em; } diff --git a/src/templates/en/2019/chapters/accessibility.html b/src/templates/en/2019/chapters/accessibility.html index e83d1584ff8..d241d8f9615 100644 --- a/src/templates/en/2019/chapters/accessibility.html +++ b/src/templates/en/2019/chapters/accessibility.html @@ -39,9 +39,9 @@ "headline": "{{ metadata.get('title') }}", "image": { "@type": "ImageObject", - "url": "https://almanac.httparchive.org/static/images/{{ year }}/{{ get_chapter_image_dir(metadata) }}/hero_sm.jpg", - "height": 163, - "width": 326 + "url": "https://almanac.httparchive.org/static/images/{{ year }}/{{ get_chapter_image_dir(metadata) }}/hero_lg.jpg", + "height": 433, + "width": 866 }, "publisher": { "@type": "Organization", @@ -265,7 +265,7 @@
There are many cases in which website visitors may not be able to see a website perfectly. Visitors may be colorblind and unable to distinguish between the font and background color (1 in every 12 men and 1 in 200 women of European descent). Perhaps they’re simply reading while the sun is out and creating tons of glare on their screen—significantly impairing their vision. Or maybe they’ve just grown older, and their eyes can't distinguish colors as well as they used to.
To allow your users to be able to read your website under these conditions, be sure that your text has sufficient color contrast with its background.
- +Figure 1. Example of what text with insufficient color contrast looks like. Courtesy of LookZook
So how well did the sites we analyzed do? Only 22.04% of sites gave all of their text sufficient color contrast. Or in other words: 4 out of every 5 sites have text that easily blends into the background, making it unreadable.
@@ -365,7 +365,7 @@
A skip link is a link placed at the top of a page which allows screen readers or keyboard-only users to jump straight to the main content. It effectively "skips" over all navigational links and menus at the top of the page. Skip links are especially useful to keyboard users who don't use a screen reader, as these users don’t usually have access to other modes of quick navigation (like landmarks and headings). 14.19% of the pages in our sample were found to have skip links.
If you’d like to see a skip link in action for yourself, you can! Just do a quick Google search and hit "tab" as soon as you land on the search result pages. You’ll be greeted with a previously hidden link just like the one in Figure 7.
- +Figure 7. What a skip link looks like on google.com
Note: It’s hard to accurately determine what a skip link is when analyzing sites. For this analysis, if we found an anchor link (href=#heading1) within the first 3 links on the page, we defined this as a page with a skip link. So, this 14.19% we reported is an upper bound and could be far worse. diff --git a/src/templates/en/2019/chapters/caching.html b/src/templates/en/2019/chapters/caching.html index 89cfa1e870d..468d4d56869 100644 --- a/src/templates/en/2019/chapters/caching.html +++ b/src/templates/en/2019/chapters/caching.html @@ -39,9 +39,9 @@ "headline": "{{ metadata.get('title') }}", "image": { "@type": "ImageObject", - "url": "https://almanac.httparchive.org/static/images/{{ year }}/{{ get_chapter_image_dir(metadata) }}/hero_sm.jpg", - "height": 163, - "width": 326 + "url": "https://almanac.httparchive.org/static/images/{{ year }}/{{ get_chapter_image_dir(metadata) }}/hero_lg.jpg", + "height": 433, + "width": 866 }, "publisher": { "@type": "Organization", @@ -222,10 +222,10 @@
The tool RedBot.org allows you to input a URL and see a detailed explanation of how the response would be cached based on these headers. For example, a test for the URL above would output the following:
- +If no caching headers are present in a response, then the client is permitted to heuristically cache the response. Most clients implement a variation of the RFC’s suggested heuristic - which is 10% of the time since last modified. However some may cache the response indefinitely. So it is important to set specific caching rules to ensure that you are in control of the cacheability.
According to the HTTP Archive, in July 2019 72% of responses were served with a Cache-Control header, and 56% of responses were served with an Expires header. However 27% of responses did not use either header, and are subject to heuristic caching! This was consistent across both desktop and mobile sites.
- +A cacheable resource is stored by the client for a period of time and available for reuse on a subsequent request. Across all HTTP requests, 80% of responses are considered cacheable, meaning that a cache is permitted to store them. Out of these,
The remaining responses are not permitted to be stored in browser caches.
- +The table below details the cache TTL values for desktop requests by type. Most content types are being cached, however CSS resources appear to be consistently cached at high TTLs.
While most of the median TTLs are high, the lower percentiles highlight some of the missed caching opportunities. For example, the median TTL for images is 28 hours, however the 25th percentile is just 1-2 hours and the 10th percentile indicates that 10% of cacheable image content is cached for less than 1 hour.
- | Desktop Cache TTL Percentiles (Hours) | -||||
type | -p10 | -p25 | -p50 | -p75 | -p90 | -
audio | -12 |
- 24 |
- 720 |
- 8760 |
- 8760 |
-
css | -720 |
- 8760 |
- 8760 |
- 8760 |
- 8760 |
-
font | -< 1 |
- 3 |
- 336 |
- 8760 |
- 87600 |
-
html | -< 1 |
- 168 |
- 720 |
- 8760 |
- 8766 |
-
image | -< 1 |
- 1 |
- 28 |
- 48 |
- 8760 |
-
other | -< 1 |
- 2 |
- 336 |
- 8760 |
- 8760 |
-
script | -< 1 |
- < 1 |
- 1 |
- 6 |
- 720 |
-
text | -21 |
- 336 |
- 7902 |
- 8357 |
- 8740 |
-
video | -< 1 |
- 4 |
- 24 |
- 24 |
- 336 |
-
xml | -< 1 |
- < 1 |
- < 1 |
- < 1 |
- < 1 |
-
+ | Desktop Cache TTL Percentiles (Hours) | +||||
type | +p10 | +p25 | +p50 | +p75 | +p90 | +
audio | +12 |
+ 24 |
+ 720 |
+ 8760 |
+ 8760 |
+
css | +720 |
+ 8760 |
+ 8760 |
+ 8760 |
+ 8760 |
+
font | +< 1 |
+ 3 |
+ 336 |
+ 8760 |
+ 87600 |
+
html | +< 1 |
+ 168 |
+ 720 |
+ 8760 |
+ 8766 |
+
image | +< 1 |
+ 1 |
+ 28 |
+ 48 |
+ 8760 |
+
other | +< 1 |
+ 2 |
+ 336 |
+ 8760 |
+ 8760 |
+
script | +< 1 |
+ < 1 |
+ 1 |
+ 6 |
+ 720 |
+
text | +21 |
+ 336 |
+ 7902 |
+ 8357 |
+ 8740 |
+
video | +< 1 |
+ 4 |
+ 24 |
+ 24 |
+ 336 |
+
xml | +< 1 |
+ < 1 |
+ < 1 |
+ < 1 |
+ < 1 |
+
By exploring the cacheability by content type in more detail, we can see that approximately half of all HTML responses are considered non-cacheable. Additionally, 16% of images and scripts are non-cacheable.
- +The same data for mobile is shown below. The cacheability of content types is consistent between desktop and mobile.
- +In HTTP/1.0, the Expires header was used to indicate the date/time after which the response is considered stale. It’s value is an HTTP-date timestamp, such as:
Expires: Thu, 01 Dec 1994 16:00:00 GMT
53% of HTTP responses include a Cache-Control header with the max-age directive, and 54% include the Expires header. However, 41% of these responses use both headers, which means that 13% of responses are caching solely based on the Expires header.
- +The HTTP/1.1 specification includes multiple directives that can be used in the Cache-Control response header and are detailed below. Note that multiple can be used in a single response.
Directive | -Description | -
max-age | -Indicates the number of seconds that a resource can be cached for | -
public | -Any cache may store the response. | -
no-cache | -A cached entry must be revalidated prior to it's use | -
must-revalidate | -A stale cached entry must be revalidated prior to its use | -
no-store | -Indicates that a response is not cacheable | -
private | -The response is intended for a specific user and should not be stored by shared caches. | -
no-transform | -No transformations or conversions should be made to this resource | -
proxy-revalidate | -Same as must-revalidate, but applies to shared caches. | -
s-maxage | -Same as max age, but applies to shared caches only | -
immutable | -Indicates that the cached entry will never change, and that revalidation is not necessary. | -
stale-while-revalidate | -Indicates that the client is willing to accept a stale response while asynchronously checking in the background for a fresh one. | -
stale-if-error | -Indicates that the client is willing to accept a stale response if the check for one fails. | -
Directive | +Description | +
max-age | +Indicates the number of seconds that a resource can be cached for | +
public | +Any cache may store the response. | +
no-cache | +A cached entry must be revalidated prior to it's use | +
must-revalidate | +A stale cached entry must be revalidated prior to its use | +
no-store | +Indicates that a response is not cacheable | +
private | +The response is intended for a specific user and should not be stored by shared caches. | +
no-transform | +No transformations or conversions should be made to this resource | +
proxy-revalidate | +Same as must-revalidate, but applies to shared caches. | +
s-maxage | +Same as max age, but applies to shared caches only | +
immutable | +Indicates that the cached entry will never change, and that revalidation is not necessary. | +
stale-while-revalidate | +Indicates that the client is willing to accept a stale response while asynchronously checking in the background for a fresh one. | +
stale-if-error | +Indicates that the client is willing to accept a stale response if the check for one fails. | +
For example, the below header indicates that a cached entry should be stored for 43200 seconds and it can be stored by all caches.
cache-control: public, max-age=43200
The graph below illustrates the top 15 Cache-Control directives in use.
- +There are a few interesting observations about the popularity of these cache directives:
So far we’ve talked about how web servers tell a client what is cacheable, and how long it has been cached for. When designing cache rules, it’s also important to understand how old the content you are serving is.
When you are selecting a cache TTL, ask yourself: “How often are you updating these assets?” and “what is their content sensitivity?”. For example, if a hero image is going to be modified infrequently - then cache it with a very long TTL. If you expect a JavaScript resource to change frequently, then version it and cache it with a long TTL or cache it with a shorter TTL.
The graph below illustrates the relative age of resources by content type, and you can read a more detailed analysis here. HTML tends to be the content type with the shortest age, and a very large % of traditionally cacheable resources scripts, css, and fonts) are older than 1 year!
- +By comparing a resources cacheability to its age, we can determine if the TTL used is appropriate or too low. For example, the resource served by this response was last modified on 25 Aug 2019, which means that it was 49 days old at the time of delivery. The Cache-Control header says that we can cache it for 43,200 seconds, which is 12 hours. It is definitely old enough to merit investigating whether a longer TTL would be appropriate.
< HTTP/2 200
< date: Sun, 13 Oct 2019 19:36:57 GMT
@@ -453,32 +457,34 @@ How Do Cache TTLs Compare to
Overall, 59% of resources served on the web have a cache TTL that is too short compared to it’s content age. Furthermore, the median delta between the TTL and age is 25 days.
When we break this out by first vs third party, we can also see that 70% of first party resources can benefit from a longer TTL. This clearly highlights a need to spend extra attention focusing on what is cacheable, and then ensuring caching is configured correctly.
-
-
-
-
- % of Requests with Short TTLs
-
-
- client
- 1st Party
- 3rd Party
- Overall
-
-
- desktop
- 70.7%
- 47.9%
- 59.2%
-
-
- mobile
- 71.4%
- 46.8%
- 59.6%
-
-
-
+
+
+
+
+
+ % of Requests with Short TTLs
+
+
+ client
+ 1st Party
+ 3rd Party
+ Overall
+
+
+ desktop
+ 70.7%
+ 47.9%
+ 59.2%
+
+
+ mobile
+ 71.4%
+ 46.8%
+ 59.6%
+
+
+
+
Validating Freshness
The HTTP response headers used for validating the responses stored within a cache are Last-Modified and Etag. The Last-Modified header provides the time that the object was last modified, and the Etag header provides a unique identifier for the content.
@@ -509,7 +515,7 @@ Validating Freshness
< etag: "1566748830.0-3052-3932359948"
< accept-ranges: bytes
Overall, 65% of responses are served with a Last-Modified header, 42% are served with an Etag, and 38% use both. However 30% of responses include neither a Last-Modified or Etag header.
- +There are a few HTTP headers used to convey timestamps, and the format for these are very important. The Date response header indicates when the resource was served to a client. The Last-Modified response header indicates when a resource was last changed on the server. And the Expires header is used to indicate how long a resource is cacheable until (unless a Cache-Control header is present).
All 3 of these headers use an HTTP-date formatted string to represent timestamps.
@@ -532,7 +538,7 @@< etag: "1566748830.0-3052-3932359948"
Most clients will ignore invalid date strings, which render them ineffective for the response they are served on. This can have consequences on cacheability, since an erroneous Last-Modified header will be cached without a Last-Modified timestamp resulting in the inability to perform a conditional request.
The Date HTTP response header is usually generated by the web server or CDN serving the response to a client. Because the header is generated server side, it tends to be less prone to error, which is reflected by the very low percentage of invalid Date headers. Last-Modified headers were very similar, with only 0.67% of them being invalid. What was very surprising to see though, was that 3.64% Expires headers used an invalid date format!
- +Examples of some of the invalid uses of the Expires header are:
In general, you should only vary the cache if you are serving alternate content to clients based on that header.
The Vary header is used on 39% of HTTP responses, and 45% of responses that include a Cache-Control header.
The graph below details the popularity for the top 10 Vary header values. Accept-Encoding accounts for 90% of Vary’s use, with User-Agent (11%), Origin (??%), and Accept (??%) making up much of the rest.
- +When a response is cached, it’s entire headers are swapped into the cache as well. This is why you can see the response headers when inspecting a cached response via DevTools.
- +But what happens if you have a Set-Cookie on a response? According to RFC 7234 Section 8, the presence of a Set-Cookie response header does not inhibit caching. This means that a cached entry might contain a Set-Cookie if it was cached with one. The RFC goes on to recommend that you should configure appropriate Cache-Control headers to control how responses are cached.
One of the risks of caching responses with Set-Cookie is that the cookie values can be stored and served to subsequent requests. Depending on the cookie’s purpose, this could have worrying results. For example, if a login cookie or a session cookie is present in a shared cache, then that cookie might be reused by another client. One way to avoid this is to use the Cache-Control “private” directive, which only permits the response to be cached by the client browser.
According to the HTTP Archive, 3% of cacheable responses contain a Set-Cookie header. Of those responses, only 18% use the private directive. The remaining 82% include 5.3 million HTTP responses that include a Set-Cookie which can be cached by public and private cache servers.
- +The Application Cache is a feature of HTML5 that allows developers to specify resources the browser should cache and make available to offline users. This feature was deprecated and removed from web standards, and browser support has been diminishing. In fact when it’s use is detected, Firefox v44+ recommends that developers should use service workers instead. Chrome 69+ restricts the Application Cache to secure context only. The industry has moved more towards implementing this type of functionality with Service Workers - and browser support has been rapidly growing for it.
In fact, one of the HTTP Archive trend reports shows the adoption of Service Workers. Adoption is still below 1% of websites, but it has been steadily increasing since January 2017.
- +In the table below, you can see a summary of AppCache vs ServiceWorker usage. 32,292 websites have implemented a Service Worker, while 1,867 sites are still utilizing the deprecated AppCache feature.
- | Does Not Use Server Worker | -Uses Service Worker | -Total | -
Does Not Use AppCache | -5,045,337 |
- 32,241 |
- 5,077,578 |
-
Uses AppCache | -1,816 |
- 51 |
- 1,867 |
-
Total | -5,047,153 |
- 32,292 |
- 5,079,445 |
-
+ | Does Not Use Server Worker | +Uses Service Worker | +Total | +
Does Not Use AppCache | +5,045,337 |
+ 32,241 |
+ 5,077,578 |
+
Uses AppCache | +1,816 |
+ 51 |
+ 1,867 |
+
Total | +5,047,153 |
+ 32,292 |
+ 5,079,445 |
+
If we break this out by HTTP vs HTTPS, then this gets even more interesting. 581 of the AppCache enabled sites are served over HTTP, which means that Chrome is likely disabling the feature. HTTPS is a requirement for using Service Workers, but 907 of the sites using them are served over HTTP.
- | - | Does Not Use Service Worker | -Uses Service Worker | -
HTTP | -Does Not Use AppCache | -1,968,736 |
- 907 |
-
Uses AppCache | -580 |
- 1 |
- |
HTTPS | -Does Not Use AppCache | -3,076,601 |
- 31,334 |
-
Uses AppCache | -1,236 |
- 50 |
-
+ | + | Does Not Use Service Worker | +Uses Service Worker | +
HTTP | +Does Not Use AppCache | +1,968,736 |
+ 907 |
+
Uses AppCache | +580 |
+ 1 |
+ |
HTTPS | +Does Not Use AppCache | +3,076,601 |
+ 31,334 |
+
Uses AppCache | +1,236 |
+ 50 |
+
Google’s Lighthouse tool enables users to run a series of audits against web pages, and one of them evaluates whether a site can benefit from additional caching. It does this by comparing the content age (via the Last-Modified header) to the Cache TTL and estimating the probability that the resource would be served from cache. Depending on the score, you may see a caching recommendation in the results, with a list of specific resources that could be cached.
- +Lighthouse computes a score for each audit, ranging from 0% to 100%, and those scores are then factored into the overall scores. The caching score is based on potential byte savings. When we examine the HTTP Archive Lighthouse data, we can get a perspective of how many sites are doing well with their cache policies. Only 3.4% of sites scored a 100%, meaning that most sites can benefit from some cache optimizations. A vast majority of sites sore below 40%, with 38% scoring less than 10%. Based on this, there is a significant amount of caching opportunities on the web.
- +Lighthouse also indicates how many bytes could be saved on repeat views by enabling a longer cache policy. Of the sites that could benefit from additional caching, 82% of them can reduce their page weight by up to 1 MB!
- +Caching is an incredibly powerful feature that allows browsers, proxies and other intermediaries (such as CDNs) to store web content and serve it to end users. The performance benefits of this are significant, since it reduces round trip times and minimizes costly network requests.
Caching is also a very complex topic. There are numerous HTTP response headers that can convey freshness as well as validate cached entries, and Cache-Control directives provide a tremendous amount of flexibility and control. However developers should be cautious about the additional opportunities for mistakes that it comes with. Regularly auditing your site to ensure that cacheable resources are cached appropriately is recommended, and tools like Lighthouse and REDbot do an excellent job of helping to simplify the analysis.
diff --git a/src/templates/en/2019/chapters/compression.html b/src/templates/en/2019/chapters/compression.html index 0ba3174e5bf..0690b3bac05 100644 --- a/src/templates/en/2019/chapters/compression.html +++ b/src/templates/en/2019/chapters/compression.html @@ -39,9 +39,9 @@ "headline": "{{ metadata.get('title') }}", "image": { "@type": "ImageObject", - "url": "https://almanac.httparchive.org/static/images/{{ year }}/{{ get_chapter_image_dir(metadata) }}/hero_sm.jpg", - "height": 163, - "width": 326 + "url": "https://almanac.httparchive.org/static/images/{{ year }}/{{ get_chapter_image_dir(metadata) }}/hero_lg.jpg", + "height": 433, + "width": 866 }, "publisher": { "@type": "Organization", @@ -168,88 +168,90 @@Appoximately 38% of HTTP responses are delivered with text based compression. This may seem like a surprising statistic, but keep in mind that it is based on all HTTP requests in the archive. Some content, such as images, will not benefit from these compression algorithms. The table below summarizes the percentage of requests served with each content encoding.
- | % of Requests | -Requests | -||
Content Encoding | -desktop | -mobile | -desktop | -mobile | -
none | -62.87% |
- 61.47% |
- 260245106 |
- 285158644 |
-
gzip | -29.66% |
- 30.95% |
- 122789094 |
- 143549122 |
-
br | -7.43% |
- 7.55% |
- 30750681 |
- 35012368 |
-
deflate | -0.02% |
- 0.02% |
- 68802 |
- 70679 |
-
Other / Invalid | -0.02% |
- 0.01% |
- 67527 |
- 68352 |
-
identity | -0.000709% |
- 0.000563% |
- 2935 |
- 2611 |
-
x-gzip | -0.000193% |
- 0.000179% |
- 800 |
- 829 |
-
compress | -0.000008% |
- 0.000007% |
- 33 |
- 32 |
-
x-compress | -0.000002% |
- 0.000006% |
- 8 |
- 29 |
-
+ | % of Requests | +Requests | +||
Content Encoding | +desktop | +mobile | +desktop | +mobile | +
none | +62.87% |
+ 61.47% |
+ 260245106 |
+ 285158644 |
+
gzip | +29.66% |
+ 30.95% |
+ 122789094 |
+ 143549122 |
+
br | +7.43% |
+ 7.55% |
+ 30750681 |
+ 35012368 |
+
deflate | +0.02% |
+ 0.02% |
+ 68802 |
+ 70679 |
+
Other / Invalid | +0.02% |
+ 0.01% |
+ 67527 |
+ 68352 |
+
identity | +0.000709% |
+ 0.000563% |
+ 2935 |
+ 2611 |
+
x-gzip | +0.000193% |
+ 0.000179% |
+ 800 |
+ 829 |
+
compress | +0.000008% |
+ 0.000007% |
+ 33 |
+ 32 |
+
x-compress | +0.000002% |
+ 0.000006% |
+ 8 |
+ 29 |
+
Of the resources that are served compressed, the majority are using either either gzip (80%) or brotli (20%). The other compression algorithms are infrequently used.
- +Additionally, there are 67K requests that return an invalid Content-Encoding, such as “none”, “UTF-8”, “base64”, “text”, etc. These resources are likely served uncompressed.
We can’t determine the compression levels from any of the diagnostics collected by the HTTP Archive, but the best practice for compressing content will be:
Most text based resources (such as HTML, CSS and JavaScript) can benefit from gzip and brotli compression. However, it’s often not necessary to use these compression techniques on binary resources, such as images, video and some web fonts because their file formats are already compressed.
In the graph below, the top 25 content types are displayed with box sizes representing the relative amount of requests. The color of each box represents how many of these resources were served compressed. Most of the media content is shaded red, which is expected since gzip and brotli would have little to no benefit for them. Most of the text content is shaded green to indicate that they are being compressed. However, the light green shading for some content types indicate that they are not compressed as consistently as the others.
- +Filtering out the 6 most popular content types allows us to see the rest of these content types more clearly.The application/json and image/svg+xml content types are compressed less than 65% of the time.
Most of the custom web fonts are served without compression, since they are already in a compressed format. However font/ttf is compressible, but only 84% of TTF font requests are being served with compression.
- +The graphs below illustrates the breakdown of compression techniques used for each content type. Looking at the top 3 content types, we can see that across both Desktop and Mobile there are major gaps in compressing some of the most frequently requested content types. 56% of text/html as well as 18% of application/javascript and text/css resources are not being compressed. This presents a significant performance opportunity..
-+
The content types with the lowest compression rates include application/json, text/xml and text/plain. These resources are commonly used for XHR requests to provide data that web applications can use to create rich experiences. Compressing them will likely improve user experience. Vector graphics such as image/svg+xml, and image/x-icon are not often thought of as text based, but they are and sites that use them would benefit from compression.
-+
Across all content types, gzip is the most popular compression algorithm. Brotli compression is used less frequently, and the content types where it appears most are application/javascript, text/css and application/x-javascript. This is likely due to to CDNs that automatically apply brotli compression for traffic that passes through them.
In Chapter 5, we learned about third parties and their impact on performance. When we compare compression techniques between first and third parties, we can see that third party content tends to be compressed more than first party content.
Additionally, the percentage of Brotli compression is higher for third party content. This is likely due to the number of resources served from third parties that support Brotli, such as Google and Facebook with Brotli.
Content Encoding | -FirstParty | -ThirdParty | -FirstParty | -ThirdParty | -
No Text Compression | -66.23% |
- 59.28% |
- 64.54% |
- 58.26% |
-
gzip | -29.33% |
- 30.20% |
- 30.87% |
- 31.22% |
-
br | -4.41% |
- 10.49% |
- 4.56% |
- 10.49% |
-
deflate | -0.02% |
- 0.01% |
- 0.02% |
- 0.01% |
-
other / invalid | -0.01% |
- 0.02% |
- 0.01% |
- 0.02% |
-
Content Encoding | +FirstParty | +ThirdParty | +FirstParty | +ThirdParty | +
No Text Compression | +66.23% |
+ 59.28% |
+ 64.54% |
+ 58.26% |
+
gzip | +29.33% |
+ 30.20% |
+ 30.87% |
+ 31.22% |
+
br | +4.41% |
+ 10.49% |
+ 4.56% |
+ 10.49% |
+
deflate | +0.02% |
+ 0.01% |
+ 0.02% |
+ 0.01% |
+
other / invalid | +0.01% |
+ 0.02% |
+ 0.01% |
+ 0.02% |
+
Google’s Lighthouse tool enables users to run a series of audits against web pages, and one of them evaluates whether a site can benefit from additional text based compression. It does this by attempting to compress resources and evaluate whether an object’s size can be reduced by at least 10% and 1400 bytes. Depending on the score, you may see a compression recommendation in the results, with a list of specific resources that could be compressed.
- +Because the HTTP Archive runs Lighthouse audits for each mobile page, we can aggregate the scores across all sites to learn how much opportunity there is to compress more content. Overall, 62% of websites are passing this audit and almost 23% of websites have scored below a 40. This means that over 1.2 million websites could benefit from enabling additional text based compression.
- +Lighthouse also indicates how many bytes could be saved by enabling text compression. Of the sites that could benefit from text compression, 82% of them can reduce their page weight by up to 1 MB!
- +HTTP Compression is a widely used and highly valuable feature for reducing the size of web content. Both gzip and brotli compression are the dominant algorithms used, and the amount of compressed content varies by content type. Tools like Google Lighthouse can help uncover opportunities to compress content.
At a minimum, websites should use gzip compression for all text based resources, since it is widely supported, easily implemented and has a low processing overhead. Additional savings can be found with brotli compression, although compression levels should be chosen carefully based on whether a resource can be precompressed.
diff --git a/src/templates/en/2019/chapters/css.html b/src/templates/en/2019/chapters/css.html index 4a8320466c7..b22edb30068 100644 --- a/src/templates/en/2019/chapters/css.html +++ b/src/templates/en/2019/chapters/css.html @@ -39,9 +39,9 @@ "headline": "{{ metadata.get('title') }}", "image": { "@type": "ImageObject", - "url": "https://almanac.httparchive.org/static/images/{{ year }}/{{ get_chapter_image_dir(metadata) }}/hero_sm.jpg", - "height": 163, - "width": 326 + "url": "https://almanac.httparchive.org/static/images/{{ year }}/{{ get_chapter_image_dir(metadata) }}/hero_lg.jpg", + "height": 433, + "width": 866 }, "publisher": { "@type": "Organization", diff --git a/src/templates/en/2019/chapters/ecommerce.html b/src/templates/en/2019/chapters/ecommerce.html index 842efb48194..f47fed9e288 100644 --- a/src/templates/en/2019/chapters/ecommerce.html +++ b/src/templates/en/2019/chapters/ecommerce.html @@ -39,9 +39,9 @@ "headline": "{{ metadata.get('title') }}", "image": { "@type": "ImageObject", - "url": "https://almanac.httparchive.org/static/images/{{ year }}/{{ get_chapter_image_dir(metadata) }}/hero_sm.jpg", - "height": 163, - "width": 326 + "url": "https://almanac.httparchive.org/static/images/{{ year }}/{{ get_chapter_image_dir(metadata) }}/hero_lg.jpg", + "height": 433, + "width": 866 }, "publisher": { "@type": "Organization", @@ -365,57 +365,59 @@- | Number of requests | -Payload (KB) | -||
Percentile | -Mobile | -Desktop | -Mobile | -Desktop | -
10 | -4 | -4 | -33.05 | -41.51 | -
25 | -8 | -9 | -118.44 | -129.39 | -
50 | -17 | -19 | -292.61 | -320.17 | -
75 | -32 | -34 | -613.46 | -651.19 | -
90 | -50 | -54 | -1016.45 | -1071.3 | -
+ | Number of requests | +Payload (KB) | +||
Percentile | +Mobile | +Desktop | +Mobile | +Desktop | +
10 | +4 | +4 | +33.05 | +41.51 | +
25 | +8 | +9 | +118.44 | +129.39 | +
50 | +17 | +19 | +292.61 | +320.17 | +
75 | +32 | +34 | +613.46 | +651.19 | +
90 | +50 | +54 | +1016.45 | +1071.3 | +
This whistle-stop tour of HTTP/2 gives the main history and concepts of the newish protocol. As should be apparent from this explanation, the main benefit of HTTP/2 is to address performance limitations of the HTTP/1.1 protocol. There were also security improvements as well - perhaps most importantly in being to address performance issues of using HTTPS since HTTP/2, even over HTTPS, is often much faster than plain HTTP. Other than the web browser packing the HTTP messages into the new binary format, and the web server unpacking it at the other side, the core basics of HTTP itself stayed roughly the same. This means web applications do not need to make any changes to support HTTP/2 as the browser and server take care of this. Turning it on should be a free performance boost and because of this adoption should be relatively easy. Of course, there are ways web developers can optimize for HTTP/2 to take full advantage of how it differs.
As mentioned above, Internet protocols are often difficult to adopt since they are ingrained into so much of the infrastructure that makes up the internet. This makes introducing any changes slow and difficult. IPv6 for example has been around for 20 years but has struggled to be adopted. HTTP/2 however, was different as it was effectively hidden in HTTPS (at least for the browser uses cases) removing barriers to adoption as long as both the browser and server supported it. Browser support has been very strong for some time and the advent of auto updating evergreen browsers has meant that an estimated 95% of global users support HTTP/2 now. For this Web Almanac we use HTTP Archive which runs a Chrome web crawler on the approximately 5 million top websites (on both Desktop and Mobile with a slightly different set for each). This shows that HTTP/2 usage is now the majority protocol - an impressive feat just 4 short years after formal standardization:
- +Figure 1 - HTTP/2 usage by request
Looking at the breakdown of all HTTP versions by request we see the following:
Protocol | -Desktop | -Mobile | -Both | -
---|---|---|---|
- | 5.60% | -0.57% | -2.97% | -
HTTP/0.9 | -0.00% | -0.00% | -0.00% | -
HTTP/1.0 | -0.08% | -0.05% | -0.06% | -
HTTP/1.1 | -40.36% | -45.01% | -42.79% | -
HTTP/2 | -53.96% | -54.37% | -54.18% | -
Protocol | +Desktop | +Mobile | +Both | +
---|---|---|---|
+ | 5.60% | +0.57% | +2.97% | +
HTTP/0.9 | +0.00% | +0.00% | +0.00% | +
HTTP/1.0 | +0.08% | +0.05% | +0.06% | +
HTTP/1.1 | +40.36% | +45.01% | +42.79% | +
HTTP/2 | +53.96% | +54.37% | +54.18% | +
Figure 2 - HTTP version usage by request
This shows that HTTP/1.1 and HTTP/2 are the versions used by the vast majority of requests as expected. There are only a very small number of requests on the older HTTP/1.0 and HTTP/0.9 protocols. Annoyingly there is a larger percentage where the protocol was not correctly tracked by the HTTP Archive crawl, particularly on desktop. Digging into this has shown various reasons, some of which I can explain and some of which I can't. Based on spot checks they mostly appear to be HTTP/1.1 requests and, assuming they are, desktop and mobile usage is similar. Despite there being a little larger percentage of noise than I'd like, it doesn't alter the overall message being conveyed here. Other than that, the mobile/desktop similarity is not unexpected - the HTTP Archive crawls using Chrome which supports HTTP/2 for both desktop and mobile. Real world usage may have slightly different stats with some older usage of browsers on both but even then support is widespread so I would not expect a large variation between desktop and mobile.
@@ -222,295 +224,305 @@Looking at the number of requests will skew the results somewhat due to popular requests. For example, many sites load Google Analytics, which does support HTTP/2, and so would show as an HTTP/2 request even if the embedding site itself does not support HTTP/2. On the other hand, popular websites (that tend to support HTTP/2) are also underrepresented in the above stats as they are only measured once (e.g. google.com and obscuresite.com are given equal weighting). There are lies, damn lies and statistics. However, looking at other sources (for example the Mozilla telemetry which looks at real-world usage through the Firefox browser) shows similar statistics.
It is still interesting to look at home pages only to get a rough figure on the number of sites that support HTTP/2 (at least on their home page). Figure 3 shows less support than overall requests, as expected, at around 36%:
Protocol | -Desktop | -Mobile | -Both | -
---|---|---|---|
- | 0.09% | -0.08% | -0.08% | -
HTTP/1.0 | -0.09% | -0.08% | -0.09% | -
HTTP/1.1 | -62.36% | -63.92% | -63.22% | -
HTTP/2 | -37.46% | -35.92% | -36.61% | -
Protocol | +Desktop | +Mobile | +Both | +
---|---|---|---|
+ | 0.09% | +0.08% | +0.08% | +
HTTP/1.0 | +0.09% | +0.08% | +0.09% | +
HTTP/1.1 | +62.36% | +63.92% | +63.22% | +
HTTP/2 | +37.46% | +35.92% | +36.61% | +
Figure 3 - HTTP version usage for home pages
HTTP/2 is only supported by browsers over HTTPS, even though officially HTTP/2 can be used over HTTPS or over unencrypted non-HTTPS connections. As mentioned previously, hiding the new protocol in encrypted HTTPS connections prevents networking appliances which do not understand this new protocol from interfering with (or rejecting!) its usage. Additionally, the HTTPS handshake allows an easy method of the client and server agreeing to use HTTP/2. The web is moving to HTTPS and HTTP/2 turns the traditional argument of HTTPS being bad for performance almost completely on its head. Not every site has made the transition to HTTPS, so HTTP/2 will not even be available to those that have not. Looking at just those sites that use HTTPS, we do see a higher percentage support HTTP/2 at around 55% - similar to the first all requests statistic we started with:
Protocol | -Desktop | -Mobile | -Both | -
---|---|---|---|
- | 0.09% | -0.10% | -0.09% | -
HTTP/1.0 | -0.06% | -0.06% | -0.06% | -
HTTP/1.1 | -45.81% | -44.31% | -45.01% | -
HTTP/2 | -54.04% | -55.53% | -54.83% | -
Protocol | +Desktop | +Mobile | +Both | +
---|---|---|---|
+ | 0.09% | +0.10% | +0.09% | +
HTTP/1.0 | +0.06% | +0.06% | +0.06% | +
HTTP/1.1 | +45.81% | +44.31% | +45.01% | +
HTTP/2 | +54.04% | +55.53% | +54.83% | +
Figure 4 - HTTP version usage for HTTPS home pages
We have shown that browser support is strong, and there is a safe road to adoption, so why does every site (or at least every HTTPS site) not support HTTP/2? Well here we come to the final item for support we have not measured yet: server support. This is more problematic than browser support as, unlike modern browsers, servers often do not automatically upgrade to the latest version. Even when the server is regularly maintained and patched that will often just apply security patches rather than new features like HTTP/2. Let us look first at the server HTTP header for those sites that do support HTTP/2:
Server | -Desktop | -Mobile | -Both | -
---|---|---|---|
nginx | -34.04% | -32.48% | -33.19% | -
cloudflare | -23.76% | -22.29% | -22.97% | -
Apache | -17.31% | -19.11% | -18.28% | -
- | 4.56% | -5.13% | -4.87% | -
LiteSpeed | -4.11% | -4.97% | -4.57% | -
GSE | -2.16% | -3.73% | -3.01% | -
Microsoft-IIS | -3.09% | -2.66% | -2.86% | -
openresty | -2.15% | -2.01% | -2.07% | -
… | -… | -… | -… | -
Server | +Desktop | +Mobile | +Both | +
---|---|---|---|
nginx | +34.04% | +32.48% | +33.19% | +
cloudflare | +23.76% | +22.29% | +22.97% | +
Apache | +17.31% | +19.11% | +18.28% | +
+ | 4.56% | +5.13% | +4.87% | +
LiteSpeed | +4.11% | +4.97% | +4.57% | +
GSE | +2.16% | +3.73% | +3.01% | +
Microsoft-IIS | +3.09% | +2.66% | +2.86% | +
openresty | +2.15% | +2.01% | +2.07% | +
… | +… | +… | +… | +
Figure 5 - Servers used for HTTP/2
Nginx provides package repos that allow ease of installing or upgrading to the latest version, so it is no surprise to see it leading the way here. Cloudflare is the most popular CDNs and enables HTTP/2 by default so again it is also not surprising to see this as a large percentage of HTTP/2 sites. Incidently, Cloudflare uses a heavily customised version of nginx as their web server. After this we see Apache at around 20% of usage, followed by some servers who choose to hide what they are and then the smaller players (LiteSpeed, IIS, Google Servlet Engine and openresty - which is nginx based).
What is more interesting is those sites that that do not support HTTP/2:
Server | -Desktop | -Mobile | -Both | -
---|---|---|---|
Apache | -46.76% | -46.84% | -46.80% | -
nginx | -21.12% | -21.33% | -21.24% | -
Microsoft-IIS | -11.30% | -9.60% | -10.36% | -
- | 7.96% | -7.59% | -7.75% | -
GSE | -1.90% | -3.84% | -2.98% | -
cloudflare | -2.44% | -2.48% | -2.46% | -
LiteSpeed | -1.02% | -1.63% | -1.36% | -
openresty | -1.22% | -1.36% | -1.30% | -
… | -… | -… | -… | -
Server | +Desktop | +Mobile | +Both | +
---|---|---|---|
Apache | +46.76% | +46.84% | +46.80% | +
nginx | +21.12% | +21.33% | +21.24% | +
Microsoft-IIS | +11.30% | +9.60% | +10.36% | +
+ | 7.96% | +7.59% | +7.75% | +
GSE | +1.90% | +3.84% | +2.98% | +
cloudflare | +2.44% | +2.48% | +2.46% | +
LiteSpeed | +1.02% | +1.63% | +1.36% | +
openresty | +1.22% | +1.36% | +1.30% | +
… | +… | +… | +… | +
Figure 6 - Servers used for HTTP/1.1 or lower
Some of this will be non-HTTPS traffic that would use HTTP/1.1 even if the server supported HTTP/2, but a bigger issue is those that do not support HTTP/2. In these stats we see a much greater share for Apache and IIS which are likely running older versions. For Apache in particular it is often not easy to add HTTP/2 support to an existing installation as Apache does not provide an official repository to install this from. This often means resorting to compiling from source or trusting a third-party repo - neither of which is particularly appealing to many administrators. Only the latest versions of Linux distributions (RHEL and CentOS 8, Ubuntu 18 and Debian 9) come with a version of Apache which supports HTTP/2 and many servers are not running those yet. On the Microsoft side only Windows Server 2016 and above supports HTTP/2 so again those running older versions cannot support this in IIS. Merging these two stats together we can see the percentage of installs, of each server, that uses HTTP/2:
Server | -Desktop | -Mobile | -
---|---|---|
cloudflare | -85.40% | -83.46% | -
LiteSpeed | -70.80% | -63.08% | -
openresty | -51.41% | -45.24% | -
nginx | -49.23% | -46.19% | -
GSE | -40.54% | -35.25% | -
- | 25.57% | -27.49% | -
Apache | -18.09% | -18.56% | -
Microsoft-IIS | -14.10% | -13.47% | -
… | -… | -… | -
Server | +Desktop | +Mobile | +
---|---|---|
cloudflare | +85.40% | +83.46% | +
LiteSpeed | +70.80% | +63.08% | +
openresty | +51.41% | +45.24% | +
nginx | +49.23% | +46.19% | +
GSE | +40.54% | +35.25% | +
+ | 25.57% | +27.49% | +
Apache | +18.09% | +18.56% | +
Microsoft-IIS | +14.10% | +13.47% | +
… | +… | +… | +
Figure 7 - percentage installs of each server used to provide HTTP/2
It's clear Apache and IIS fall way behind with 18% and 14% of their installed based supporting HTTP/2, and this has to be at least in part, a consequence of it being more difficult to upgrade them. A full operating system upgrade is often required for many to get this support easily. Hopefully this will get easier as new versions of operating systems become the norm. None of this is a comment on the HTTP/2 implementations here (I happen to think Apache has one of the best implementations), but more in the ease of enabling HTTP/2 in each of these servers - or lack thereof.
The impact of HTTP/2 is a much more difficult to measure statistic, especially using the HTTP Archive methodology. Ideally sites should be crawled with both HTTP/1.1 and HTTP/2 and the difference measured but that is not possible with the statistics we are investigating here. Additionally, measuring whether the average HTTP/2 site is faster than the average HTTP/1.1 site introduces too many other variables that I feel requires a more exhaustive study than we can cover here.
One impact that can be measured is in the changing use of HTTP now we are in an HTTP/2 world. Multiple connections were a work around with HTTP/1.1 to allow a limited form of parallelization, but this is in fact the opposite of what usually works best with HTTP/2. A single connection reduces the overhead of TCP setup, TCP slow start, HTTPS negotiation and also allows the potential of cross-request prioritization. The HTTP Archive measures the number of TCP connections per page and that is dropping steadily as more sites support HTTP/2 and use its single connection instead of 6 separate connections:
- +Figure 8 - TCP connections per page
Bundling assets to obtain fewer requests was another HTTP/1.1 workaround that went by many names: bundling, concatenation, packaging, spriting, … etc. It is less necessary when using HTTP/2 as there is less overhead with requests but it should be noted that requests are not free in HTTP/2 and those that experimented with removing bundling completely have noticed a loss in performance. Looking at the number of requests loaded by page over time, we do see a slight decrease in requests, rather than the expected increase:
- +Figure 9 - Total Requests per page
@@ -521,56 +533,60 @@There has also been very little evidence to date that push, even when implemented correctly, results in the performance increase it promised. This is an area that again the HTTP Archive is not best placed to answer, due to the nature of how it runs (a month crawl of popular sites using Chrome in one state) so we won't delve into it too much here, but suffice to say that the performance gains are far from clear cut and the potential problems are real.
Putting that aside let's look at the usage of HTTP/2 push:
Client | -Sites Using HTTP/2 Push | -Sites Using HTTP/2 Push (%) | -
---|---|---|
Desktop | -22,581 | -0.52% | -
Mobile | -31,452 | -0.59% | -
Client | +Sites Using HTTP/2 Push | +Sites Using HTTP/2 Push (%) | +
---|---|---|
Desktop | +22,581 | +0.52% | +
Mobile | +31,452 | +0.59% | +
Figure 10 - Sites using HTTP/2 push
These status show that the uptick of HTTP/2 push is very low - most likely because of the issues described previously. However, when sites do use push, then tend to use it a lot rather than for one or two assets as shown in Figure 11:
Client | -Avg Pushed Requests | -Avg KB Pushed | -
---|---|---|
Desktop | -7.86 | -162.38 | -
Mobile | -6.35 | -122.78 | -
Client | +Avg Pushed Requests | +Avg KB Pushed | +
---|---|---|
Desktop | +7.86 | +162.38 | +
Mobile | +6.35 | +122.78 | +
Figure 11 - How much is pushed when it is used
This is a concern as previous advice has been to be conservative with push and to "push just enough resources to fill idle network time, and no more". The above statistics suggest many resources, of a significant combined size are pushed. Looking at what is pushed we see the data in Figure 12:
- +Figure 12 - What asset types is push used for?
JavaScript and then CSS are the overwhelming majority of pushed items, both by volume and by bytes. After this there is a rag tag assortment of images, fonts, data, …etc. At the tail end we see around 100 sites pushing video - which may be intentional or may be a sign of over-pushing the wrong types of assets!
One concern raised by some, is that HTTP/2 implementations have repurposed the preload HTTP link header as a signal to push. One of the most popular uses of the preload resource hint is to inform the browser of late-discovered resources like fonts and images, that the browser will not see until the CSS for example has been requested, downloaded and parsed. If these are now pushed based on that header, there was a concern that reusing this may result in a lot of unintended pushes. However, the relative low usage of fonts and images may mean that risk is not being seen as much as was feared. <link rel="preload" ... >
tags are often used in the HTML rather than HTTP link headers and the meta tags are not a signal to push. Statistics in the resource hints chapter show that less than 1% of sites use the preload HTTP link header, and about the same amount use preconnection which has no meaning in HTTP/2, so this would suggest this is not so much of an issue. Though there are a number of fonts and other assets being pushed, which may be a signal of this. As a counter argument to those complaints, if an asset is important enough to preload, then it could be argued these assets should be pushed if possible as browsers treat a preload hints as very high priority requests anyway. Any performance concern is therefore (again arguably) at the overuse of preload, rather than the resulting HTTP/2 push that happens because of this.
HTTP/2 is mostly a seamless upgrade that, once your server supports it, you can switch on with no need to change your website or application. Of course, you can optimize for HTTP/2 or stop using HTTP/1.1 workarounds as much, but in general a site will usually work without needing any changes - but just be faster. There are a couple of gotchas to be aware of however that can impact any upgrade and some sites have found these out the hard way.
One cause of issues in HTTP/2 is the poor support of HTTP/2 prioritization. This feature allows multiple requests in progress to make the appropriate use of the connection. This is especially important since HTTP/2 has massively increased the number of requests that can be running on the same connection. 100 or 128 parallel requests limits are common in server implementations. Previously the browser had a max of 6 connections per domain and so used its skill and judgement to decide how best to use those connections. Now it rarely needs to queue and can send all requests as soon as it knows about them. This then can lead to the bandwidth being "wasted" on lower priority requests while critical requests are delayed (and incidentally can also lead to swamping your backend server with more requests than it is used to!). HTTP/2 has a complex prioritization model (too complex many say - hence why it is being reconsidered for HTTP/3!) but few servers honor that properly. This can be because their HTTP/2 implementations are not up to scratch or because of so called bufferbloat where the responses are already en route before the server realizes there is a higher priority request. Due to the varying nature of servers, TCP stacks and locations it is difficult to measure this for most sites, but with CDNs this should be more consistent. Patrick Meenan created an example test page which deliberately tries to download a load of low-priority, off-screen, images, before requesting some high priority on-screen images. A good HTTP/2 server should be able to recognize this and send the high priority images shortly after requested, at the expense of the lower priority images. A poor HTTP/2 server will just respond in the request order and ignore any priority signals. Andy Davies has a page tracking status of various CDNs for Patrick's test. The HTTP Archive identifies when a CDN is used as part of its crawl and merging these two datasets that gives us the results shown in Figure 13:
CDN | -Prioritizes Correctly? | -Desktop | -Mobile | -Both | -
---|---|---|---|---|
Not using CDN | -Unknown | -57.81% | -60.41% | -59.21% | -
Cloudflare | -Pass | -23.15% | -21.77% | -22.40% | -
Fail | -6.67% | -7.11% | -6.90% | -|
Amazon CloudFront | -Fail | -2.83% | -2.38% | -2.59% | -
Fastly | -Pass | -2.40% | -1.77% | -2.06% | -
Akamai | -Pass | -1.79% | -1.50% | -1.64% | -
- | Unknown | -1.32% | -1.58% | -1.46% | -
WordPress | -Pass | -1.12% | -0.99% | -1.05% | -
Sucuri Firewall | -Fail | -0.88% | -0.75% | -0.81% | -
Incapsula | -Fail | -0.39% | -0.34% | -0.36% | -
Netlify | -Fail | -0.23% | -0.15% | -0.19% | -
OVH CDN | -Unknown | -0.19% | -0.18% | -0.18% | -
CDN | +Prioritizes Correctly? | +Desktop | +Mobile | +Both | +
---|---|---|---|---|
Not using CDN | +Unknown | +57.81% | +60.41% | +59.21% | +
Cloudflare | +Pass | +23.15% | +21.77% | +22.40% | +
Fail | +6.67% | +7.11% | +6.90% | +|
Amazon CloudFront | +Fail | +2.83% | +2.38% | +2.59% | +
Fastly | +Pass | +2.40% | +1.77% | +2.06% | +
Akamai | +Pass | +1.79% | +1.50% | +1.64% | +
+ | Unknown | +1.32% | +1.58% | +1.46% | +
WordPress | +Pass | +1.12% | +0.99% | +1.05% | +
Sucuri Firewall | +Fail | +0.88% | +0.75% | +0.81% | +
Incapsula | +Fail | +0.39% | +0.34% | +0.36% | +
Netlify | +Fail | +0.23% | +0.15% | +0.19% | +
OVH CDN | +Unknown | +0.19% | +0.18% | +0.18% | +
Figure 13 - HTTP/2 prioritization support in common CDNs
This shows that a not insignificant portion of traffic is subject to the identified issue. How much of a problem this is, depends on exactly how your page loads and whether high priority resources are discovered late or not for your site, but it does show another complexity to take into considerations.
diff --git a/src/templates/en/2019/chapters/javascript.html b/src/templates/en/2019/chapters/javascript.html index d76bfb399d0..9c7f77df9f8 100644 --- a/src/templates/en/2019/chapters/javascript.html +++ b/src/templates/en/2019/chapters/javascript.html @@ -39,9 +39,9 @@ "headline": "{{ metadata.get('title') }}", "image": { "@type": "ImageObject", - "url": "https://almanac.httparchive.org/static/images/{{ year }}/{{ get_chapter_image_dir(metadata) }}/hero_sm.jpg", - "height": 163, - "width": 326 + "url": "https://almanac.httparchive.org/static/images/{{ year }}/{{ get_chapter_image_dir(metadata) }}/hero_lg.jpg", + "height": 433, + "width": 866 }, "publisher": { "@type": "Organization", @@ -191,7 +191,7 @@<insert graphic of metric 01_07>
At every percentile, processing times are longer for mobile web pages than on desktop. The median total main thread time on desktop is 849 ms, while mobile is at a larger number - 2436ms.
Although this data shows how much longer it can take for a mobile device to process JavaScript than a more powerful desktop machine, mobile devices also vary in terms of computing power. The following chart shows how processing times on a single web page can vary significantly depending on the mobile device class.
- +From "The cost of JavaScript in 2019"
diff --git a/src/templates/en/2019/chapters/markup.html b/src/templates/en/2019/chapters/markup.html index 5615c0740e8..da4f86262ce 100644 --- a/src/templates/en/2019/chapters/markup.html +++ b/src/templates/en/2019/chapters/markup.html @@ -39,9 +39,9 @@ "headline": "{{ metadata.get('title') }}", "image": { "@type": "ImageObject", - "url": "https://almanac.httparchive.org/static/images/{{ year }}/{{ get_chapter_image_dir(metadata) }}/hero_sm.jpg", - "height": 163, - "width": 326 + "url": "https://almanac.httparchive.org/static/images/{{ year }}/{{ get_chapter_image_dir(metadata) }}/hero_lg.jpg", + "height": 433, + "width": 866 }, "publisher": { "@type": "Organization", @@ -172,57 +172,59 @@In 2005, Hixie's survey listed the top few most commonly used elements on pages. The top 3 were html
, head
and body
which he noted as interesting because they are optional and created by the parser if omitted. Given that we use the post-parsed DOM, they'll show up universally in our data. Thus, we'll begin with the 4th most used element. Below is a comparison of the data from then to now (I've included the frequency comparison here as well just for fun).
2005 (per site) | -2019 (per site) | -2019 (frequency) | -
---|---|---|
title | -title | -div | -
a | -meta | -a | -
img | -a | -span | -
meta | -div | -li | -
br | -link | -img | -
table | -script | -script | -
td | -img | -p | -
tr | -span | -option | -
2005 (per site) | +2019 (per site) | +2019 (frequency) | +
---|---|---|
title | +title | +div | +
a | +meta | +a | +
img | +a | +span | +
meta | +div | +li | +
br | +link | +img | +
table | +script | +script | +
td | +img | +p | +
tr | +span | +option | +
Comparing the latest data in Figure 3 to that of Hixie's report from 2005 in Figure 2, we can see that the average size of DOM trees has gotten bigger.
We can see that both the average number of types of elements per page has increased, as well as the maximum numbers of unique elements that we encounter.
@@ -275,14 +277,14 @@Additionally, 15% of desktop pages and 16% of mobile pages contain deprecated elements.
Figure 6 above shows the top 10 most frequently used deprecated elements. Most of these can seem like very small numbers, but perspective matters.
In order to discuss numbers about the use of elements (standard, deprecated or custom), we first need to establish some perspective.
In Figure 7 above, the top 150 element names, counting the number of pages where they appear, are shown. Note how quickly use drops off.
@@ -323,7 +325,7 @@It's interesting, then, to see what the distribution of these elements looks like and which ones have more than 1% use.
@@ -380,7 +382,7 @@Let's compare these to a few of the native HTML elements that are below the 5% bar, for perspective.
You could discover interesting insights like these all day long.
@@ -407,7 +409,7 @@Placing these into our same chart as above for perspective looks something like this (again, it varies slightly based on the dataset)
The interesting thing about these results is that they also introduce a few other ways that our tool can come in very handy. If we're interested in exploring the space of the data, a very specific tag name is just one possible measure. It's definitely the strongest indicator if we can find good "slang" developing. However, what if that's not all we're interested in?
diff --git a/src/templates/en/2019/chapters/media.html b/src/templates/en/2019/chapters/media.html index b69170a80d3..77ef6620295 100644 --- a/src/templates/en/2019/chapters/media.html +++ b/src/templates/en/2019/chapters/media.html @@ -39,9 +39,9 @@ "headline": "{{ metadata.get('title') }}", "image": { "@type": "ImageObject", - "url": "https://almanac.httparchive.org/static/images/{{ year }}/{{ get_chapter_image_dir(metadata) }}/hero_sm.jpg", - "height": 163, - "width": 326 + "url": "https://almanac.httparchive.org/static/images/{{ year }}/{{ get_chapter_image_dir(metadata) }}/hero_lg.jpg", + "height": 433, + "width": 866 }, "publisher": { "@type": "Organization", @@ -158,123 +158,127 @@It is rare to find a webpage that does not utilize images. Over the years, many different file formats have emerged to help present content on the web - each addressing a different problem. Predominantly there are 4 main universal image formats: JPEG, PNG, GIF, and SVG. In addition, Chrome has enhanced the media pipeline and added support for a fifth image format - WebP. Other browsers have likewise added support for JPEG2000 (Safari), JPEG-XL (IE and Edge) and HEIC (WebView only in Safari)
Each format has its own merits and has ideal uses for the web. A very simplified summary would break down as:
Format | -Highlights | -Drawbacks | -
---|---|---|
JPEG | -* ubiquitously supported * ideal for photographic content |
- * there is always quality loss * most decoders cannot handle high bit depth photographs from modern cameras (> 8 bits per channel) * no support for transparency |
-
PNG | -
- * like JPEG and GIF, shares wide support - *it is lossless - * supports transparency, animation, and high bit depth - |
-
- * much bigger files compared to JPEG - * not ideal for photographic content - |
-
GIF | -* the predecessor to PNG, is most known for animations * lossless |
- * because of the limitation of 256 colors, there is always visual loss from conversion *very large files for animations |
-
SVG | -
- * A vector based format that can be resized without increasing filesize - * It is based on math rather than pixels and creates smooth lines - |
- * not useful for photographic or other raster content | -
WebP | -
- * a newer file format that can produce lossless images like PNG and lossy images like JPEG - * It boasts a 30% average file reduction compared to JPEG, while other data suggests that median file reduction is between 10-28% based on pixel volume. - |
-
- * Unlike JPEG, it is limited to chroma-subsampling which will make some images appear blurry. - *not universally supported. Only Chrome, Firefox and Android ecosystems. - * fragmented feature support depending on browser versions - |
-
Format | +Highlights | +Drawbacks | +
---|---|---|
JPEG | +* ubiquitously supported * ideal for photographic content |
+ * there is always quality loss * most decoders cannot handle high bit depth photographs from modern cameras (> 8 bits per channel) * no support for transparency |
+
PNG | +
+ * like JPEG and GIF, shares wide support + *it is lossless + * supports transparency, animation, and high bit depth + |
+
+ * much bigger files compared to JPEG + * not ideal for photographic content + |
+
GIF | +* the predecessor to PNG, is most known for animations * lossless |
+ * because of the limitation of 256 colors, there is always visual loss from conversion *very large files for animations |
+
SVG | +
+ * A vector based format that can be resized without increasing filesize + * It is based on math rather than pixels and creates smooth lines + |
+ * not useful for photographic or other raster content | +
WebP | +
+ * a newer file format that can produce lossless images like PNG and lossy images like JPEG + * It boasts a 30% average file reduction compared to JPEG, while other data suggests that median file reduction is between 10-28% based on pixel volume. + |
+
+ * Unlike JPEG, it is limited to chroma-subsampling which will make some images appear blurry. + *not universally supported. Only Chrome, Firefox and Android ecosystems. + * fragmented feature support depending on browser versions + |
+
In aggregate, across all page, we indeed see the prevalence of these formats. JPEG, one of the oldest formats on the web, is by far the most commonly used image formats at 60% of the image requests and 65% of all image bytes. Interestingly, PNG is the second most commonly used image format 28% of image requests and bytes. The ubiquity of support along with the precision of color and creative content are likely explanations for its wide use. In contrast SVG, GIF and WebP share nearly the same usage at 4%. (figure image formats by % of totla requests) Of course, web pages are not uniform in their use of image content. Some depend on images more than others. Look no further than the home page of Google.com and you will see very little imagery compared to a typical news website. Indeed, the median website has 13 images and 61 at the 90th percentile and a whopping 229 at the 99th percentile. (figure image frequency per page)
Format | -p10 | -p25 | -p50 | -p75 | -p90 | -p99 | -
---|---|---|---|---|---|---|
jpg | -0 | -3 | -9 | -20 | -39 | -119 | -
png | -0 | -4 | -4 | -10 | -18 | -49 | -
webp | -0 | -0 | -0 | -0 | -0 | -28 | -
svg | -0 | -0 | -0 | -0 | -2 | -19 | -
gif | -0 | -0 | -0 | -1 | -2 | -14 | -
Format | +p10 | +p25 | +p50 | +p75 | +p90 | +p99 | +
---|---|---|---|---|---|---|
jpg | +0 | +3 | +9 | +20 | +39 | +119 | +
png | +0 | +4 | +4 | +10 | +18 | +49 | +
webp | +0 | +0 | +0 | +0 | +0 | +28 | +
svg | +0 | +0 | +0 | +0 | +2 | +19 | +
gif | +0 | +0 | +0 | +1 | +2 | +14 | +
While the median page has 9 jpegs and 4 pngs, and only in the top 25% pages where use gifs, this doesn’t report the adoption rate. The use and frequency of each format per page doesn’t provide insight into the adoption of the more modern formats. Specifically, what % of pages include at least one image in each format?
(figure % of pages using at least 1 image)
@@ -282,117 +286,121 @@There are two ways to look at image file sizes: absolute bytes per resource and bytes per pixel. Looking at the absolute bytes per resource, and we can look at the frequency of file sizes. (figure image format file size)
Format | -p10 | -p25 | -p50 | -p75 | -p90 | -
---|---|---|---|---|---|
jpg | -4 KB | -9 KB | -24 KB | -68 KB | -166 KB | -
png | -2 KB | -4 KB | -11 KB | -43 KB | -152 KB | -
webp | -4 KB | -7 KB | -17 KB | -41 KB | -90 KB | -
gif | -2 KB | -3 KB | -6 KB | -17 KB | -87 KB | -
svg | -0 KB | -0 KB | -1 KB | -2 KB | -8 KB | -
Format | +p10 | +p25 | +p50 | +p75 | +p90 | +
---|---|---|---|---|---|
jpg | +4 KB | +9 KB | +24 KB | +68 KB | +166 KB | +
png | +2 KB | +4 KB | +11 KB | +43 KB | +152 KB | +
webp | +4 KB | +7 KB | +17 KB | +41 KB | +90 KB | +
gif | +2 KB | +3 KB | +6 KB | +17 KB | +87 KB | +
svg | +0 KB | +0 KB | +1 KB | +2 KB | +8 KB | +
From this we can start to get a sense of how large or small a typical resource is on the web. However, this doesn’t give us a sense of the volume of pixels represented on screen for these file distributions. To do this we can divide each resource bytes by the natural pixel volume of the image. A lower Bytes-Per-Pixel indicates a more efficient transmission of visual content. (figure Bytes per pixel)
imageType | -Bytes Per Pixel: p10 | -Bytes Per Pixel: p25 | -Bytes Per Pixel: p50 | -Bytes Per Pixel: p75 | -Bytes Per Pixel: p90 | -
---|---|---|---|---|---|
jpg | -0.1175 | -0.1848 | -0.2997 | -0.5456 | -0.9822 | -
png | -0.1197 | -0.2874 | -0.6918 | -1.4548 | -2.5026 | -
gif | -0.1702 | -0.3641 | -0.7967 | -2.515 | -8.5151 | -
webp | -0.0586 | -0.1025 | -0.183 | -0.3272 | -0.6474 | -
svg | -0.0293 | -0.174 | -0.6766 | -1.9261 | -4.1075 | -
imageType | +Bytes Per Pixel: p10 | +Bytes Per Pixel: p25 | +Bytes Per Pixel: p50 | +Bytes Per Pixel: p75 | +Bytes Per Pixel: p90 | +
---|---|---|---|---|---|
jpg | +0.1175 | +0.1848 | +0.2997 | +0.5456 | +0.9822 | +
png | +0.1197 | +0.2874 | +0.6918 | +1.4548 | +2.5026 | +
gif | +0.1702 | +0.3641 | +0.7967 | +2.515 | +8.5151 | +
webp | +0.0586 | +0.1025 | +0.183 | +0.3272 | +0.6474 | +
svg | +0.0293 | +0.174 | +0.6766 | +1.9261 | +4.1075 | +
While previously it appeared that GIF files were smaller than JPEG, we can now clearly see that the cause of the larger JPEG resources is due to the pixel volume. It is probably not a surprise that GIF shows a very low pixel density compared to the other formats. Additionally, PNG, while it can handle high bit depth and doesn’t suffer from chroma subsampling blurriness, is about twice the size of JPG or WebP for the same pixel volume.
Of note, the pixel volume used for SVG is the size of the DOM element on screen (in CSS pixels). While considerably smaller for file sizes, this hints that SVGs are generally used in smaller portions of the layout. This is why the bytes per pixel appears worse than PNG.
diff --git a/src/templates/en/2019/chapters/mobile-web.html b/src/templates/en/2019/chapters/mobile-web.html index b14ea51d3c8..2354f0febfe 100644 --- a/src/templates/en/2019/chapters/mobile-web.html +++ b/src/templates/en/2019/chapters/mobile-web.html @@ -39,9 +39,9 @@ "headline": "{{ metadata.get('title') }}", "image": { "@type": "ImageObject", - "url": "https://almanac.httparchive.org/static/images/{{ year }}/{{ get_chapter_image_dir(metadata) }}/hero_sm.jpg", - "height": 163, - "width": 326 + "url": "https://almanac.httparchive.org/static/images/{{ year }}/{{ get_chapter_image_dir(metadata) }}/hero_lg.jpg", + "height": 433, + "width": 866 }, "publisher": { "@type": "Organization", @@ -211,26 +211,28 @@Let's start with what phone the typical mobile user has. The average Android phone is ~$250, and one of the most popular phones in that range is a Samsung Galaxy S6. So this is likely the kind of phone they use, which is actually 4x slower than an iPhone 8. This user doesn't have access to a fast 4G connection, but rather a 2G connection (29% of the time) or 3G connection (28% of the time). And this is what it all adds up to:
Connection type | -2G or 3G | -
---|---|
Latency | -300 - 400ms | -
Bandwidth | -0.4 - 1.6Mbps | -
Phone | -Galaxy S6 — 4x slower than iPhone 8 (Octane V2 score) | -
The state of JavaScript on the mobile web is terrifying. According to HTTP Archive's JavaScript report, the median mobile site requires phones to download 375 KB of JavaScript. Assuming a 70% compression ratio, this means that phones have to parse, compile, and execute 1.25 MB of JavaScript at the median.
Why is this a problem? Because sites loading this much JS take upwards of 10 seconds to become interactive. Or in other words, your page may appear fully loaded, but when a user clicks any of your buttons or menus, nothing happens because the JavaScript hasn't finished executing. Users are forced to keep clicking the button for upwards of 10 seconds, just waiting for that magical moment where something actually happens. Think about how confusing and frustrating that can be.
Let's delve deeper and look at another metric that focuses more on how well each page utilizes JavaScript. For example, does it really need as much JavaScript as it's loading? We call this metric the JavaScript Bloat Score, based on the web bloat score. The idea behind it is this:
@@ -259,7 +261,7 @@One of the most beautiful parts of the web is how web pages load progressively by nature. Browsers download and display content as soon as they are able, so users can engage with your content as soon as possible. However, this can have a detrimental effect if you don't design your site with this in mind. Specifically, content can shift position as resources load and impede the user experience.
To help us mitigate this problem, there are accessibility guidelines we can follow when choosing our text and background colors. So how are we doing in meeting these baselines? Only 22.04% of sites give all their text sufficient color contrast. This value is actually a lower limit, as we could only analyze text with solid backgrounds. Image and gradient backgrounds were unable to be analyzed.
Designing tap targets appropriately to mitigate this issue can be difficult because of how widely fingers vary in size. However, lots of research has now been done and there are safe standards for how large buttons should be and how far apart they need to be separated.
When analyzing sites containing an email input, 56.42% use type="email"
. Similarly, for phone inputs, type="tel"
is used 36.7% of the time. Other new input types have an even lower adoption rate.
Type | -Frequency (pages) | -
---|---|
phone | -1,917 | -
name | -1,348 | -
textbox | -833 | -
Type | +Frequency (pages) | +
---|---|
phone | +1,917 | +
name | +1,348 | +
textbox | +833 | +
How well are we doing catering to mobile users? According to our research, even though 71% of sites make some kind of effort to adjust their site for mobile, they're falling well below the mark. Pages take forever to load and become usable thanks to an abuse of JavaScript, text is often impossible to read, engaging with sites via clicking links or buttons is error-prone and infuriating, and tons of great technologies invented to mitigate these problems (Service Workers, autocomplete, zooming, new image formats, etc) are barely being used at all.
The mobile web has now been around long enough for there to be an entire generation of kids where this is the only internet they've ever known. And what kind of experience are we giving them? We're essentially taking them back to the dial-up era. (Good thing I hear AOL still sells those CDs providing 1000 hours of free internet access!)