From 6f8f52c7111975e6bf5aa92d99244e3e473f2565 Mon Sep 17 00:00:00 2001 From: Barry Pollard Date: Wed, 30 Dec 2020 12:12:02 +0000 Subject: [PATCH] Markup chapter 2020 tables (#1857) * English tables * Dutch tables --- src/content/en/2020/markup.md | 1108 ++++++++++++++++++++++++++------- src/content/nl/2020/markup.md | 1108 ++++++++++++++++++++++++++------- 2 files changed, 1754 insertions(+), 462 deletions(-) diff --git a/src/content/en/2020/markup.md b/src/content/en/2020/markup.md index 5dcd781008b..50b9d7c42a6 100644 --- a/src/content/en/2020/markup.md +++ b/src/content/en/2020/markup.md @@ -51,16 +51,44 @@ In this section, we're covering the higher-level aspects of HTML like document t 96.82% of pages declare a [_doctype_](https://developer.mozilla.org/en-US/docs/Glossary/Doctype). HTML documents declaring a doctype is useful for historical reasons, "to avoid triggering quirks mode in browsers" as [Ian Hickson wrote in 2009](https://lists.w3.org/Archives/Public/public-html-comments/2009Jul/0020.html). What are the most popular values? -
-| Doctype | Pages | Percentage | -|---|---|---| -| HTML ("HTML5") | 5,441,815 | 85.73% | -| XHTML 1.0 Transitional | 382,322 | 6.02% | -| XHTML 1.0 Strict | 107,351 | 1.69% | -| HTML 4.01 Transitional | 54,379 | 0.86% | -| HTML 4.01 Transitional ([quirky](https://hsivonen.fi/doctype/#xml)) | 38,504 | 0.61% | - -
{{ figure_link(caption="The 5 most popular doctypes.", sheets_gid="1981441894", sql_file="summary_pages_by_device_and_doctype.sql") }}
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
DoctypePagesPercentage
HTML ("HTML5")5,441,81585.73%
XHTML 1.0 Transitional382,3226.02%
XHTML 1.0 Strict107,3511.69%
HTML 4.01 Transitional54,3790.86%
HTML 4.01 Transitional (quirky)38,5040.61%
+
{{ figure_link(caption="The 5 most popular doctypes.", sheets_gid="1981441894", sql_file="summary_pages_by_device_and_doctype.sql") }}
You can already tell how the numbers decrease quite a bit after XHTML 1.0, before entering the long tail with a few standard, some esoteric, and also bogus doctypes. @@ -150,6 +178,7 @@ Overall, around 2% of pages contain no scripting at all, not even structured dat At the opposite end of the spectrum, the numbers show that about 97% of pages contain at least one script, either inline or external. {# TODO(analysts): We still have a problem here with the x-axis label (“Containing”). Can someone help out and look at this? #} + {{ figure_markup( image="script-use.png", caption="Usage of the script element.", @@ -240,21 +269,69 @@ Not that much changed [compared to 2019](../2019/markup#fig-3)! In 2019, the Markup chapter of the Web Almanac featured the most frequently used elements in reference to [Ian Hickson's work in 2005](https://web.archive.org/web/20060203031713/http://code.google.com/webstats/2005-12/elements.html). We found this useful and had a look at that data again: -
-| 2005 | 2019 | 2020 | -|---|---|---| -| `title` | `div` | `div` | -| `a` | `a` | `a` | -| `img` | `span` | `span` | -| `meta` | `li` | `li` | -| `br` | `img` | `img` | -| `table` | `script` | `script` | -| `td` | `p` | `p` | -| `tr` | `option` | `link` | -| | | `i` | -| | | `option`| - -
{{ figure_link(caption="The most popular elements in 2005, 2019, and 2020.", sheets_gid="781932961", sql_file="pages_element_count_by_device_and_element_type_frequency.sql") }}
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
200520192020
titledivdiv
aaa
imgspanspan
metalili
brimgimg
tablescriptscript
tdpp
troptionlink
i
option
+
{{ figure_link(caption="The most popular elements in 2005, 2019, and 2020.", sheets_gid="781932961", sql_file="pages_element_count_by_device_and_element_type_frequency.sql") }}
Nothing changed in the Top 7, but the `option` element went a little out of favor and dropped from 8 to 10, letting both the `link` and the `i` element pass in popularity. These elements have risen in use, possibly due to an increase in use of [resource hints](./resource-hints) (as with prerendering and prefetching), as well icon solutions like [Font Awesome](https://fontawesome.com/), which _de facto_ misuses `i` elements for the purpose of displaying icons. @@ -293,13 +370,13 @@ Accordingly, we looked at the number of `details` and `summary` elements and it summary - 62,992 - 43,936 + 62,992 + 43,936 details - 56,60 - 36,743 + 56,60 + 36,743 @@ -310,33 +387,86 @@ Accordingly, we looked at the number of `details` and `summary` elements and it Taking another look at element popularity, how likely is it to find a certain element in the DOM of a page? Surely, `html`, `head`, `body` are present on every page (even though [their tags are all optional](https://meiert.com/en/blog/optional-html/)), making them common elements, but what other elements are to be found? -
-| Element | Probability | -|---|---| -| `title` | 99.34% | -| `meta` | 99.00% | -| `div` | 98.42% | -| `a` | 98.32% | -| `link` | 97.79% | -| `script` | 97.73% | -| `img` | 95.83% | -| `span` | 93.98% | -| `p` | 88.71% | -| `ul` | 87.68% | - -
{{ figure_link(caption="High probabilities of finding a given element in pages of the Web Almanac 2020 sample.", sheets_gid="184700688", sql_file="pages_element_count_by_device_and_element_type_present.sql") }}
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ElementProbability
title99.34%
meta99.00%
div98.42%
a98.32%
link97.79%
script97.73%
img95.83%
span93.98%
p88.71%
ul87.68%
+
{{ figure_link(caption="High probabilities of finding a given element in pages of the Web Almanac 2020 sample.", sheets_gid="184700688", sql_file="pages_element_count_by_device_and_element_type_present.sql") }}
Standard elements are those that are or were part of the HTML specification. Which ones are you really rarely to find? In our sample, that would bring up the following: -
-| Element | Probability | -|---|---| -| `dir` | 0.0082% | -| `rp` | 0.0087% | -| `basefont` | 0.0092% | - -
{{ figure_link(caption="Low probabilities of finding a given element in pages of the sample.", sheets_gid="184700688", sql_file="pages_element_count_by_device_and_element_type_present.sql") }}
+
+ + + + + + + + + + + + + + + + + + + + + +
ElementProbability
dir0.0082%
rp0.0087%
basefont0.0092%
+
{{ figure_link(caption="Low probabilities of finding a given element in pages of the sample.", sheets_gid="184700688", sql_file="pages_element_count_by_device_and_element_type_present.sql") }}
We're including these elements to give an idea what elements may have gone out of favor. But while `dir` and `basefont` were last specified in XHTML 1.0 (2000), the rare use of `rp`, which has been mentioned [as early as 1998](https://www.w3.org/TR/1998/WD-ruby-19981221/#a2-4) but which is also [still part of HTML](https://html.spec.whatwg.org/multipage/text-level-semantics.html#the-rp-element), may just suggest that Ruby markup is not very popular. @@ -347,25 +477,89 @@ The 2019 edition of the Web Almanac handled [custom elements](../2019/markup#cus {# TODO(authors, analysts): Clarify occurrences and percentages _of what_. Pages? Elements? #} -
-| Element | Occurrences | Percentage | -|---|---|---| -| `ym-measure` | 141,156 | 2.22% | -| `wix-image` | 76,969 | 1.21% | -| `rs-module-wrap` | 71,272 | 1.12% | -| `rs-module` | 71,271 | 1.12% | -| `rs-slide` | 70,970 | 1.12% | -| `rs-slides` | 70,993 | 1.12% | -| `rs-sbg-px` | 70,414 | 1.11% | -| `rs-sbg-wrap` | 70,414 | 1.11% | -| `rs-sbg` | 70,413 | 1.11% | -| `rs-progress` | 70,651 | 1.11% | -| `rs-mask-wrap` | 63,871 | 1.01% | -| `rs-loop-wrap` | 63,870 | 1.01% | -| `rs-layer-wrap` | 63,849 | 1.01% | -| `wix-iframe` | 63,590 | 1% | - -
{{ figure_link(caption="The 14 most popular custom elements.", sheets_gid="770933671", sql_file="pages_element_count_by_device_and_custom_dash_elements.sql") }}
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ElementOccurrencesPercentage
ym-measure141,1562.22%
wix-image76,9691.21%
rs-module-wrap71,2721.12%
rs-module71,2711.12%
rs-slide70,9701.12%
rs-slides70,9931.12%
rs-sbg-px70,4141.11%
rs-sbg-wrap70,4141.11%
rs-sbg70,4131.11%
rs-progress70,6511.11%
rs-mask-wrap63,8711.01%
rs-loop-wrap63,8701.01%
rs-layer-wrap63,8491.01%
wix-iframe63,5901%
+
{{ figure_link(caption="The 14 most popular custom elements.", sheets_gid="770933671", sql_file="pages_element_count_by_device_and_custom_dash_elements.sql") }}
These elements come from three sources: [Yandex Metrica](https://metrica.yandex.com/about) (`ym-`), an analytics solution we've also seen last year; [Slider Revolution](https://www.sliderrevolution.com/) (`rs-`), a WordPress slider, for which there are more elements to be found near the top of the sample; and [Wix](https://www.wix.com/) (`wix-`), a website builder. @@ -380,20 +574,64 @@ There are more questions to ask about the use of HTML, and one may relate to obs In our mobile dataset of 6.3 million pages, around 0.9 million pages (14.01%) contain one or more of these elements. Here are the top 9, which are used more than 10,000 times: -
-| Element | Occurrences | Pages (%) | -|---|---|---| -| `center` | 458,402 | 7.22% | -| `font` | 430,987 | 6.79% | -| `marquee` | 67,781 | 1.07% | -| `nobr` | 31,138 | 0.49% | -| `big` | 27,578 | 0.43% | -| `frame` | 19,363 | 0.31% | -| `frameset` | 19,163 | 0.30% | -| `strike` | 17,438 | 0.27% | -| `noframes` | 15,016 | 0.24% | - -
{{ figure_link(caption="Obsolete elements with more than 10,000 uses.", sheets_gid="1972617631", sql_file="pages_element_count_by_device_and_obsolete_elements.sql") }}
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ElementOccurancesPages (%)
center458,4027.22%
font430,9876.79%
marquee67,7811.07%
nobr31,1380.49%
big27,5780.43%
frame19,3630.31%
frameset19,1630.30%
strike17,4380.27%
noframes15,0160.24%
+
{{ figure_link(caption="Obsolete elements with more than 10,000 uses.", sheets_gid="1972617631", sql_file="pages_element_count_by_device_and_obsolete_elements.sql") }}
Even `spacer` is still being used 1,584 times, and present on every 5,000th page. We know that Google has been using a `center` element on [their homepage](https://www.google.com/) [for 22 years](https://web.archive.org/web/19981202230410/https://www.google.com/) now, but why are there so many imitators? @@ -406,21 +644,58 @@ If you were wondering: The total number of [`isindex`](https://www.w3.org/TR/htm In our set of elements we found some that were neither standard HTML (nor SVG nor MathML) elements, nor custom ones, nor obsolete ones, but somewhat proprietary ones. The top 10 that we identified are the following: -
-| Element | Pages (%) | -|---|---| -| `noindex` | 0.89% | -| `jdiv` | 0.85% | -| `mediaelementwrapper` | 0.49% | -| `ymaps` | 0.26% | -| `yatag` | 0.20% | -| `ss` | 0.11% | -| `include` | 0.08% | -| `olark` | 0.07% | -| `h7` | 0.06% | -| `limespot` | 0.05% | - -
{{ figure_link(caption="Elements of questionable heritage.", sheets_gid="184700688", sql_file="pages_element_count_by_device_and_element_type_present.sql") }}
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ElementPages (%)
noindex0.89%
jdiv0.85%
mediaelementwrapper0.49%
ymaps0.26%
yatag0.20%
ss0.11%
include0.08%
olark0.07%
h70.06%
limespot0.05%
+
{{ figure_link(caption="Elements of questionable heritage.", sheets_gid="184700688", sql_file="pages_element_count_by_device_and_element_type_present.sql") }}
The source of these elements appears to be mixed, as in some are unknown while others can be traced. The most popular one, `noindex`, is probably due to [Yandex's recommendation](https://yandex.com/support/webmaster/adding-site/indexing-prohibition.html) of it to prohibit page indexing. `jdiv` was noted in [last year's Web Almanac](../2019/markup#products-and-libraries-and-their-custom-markup) and is from JivoChat. `mediaelementwrapper` comes from the MediaElement media player. Both `ymaps` and `yatag` are also from Yandex. The `ss` element could be from ProStores, a former ecommerce product from eBay, and `olark` may be from the Olark chat software. `h7` appears to be a mistake. `limespot` is probably related to the Limespot personalization program for ecommerce. None of these elements are part of a web standard. @@ -429,28 +704,76 @@ The source of these elements appears to be mixed, as in some are unknown while o [Headings](https://html.spec.whatwg.org/multipage/dom.html#heading-content) make for a special category of elements that play an important role in [sectioning](https://html.spec.whatwg.org/multipage/dom.html#sectioning-content-2) and for [accessibility](https://www.w3.org/WAI/tutorials/page-structure/headings/). -
-| Heading | Occurrences | Average per page | -|---|---|---| -| `h1` | 10,524,810 | 1.66 | -| `h2` | 37,312,338 | 5.88 | -| `h3` | 44,135,313 | 6.96 | -| `h4` | 20,473,598 | 3.23 | -| `h5` | 8,594,500 | 1.36 | -| `h6` | 3,527,470 | 0.56 | - -
{{ figure_link(caption="Frequency and average use of standard heading elements.", sheets_gid="277662548", sql_file="pages_wpt_bodies_by_device_and_percentile_and_heading_level.sql") }}
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
HeadingOccurrencesAverage per page
h110,524,8101.66
h237,312,3385.88
h344,135,3136.96
h420,473,5983.23
h58,594,5001.36
h63,527,4700.56
+
{{ figure_link(caption="Frequency and average use of standard heading elements.", sheets_gid="277662548", sql_file="pages_wpt_bodies_by_device_and_percentile_and_heading_level.sql") }}
You might have expected to only see the standard `

` to `

` elements, but some sites actually use more levels: -
-| Heading | Occurrences | Average per page | -|---|---|---| -| `h7` | 30,073 | 0.005 | -| `h8` | 9,266 | 0.0015 | - -
{{ figure_link(caption="Frequency and average use of non-standard heading elements.", sheets_gid="277662548", sql_file="pages_wpt_bodies_by_device_and_percentile_and_heading_level.sql") }}
+
+ + + + + + + + + + + + + + + + + + + + +
HeadingOccurrencesAverage per page
h730,0730.005
h89,2660.0015
+
{{ figure_link(caption="Frequency and average use of non-standard heading elements.", sheets_gid="277662548", sql_file="pages_wpt_bodies_by_device_and_percentile_and_heading_level.sql") }}
The last two have never been part of HTML, of course, and should not be used. @@ -463,21 +786,69 @@ This section focuses on how attributes are used in documents and explores patter Similar to the section on the most [popular elements](#top-elements), this section delves into the most popular attributes on the web. Given how important the `href` attribute is for the web itself, or the `alt` attribute in order to make information [accessible](./accessibility), would these be most popular attributes? -
-| Attribute | Occurrences | Percentage | -|---|---|---| -| `class` | 2,998,695,114 | 34.23% | -| `href` | 928,704,735 | 10.60% | -| `style` | 523,148,251 | 5.97% | -| `id` | 452,110,137 | 5.16% | -| `src` | 341,604,471 | 3.90% | -| `type` | 282,298,754 | 3.22% | -| `title` | 231,960,356 | 2.65% | -| `alt` | 172,668,703 | 1.97% | -| `rel` | 171,802,460 | 1.96% | -| `value` | 140,666,779 | 1.61% | - -
{{ figure_link(caption="Top 10 attributes by frequency of use.", sheets_gid="1348855449", sql_file="pages_almanac_by_device_and_attribute_name_frequency.sql") }}
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
AttributeOccurrencesPercentage
class2,998,695,11434.23%
href928,704,73510.60%
style523,148,2515.97%
id452,110,1375.16%
src341,604,4713.90%
type282,298,7543.22%
title231,960,3562.65%
alt172,668,7031.97%
rel171,802,4601.96%
value140,666,7791.61%
+
{{ figure_link(caption="Top 10 attributes by frequency of use.", sheets_gid="1348855449", sql_file="pages_almanac_by_device_and_attribute_name_frequency.sql") }}
The most popular attribute is `class`, with nearly 3 billion occurrences in our dataset and constituting 34% of all attributes in use. `class` is by far the most prevalent attribute. @@ -488,21 +859,58 @@ The `value` attribute, which specifies the value of an `input` element, surprisi Are there attributes that we find in every document? Not quite, but almost: -
-Element | Pages (%) --- | -- -href | 99.21% -src | 99.18% -content | 98.88% -name | 98.61% -type | 98.55% -class | 98.24% -rel | 97.98% -id | 97.46% -style | 95.95% -alt | 90.75% - -
{{ figure_link( +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ElementPages (%)
href99.21%
src99.18%
content98.88%
name98.61%
type98.55%
class98.24%
rel97.98%
id97.46%
style95.95%
alt90.75%
+
{{ figure_link( caption="Top 10 attributes by page.", sheets_gid="1185369559", sql_file="pages_almanac_by_device_and_attribute_name_present.sql" @@ -517,19 +925,59 @@ Per the HTML spec, [`data-*` attributes](https://html.spec.whatwg.org/multipage/ The two most popular ones stand out because they are almost twice as popular than each of the attributes that followed (with >1% use): -
-| Attribute | Occurrences | Percentage | -|---|---|---| -| `data-src` | 26,734,560 | 3.30% | -| `data-id` | 26,596,769 | 3.28% | -| `data-toggle` | 12,198,883 | 1.50% | -| `data-slick-index` | 11,775,250 | 1.45% | -| `data-element_type` | 11,263,176 | 1.39% | -| `data-type` | 11,130,662 | 1.37% | -| `data-requiremodule` | 8,303,675 | 1.02% | -| `data-requirecontext` | 8,302,335 | 1.02% | - -
{{ figure_link(caption="The most popular data-* attributes.", sheets_gid="764700773", sql_file="pages_almanac_by_device_and_data_attribute_name_frequency.sql") }}
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
AttributeOccurrencesPercentage
data-src26,734,5603.30%
data-id26,596,7693.28%
data-toggle12,198,8831.50%
data-slick-index11,775,2501.45%
data-element_type11,263,1761.39%
data-type11,130,6621.37%
data-requiremodule8,303,6751.02%
data-requirecontext8,302,3351.02%
+
{{ figure_link(caption="The most popular data-* attributes.", sheets_gid="764700773", sql_file="pages_almanac_by_device_and_data_attribute_name_frequency.sql") }}
Attributes like `data-type`, `data-id`, and `data-src` can have multiple generic uses although `data-src` is used a lot with lazy image loading via JavaScript (e.g., Bootstrap 4). [Bootstrap](https://getbootstrap.com/) again explains the presence of `data-toggle`, where it's used as a state styling hook on toggle buttons. The [Slick carousel plugin](https://kenwheeler.github.io/slick/) is the source of `data-slick-index`, whereas `data-element_type` is part of [Elementor's WordPress website builder](https://elementor.com/). Both `data-requiremodule` and `data-requirecontext`, then, are part of [RequireJS](https://requirejs.org/). @@ -549,17 +997,49 @@ Users should be able to zoom and scale the text [up to 500%](https://dequeuniver We had a look at the data and in order to better understand the results, we normalized it by removing spaces, converting everything to lowercase, and sorting by comma values of the `content` attribute. -
-| Content attribute value | Occurrences | Pages (%) | -|---|---|---| -| `initial-scale=1,width=device-width` | 2,728,491 | 42.98% | -| blank | 688,293 | 10,84% | -| `initial-scale=1,maximum-scale=1,width=device-width` | 373,136 | 5.88% | -| `initial-scale=1,maximum-scale=1,user-scalable=no,width=device-width` | 352,972 | 5.56% | -| `initial-scale=1,maximum-scale=1,user-scalable=0,width=device-width` | 249,662 | 3.93% | -| `width=device-width` | 231,668 | 3.65% | - -
{{ figure_link(caption="viewport specifications, and lack thereof.", sheets_gid="1414206386", sql_file="summary_pages_by_device_and_viewport.sql") }}
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Content attribute valueOccurrencesPages (%)
initial-scale=1,width=device-width2,728,49142.98%
blank688,29310,84%
initial-scale=1,maximum-scale=1,width=device-width373,1365.88%
initial-scale=1,maximum-scale=1,user-scalable=no,width=device-width352,9725.56%
initial-scale=1,maximum-scale=1,user-scalable=0,width=device-width249,6623.93%
width=device-width231,6683.65%
+
{{ figure_link(caption="viewport specifications, and lack thereof.", sheets_gid="1414206386", sql_file="summary_pages_by_device_and_viewport.sql") }}
The results show that almost half of the pages we analyzed are using the typical viewport `content` value. Still, around 10% of mobile pages are entirely missing a proper `content` value for the viewport meta element, with the rest of them using an improper combination of `maximum-scale`, `minimum-scale`, `user-scalable=no`, or `user-scalable=0`. @@ -574,20 +1054,64 @@ The situation around favicons is fascinating. Favicons work with or without mark When we built our tests we didn't check for the presence of images, but only looked at the markup. That means, when you review the following, note that it's more about _how_ favicons are referenced rather than whether or how often they are used. -
-| Favicon format | Occurrences | Pages (%) | -|---|---|---| -| ICO | 2,245,646 | 35.38% | -| PNG | 1,966,530 | 30.98% | -| No favicon defined | 1,643,136 | 25.88% | -| JPG | 319,935 | 5.04% | -| No extension specified (no format identifiable) | 37,011 | 0.58% | -| GIF | 34,559 | 0.54% | -| WebP | 10,605 | 0.17% | -| … | | | -| SVG | 5,328 | 0.08% | - -
{{ figure_link(caption="Common favicon formats.", sheets_gid="1930085905", sql_file="pages_almanac_by_device_and_favicon_image_type.sql") }}
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Favicon formatOccurrencesPages (%)
ICO2,245,64635.38%
PNG1,966,53030.98%
No favicon defined1,643,13625.88%
JPG319,9355.04%
No extension specified (no format identifiable)37,0110.58%
GIF34,5590.54%
WebP10,6050.17%
SVG5,3280.08%
+
{{ figure_link(caption="Common favicon formats.", sheets_gid="1930085905", sql_file="pages_almanac_by_device_and_favicon_image_type.sql") }}
There are a couple of surprises in here: @@ -609,32 +1133,90 @@ There has been a lot of [discussion](https://adrianroselli.com/2016/01/links-but ) }} {# TODO(analysts): Where do these "occurrences" come from? Ideally we have a single sheet to link to with the results used by this table. #} -
-| Button types | Occurrences | Percentage | -|---|---|---| -| `