Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CMS queries #123

Merged
merged 1 commit into from
Sep 15, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 22 additions & 0 deletions sql/2019/14_CMS/14_01.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
#standardSQL
# 14_01: Top CMSs
SELECT
_TABLE_SUFFIX AS client,
app AS cms,
COUNT(0) AS freq,
total,
ROUND(COUNT(0) * 100 / total, 2) AS pct
FROM
`httparchive.technologies.2019_07_01_*`
JOIN
(SELECT _TABLE_SUFFIX, COUNT(0) AS total FROM `httparchive.summary_pages.2019_07_01_*` GROUP BY _TABLE_SUFFIX)
USING
(_TABLE_SUFFIX)
rviscomi marked this conversation as resolved.
Show resolved Hide resolved
WHERE
category = 'CMS'
GROUP BY
client,
total,
cms
ORDER BY
freq DESC
24 changes: 24 additions & 0 deletions sql/2019/14_CMS/14_02.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
#standardSQL
# 14_02: AMP plugin version
SELECT
client,
amp_plugin_version,
COUNT(0) AS freq,
SUM(COUNT(0)) OVER (PARTITION BY client) AS total,
COUNT(0) / SUM(COUNT(0)) OVER (PARTITION BY client) AS pct
FROM (
SELECT
client,
url,
REGEXP_EXTRACT(body, '(?i)<meta[^>]+name=[\'"]?generator[^>]+content=[\'"]?AMP Plugin v(\\d+\\.\\d+[^\'">]*)') AS amp_plugin_version
FROM
`httparchive.almanac.summary_response_bodies`
WHERE
firstHtml)
JOIN
(SELECT _TABLE_SUFFIX AS client, url FROM `httparchive.technologies.2019_07_01_*` WHERE app = 'WordPress')
USING
(client, url)
GROUP BY
client,
amp_plugin_version
rviscomi marked this conversation as resolved.
Show resolved Hide resolved
26 changes: 26 additions & 0 deletions sql/2019/14_CMS/14_03.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
#standardSQL
# 14_02: AMP plugin mode
SELECT
client,
amp_plugin_mode,
COUNT(DISTINCT url) AS freq,
SUM(COUNT(DISTINCT url)) OVER (PARTITION BY client) AS total,
ROUND(COUNT(DISTINCT url) * 100 / SUM(COUNT(DISTINCT url)) OVER (PARTITION BY client), 2) AS pct
FROM (
SELECT
client,
page AS url,
SPLIT(REGEXP_EXTRACT(body, '(?i)<meta[^>]+name=[\'"]?generator[^>]+content=[\'"]?AMP Plugin v(\\d+\\.\\d+[^\'">]*)'), ';')[SAFE_OFFSET(1)] AS amp_plugin_mode
FROM
`httparchive.almanac.summary_response_bodies`
WHERE
firstHtml)
INNER JOIN
(SELECT _TABLE_SUFFIX AS client, url FROM `httparchive.technologies.2019_07_01_*` WHERE app = 'WordPress')
USING
(client, url)
GROUP BY
client,
amp_plugin_mode
ORDER BY
freq / total DESC
rviscomi marked this conversation as resolved.
Show resolved Hide resolved