Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Third Parties 2020 #901

Closed
10 tasks done
foxdavidj opened this issue Jun 27, 2020 · 25 comments
Closed
10 tasks done

Third Parties 2020 #901

foxdavidj opened this issue Jun 27, 2020 · 25 comments
Assignees
Labels
2020 chapter Tracking issue for a 2020 chapter writing Related to wording and content

Comments

@foxdavidj
Copy link
Contributor

foxdavidj commented Jun 27, 2020

Part I Chapter 6: Third Parties

Content team

Authors Reviewers Analysts Draft Queries Results
@simonhearne @tammyeverts @jzyang @exterkamp @max-ostapenko Doc *.sql Sheet

Content team lead: @simonhearne

Welcome chapter contributors! You'll be using this issue throughout the chapter lifecycle to coordinate on the content planning, analysis, and writing stages.

The content team is made up of the following contributors:

New contributors: If you're interested in joining the content team for this chapter, just leave a comment below and the content team lead will loop you in.

Note: To ensure that you get notifications when tagged, you must be "watching" this repository.

Milestones

0. Form the content team

  • Jul 6th: Project owners have selected an author to be the content team lead
  • Jul 13th: The content team has at least one author, reviewer, and analyst (minimally viable team formed)

1. Plan content

  • Jul 20th: The content team has completed the chapter outline in the draft doc
  • Jul 27th: Analysts have triaged the feasibility of all proposed metrics

2. Gather data

  • Aug 1 - 31: August crawl
  • Sep 7th: Analysts have queried all metrics and saved the output to the results sheet

3. Validate results

4. Draft content

  • Nov 12th: Authors have completed the first draft in the doc
  • Nov 26th: The content team has prototyped all data visualizations

5. Publication

  • Nov 26th: The content team has reviewed the final draft, converted to markdown, and filed a PR to add it to the 2020 content directory
  • Dec 9th: Target launch date
@foxdavidj foxdavidj added help wanted Extra attention is needed analysis Querying the dataset writing Related to wording and content labels Jun 27, 2020
@foxdavidj foxdavidj added this to the 2020 Content Planning milestone Jun 27, 2020
@rviscomi rviscomi added the 2020 chapter Tracking issue for a 2020 chapter label Jun 27, 2020
@max-ostapenko
Copy link
Contributor

I'd like to participate as an analyst in this chapter.

@rviscomi
Copy link
Member

rviscomi commented Jul 1, 2020

@simonhearne thank you for agreeing to be the lead author for the Third Parties chapter! As the lead, you'll be responsible for driving the content planning and writing phases in collaboration with your content team, which will consist of yourself as lead, any coauthors you choose as needed, peer reviewers, and data analysts.

The immediate next steps for this chapter are:

  1. Establish the rest of your content team. The larger the scope of the chapter, the more people you'll want to have on board.
  2. Start sketching out ideas in your draft doc.
  3. Catch up on last year's chapter and the project methodology to get a sense for what's possible.

There's a ton of info in the top comment, so check that out and feel free to ping myself or @OBTo with any questions!

To anyone else interested, we'd still love to have you contribute as a peer reviewer, data analyst, or coauthor as needed. Let us know!

@tammyeverts
Copy link

tammyeverts commented Jul 1, 2020 via email

@rviscomi
Copy link
Member

rviscomi commented Jul 1, 2020

Great thank you @tammyeverts, it's great to have you on board again! I'm also super excited because this is the first chapter to have all three author/reviewer/analyst roles filled 🥳

@rviscomi rviscomi added help wanted: reviewers This chapter is looking for reviewers help wanted: analysts This chapter is looking for data analysts and removed help wanted Extra attention is needed labels Jul 2, 2020
@paulcalvano paulcalvano removed the help wanted: analysts This chapter is looking for data analysts label Jul 6, 2020
@foxdavidj
Copy link
Contributor Author

Hey @simonhearne, just checking in:

  1. How is the the chapter coming along? We're tying to have the outline and metrics settled on by the end of the week so we have time to configure the Web Crawler to track everything you need.
  2. Can you remind your team to properly add and credit themselves in your chapter's Google Doc?
  3. Anything you need from me to keep things moving forward?

@rockeynebhwani
Copy link
Contributor

@simonhearne - Please see this thread https://discuss.httparchive.org/t/how-many-and-which-resources-have-timing-allow-origin-for-resource-timing/152/10.

I think it will be good idea to call out most used third parties without Timing-Allow-Origin headers in this year's third party chapter and hope some of these third parties start to pay attention. What do you think?

@jzyang
Copy link
Contributor

jzyang commented Jul 15, 2020

@simonhearne I'd love to help out with reviewing :)

@simonhearne
Copy link
Contributor

Hey all, back(-ish) from PTO today.

Thanks to those offering help, could you please add yourself to the relevant line in the Content Team section of the doc:

@tammyeverts - Reviewer
@jzyang - Reviewer
@max-ostapenko - Analyst

I think this gives us enough folks to start preparing the content! I would appreciate feedback on the outline section, comments in the doc preferred.

@simonhearne
Copy link
Contributor

For ongoing communication, what would you prefer:

👍🏻 use this issue
🚀 comments & chat in google doc
👀 slack channel in the HTTPArchive slack

@max-ostapenko
Copy link
Contributor

Sent a file access request.

@rockeynebhwani
Copy link
Contributor

rockeynebhwani commented Jul 24, 2020

@simonhearne - moving this from slack. @rviscomi suggested this chapter for this topic.

It will be good to cover usage of Tag Manager in this chapter. For example -

  • % of sites using more than one tag manager (e.g. GTM / Adobe DTM / Signal on HomePage)
  • % of sites using multiple instances of same tag manager provider (Example - https://www.schuh.co.uk/ .. two different instances of GoogleTagManager). I think you are aware why they do this.
  • % of third parties initiated from Tag Managers Vs Directly on sites where TagManager is in use.

I see there is a Tag Manager category in Wappalyzer (only 5 tag mangers so far but it can be easily improved)

Also, we will see more Tag Managers taking server side forwarding approach this year which should improve performance so that can also be added in the chapter as something to look forward (https://twitter.com/simoahava/status/1222459714614841346?lang=en)

Thoughts?

@tunetheweb
Copy link
Member

Size of tag manager JS. They can quickly get out of control and get massive when old tags (that often aren’t ever even fired!) continue to hang around clogging up the JS.

@foxdavidj
Copy link
Contributor Author

@simonhearne @max-ostapenko for the two milestones overdue on July 27 could you check the boxes if:

  • the outline has been reviewed and all feasible metrics have been identified
  • any necessary custom metrics have been created and you've created a draft PR to track which feasible metrics have had their queries implemented (we've updated the milestone description to clarify this)

Keeping the milestone checklist up to date helps us to see at a glance how all of the chapters are progressing. Thanks for helping us to stay on schedule!

@max-ostapenko
Copy link
Contributor

Thanks for a heads up.
Will go and update the state.

@exterkamp
Copy link
Contributor

Is this section still in need of reviewers? I'd be happy to help.

@rviscomi
Copy link
Member

Thanks @exterkamp! @simonhearne can you help onboard Shane?

@foxdavidj
Copy link
Contributor Author

I've updated the chapter metadata at the top of this issue to link to the public spreadsheet that will be used for this chapter's query results. The sheet serves 3 purposes:

  1. Enable authors/reviewers to analyze the results for each metric without running the queries themselves
  2. Generate data visualizations to be embedded in the chapter
  3. Serve as a public audit trail of this chapter's data collection/analysis, linked from the chapter footer

@rviscomi
Copy link
Member

rviscomi commented Sep 2, 2020

I learned about some really interesting work @patrickhulce has been doing on correlating third parties to Core Web Vitals performance. I suggested that could be an interesting area of exploration for this chapter and he's open to joining the team as a reviewer. @simonhearne is that something you'd be interested in?

@patrickhulce
Copy link
Contributor

Happy to help but I also understand if we don't want too many (repeat) cooks in the kitchen :) let me know what would be most helpful here!

@max-ostapenko
Copy link
Contributor

@simonhearne first data and charts are ready for review.

@max-ostapenko
Copy link
Contributor

max-ostapenko commented Oct 16, 2020

Happy to help but I also understand if we don't want too many (repeat) cooks in the kitchen :) let me know what would be most helpful here!

@patrickhulce did you have any particular query in mind providing correlation analysis?

@patrickhulce
Copy link
Contributor

patrickhulce commented Oct 16, 2020

did you have any particular query in mind providing correlation analysis?

The correlation analysis @rviscomi was referring to in #901 (comment) was separate from HTTP Archive and involved blocking, so I don't have any specific query suggestions for correlation.

My only suggestion I have on the current queries is that several of them focus on a metric that is completely normalized by the frequency and so it ends up yielding mostly obscure, really uncommon third-parties that might not be the most interesting to analyze (and have "unknown" categories as a result too). For example, it might be more useful for most readers to look at "most popular 100 sorted by median body size" or something instead of "top 100 by median body size".

If there are big ones missing from categorization we can also try to plug that gap, we have ~98% coverage by request count last I checked but coverage as % of all possible third-parties is much, much lower.

Great work here everyone!

@foxdavidj
Copy link
Contributor Author

@simonhearne in case you missed it, we've adjusted the milestones to push the launch date back from November 9 to December 9. This gives all chapters exactly 7 weeks from now to wrap up the analysis, write a draft, get it reviewed, and submit it for publication. So the next milestone will be to complete the first draft by November 12.

However if you're still on schedule to be done by the original November 9 launch date we want you to know that this change doesn't mean your hard work was wasted, and that you'll get the privilege of being part of our "Early Access" launch.

Please see the link above for more info and reach out to @rviscomi or me if you have any questions or concerns about the timeline. We hope this change gives you a bit more breathing room to finish the chapter comfortably and we're excited to see it go live!

@rviscomi rviscomi removed the help wanted: reviewers This chapter is looking for reviewers label Nov 15, 2020
@rviscomi
Copy link
Member

rviscomi commented Nov 15, 2020

I've added @exterkamp as a reviewer, per his offer to help in the Slack channel. Shane, can you create a PR to add your info to the 2020.json config file? The first draft is coming along but not done yet, so there are still opportunities to help. Thanks!

@rviscomi rviscomi added ASAP This issue is blocking progress ⚠️ LAUNCH RISK ⚠️ and removed analysis Querying the dataset labels Nov 30, 2020
@rviscomi
Copy link
Member

rviscomi commented Dec 7, 2020

@simonhearne please have your markdown submitted by EOD today to be included in Wednesday's launch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2020 chapter Tracking issue for a 2020 chapter writing Related to wording and content
Projects
None yet
Development

No branches or pull requests