Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix various bugs with the GA4 pageview tracker #3626

Merged
merged 5 commits into from
Sep 22, 2023
Merged

Conversation

AshGDS
Copy link
Contributor

@AshGDS AshGDS commented Sep 21, 2023

What

Fixes various bugs with the GA4 pageview tracker. These are:

  1. In the query_string parameter, I was redacting PII before stripping _ga and _gl. This breaks the _ga/_gl stripping code, as the PIIRemover detects the random characters in these values as [date]. And therefore the regex for stripping the values would break, as correctly it doesn't count [ and ] as part of the _ga/_gl values.
  2. Fixes html lang='en' never being grabbed in certain scenarios. Previously, if the #content element existed, it would either grab the language value from it, or return this.nullValue straight away. This isn't correct, as we want to check the HTML language value as well before returning this.nullValue. Ironically, we couldn't figure out how to test the <html> element, and if we could we would've spotted this sooner!
  3. The highest level of PII redaction was removed by accident for getLocation in Redact GA params from pageview data #3568 and getTitle in Add personally identifiable information (PII) remover to GTM #2842. The default stripPII function only does extra redaction if certain meta tags exist. We usually use stripPIIWithOverride so that we can enable date and postcode redaction ourselves.
    This is a consequence of porting the PIIRemover directly from UA, and not really knowing what we wanted to redact at the time. Eventually we ended up needing to always use the highest level of redaction, so we created stripPIIWithOverride to allow this, whereas UA would use stripPII which would only redact emails, state, password tokens and unlock tokens. stripPII would only be stricter and redact dates/postcodes if certain meta tags existed on the page. Fortunately on finder-frontend, one of the meta tags existed <meta name="govuk:static-analytics:strip-postcodes" content="true"> so postcodes weren't being tracked in searches. Dates however are coming through so this will fix that. This change will also ensure the data/postcode redaction is site-wide.

Why

  1. Will fix a bug with this card https://trello.com/c/Rb45fH1B/633-search-enhancement-new-attribute-query-string
  2. Will fix a bug with this card https://trello.com/c/ljF64JLm/377-page-view-enhancement-content-language-not-consistently-tracked
  3. Will prevent PII coming through/keep our PII consistent

Visual Changes

None.

This fixes a bug where the PIIRemove replaces the _ga/_gl values with [date], and then the regex to strip the values to become _ga=[id]/_gl=[id] does not work properly
If the #content element existed, but had no language on it, it would return this.nullValue instead of jumping out to check the html language value instead. This has been fixed, so that it wont return null when content doesnt have a language on it. It will now only return null if both #content and html do not have a language on them
@govuk-ci govuk-ci temporarily deployed to components-gem-pr-3626 September 21, 2023 10:16 Inactive
@govuk-ci govuk-ci temporarily deployed to components-gem-pr-3626 September 21, 2023 10:26 Inactive
@AshGDS AshGDS changed the title Ga4 pageview fixes Fix various bugs with the GA4 pageview tracker Sep 21, 2023
@AshGDS AshGDS self-assigned this Sep 21, 2023
@govuk-ci govuk-ci temporarily deployed to components-gem-pr-3626 September 21, 2023 10:38 Inactive
@AshGDS AshGDS marked this pull request as ready for review September 21, 2023 10:38
PII for `getLocation` was accidentally removed in #3568, and missed in #2842 for `getTitle`
@govuk-ci govuk-ci temporarily deployed to components-gem-pr-3626 September 21, 2023 10:44 Inactive
@AshGDS AshGDS requested a review from andysellick September 21, 2023 10:56
@@ -148,7 +148,9 @@ window.GOVUK.analyticsGa4.analyticsModules = window.GOVUK.analyticsGa4.analytics
var content = document.getElementById('content')
var html = document.querySelector('html')
if (content) {
return content.getAttribute('lang') || this.nullValue
if (content.getAttribute('lang')) {
return content.getAttribute('lang')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you're calling this twice, you should probably store it in a variable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, good spot 👍

@govuk-ci govuk-ci temporarily deployed to components-gem-pr-3626 September 21, 2023 14:46 Inactive
@AshGDS
Copy link
Contributor Author

AshGDS commented Sep 21, 2023

Thanks @andysellick - should be ready for a re-review 👍

@AshGDS AshGDS merged commit 2430a72 into main Sep 22, 2023
@AshGDS AshGDS deleted the ga4-pageview-fixes branch September 22, 2023 09:08
AshGDS added a commit that referenced this pull request Sep 22, 2023
Fix various bugs with the GA4 pageview tracker
@andysellick andysellick mentioned this pull request Sep 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants