Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Caching 2020 SQL updates #1580

Merged
merged 2 commits into from
Nov 29, 2020
Merged

Caching 2020 SQL updates #1580

merged 2 commits into from
Nov 29, 2020

Conversation

raghuramakrishnan71
Copy link
Contributor

@raghuramakrishnan71 raghuramakrishnan71 commented Nov 26, 2020

Progress on #917

@raghuramakrishnan71 raghuramakrishnan71 added analysis Querying the dataset ASAP This issue is blocking progress labels Nov 26, 2020
@raghuramakrishnan71 raghuramakrishnan71 self-assigned this Nov 26, 2020
@raghuramakrishnan71
Copy link
Contributor Author

@rviscomi
resource_age_party_and_type_wise_groups.sql - added an improved variant of the earlier query (in line with the chapter),

  • resource_age_party_and_type_wise.sql generated content age percentiles party, resource wise
  • resource_age_party_and_type_wise_groups.sql generates percentages for specific age buckets, party and resource wise.

With this, the queries/output sheet/chapter are now in synch.

Copy link
Member

@tunetheweb tunetheweb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some suggested changes @raghuramakrishnan71

Also I don't see these queries that you mentioned above:

  • resource_age_party_and_type_wise.sql generated content age percentiles party, resource wise
  • resource_age_party_and_type_wise_groups.sql generates percentages for specific age buckets, party and resource wise.

Are they supposed to be part of this PR?

client,
party,
resource_type,
COUNTIF(age_weeks IS NOT NULL) AS requests_total,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not request requests_total so should this be:

Suggested change
COUNTIF(age_weeks IS NOT NULL) AS requests_total,
COUNT(0) as requests,
COUNTIF(age_weeks IS NOT NULL) AS requests_with_last_modified,

Comment on lines 18 to 23
COUNTIF(age_weeks < 0) AS age_neg,
COUNTIF(age_weeks = 0) AS age_0wk,
COUNTIF(age_weeks >= 1 AND age_weeks <= 7) AS age_1_to_7wk,
COUNTIF(age_weeks >= 8 AND age_weeks <= 52) AS age_8_to_52wk,
COUNTIF(age_weeks >= 53 AND age_weeks <= 104) AS age_gt_1y,
COUNTIF(age_weeks >= 105) AS age_gt_2y
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we do these as percentage to allow easier year on year comparison?

Suggested change
COUNTIF(age_weeks < 0) AS age_neg,
COUNTIF(age_weeks = 0) AS age_0wk,
COUNTIF(age_weeks >= 1 AND age_weeks <= 7) AS age_1_to_7wk,
COUNTIF(age_weeks >= 8 AND age_weeks <= 52) AS age_8_to_52wk,
COUNTIF(age_weeks >= 53 AND age_weeks <= 104) AS age_gt_1y,
COUNTIF(age_weeks >= 105) AS age_gt_2y
SAFE_DIVIDE(COUNTIF(age_weeks < 0), COUNTIF(age_weeks IS NOT NULL)) AS age_neg_pct,
SAFE_DIVIDE(COUNTIF(age_weeks = 0), COUNTIF(age_weeks IS NOT NULL)) AS age_0wk_pct,
SAFE_DIVIDE(COUNTIF(age_weeks >= 1 AND age_weeks <= 7), COUNTIF(age_weeks IS NOT NULL)) AS age_1_to_7wk_pct,
SAFE_DIVIDE(COUNTIF(age_weeks >= 8 AND age_weeks <= 52), COUNTIF(age_weeks IS NOT NULL)) AS age_8_to_52wk_pct,
SAFE_DIVIDE(COUNTIF(age_weeks >= 53 AND age_weeks <= 104), COUNTIF(age_weeks IS NOT NULL)) AS age_gt_1y_pct,
SAFE_DIVIDE(COUNTIF(age_weeks >= 105), COUNTIF(age_weeks IS NOT NULL)) AS age_gt_2y_pct

Copy link
Contributor Author

@raghuramakrishnan71 raghuramakrishnan71 Nov 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Included review feedback. Kept both count, pct (as graphs used a 100% stacked bar chart and calculated its own pct automatically). Instead of requests_total, renamed to requests_with_age.

  • resource_age_party_and_type_wise_groups.sql was the sql file you reviewed and provided comments.
  • resource_age_party_and_type_wise.sql is already in the main branch.

@rviscomi rviscomi added this to the 2020 Analysis milestone Nov 28, 2020
@rviscomi rviscomi changed the title Any changes required during chapter finalization. Caching 2020 SQL updates Nov 28, 2020
@tunetheweb tunetheweb merged commit fba563a into main Nov 29, 2020
@tunetheweb tunetheweb deleted the caching-2020-modifications-2 branch November 29, 2020 23:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
analysis Querying the dataset ASAP This issue is blocking progress
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants