Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data Mart : Part C before and after analysis #14

Open
praveen555 opened this issue Aug 11, 2023 · 5 comments
Open

Data Mart : Part C before and after analysis #14

praveen555 opened this issue Aug 11, 2023 · 5 comments

Comments

@praveen555
Copy link

Using the week_number to filter out the dates would result in incorrect analysis. A simple query will show that week_number shift in 2020.

image

Hence it is better to use date_add, date_sub and interval to filter out the correct dates.

@Lesego-Serobatse
Copy link

Lesego-Serobatse commented Aug 11, 2023

Well that is quite thoughtful of you to do that. And I'm a little bit worried about your part C analysis, I have noticed you included the date 2020-06-15, and the question really asks for 4 weeks before and after. Now the 'after' part cannot include the date 2020-06-15 reported sales, because those are accumulated(or aggregated rather) sales from Tuesday week before(2020-06-09). So you ought to count the weeks from the (2020-06-22) and exclude the 2020-06-15 date. Otherwise it will mess up the analysis. I just peeked at your code praveen555, I think Katie made that error as well, but I think Katies error was by default mainly because of the cleaning part which I raised.

@praveen555
Copy link
Author

As per the case study

"We would include all week_date values for 2020-06-15 as the start of the period after the change and the previous week_date values would be before"

note the word- include
what i understood..

  1. week_date != week_number
  2. week_date values for 2020-06-15 should be included in after analysis (as per question) and the days (2020-06-14 and before should be for before analysis.
  3. My logic is not based on week_number but the actual dates +/- 4 weeks from 2020-06-15

@Lesego-Serobatse
Copy link

Oh yeah, thank you very much, I am actually the one that missed that part. Thanks Praveen, I appreciate your input in all of this.

@praveen555
Copy link
Author

yes now that i think about it more in depth date_add('2020-06-15', interval 4 week) should be used. I used 3 weeks influenced by it. Because all dates between 2020-06-15 and 2020-07-12 should be included for after analysis. That would give the correct results as per questions.

@Lesego-Serobatse
Copy link

Nah Praveen, I think you were right when you said we ought to include the week date values for 2020-06-15, because of this statement from Danny:
'Taking the week_date value of 2020-06-15 as the baseline week where the Data Mart sustainable packaging changes came into effect.' That means the operational changes really set in at the beginning of that week, of which the exact date is
2020-06-09, now if that is so, then we'll have to include the baseline week and that means
DATE_ADD('2020-06-15', interval 3 week). You were correct. It was a bit confusing for me though, but thanks again for lifting the veil.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants