Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create more stable solution if the computation of statistics failed #44

Open
telezhnaya opened this issue Aug 25, 2022 · 0 comments
Open
Assignees

Comments

@telezhnaya
Copy link
Contributor

telezhnaya commented Aug 25, 2022

Now, the cron job runs each day.
It tries to collect all the data several times and it could fail.
If it fails, I have the notification about it and I have to go and investigate it manually.
Our SRE team silently revoked my rights some time ago by mistake, and I haven't received any notifications for some time (maybe half a year).
Some of our tables fully depends on the consistency of the data in the other tables. If the data is missing, the new computed data is wrong.

So, I'm no longer sure if the data is correct.

Now, if something goes wrong and noone investigated that, next day we will just compute next day statistics, and therefore we fill the tables with potentially broken data.

I want to create the separate table where I put the last successful datetime of each analytics table we have.
Each day we should at first read these values and try to fill again the missing parts. We also should not even try to fill the dependant tables if we can't compute all the data they need.

After that, we should clean all the data we have and re-compute it again to be sure that everything is consistent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant