Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 1656934 - Scan pending pings directories after dealing with upload status #1205

Merged
merged 2 commits into from
Sep 11, 2020

Conversation

brizental
Copy link
Contributor

The bug that was causing that intermittent, was due to scanning the pings directories before running the on_upload_disabled function that deleted non-deletion request pings.

I fix that in this PR by moving the scanning to after function the on_upload_disabled function.

As I mentioned in the bug, I was able to reproduce the intermittent reliably in a specific branch of mine, with the fix I don't get the errors anymore on that branch either.

// If upload is disabled, we delete all pending pings files
// and we need to do that **before** scanning the pending pings folder
// to ensure we don't enqueue pings before their files are deleted.
let _scanning_thread = glean.upload_manager.scan_pending_pings_directories();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eugh. So this was an actual bug and not a test error. We should add a changelog entry. Do we have any way to estimate how frequently we sent pings out?

I'd suspect that's not super frequent, but it's important to better understand. We could also draft a PSA email, if we find the impact being big.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have any way to estimate how frequently we sent pings out?

Hm, I need to think about this. When we have this bug it is usually followed by an error on deleting the ping file. Because we send the request to the caller -> it uploads succesfully (usually) -> and then it tries to delete an already deleted file, which logs an error. But that is all we have, from what I remember on the top of my head.

I'll think a bit further, if you have any ideas, let me know.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have an idea to validate this: check if any pings are sent from a client_id after this client_id sent a deletion-request ping. Will come back to this thread with results when I have them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BONUS: add CHANGEOG.md entry
upload_manager.set_rate_limiter(
/* seconds per interval */ 60, /* max tasks per interval */ 15,
);

// We only scan the pending ping sdirectories when calling this from a subprocess,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having this makes me question the name of this function. If you have better ideas I'd be up for changing it.

@brizental brizental merged commit 523a358 into mozilla:main Sep 11, 2020
@brizental brizental deleted the 1656934-non-deletion-intermittents branch September 11, 2020 12:33
upload_manager.set_rate_limiter(
/* seconds per interval */ 60, /* max tasks per interval */ 15,
);

// We only scan the pending ping sdirectories when calling this from a subprocess,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops: pending ping sdirectories

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants