Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use simpler fs based locking for multiproc prometheus #419

Open
bloodearnest opened this issue Mar 5, 2019 · 0 comments
Open

Use simpler fs based locking for multiproc prometheus #419

bloodearnest opened this issue Mar 5, 2019 · 0 comments

Comments

@bloodearnest
Copy link
Contributor

bloodearnest commented Mar 5, 2019

Talisker currently uses a multiprocessing.Lock to synchronize metrics clean up. This is non-trivial and probably unnecessary. We could instead use a simpler filesystem based locking, along the lines of

master - writes a monotonic counter to file whenever it finishes aggregating metrics
worker - reads the value of the file before collecting metrics, collects, that before rendering, checks it hasn't changed. If it has, either abort, or maybe retry once.

This should protect against the worker reading an inconsistent state on disk. We can only get away with it because metrics are non-transactional data that we can fail on if needed, of course.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant