Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add mostrecent aggregation to Gauge #967

Merged
merged 1 commit into from
Oct 24, 2023

Conversation

draftcode
Copy link
Contributor

@draftcode draftcode commented Oct 17, 2023

In the multiprocess mode, the process that exposes the metrics needs to aggregate the samples from other processes. Gauge metric allows users to choose the aggregation mode. This implements 'mostrecent' (and 'livemostrecent') mode where the last observed value is exposed.

In order to support this, the file format is expanded to store the timestamps in addition to the values. The stored timestamps are read by the reader process and it's used to find the latest value.

Closes #847

Consideration on the atomicity:

Previously, mmap_dict.py had a comment saying "We assume that reading from an 8 byte aligned value is atomic". With this change, the value write becomes a 16 bytes 8-byte aligned write. The code author (draftcode) tried to find a basis on the original assumption, but couldn't find any. According to write(2), if a file descriptor is shared, the write becomes atomic. However, we do not share the file descriptors in the current architecture.

Considering that Ruby implementation also does the same and hadn't seen an issue with it, this write atomicity problem might be practically not an issue.

See also:

@draftcode draftcode force-pushed the most_recent_aggregation branch 2 times, most recently from 95820f7 to cc84ce2 Compare October 17, 2023 02:52
Copy link
Member

@csmarchbanks csmarchbanks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Generally this is looking great!

I left a few initial comments, but need to spend a bit more time testing/reviewing in addition to those.

prometheus_client/metrics.py Outdated Show resolved Hide resolved
prometheus_client/multiprocess.py Outdated Show resolved Hide resolved
prometheus_client/values.py Outdated Show resolved Hide resolved
@draftcode draftcode force-pushed the most_recent_aggregation branch 2 times, most recently from 172733f to 27ca983 Compare October 19, 2023 18:02
@draftcode
Copy link
Contributor Author

OK. Addressed the comments. Please take a look again.

Copy link
Member

@csmarchbanks csmarchbanks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally this is looking good to me, just one small comment. Thanks!

prometheus_client/metrics.py Outdated Show resolved Hide resolved
@draftcode draftcode force-pushed the most_recent_aggregation branch from 27ca983 to 831a5b6 Compare October 23, 2023 23:04
@draftcode draftcode force-pushed the most_recent_aggregation branch from 831a5b6 to 5d946a5 Compare October 24, 2023 18:38
Copy link
Member

@csmarchbanks csmarchbanks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks good, thanks! I realized we should also update the "Metrics tuning (Gauge):" section in the README before merging, basically just adding a bullet point for 'recent'.

In the multiprocess mode, the process that exposes the metrics needs to
aggregate the samples from other processes. Gauge metric allows users to
choose the aggregation mode. This implements 'mostrecent' (and
'livemostrecent') mode where the last observed value is exposed.

In order to support this, the file format is expanded to store the
timestamps in addition to the values. The stored timestamps are read by
the reader process and it's used to find the latest value.

Closes prometheus#847

Consideration on the atomicity:

Previously, mmap_dict.py had a comment saying "We assume that reading
from an 8 byte aligned value is atomic". With this change, the value
write becomes a 16 bytes 8-byte aligned write. The code author tried to
find a basis on the original assumption, but couldn't find any.
According to write(2), **if a file descriptor is shared**, the write
becomes atomic. However, we do not share the file descriptors in the
current architecture.

Considering that Ruby implementation also does the same and hadn't seen
an issue with it, this write atomicity problem might be practically not
an issue.

See also:

* prometheus/client_ruby#172

  The approach and naming are taken from client_ruby.

* https://github.com/prometheus/client_golang/blob/v1.17.0/prometheus/metric.go#L149-L161

  client_golang has an API for setting timestamp already. It explains
  the use case for the timestamp beyond the client-local aggregation. In
  order to support the same use case in Python, further changes are
  needed.

Signed-off-by: Masaya Suzuki <[email protected]>
@draftcode draftcode force-pushed the most_recent_aggregation branch from 5d946a5 to 20ade3c Compare October 24, 2023 21:42
@draftcode
Copy link
Contributor Author

The code looks good, thanks! I realized we should also update the "Metrics tuning (Gauge):" section in the README before merging, basically just adding a bullet point for 'recent'.

Done.

Copy link
Member

@csmarchbanks csmarchbanks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@csmarchbanks csmarchbanks merged commit 546599b into prometheus:master Oct 24, 2023
1 check passed
jvansanten added a commit to AmpelProject/Ampel-core that referenced this pull request Dec 21, 2023
prometheus/client_python#967 jiggers with
the format of multiprocess metrics in a way that breaks our custom
aggregation. Ignore for now.
jvansanten added a commit to AmpelProject/Ampel-core that referenced this pull request Jan 8, 2024
prometheus/client_python#967 jiggers with
the format of multiprocess metrics in a way that broke our custom
aggregation.
jvansanten added a commit to AmpelProject/Ampel-core that referenced this pull request Jan 8, 2024
prometheus/client_python#967 jiggers with
the format of multiprocess metrics in a way that broke our custom
aggregation.
jvansanten added a commit to AmpelProject/Ampel-core that referenced this pull request Jan 8, 2024
* fix(deps): update minor updates

* fix: support timestamps in prometheus 0.18

prometheus/client_python#967 jiggers with
the format of multiprocess metrics in a way that broke our custom
aggregation.

---------

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: Jakob van Santen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

"last" aggregation in Multiprocess_mode (Gauge)
2 participants