Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make histogram aggregator checkpoint consistent #438

Merged
merged 15 commits into from
Mar 11, 2020
Merged

Make histogram aggregator checkpoint consistent #438

merged 15 commits into from
Mar 11, 2020

Conversation

paivagustavo
Copy link
Member

@paivagustavo paivagustavo commented Jan 23, 2020

This was based this algorithm on the implementation of the prometheus histogram linked in #437.The algorithm is documented in the code.

I've created a separated struct so we can properly reuse it for other aggrators that have similar problems.

I've also added some tests that ensure that for whenever a Checkpoint() happens, the state of the histogram is valid, i.e., the overall count is the same as the sum of the count of all buckets. It indeed failed with the initial implementation of these at high concurrency but it is working now.

@jmacd
Copy link
Contributor

jmacd commented Jan 23, 2020

The algorithm in Prometheus will block the collection pass using runtime.Gosched() to spin-wait for the counters to match. Without that wait, it's likely that an exporter could still see inconsistent counts. I'd approach this by starting with a stress test -- start a few hundred writers and have them write to random buckets using one fixed value per bucket (so you know the bucket average). At the same time have a reader checkpointing the histogram and verifying that bucket counts / sums are consistent. I am confident you'll be able to write a test that reproduces a problem. (See sdk/metric/stress_test for inspiration.)

But wait, I'd like to abstract the mechanism for computing this concurrency technique into a self-contained package. The idea is that we can use one of these devices for the Histogram aggregator, and we can use one for the MinMaxSumCount aggregator. I think this means creating a struct with just the countAndHotIdx field as seen here. This type would support an API to Enter(), Exit(), and Checkpoint(). Checkpoint will return the index of the cold data set after swapping the two. This device should be even easier to test on its own, using a stress test like the one above. For the test you could construct a data structure with several atomic fields, have the write method increment them each protected by the device. The test should ensure that you always read consistent counts.

c.current.sum.AddNumberAtomic(kind, number)
current := c.current()
current.count.AddUint64Atomic(1)
current.sum.AddNumberAtomic(kind, number)

for i, boundary := range c.boundaries {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sort of off-topic, but I'd like to not see a linear search here. I think this would work:

  idx := sort.Search(len(c.boundaries), func(i int) bool {
    return number.CompareNumber(kind, boundary) >= 0
  })
  current.buckets.Counts[idx].AddUint64Atomic(1)

... but there's a memory allocation there, so it's not perfect. Linear search is probably faster than even a hand-coded-allocation-free binary search for a small number of buckets, it would take some benchmarking to know if this matters.

Note that in the OpenMetrics exposition format, the Histogram type supports several different bucket configurations. There are "linear" buckets which use evenly spaced boundaries, there are "logarithmic" buckets, which use log-linear spacing between boundaries. If either of these special cases is used, we can do better than linear search or binary search typically using simple division (linear case) or computing a single logarithm + a division (log-linear case). These two special cases are probably more important than the arbitrary boundary case, which is when you begin looking at binary search.

So, I'm not sure what's best. A hand-coded-allocation-free binary search is still not the best outcome if the buckets are somehow linear. If the buckets are going to be linear or log-linear, it calls for new constructors to pass in a different sort of bucket-izer interface. If there's a "bucket configuration" interface that supports (a) listing boundaries and (b) performing the necessary lookup too, then this complexity can be isolated outside of the histogram aggregator itself.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so a LinearBucket(min, max, number) returns you a bucket configuration for number evenly spaced buckets between min and max for example. the search function uses division.

when using absolute measures, in particular, a LogarithmicBucket(minValue, growthFactor) returns you a bucket configuration where the first bucket is (-Inf, minValue) and each value after that is exponentially increasing by a factor of growth. the search function calculates a math.Log() to find the bucket, etc.

lastly there's a ArbitraryBucket() configuration that's like what we have now. it uses binary search or linear search, depending on how many buckets there are.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All this are total valid considerations that we should address in the future, I think it is valid for us to create some issues to track this :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on addressing and doing so in a future issue. I like the suggestion of addressing common bucket implementation.

@paivagustavo paivagustavo added exporters area:metrics Part of OpenTelemetry Metrics labels Feb 11, 2020
@jmacd
Copy link
Contributor

jmacd commented Feb 20, 2020

This looks exciting. 🥇

@jmacd
Copy link
Contributor

jmacd commented Mar 3, 2020

Let us know when you'd like us to review this. 😁

@paivagustavo paivagustavo marked this pull request as ready for review March 7, 2020 01:36
@paivagustavo
Copy link
Member Author

Sorry for taking so long with this, It has been some stormy weeks. I think it is ready for review now.

Naming this was very hard and I'm not totally satisfied, I'm open to suggestions.

sdk/internal/state_locker.go Show resolved Hide resolved
l := StateLocker{}
l.SwapActiveState(func() {})
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So good!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seconded! This is great 😄

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy that you both liked it! :)

"go.opentelemetry.io/otel/sdk/metric/aggregator/histogram"
)

func TestStressInt64Histogram(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Sweet)

Copy link
Contributor

@MrAlias MrAlias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks awesome, thanks for taking the time to dig into this!

Almost all of my comments are minor nits except for the question about merging Histogram buckets. Other than that one, this looks good to go.

sdk/internal/state_locker.go Outdated Show resolved Hide resolved
sdk/internal/state_locker.go Outdated Show resolved Hide resolved
sdk/internal/state_locker.go Outdated Show resolved Hide resolved
sdk/internal/state_locker.go Outdated Show resolved Hide resolved
sdk/internal/state_locker.go Outdated Show resolved Hide resolved
l := StateLocker{}
l.SwapActiveState(func() {})
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seconded! This is great 😄

c.current.sum.AddNumberAtomic(kind, number)
current := c.current()
current.count.AddUint64Atomic(1)
current.sum.AddNumberAtomic(kind, number)

for i, boundary := range c.boundaries {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on addressing and doing so in a future issue. I like the suggestion of addressing common bucket implementation.

sdk/metric/aggregator/histogram/histogram.go Outdated Show resolved Hide resolved
sdk/metric/aggregator/histogram/histogram.go Show resolved Hide resolved
@paivagustavo
Copy link
Member Author

@jmacd I've added the link you've asked. @MrAlias fixed the grammars and answered your question.
Thanks for reviewing it.

Copy link
Contributor

@MrAlias MrAlias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work. 🚀

@jmacd jmacd merged commit 288821c into open-telemetry:master Mar 11, 2020
@paivagustavo paivagustavo deleted the lock-free-consistent-aggregator branch March 11, 2020 19:41
MikeGoldsmith pushed a commit to MikeGoldsmith/opentelemetry-go that referenced this pull request Mar 13, 2020
* change the histogram aggregator to have a consistent but blocking Checkpoint()

* docs

* wrapping docs

* remove currentIdx from the 8bit alignment check

* stress test

* add export and move lockfreewrite algorithm to an external struct.

* move state locker to another package.

* add todos

* minimal tests

* renaming and docs

* change to context.Background()

* add link to algorithm and grammars

Co-authored-by: Joshua MacDonald <[email protected]>
@pellared pellared added this to the untracked milestone Nov 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:metrics Part of OpenTelemetry Metrics
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants