Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new NH statistic class CircllhistStatistic and SinkableCircllhist… #391

Merged
merged 20 commits into from
Jul 14, 2020

Conversation

qqustc
Copy link
Contributor

@qqustc qqustc commented Jul 1, 2020

Add new NH statistic class CircllhistStatistic and SinkableCircllhistStatistic.

CircllhistStatistic uses Circllhist under the hood to compute statistics where Circllhist is used in the implementation of Envoy Histograms. Compared to HdrHistogram Circllhist trades precision for fast performance in merge and insertion according to #115.

SinkableCircllhistStatistic wraps the Envoy::Stats::Histogram interface and is used to flush histogram value to downstream Envoy stats Sinks.

Linked Issues: #344

Testing: unit tests

@qqustc qqustc requested a review from oschaaf July 1, 2020 02:52
@qqustc qqustc added the waiting-for-review A PR waiting for a review. label Jul 1, 2020
@qqustc
Copy link
Contributor Author

qqustc commented Jul 1, 2020

/retest

@repokitteh-read-only
Copy link

🙀 failed invoking rebuild of ci/circleci: tsan: 403 Forbidden
🙀 failed invoking rebuild of ci/circleci: build: 403 Forbidden
🙀 failed invoking rebuild of ci/circleci: asan: 403 Forbidden
🙀 failed invoking rebuild of ci/circleci: clang_tidy: 403 Forbidden
🙀 failed invoking rebuild of ci/circleci: benchmark: 403 Forbidden
🙀 failed invoking rebuild of ci/circleci: docker: 403 Forbidden
🙀 failed invoking rebuild of ci/circleci: coverage: 403 Forbidden

🐱

Caused by: a #391 (comment) was created by @qqustc.

see: more, trace.

@qqustc
Copy link
Contributor Author

qqustc commented Jul 1, 2020

@oschaaf hi Otto, it seems all CI checks failed again due to same error as last time. I will try to trigger it to rerun, meanwhile please go ahead review the change. Thanks!

@oschaaf
Copy link
Member

oschaaf commented Jul 1, 2020

/retest

@repokitteh-read-only
Copy link

🙀 failed invoking rebuild of ci/circleci: tsan: 403 Forbidden
🙀 failed invoking rebuild of ci/circleci: build: 403 Forbidden
🙀 failed invoking rebuild of ci/circleci: asan: 403 Forbidden
🙀 failed invoking rebuild of ci/circleci: clang_tidy: 403 Forbidden
🙀 failed invoking rebuild of ci/circleci: benchmark: 403 Forbidden
🙀 failed invoking rebuild of ci/circleci: docker: 403 Forbidden
🙀 failed invoking rebuild of ci/circleci: coverage: 403 Forbidden
🙀 failed invoking rebuild of ci/circleci: test: 403 Forbidden

🐱

Caused by: a #391 (comment) was created by @oschaaf.

see: more, trace.

Copy link
Member

@oschaaf oschaaf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome, and once we merge this it should be trivial to make all the histograms in NH use circllhist as a statistics backend.

I haven't been able to absorb the whole PR at high detail yet, so I'd love to have another pass at it a little bit later; but flushing out my first comments to get started.

source/common/statistic_impl.h Show resolved Hide resolved
}
StatisticPtr combine(const Statistic& statistic) const override;
uint64_t significantDigits() const override { return 1; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems a low guarantee; I wonder, can we improve? (HdrStatistic manages 4 significant digits in the way it we set it up).

Copy link
Contributor Author

@qqustc qqustc Jul 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline. Confirmed that circllhist only has 2 digit decimal precision as a result of base 10 algorithm.
#115 (comment)

Since this is out of current work scope, decided to leave the discussion of potential impacts for future work when we try to switch to circllhist

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we move the implementation into the .cc file, we should add a comment explaining the single digit precision.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment may have been missed. Even if we end up not moving this to the .cc file, we should add a comment explaining the choice of "1" as the precision.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you mum4k@.
Added a comment here.

@oschaaf
Copy link
Member

oschaaf commented Jul 1, 2020

(I have asked for help with CI in the Slack nighthawk-maintainers channel; I'm still clueless as to what is causing the CI trouble here, and me creating mirror PRs and closing them once we merge isn't sustainable)

Copy link
Member

@oschaaf oschaaf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great to me overall, posting my comments after having a look at this in more detail.

source/common/statistic_impl.cc Outdated Show resolved Hide resolved
test/statistic_test.cc Outdated Show resolved Hide resolved
source/common/statistic_impl.h Outdated Show resolved Hide resolved
source/common/statistic_impl.h Outdated Show resolved Hide resolved
source/common/statistic_impl.h Outdated Show resolved Hide resolved
source/common/statistic_impl.h Outdated Show resolved Hide resolved
test/statistic_test.cc Show resolved Hide resolved
test/statistic_test.cc Outdated Show resolved Hide resolved
test/statistic_test.cc Show resolved Hide resolved
test/test_data/circllhist_proto_json.gold Show resolved Hide resolved
@oschaaf oschaaf added waiting-for-changes A PR waiting for comments to be resolved and changes to be applied. and removed waiting-for-review A PR waiting for a review. labels Jul 1, 2020
@oschaaf
Copy link
Member

oschaaf commented Jul 2, 2020

I did the magic trick again, creating a PR from my own repo with the same commits to trigger CI. Let's see what happens.

@qqustc
Copy link
Contributor Author

qqustc commented Jul 3, 2020 via email

qqustc and others added 8 commits July 6, 2020 18:43
…Statistic

Signed-off-by: Qin Qin <[email protected]>

Signed-off-by: qqustc <[email protected]>
Signed-off-by: Qin Qin <[email protected]>
Signed-off-by: qqustc <[email protected]>
Signed-off-by: Qin Qin <[email protected]>
Signed-off-by: Qin Qin <[email protected]>
@qqustc qqustc added waiting-for-review A PR waiting for a review. and removed waiting-for-changes A PR waiting for comments to be resolved and changes to be applied. labels Jul 7, 2020
Signed-off-by: Qin Qin <[email protected]>

Signed-off-by: qqustc <[email protected]>
Signed-off-by: Qin Qin <[email protected]>
@qqustc qqustc added waiting-for-review A PR waiting for a review. and removed waiting-for-changes A PR waiting for comments to be resolved and changes to be applied. labels Jul 8, 2020
oschaaf
oschaaf previously approved these changes Jul 8, 2020
@@ -43,7 +43,7 @@ COVERAGE_VALUE=${COVERAGE_VALUE%?}

if [ "$VALIDATE_COVERAGE" == "true" ]
then
COVERAGE_THRESHOLD=98.6
COVERAGE_THRESHOLD=98.4
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we forced to lower the threshold? Which lines / behaviors we can't cover in this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://14282-180498819-gh.circle-artifacts.com/0/coverage/source/common/statistic_impl.h.gcov.html

The lines not covered are std::string tagExtractedName() const override { NOT_IMPLEMENTED_GCOVR_EXCL_LINE; }, where I added test EXPECT_DEATH(stat.tagExtractedName(), ".*"); but it does not count into coverage. So after talked to Otto, we decided to lower the coverage.

util.loadFromJson(Envoy::Filesystem::fileSystemForTest().fileReadToEnd(
TestEnvironment::runfilesPath("test/test_data/circllhist_proto_json.gold")),
parsed_json_proto, Envoy::ProtobufMessage::getStrictValidationVisitor());
// Instead of comparing proto's, we perform a string-based comparison, because that emits a
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A more idiomatic approach would be to compare the proto and output a textual diff only if the comparison fails. That way we will be comparing via the binary representations that are the public API. The string are only for humans to read.

Note that EXPECT_EQ takes in an optional stream which can contain a custom error message - this is where we can inject the readable diff:

https://github.com/google/googletest/blob/master/googletest/docs/primer.md#assertions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to compare the proto. PTAL.

class CircllhistStatistic : public StatisticImpl {
public:
CircllhistStatistic() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we place the implementation of the constructor in the .cc file? It helps to keep the header files more readable and focused on the API.

Same comment applies to the destructor and the other methods and class.

Copy link
Contributor Author

@qqustc qqustc Jul 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Moved most of the implementation to .cc file and only leave some of the 1 line function in .h file. PTAL

}
StatisticPtr combine(const Statistic& statistic) const override;
uint64_t significantDigits() const override { return 1; }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we move the implementation into the .cc file, we should add a comment explaining the single digit precision.


private:
histogram_t* histogram_;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(optional) Using a raw pointer here opens us to memory leaks and all the usual problems of raw pointers.

Would it be possible to move this to a smart pointer (std::unique_ptr) with a custom deleter?

https://www.bfilipek.com/2016/04/custom-deleters-for-c-smart-pointers.html

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Jakub for the reference. It seems the hist_free(histogram_t *hist) is quite complicated, so I will not move it to a smart pointer for now

class SinkableCircllhistStatistic : public SinkableStatistic, public CircllhistStatistic {
public:
SinkableCircllhistStatistic(Envoy::Stats::Scope& scope,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we document the constructor?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

StatisticPtr CircllhistStatistic::combine(const Statistic& statistic) const {
auto combined = std::make_unique<CircllhistStatistic>();
const auto& b = dynamic_cast<const CircllhistStatistic&>(statistic);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's find a better (more expressive) variable name than just "b".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed it to "stat".

auto combined = std::make_unique<CircllhistStatistic>();
const auto& b = dynamic_cast<const CircllhistStatistic&>(statistic);
hist_accumulate(combined->histogram_, &this->histogram_, 1);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without seeing the API of the histogram, it is hard to figure out what does the literal 1 stand for. Is there an useful name we can give it by either making it a constant or placing an inline comment?

// Either:
const int good_name = 1;
hist_accumulate(combined->histogram_, &this->histogram_, good_name);

// Or:
hist_accumulate(combined->histogram_, &this->histogram_, /* good_name = */ 1);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Jakub for the advice. Added an inline comment.

}

std::vector<double> quantiles{0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.55, 0.6,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a comment indicating how did we select these quantile values?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

}

std::vector<double> quantiles{0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.55, 0.6,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(optional) If this and the next vector can be const, marking them so helps the reader understand that they aren't going to be modified further down.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@mum4k mum4k added waiting-for-changes A PR waiting for comments to be resolved and changes to be applied. and removed waiting-for-review A PR waiting for a review. labels Jul 9, 2020
Signed-off-by: Qin Qin <[email protected]>
@qqustc qqustc added waiting-for-review A PR waiting for a review. and removed waiting-for-changes A PR waiting for comments to be resolved and changes to be applied. labels Jul 10, 2020
Copy link
Collaborator

@mum4k mum4k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @qqustc, just last two nits.

}
StatisticPtr combine(const Statistic& statistic) const override;
uint64_t significantDigits() const override { return 1; }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment may have been missed. Even if we end up not moving this to the .cc file, we should add a comment explaining the choice of "1" as the precision.

const absl::optional<int> worker_id() { return worker_id_; }

protected:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I apologize, I made a typo in the comment. I meant to say "rather than private". We should always prefer private methods and fields as they minimize the public API and allow us to execute faster (internal changes).

Can this be private instead?

@mum4k mum4k added waiting-for-changes A PR waiting for comments to be resolved and changes to be applied. and removed waiting-for-review A PR waiting for a review. labels Jul 13, 2020
}

// List of quantiles is based on hdr_proto_json.gold which lists the quantiles provided by
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I think HdrHistogram will emit more percentiles as the number of (differently-valued) data points added grows. So it's not a fixed set with HdrHistogram.
I am not sure if there is an equivalent way to iterate the available buckets with libcircllhist to get more detail out, and I am also not sure this is really a problem, but I thought it would be good to call out the difference here.

(The only direct implication I figure it would have today is that the Fortio plots would be drawn with less data points)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Otto for the information! Updated the comment to just List of quantiles is based on hdr_proto_json.gold.
From the APIs in circllhist.h, there is no easy way to iterate the quantiles. If the listed data points turn out to be not enough in the future, we can always add more here.

Signed-off-by: Qin Qin <[email protected]>
@qqustc
Copy link
Contributor Author

qqustc commented Jul 14, 2020

Thank you @qqustc, just last two nits.

Thank you @mum4k for the review.
Made scope_ private, exposed it through a public function method scope() for child class SinkableHdrStatistic to use.
Previously member scope_ is protected in SinkableStatistic, it is because scope_ is used in the child class SinkableHdrStatistic.

@qqustc qqustc added waiting-for-review A PR waiting for a review. and removed waiting-for-changes A PR waiting for comments to be resolved and changes to be applied. labels Jul 14, 2020
@qqustc
Copy link
Contributor Author

qqustc commented Jul 14, 2020

/retest

@repokitteh-read-only
Copy link

🙀 failed invoking rebuild of ci/circleci: coverage: 403 Forbidden

🐱

Caused by: a #391 (comment) was created by @qqustc.

see: more, trace.

qqustc added 2 commits July 13, 2020 22:00
Signed-off-by: Qin Qin <[email protected]>
Signed-off-by: Qin Qin <[email protected]>
@mum4k mum4k merged commit b4cebe4 into envoyproxy:master Jul 14, 2020
@qqustc qqustc deleted the pr1 branch July 14, 2020 23:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
waiting-for-review A PR waiting for a review.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants