Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: Add tracking latencies and flamegraphs in CI #11266

Open
oschaaf opened this issue May 19, 2020 · 23 comments
Open

proposal: Add tracking latencies and flamegraphs in CI #11266

oschaaf opened this issue May 19, 2020 · 23 comments
Assignees
Labels

Comments

@oschaaf
Copy link
Member

oschaaf commented May 19, 2020

Filing this issue to get a feel for interest in this

Goal:

Add a means to track and persist latency numbers and perf visualizations like flamegraphs over time in CI. This would allow us to track how we're doing over time as well as have perf information
at hand when a latency regression is observed.

Description:

Nighthawk uses a lightweight python-based framework for integration testing.
This framework serves as a basis for writing NH's own benchmarks.

With a small bit of modification this could be modified to:

  • make consumption very low friction in foreign code bases (like Envoy)
  • allow it to inject proxies in the between the client and test server. For example, Envoy at a certain sha
  • scavenge tests from external locations

More details, and some concrete scripts for getting an idea of what this would look like can be found here.

/cc @danzh2010 @htuch

@htuch
Copy link
Member

htuch commented May 19, 2020

I think even redline QPS would be an amazing contribution here, everything else proposed seems like gravy. +1000.

@mattklein123
Copy link
Member

See #961. I desperately want this. This will require a lot of thought in terms of how to structure repeatable tests, but yes, we really need to do this.

@antoniovicente
Copy link
Contributor

I don't know if this exists yet, but it would also be good to have a few relatively small benchmark scenarios that can be used for A/B comparison of performance after changes to data plane components, specially in cases where we expect some performance impact. Tracking performance data for the small benchmarks over time on a calibrated environment would be great.

@oschaaf
Copy link
Member Author

oschaaf commented May 25, 2020

I have started exploring this. Tracking progress here.
@antoniovicente It might be good to take a look at test_benchmarks.py, to see if that allows enough flexibility. The idea is that consumers can specify their own locations where the suite should scavenge tests, which in turn can supply custom fixtures with custom Envoy configurations.

@mattklein123
Copy link
Member

cc @marcomagdy who is also interested in helping with this effort.

@snowp
Copy link
Contributor

snowp commented Jun 4, 2020

We'd also be interested in this, so let me know how I can help to move this forward

@oschaaf
Copy link
Member Author

oschaaf commented Jun 9, 2020

Update: a good part of this is in review over at envoyproxy/nighthawk#337.

Nighthawk is eating its own dogfeed via a new CI task, and is dropping simple visualizations per test (example).

Cpu profiles are also collected, but flamegraphing needs more work as we need to consider the binaries and libraries involved in generating the profile to get sensible output for that.

@antoniovicente
Copy link
Contributor

Any updates?

@oschaaf
Copy link
Member Author

oschaaf commented Aug 7, 2020

Well, I got sidetracked for a but, but this has been happily test-driving in NHs own CI. So far so good.

For example see the .html files in the artefacts of a recent PR.
We could consider wiring up the current state in Envoy's CI as an MVP based on the docker-based flow (Nighthawk's CI runs with its locally produced binaries). This should be pretty doable, but I would appreciate help/guidance there.

Some important improvements that others have expressed interest in tackling are:

  • The current UI is limited to a directly listing of artefacts as offered by the CI env (CircleCI in NH).
  • There's no regression analysis / detection.

For more detailed status, see https://github.com/envoyproxy/nighthawk/tree/master/benchmarks#todos

@abaptiste
Copy link
Contributor

Hello Folks. We have a design doc for a framework that we'd like your comments on:

https://docs.google.com/document/d/14Iz8j--Mvb06QFB8RurtYlwmy657YbAVfqDr-jKgtaQ/edit#heading=h.grkfe6onmtgv

@htuch
Copy link
Member

htuch commented Sep 8, 2020

@abaptiste thanks. My super high-level comment is that as a developer and performance engineer (user story), I'd like to be able to have control over the benchmark execution environment. So, any framework should be capable of running 100% locally. It's fine to make it also available as a SaaS via buckets or e-mail, but I think we're limiting applicability if those are the only options.

@mattklein123
Copy link
Member

@abaptiste thanks. My super high-level comment is that as a developer and performance engineer (user story), I'd like to be able to have control over the benchmark execution environment. So, any framework should be capable of running 100% locally. It's fine to make it also available as a SaaS via buckets or e-mail, but I think we're limiting applicability if those are the only options.

+1 I left a bunch of comments around this. I also want to make sure we have a clear post-MVP path for CI integration as IMO this is the thing we really want to unlock ASAP. Thank you for working on this!

@abaptiste
Copy link
Contributor

Thank you for the comments. These are the major themes I've captured:

  • Define all JSON schemas using proto3 messages (this will be done as part of the MVP)
  • We need a better authentication mechanism
  • CI integration so that builds run nightly or upon master check-in
  • Long term storage of results so that we can chart the performance of prior builds
  • Ability to do performance runs complementing local development

If there are additional items I may have inadvertently missed or misunderstood, please let me know.

@mattklein123
Copy link
Member

@abaptiste that list LGTM and also similar to our offline conversation. Thanks for working on this! This will be awesome.

@abaptiste
Copy link
Contributor

I posted a separate doc based on the feedback from the initial review. Please feel free to take a look and comment.

@htuch
Copy link
Member

htuch commented Sep 29, 2020

@abaptiste the new doc LGTM, tagging @oschaaf @mattklein123 @antoniovicente @mum4k @pamorgan @snowp for comments/sign-off.

@mattklein123
Copy link
Member

I looked at the doc and at a high level it looks great to me. Very excited for this work!

@oschaaf
Copy link
Member Author

oschaaf commented Sep 29, 2020

Looks good to me!

htuch pushed a commit to envoyproxy/envoy-perf that referenced this issue Nov 9, 2020
This is the initial 'official' commit for the Salvo tool. This aims to abstract the execution of nighthawk to benchmark a given envoy version.

See this issue for some background. The two design docs for this project are referenced here.

In this commit, salvo is placed into a separate directory of the envoy-perf repository, and is referenced from the main README.md.

Testing: Unit tests included, Address as many pylint3 issues as feasible
[#Issue] envoyproxy/envoy#11266

Signed-off-by: Alvin Baptiste <[email protected]>
@gyohuangxin
Copy link
Member

Any updates? We are interested in the integration, is there any help I can offer?

@mum4k
Copy link
Contributor

mum4k commented Jan 12, 2022

Hi @gyohuangxin, we will gladly accept help. We expect to be able to staff this work in about 6 months, but I would gladly work with you in the meantime if you have the cycles. If you are able to help, it would be good to get in touch and discuss priorities and the direction. Are you on the Envoy's Slack by any chance?

@gyohuangxin
Copy link
Member

gyohuangxin commented Jan 12, 2022

@mum4k Thank you! yes, let's discuss on Slack.

@keithmattix
Copy link
Contributor

What's the latest on this effort? This would be extremely beneficial

@mum4k
Copy link
Contributor

mum4k commented Aug 15, 2024

This effort has been de-staffed temporarily. If there is anyone who wants to pick it up in the meantime, I will gladly transfer the latest state and/or guide, review code as desired.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

9 participants