Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[salvo] Initial commit for benchmark abstraction framework #73

Merged
merged 35 commits into from
Nov 9, 2020

Conversation

abaptiste
Copy link
Contributor

This is the initial 'official' commit for the Salvo tool. This aims to abstract the execution of nighthawk to benchmark a given envoy version.

See this issue for some background. The two design docs for this project are referenced here.

In this commit, salvo is placed into a separate directory of the envoy-perf repository, and is referenced from the main README.md.

cc: @oschaaf, @htuch, @landesherr

Signed-off-by: Alvin Baptiste [email protected]

Commit Message: [salvo] Initial commit for benchmark abstraction framework
Testing: Unit tests included, Address as many pylint3 issues as feasible
[#Issue] envoyproxy/envoy#11266

@mattklein123
Copy link
Member

@jmarantz @oschaaf would you be willing to do a lightweight review of this?

@oschaaf
Copy link
Member

oschaaf commented Oct 13, 2020

Sure; the first questions that pop up for me are:

  • Do we want to establish/modify checks in CI for Python conventions? (And maybe proto3 & bazel files too?). We hypothetically could copy over some things from Nighthawk and/or Envoy to create a starting point.
  • There are build tasks in .circleci/config.yml, can we leverage that in this PR? E.g it seems unit tests are provided in here, we could execute those in CI?

Copy link
Member

@oschaaf oschaaf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome; flushing out a first round of comments; the PR is pretty large so I may need some more takes to get through it all, but overall I have a feeling it looks great. Before going into more depth, I think that establishing automated code-level conventions would be good as that helps smooth the review process by automating away part of it.

salvo/README.md Outdated Show resolved Hide resolved
salvo/README.md Outdated Show resolved Hide resolved
salvo/README.md Show resolved Hide resolved
salvo/README.md Show resolved Hide resolved
salvo/README.md Outdated Show resolved Hide resolved
salvo/salvo.py Outdated Show resolved Hide resolved
salvo/src/lib/api/control.proto Outdated Show resolved Hide resolved
@abaptiste abaptiste force-pushed the master branch 2 times, most recently from 7dd9c93 to 2401dda Compare October 13, 2020 19:36
Signed-off-by: Alvin Baptiste <[email protected]>
@abaptiste abaptiste force-pushed the master branch 2 times, most recently from a062d76 to 2820352 Compare October 15, 2020 20:20
@abaptiste
Copy link
Contributor Author

abaptiste commented Oct 19, 2020

I will work on making this change set a bit smaller.

@abaptiste abaptiste requested review from oschaaf and htuch November 3, 2020 16:59
@mum4k
Copy link
Contributor

mum4k commented Nov 3, 2020

Hi @abaptiste, my name is Jakub and I am one of the maintainers of Nighthawk. Recently I have been working on collecting requirements for a benchmarking framework of Envoy and I was very excited to hear that you are ahead and have something ready for review. My hope is that we can cooperate towards our common goals.

It will be my pleasure to start reviewing this PR and I am sure we can gain some traction going forward. Sorry about the delay you have encountered here so far, I was otherwise occupied and couldn't get here sooner. I will start reading the code here today and expect to ask some questions later today or tomorrow. I expect to start the review at a high level and will need you help to work towards the details.

Meanwhile it would help me if you could read the linked requirements and highlight any areas of concern or questions you might have. Looking forward to working with you on this.

Signed-off-by: Alvin Baptiste <[email protected]>
Signed-off-by: Alvin Baptiste <[email protected]>
Signed-off-by: Alvin Baptiste <[email protected]>
Copy link
Contributor

@mum4k mum4k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for iterating on this, the BUILD system setup looks much easier to follow. I have a few more improvement suggestions on that front. I have also reviewed the documentation, most of the helper scripts and the proto API in this round.

I will start reviewing the Python code in the next round. Thank you for your patience, this is mostly taking this many rounds because the scope of this PR is fairly large. We can try to address that in future PRs by making them smaller. Happy to discuss the best approach, one option is to send separate PRs per each api / module / package. Reviewing and addressing comments on large PRs can sometimes feel like running a marathon ;)

ci/do_ci.sh Outdated Show resolved Hide resolved
salvo/README.md Outdated Show resolved Hide resolved
salvo/BUILD Outdated Show resolved Hide resolved
salvo/src/lib/BUILD Outdated Show resolved Hide resolved
salvo/src/lib/BUILD Outdated Show resolved Hide resolved
salvo/src/lib/api/source.proto Outdated Show resolved Hide resolved
salvo/src/lib/api/source.proto Outdated Show resolved Hide resolved
salvo/src/lib/api/source.proto Outdated Show resolved Hide resolved
salvo/src/lib/api/source.proto Outdated Show resolved Hide resolved
# E126 Continuation line over-indented for hanging indent

# We ignore false positives because of what look like pytest peculiarities
# F401 Module imported but unused
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We didn't get to reviewing Python code yet, but these two feel like they may be result of how we are importing code / setting up the build system. It may be valuable for us to try and address them before we silence them which can result in accumulation of technical debt. Can you help me out by pointing to the locations where these warnings come up? I might be able to suggest a simple fix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I copied these scripts from nighthawk and they are largely unmodified.

Can we circle back to this because there are a few games we have to play with adding directories to the path because of the source layout and 'bazel'? We can certainly tighten up the style and formatting checks.

Signed-off-by: Alvin Baptiste <[email protected]>
Signed-off-by: Alvin Baptiste <[email protected]>
Signed-off-by: Alvin Baptiste <[email protected]>
Signed-off-by: Alvin Baptiste <[email protected]>
Signed-off-by: Alvin Baptiste <[email protected]>
Signed-off-by: Alvin Baptiste <[email protected]>
Copy link
Contributor

@mum4k mum4k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for cleaning up the build system setup. Looks like this PR still has a number of comments that weren't addressed yet (they are marked as "Outdated" due to the file moves, but their content still applies). Can you please look at them in the conmversation history and address them before we move forward?

Most of the comments that are still in flight are related to the API. We have already finished reviewing the documentation and helper scripts (format tools, download dependencies, repo setup). I have a proposal that may help us decrease the scope of this PR while moving forward. How would you feel about removing all the Python code and tests from this PR and merging what we have reviewed so far (after addressing the existing comments)? We could then immediately follow up with a set of smaller PRs, one per each Python library and its tests. The benefit would be that each of the smaller PRs would take less time in review and would make it easier for us to keep track of the conversation and changes.

Please let me know what you think.

ci/do_ci.sh Outdated
function build_salvo() {
echo "Building Salvo"
pushd salvo
bazel build //:salvo
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(optional) If the intention here is to build all of our code, we should also change this to:

bazel build //src/...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack

salvo/README.md Outdated

## Example Control Documents

The control document defines the data needed to excute a benchmark. At the moment, the fully dockerized benchmark is the only one supported. This benchmark discoveres user supplied tests for execution and uses docker images to run the tests. In the exampple below, the user supplied tests files are located in `/home/ubuntu/nighthawk_tests` and are mapped to a volume in the docker container.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: execute

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

"remote": false,
"dockerizedBenchmark": true,
"images": {
"reuseNhImages": true,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: we may need to update these examples after we perform some of the suggested proto field renames.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. Certainly doable.

Signed-off-by: Alvin Baptiste <[email protected]>
Signed-off-by: Alvin Baptiste <[email protected]>
Copy link
Contributor

@mum4k mum4k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the offline discussion @abaptiste. As we agreed, we will keep the Python code in this PR and achieve greater velocity by instead switching the focus of this review to architecture. This will help us move faster while giving us the opportunity to iterate on the code in future changes.

With that said, this review round represents the last set of comments I have at that level. Thank you for your patience with this process and looking forward to the next PRs. I will tag one of the maintainers here for merging after these comments are addressed.

ci/do_ci.sh Outdated
pushd salvo
./install_deps.sh

bazel test //src/lib/...
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit) This can now also be bazel test //....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

salvo/BUILD Outdated

py_binary(
name = "salvo",
srcs = ["src/salvo.py"],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed offline, the move of salvo.py into the src/ directory was likely caused by a mistake I made in one of my comments. If we can, let's move it back to the salvo/ directory.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a merge conflict and I didn't catch this. I'll fix and update


class DockerVolume():

@staticmethod
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of marking this method static, can we define it as a module level function (i.e. remove the class DockerVolume).

We should prefer standalone functions unless having a class is really necessary. We can test for that by identifying if the class either has interesting state that changes during its lifetime or if the code needs to maintain multiple instances of the class. If we don't have good reasons for either of these two, having a standalone method will result in simpler code.

The style guide also recommends avoiding staticmethod:
https://github.com/google/styleguide/blob/gh-pages/pyguide.md#217-function-and-method-decorators

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.


# Run in the actual bazel directory so that the sys.path
# is setup correctly
if os.path.islink(sys.argv[0]):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you help me understand this path manipulation and the site.addsitedir("src") call below?

I wouldn't expect this to be necessary in a bazel environment. We should be able to delete all these manipulations from here and the other files and just go with an import rooted at the beginning of the WORKSPACE:

from src.lib.job_control_loader import load_control_doc

I have tested this locally by removing all the manipulation and executing the salvo target which seems to have worked at least to a point that it imported all libraries correctly. Do we need to keep this or can we do without it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an artifact of us getting salvo to run in a cloud service. We noticed that bazel would symlink files to the original location so, the current working directory would be the local source location and not bazel-bin where we expect. This was true for the python generated code for the api. We needed these files packaged in a library and were not able to accomplish that without the working directory manipulation.

I can remove it here, but there are other instance where we need it to remain.

Signed-off-by: Alvin Baptiste <[email protected]>
@abaptiste
Copy link
Contributor Author

@mum4k I pushed the last set of changes. Thanks for your reviews!

mum4k
mum4k previously approved these changes Nov 6, 2020
Copy link
Contributor

@mum4k mum4k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for addressing the comments @abaptiste.

@htuch this PR is good to go in, please take a look.

Copy link
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since @mum4k is good with this, I think it would be great to ship and land. I had a quick scan and noticed a few nits that would be good to sort out, happy to merge when they are fixed.


// Define whether the mount point is read-write or read-only
// eg: 'rw' or 'ro'
string mode = 2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably should be an enum, but don't feel super strong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Enum it is.

oneof envoy_ip_test_versions {
bool v4only = 1;
bool v6only = 2;
bool all = 3;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably should an enum.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

// Specify whether this source location is Envoy or NightHawk
oneof identity {
bool envoy = 1;
bool nighthawk = 2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Enum please.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

# Setup the test directory
if test_dir:
properties = VolumeProperties()
properties.bind = '/usr/local/bin/benchmarks/benchmarks.runfiles/nighthawk/benchmarks/external_tests/'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this hardcoded path?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the location in the container that the external tests are mapped to. The path is used from this example.

contents = None

# Try loading the contents based on the file extension
if re.match(r'.*\.json', filename):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

filename.endswith('.json') etc. below

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Certainly. Will update.


env = ['key1=val1', 'key2=val2']
cmd = ['uname', '-r']
image_name = 'oschaaf/benchmark-dev:latest'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come Otto's image is hardcoded?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was available in dockerhub and I used it to validate that the docker module could actually pull an image and execute a command. The actual image here is moot and could be any valid docker image.

assert source.source_path == "/home/ubuntu/envoy"
assert not source.source_url
assert source.branch == "master"
assert source.commit_hash == "e744a103756e9242342662442ddb308382e26c8b"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the plan to deal with these hardcoded hashes? Should they be in some configuration file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are just string literals with no meaning. This could be "foo" and work the same.

Signed-off-by: Alvin Baptiste <[email protected]>
@abaptiste
Copy link
Contributor Author

@htuch This is ready for you again. Thanks!

@htuch
Copy link
Member

htuch commented Nov 6, 2020

@abaptiste looks good, but what about the enums?

Signed-off-by: Alvin Baptiste <[email protected]>
@abaptiste
Copy link
Contributor Author

abaptiste commented Nov 6, 2020

@htuch Sorry about the enums. No excuses. Updated the PR.

Copy link
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, excited to see this land!

@htuch htuch merged commit 5c434c8 into envoyproxy:master Nov 9, 2020
@oschaaf
Copy link
Member

oschaaf commented Nov 9, 2020

Exciting indeed, awesome!

@abaptiste
Copy link
Contributor Author

Thanks folks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants