Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Capture team and track revisions in metrics metadata #725

Merged
merged 7 commits into from
Aug 22, 2019

Conversation

ebadyano
Copy link
Contributor

Still missing track revision info.

Closes #664

@ebadyano ebadyano added :Metrics How metrics are stored, calculated or aggregated enhancement Improves the status quo labels Jul 11, 2019
@ebadyano ebadyano added this to the 1.x milestone Jul 11, 2019
@ebadyano ebadyano changed the title Capture team revision in metrics metadata Capture team and track revisions in metrics metadata Jul 11, 2019
@ebadyano ebadyano marked this pull request as ready for review July 11, 2019 16:50
Copy link
Contributor

@dliappis dliappis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. I left some comments mostly about tests and abstracting things.

@@ -106,6 +106,7 @@ def __init__(self):
self.start_sender = None
self.mechanic = None
self.main_driver = None
self.track_revision= None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PEP8: whitespace missing before =


name = self.cfg.opts("race", "pipeline")
if not (name == "benchmark-only" or track.is_simple_track_mode(self.cfg)):
self.track_revision = git.head_revision(track.track_path(self.cfg))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO invoking git.head_revision i.e. an FS+external command operation against a local directory seems out of place for this file.

As we are anyway supporting the --track-revision command line argument, we could always populate it for the cfg object with the exact value, if unspecified, in a separate module. In loader.py/GitTrackRepository and SimpleTrackRepository we are doing a bunch of low-level stuff, so this could be a starting point?

Note that we are passing the cfg object gets passed to the actor system so the information should be accessible by actors.

name = self.cfg.opts("race", "pipeline")
team_path = self.cfg.opts("mechanic", "team.path", mandatory=False)
if not (name == "benchmark-only" or team_path):
self.team_revision = git.head_revision(team.team_path(self.cfg))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to my comment below about track-parameters, could we have a cfg argument for team_revision -- which could even be specified via cli arguments -- and initialize it, if unset, e.g. in team_path? The information could then be retrieved from the cfg object here.

@@ -1325,6 +1328,8 @@ def to_result_dicts(self):
if plugins:
result_template["plugins"] = list(plugins)

if self.track_revision:
result_template["track-revision"] = self.track_revision
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll need to adjust tests in metrics_test.py to check this too, right?

@dliappis
Copy link
Contributor

re: abstracting the git operations I read #664 that this PR references and observed there is some guidance there about where each value (esp. for teams) could be retrieved:

For teams, the because is checked out on the target nodes where Elasticsearch is deployed but I think we should derive this info in MechanicActor#receiveMsg_StartEngine() when we are about to load the team. We then need to communicate this information back to the benchmark coordinator actor (BenchmarkActor) in MechanicActor#on_cluster_started() as an additional property in the EngineStarted message.

For tracks, we load the track in BenchmarkActor#setup() and can derive the revision there.

@ebadyano ebadyano requested a review from dliappis July 18, 2019 17:56
Copy link
Contributor

@dliappis dliappis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for iterating! I think we are getting closer, I left some comments about doing the git operations in the right place and not breaking encapsulation, as well as a question about where do we store team revisions?

@@ -144,6 +144,8 @@ def team_path(cfg):
current_team_repo.checkout(repo_revision)
else:
current_team_repo.update(distribution_version)
team_revision = git.head_revision(current_team_repo.repo_dir)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this works fine, it breaks OOP encapsulation. The class RallyRepository gets instantiated in various places, like here a few lines above in line 142 and should be the "container" holding the information about this git repo. You are already using current_team_repo.repo_dir, for example, so it'd be cleaner and more appropriate to store the information about the git revision in an instance variable defined in the constructor and updated, as needed, when the update or checkout methods get called.

That way you won't need to import git in this module and just access the instance attribute as you are accessing the repo_dir.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, thank you for the detailed explanation.

@@ -24,7 +24,7 @@
import urllib

from esrally import actor, config, exceptions, track, driver, mechanic, reporter, metrics, time, DOC_LINK, PROGRAM_NAME
from esrally.utils import console, convert, opts
from esrally.utils import console, convert, opts, git
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

git is not needed anymore, right?

@@ -204,6 +211,8 @@ def __init__(self, cfg, fetch, update, repo_class=repo.RallyRepository):
self.repo.checkout(repo_revision)
else:
self.repo.update(distribution_version)
track_revision = git.head_revision(self.track_dir(self.track_name))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment here as earlier, if you move this inside RallyRepository you won't have to directly use git here, but just get the info from the instance attribute.

@@ -1203,7 +1203,7 @@ def format_dict(d):
console.println("No recent races found.")


def create_race(cfg, track, challenge):
def create_race(cfg, track, challenge, track_revision=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about the team_revision? I didn't see it getting stored in metrics.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's stored as part of ClusterMetaInfo.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a test run with the latest commit in this PR against an Elasticsearch metrics store and didn't see the team revision recorded.
e.g.:

image

Btw the particular class ClusterMetaInfo has a comment before indicating it's used internally.

@@ -830,6 +830,7 @@ def test_store_race(self):
pipeline="from-sources", user_tags={"os": "Linux"}, track=t, track_params={"shard-count": 3},
challenge=t.default_challenge, car="defaults", car_params={"heap_size": "512mb"}, plugin_params=None,
total_laps=12,
track_revision="abc1",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also support and test team_revision, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in the above comment I think we need to test for team_revision to ensure things work as they are supposed?

@ebadyano ebadyano requested a review from drawlerr July 29, 2019 17:15
# Conflicts:
#	esrally/metrics.py
#	tests/metrics_test.py
@dliappis dliappis self-requested a review August 21, 2019 09:42
Copy link
Contributor

@dliappis dliappis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ebadyano for the latest commit 4b56274 conducted a few tests and things look great.

LGTM

@ebadyano ebadyano merged commit 966891a into elastic:master Aug 22, 2019
@ebadyano
Copy link
Contributor Author

@dliappis Thank you for all the reviews!

@danielmitterdorfer danielmitterdorfer modified the milestones: 1.x, 1.3.0 Aug 26, 2019
novosibman pushed a commit to novosibman/rally that referenced this pull request Oct 2, 2019
@ebadyano ebadyano deleted the master-metrics branch December 16, 2022 15:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improves the status quo :Metrics How metrics are stored, calculated or aggregated
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Capture team and track revision in metrics metadata
3 participants