Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow to benchmark multi-machine clusters #71

Closed
danielmitterdorfer opened this issue Mar 31, 2016 · 6 comments
Closed

Allow to benchmark multi-machine clusters #71

danielmitterdorfer opened this issue Mar 31, 2016 · 6 comments
Labels
:Benchmark Candidate Management Anything affecting how Rally sets up Elasticsearch enhancement Improves the status quo highlight A substantial improvement that is worth mentioning separately in release notes :Telemetry Telemetry Devices that gather additional metrics
Milestone

Comments

@danielmitterdorfer
Copy link
Member

danielmitterdorfer commented Mar 31, 2016

Currently, it is only possible to benchmark on a single machine (except for the special case where we run with --pipeline=benchmark-only which (a) puts the burden of provisioning on the user and (b) does not gather system metrics (like CPU usage or index size). Rally should be able to run benchmarks also on clusters across multiple machines.

In this ticket we want to collect the high-level ideas around making this possible. Work should be done in smaller, more focused tickets. At least these areas need to change:

Note: Distribution of the load generator is handled separately in #257.

@siennathesane
Copy link

What's the status of this issue?

@danielmitterdorfer
Copy link
Member Author

It's in the early concept stage. We use Github's milestone feature and it's planned for 0.5.0 (no release date).

The current "workaround" is to use the pipeline "benchmark-only" as I've mentioned in the description of this ticket provided you are able to apply enough load with a single Rally instance. I'd expect that you can apply a bit more load when #108 is ready on which I am currently working on.

@siennathesane
Copy link

Gotcha, thanks!

danielmitterdorfer added a commit that referenced this issue Aug 12, 2016
With this commit we change the client model of one process per
client instead of one thread per client. We also allow to run
queries by more than one client.

Clients communicate internally via an actor system so we are
already preparing Rally for truly distributed benchmarks which
will be implemented in #71.

Closes #58
Closes #64
Closes #108
@danielmitterdorfer danielmitterdorfer added the blocked This item cannot be finished because of a dependency label Nov 21, 2016
@danielmitterdorfer
Copy link
Member Author

We'll implement this gradually and implement support for single-machine clusters in #184 first. Then we can extend Rally to multi-machine clusters. This will reduce the risk of adding too many changes at once (as we then also need an ability to define cluster topology).

@danielmitterdorfer danielmitterdorfer modified the milestones: 0.5.1, 0.5.0 Nov 21, 2016
@danielmitterdorfer danielmitterdorfer added :Benchmark Candidate Management Anything affecting how Rally sets up Elasticsearch :Telemetry Telemetry Devices that gather additional metrics enhancement Improves the status quo and removed meta A high-level issue of a larger topic which requires more fine-grained issues / PRs labels Dec 23, 2016
@danielmitterdorfer danielmitterdorfer modified the milestones: 0.5.2, 0.5.1 Dec 23, 2016
@danielmitterdorfer danielmitterdorfer modified the milestones: 0.5.2, 0.5.3 Mar 3, 2017
@danielmitterdorfer danielmitterdorfer modified the milestones: 0.5.3, 0.5.2 Mar 3, 2017
@danielmitterdorfer danielmitterdorfer removed the blocked This item cannot be finished because of a dependency label Mar 12, 2017
@danielmitterdorfer danielmitterdorfer modified the milestones: 0.5.x, 0.5.3 Apr 5, 2017
danielmitterdorfer added a commit that referenced this issue Apr 5, 2017
@danielmitterdorfer danielmitterdorfer added the highlight A substantial improvement that is worth mentioning separately in release notes label Jul 21, 2017
@danielmitterdorfer danielmitterdorfer self-assigned this Aug 1, 2017
@danielmitterdorfer
Copy link
Member Author

For the first implementation we'll assume that every node has the same configuration (i.e. there is no possibility to define dedicated node roles (master, data, coordinator, ...) and different node configurations). We will allow multiple nodes per host though.

danielmitterdorfer added a commit that referenced this issue Aug 7, 2017
@danielmitterdorfer danielmitterdorfer modified the milestones: 0.6.3, 0.6.x Aug 8, 2017
@danielmitterdorfer danielmitterdorfer removed their assignment Aug 8, 2017
@danielmitterdorfer
Copy link
Member Author

Having Rally set-up multi-node clusters can be done now by specifying the respective nodes with --target-hosts. It is possible to use multiple nodes per host by specifying the same IP/port pair twice.

Examples:

  • --target-hosts=127.0.0.1:39200 will set up a single ES node on the local machine which runs on port 39200.
  • --target-hosts=127.0.0.1:39200,127.0.0.1:39200 will set up two ES nodes on the local machine.
  • --target-hosts=192.168.2.2:9200,192.168.2.3:9200,192.168.2.4:9200 will set up three ES nodes running on the machines with the IPs 192.168.2.2, 192.168.2.3 and 192.168.2.4 on port 9200. Note that in order for this to work, the Rally daemon process must be started on each machine (see docs).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Benchmark Candidate Management Anything affecting how Rally sets up Elasticsearch enhancement Improves the status quo highlight A substantial improvement that is worth mentioning separately in release notes :Telemetry Telemetry Devices that gather additional metrics
Projects
None yet
Development

No branches or pull requests

2 participants