Skip to content

Commit

Permalink
Allow to benchmark plugins from sources
Browse files Browse the repository at this point in the history
With this commit we allow to build plugins from source code. This works
for Elasticsearch core plugins (that are bundled along with the
Elasticsearch repository) and also externally maintained plugins.

Closes #309
  • Loading branch information
danielmitterdorfer committed Sep 28, 2017
1 parent 320fdf2 commit 2b99e6d
Show file tree
Hide file tree
Showing 15 changed files with 456 additions and 104 deletions.
9 changes: 9 additions & 0 deletions docs/command_line_reference.rst
Original file line number Diff line number Diff line change
Expand Up @@ -186,6 +186,15 @@ You can specify the revision in different formats:

Supported date format: If you specify a date, it has to be ISO-8601 conformant and must start with an ``@`` sign to make it easier for Rally to determine that you actually mean a date.

If you want to create source builds of Elasticsearch plugins, you need to specify the revision for Elasticsearch and all relevant plugins separately. Revisions for Elasticsearch and each plugin need to be comma-separated (``,``). Each revision is prefixed either by ``elasticsearch`` or by the plugin name and separated by a colon (``:``). As core plugins are contained in the Elasticsearch repo, there is no need to specify a revision for them (the revision would even be ignored in fact).

Examples:

* Build latest Elasticsearch and plugin "my-plugin": ``--revision="elasticsearch:latest,my-plugin:latest"
* Build Elasticsearch tag ``v5.6.1`` and revision ``abc123`` of plugin "my-plugin": ``--revision="elasticsearch:v5.6.1,my-plugin:abc123"
Note that it is still required to provide the parameter ``--elasticsearch-plugins``. Specifying a plugin with ``--revision`` just tells Rally which revision to use for building the artifact. See the documentation on :doc:`Elasticsearch plugins </elasticsearch_plugins>` for more details.

``distribution-version``
~~~~~~~~~~~~~~~~~~~~~~~~

Expand Down
4 changes: 2 additions & 2 deletions docs/configuration.rst
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@ After the initial detection, Rally will try to autodetect your Elasticsearch pro
Otherwise, Rally will choose a default directory and ask you for confirmation::

* Setting up benchmark data directory in [/Users/dm/.rally/benchmarks] (needs several GB).
Enter your Elasticsearch project directory: [default: '/Users/dm/.rally/benchmarks/src']:
Enter your Elasticsearch project directory: [default: '/Users/dm/.rally/benchmarks/src/elasticsearch']:

If you are ok with this default, just press "Enter" and Rally will take care of the rest. Otherwise, provide your Elasticsearch project directory here. Please keep in mind that Rally will run builds with Gradle in this directory if you start a benchmark.

Expand Down Expand Up @@ -124,7 +124,7 @@ Rally will ask you a few more things in the advanced setup:

* Benchmark data directory: Rally stores all benchmark related data in this directory which can take up to several tens of GB. If you want to use a dedicated partition, you can specify a different data directory here.
* Elasticsearch project directory: This is the directory where the Elasticsearch sources are located. If you don't actively develop on Elasticsearch you can just leave the default but if you want to benchmark local changes you should point Rally to your project directory. Note that Rally will run builds with Gradle in this directory (it runs ``gradle clean`` and ``gradle :distribution:tar:assemble``).
* JDK 8 root directory: Rally will only ask this if it could not autodetect the JDK 8 home by itself. Just enter the root directory of the JDK you want to use.
* JDK root directory: Rally will only ask this if it could not autodetect the JDK home by itself. Just enter the root directory of the JDK you want to use. By default, Rally will choose Java 9 if available and fallback to Java 8.
* Name for this benchmark environment: You can use the same metrics store for multiple environments (e.g. local, continuous integration etc.) so you can separate metrics from different environments by choosing a different name.
* metrics store settings: Provide the connection details to the Elasticsearch metrics store. This should be an instance that you use just for Rally but it can be a rather small one. A single node cluster with default setting should do it. There is currently no support for choosing the in-memory metrics store when you run the advanced configuration. If you really need it, please raise an issue on Github.
* whether or not Rally should keep the Elasticsearch benchmark candidate installation including all data by default. This will use lots of disk space so you should wipe ``~/.rally/benchmarks/races`` regularly.
Expand Down
70 changes: 56 additions & 14 deletions docs/elasticsearch_plugins.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,7 @@ Using Elasticsearch Plugins
You can have Rally setup an Elasticsearch cluster with plugins for you. However, there are a couple of restrictions:

* This feature is only supported from Elasticsearch 5.0.0 onwards
* You cannot benchmark source-builds of plugins
* Whereas Rally caches downloaded Elasticsearch distributions, plugins will always be installed via the Internet and thus each machine where an Elasticsearch node will be installed requires an active Internet connection.
* Whereas Rally caches downloaded Elasticsearch distributions, plugins will always be installed via the Internet and thus each machine where an Elasticsearch node will be installed, requires an active Internet connection.

Listing plugins
---------------
Expand All @@ -14,14 +13,18 @@ To see which plugins are available, run ``esrally list elasticsearch-plugins``::

Available Elasticsearch plugins:

Name Configuration
------------------ ----------------
Name Configuration
----------------------- ----------------
analysis-icu
analysis-kuromoji
analysis-phonetic
analysis-smartcn
analysis-stempel
analysis-ukrainian
discovery-azure-classic
discovery-ec2
discovery-file
discovery-gce
ingest-attachment
ingest-geoip
ingest-user-agent
Expand All @@ -30,9 +33,13 @@ To see which plugins are available, run ``esrally list elasticsearch-plugins``::
mapper-attachments
mapper-murmur3
mapper-size
repository-azure
repository-gcs
repository-hdfs
repository-s3
store-smb
x-pack monitoring-local
x-pack security
x-pack monitoring-local
x-pack security

Rally supports plugins only for Elasticsearch 5.0 or better. As the availability of plugins may change from release to release we recommend that you include the ``--distribution-version`` parameter when listing plugins. By default Rally assumes that you want to benchmark the latest master version of Elasticsearch.

Expand Down Expand Up @@ -67,6 +74,33 @@ As mentioned above, Rally also allows you to specify a plugin configuration and

If you are behind a proxy, please set the environment variable ``ES_JAVA_OPTS`` accordingly on each target machine as described in the `Elasticsearch plugin documentation <https://www.elastic.co/guide/en/elasticsearch/plugins/current/_other_command_line_parameters.html#_proxy_settings>`_.

Building plugins from sources
-----------------------------

Plugin authors may want to benchmark source builds of their plugins. To make this work, you need to manually edit Rally's configuration file in ``~/.rally/rally.ini``. Suppose, we want to benchmark the plugin "my-plugin". Then you need to add the following entries in the ``source`` section::

plugin.my-plugin.remote.repo.url = [email protected]:example-org/my-plugin.git
plugin.my-plugin.src.subdir = elasticsearch-extra/my-plugin
plugin.my-plugin.build.task = :my-plugin:plugin:assemble
plugin.my-plugin.build.artifact.subdir = plugin/build/distributions

Let's discuss these properties one by one:

* ``plugin.my-plugin.remote.repo.url``: This is needed to let Rally checkout the source code of the plugin. If this is a private repo, credentials need to be setup properly.
* ``plugin.my-plugin.src.subdir``: This is the directory to which the plugin will be checked out relative to ``src.root.dir``. In order to allow to build the plugin alongside Elasticsearch, the plugin needs to reside in a subdirectory of ``elasticsearch-extra`` (see also the `Elasticsearch testing documentation <https://github.com/elastic/elasticsearch/blob/master/TESTING.asciidoc#building-with-extra-plugins>`_.
* ``plugin.my-plugin.build.task``: The Gradle task to run in order to build the plugin artifact. Note that his command is run from the Elasticsearch source directory as Rally assumes that you want to build your plugin alongside Elasticsearch. Mixing released Elasticsearch distributions with plugin source builds is not supported (nor is there an intention to do so).
* ``plugin.my-plugin.build.artifact.subdir``: This is the subdirectory relative to ``plugin.my-plugin.src.subdir`` in which the final plugin artifact is located.

In order to run a benchmark with ``my-plugin``, you'd invoke Rally as follows: ``esrally --revision="elasticsearch:some-elasticsearch-revision,my-plugin:some-plugin-revision" --elasticsearch-plugins="my-plugin"`` where you need to replace ``some-elasticsearch-revision`` and ``some-plugin-revision`` with the appropriate :ref:`git revisions <clr_revision>`. Adjust other command line parameters (like track or car) accordingly. In order for this to work, you need to ensure that:

* All prerequisites for source builds are installed.
* The Elasticsearch source revision is compatible with the chosen plugin revision. Note that you do not need to know the revision hash to build against an already released version and can use git tags instead. E.g. if you want to benchmark against Elasticsearch 5.6.1, you c0an specify ``--revision="elasticsearch:v5.6.1,my-plugin:some-plugin-revision"`` (see e.g. the `Elasticsearch tags on Github <https://github.com/elastic/elasticsearch/tags>`_ or use ``git tag`` in the Elasticsearch source directory on the console).
* If your plugin needs to be configured, please ensure to create a proper plugin specification (see below).

.. note::
Rally can build all `Elasticsearch core plugins <https://github.com/elastic/elasticsearch/tree/master/plugins>`_ out of the box without any further configuration.


Anatomy of a plugin specification
---------------------------------

Expand Down Expand Up @@ -100,7 +134,7 @@ In ``$TEAM_REPO_ROOT`` create the directory structure for the plugin and its con
That's it. Later, Rally will just copy all files in ``myplugin/default`` to the home directory of the Elasticsearch node that it configures. First, Rally will always apply the car's configuration and then plugins can add their configuration on top. This also explains why we have created a ``config/elasticsearch.yml``. Rally will just copy this file and replace template variables on the way.

.. note::
If you create a new customization for a plugin, ensure that the plugin name in the team repository matches the official plugin name. Note that hyphens need to be replaced by underscores (e.g. "x-pack" becomes "x_pack"). The reason is that Rally allows to write custom install hooks and the plugin name will become the root package name of the install hook. However, hyphens are not supported in Python which is why we use underscores instead.
If you create a new customization for a plugin, ensure that the plugin name in the team repository matches the core plugin name. Note that hyphens need to be replaced by underscores (e.g. "x-pack" becomes "x_pack"). The reason is that Rally allows to write custom install hooks and the plugin name will become the root package name of the install hook. However, hyphens are not supported in Python which is why we use underscores instead.


The next step is now to create our two plugin configurations where we will set the variables for our config base "default". Create a file ``simple.ini`` in the ``myplugin`` directory::
Expand All @@ -125,14 +159,18 @@ Rally will now know about ``myplugin`` and its two configurations. Let's check t

Available Elasticsearch plugins:

Name Configuration
------------------ ----------------
Name Configuration
----------------------- ----------------
analysis-icu
analysis-kuromoji
analysis-phonetic
analysis-smartcn
analysis-stempel
analysis-ukrainian
discovery-azure-classic
discovery-ec2
discovery-file
discovery-gce
ingest-attachment
ingest-geoip
ingest-user-agent
Expand All @@ -141,13 +179,17 @@ Rally will now know about ``myplugin`` and its two configurations. Let's check t
mapper-attachments
mapper-murmur3
mapper-size
myplugin simple
myplugin advanced
repository-azure
repository-gcs
repository-hdfs
repository-s3
store-smb
x-pack monitoring-local
x-pack security
myplugin simple
myplugin advanced
x-pack monitoring-local
x-pack security

As ``myplugin`` is not an official plugin, the Elasticsearch plugin manager does not know from where to install it, so we need to add the download URL to ``~/.rally/rally.ini`` as before::
As ``myplugin`` is not a core plugin, the Elasticsearch plugin manager does not know from where to install it, so we need to add the download URL to ``~/.rally/rally.ini`` as before::

[distributions]
plugin.myplugin.release.url=https://example.org/myplugin/releases/{{VERSION}}/myplugin-{{VERSION}}.zip
Expand Down
2 changes: 1 addition & 1 deletion docs/faq.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ A benchmark aborts with ``Couldn't find a tar.gz distribution``. What's the prob

This error occurs when Rally cannot build an Elasticsearch distribution from source code. The most likely cause is that there is some problem in the build setup.

To see what's the problem, try building Elasticsearch yourself. First, find out where the source code is located (run ``grep local.src.dir ~/.rally/rally.ini``). Then change to this directory and run the following commands::
To see what's the problem, try building Elasticsearch yourself. First, find out where the source code is located (run ``grep src ~/.rally/rally.ini``). Then change to the directory (``src.root.dir`` + ``elasticsearch.src.subdir`` which is usually ``~/.rally/benchmarks/src/elasticsearch``) and run the following commands::

gradle clean
gradle :distribution:tar:assemble
Expand Down
58 changes: 55 additions & 3 deletions esrally/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@ def auto_load_local_config(base_config, additional_sections=None, config_file_cl


class Config:
CURRENT_CONFIG_VERSION = 11
CURRENT_CONFIG_VERSION = 12

"""
Config is the main entry point to retrieve and set benchmark properties. It provides multiple scopes to allow overriding of values on
Expand Down Expand Up @@ -350,7 +350,7 @@ def create_config(self, config_file, advanced_config=False, assume_defaults=Fals
self.o("Autodetected Elasticsearch project directory at [%s]." % source_dir)
logger.debug("Autodetected Elasticsearch project directory at [%s]." % source_dir)
else:
default_src_dir = "%s/src" % root_dir
default_src_dir = "%s/src/elasticsearch" % root_dir
logger.debug("Could not autodetect Elasticsearch project directory. Providing [%s] as default." % default_src_dir)
source_dir = io.normalize_path(self._ask_property("Enter your Elasticsearch project directory:",
default_value=default_src_dir))
Expand Down Expand Up @@ -389,9 +389,14 @@ def create_config(self, config_file, advanced_config=False, assume_defaults=Fals
config["node"]["root.dir"] = root_dir

if benchmark_from_sources:
# user has provided the Elasticsearch directory but the root for Elasticsearch and related plugins will be one level above
final_source_dir = io.normalize_path(os.path.abspath(os.path.join(source_dir, os.pardir)))
config["node"]["src.root.dir"] = final_source_dir

config["source"] = {}
config["source"]["local.src.dir"] = source_dir
config["source"]["remote.repo.url"] = repo_url
# the Elasticsearch directory is just the last path component (relative to the source root directory)
config["source"]["elasticsearch.src.subdir"] = io.basename(source_dir)

config["build"] = {}
config["build"]["gradle.bin"] = gradle_bin
Expand Down Expand Up @@ -650,6 +655,53 @@ def migrate(config_file, current_version, target_version, out=print):
config["runtime"]["java.home"] = config["runtime"].pop("java8.home")
current_version = 11
config["meta"]["config.version"] = str(current_version)
if current_version == 11 and target_version > current_version:
# As this is a rather complex migration, we log more than usual to understand potential migration problems better.
if "source" in config:
if "local.src.dir" in config["source"]:
previous_root = config["source"].pop("local.src.dir")
logger.info("Set [source][local.src.dir] to [%s]." % previous_root)
# if this directory was Rally's default location, then move it on the file system because to allow for checkouts of plugins
# in the sibling directory.
if previous_root == os.path.join(config["node"]["root.dir"], "src"):
new_root_dir_all_sources = previous_root
new_es_sub_dir = "elasticsearch"
new_root = os.path.join(new_root_dir_all_sources, new_es_sub_dir)
# only attempt to move if the directory exists. It may be possible that users never ran a source benchmark although they
# have configured it. In that case the source directory will not yet exist.
if io.exists(previous_root):
logger.info("Previous source directory was at Rally's default location [%s]. Moving to [%s]."
% (previous_root, new_root))
try:
# we need to do this in two steps as we need to move the sources to a subdirectory
tmp_path = io.normalize_path(os.path.join(new_root_dir_all_sources, os.pardir, "tmp_src_mig"))
os.rename(previous_root, tmp_path)
io.ensure_dir(new_root)
os.rename(tmp_path, new_root)
except OSError:
logger.exception("Could not move source directory from [%s] to [%s]." % (previous_root, new_root))
# A warning is sufficient as Rally should just do a fresh checkout if moving did not work.
console.warn("Elasticsearch source directory could not be moved from [%s] to [%s]. Please check the logs."
% (previous_root, new_root))
else:
logger.info("Source directory is configured at Rally's default location [%s] but does not exist yet."
% previous_root)
else:
logger.info("Previous source directory was the custom directory [%s]." % previous_root)
new_root_dir_all_sources = io.normalize_path(os.path.join(previous_root, os.path.pardir))
# name of the elasticsearch project directory.
new_es_sub_dir = io.basename(previous_root)

logger.info("Setting [node][src.root.dir] to [%s]." % new_root_dir_all_sources)
config["node"]["src.root.dir"] = new_root_dir_all_sources
logger.info("Setting [source][elasticsearch.src.subdir] to [%s]" % new_es_sub_dir)
config["source"]["elasticsearch.src.subdir"] = new_es_sub_dir
else:
logger.info("Key [local.src.dir] not found. Advancing without changes.")
else:
logger.info("No section named [source] found in config. Advancing without changes.")
current_version = 12
config["meta"]["config.version"] = str(current_version)

# all migrations done
config_file.store(config)
Expand Down
Loading

0 comments on commit 2b99e6d

Please sign in to comment.