-
Notifications
You must be signed in to change notification settings - Fork 313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow index pattern for source-index in shrink-index operation #1118
Conversation
This commit adds support for specifying several source indices using the standard Elasticsearch Multi-target syntax for the shrink-index runner. This helps automate shrink operations for multiple indices.
esrally/driver/runner.py
Outdated
await es.indices.shrink(index=source_index, target=final_target_index, body=target_body) | ||
|
||
self.logger.info("Waiting for shrink to finish for index [%s] ...", source_index) | ||
await self._wait_for(es, final_target_index, "shrink for index [{}]".format(final_target_index)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could run things in parallel by moving this to a separate loop later, however, I am not sure whether a high level of concurrent shrink operations will actually be beneficial or cause contention and provide minimal advantage (esp. if only one shrink node has been provided).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, I'd not complicate things unless there is a strong benefit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice convenience improvement. Looks fine for the most part but I left a few comments.
esrally/driver/runner.py
Outdated
# we need to inject additional settings so we better copy the body | ||
target_body = deepcopy(mandatory(params, "target-body", self)) | ||
shrink_node = params.get("shrink-node") | ||
# Choose a random data node if none is specified | ||
node_names = [shrink_node] if shrink_node else [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this makes the code a bit hard to read as you operate on two variables (node_names
and shrink_node
) in several branches and check shrink_node
multiple times.
How about this instead?
# Choose a random data node if none is specified
if shrink_node:
node_names = [shrink_node]
else:
node_names = []
# choose a random data node
node_info = await es.nodes.info()
for node in node_info["nodes"].values():
if "data" in node["roles"]:
node_names.append(node["name"])
if not node_names:
raise exceptions.RallyAssertionError("Could not choose a suitable shrink-node automatically. Please specify it explicitly.")
This makes it easier to reason because there is only one top-level conditional and then we populate node_names
in each branch separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in a3cab0e (also removed please following https://developers.google.com/style/tone#politeness-and-use-of-please)
f"Add it to your parameter source and try again.") | ||
|
||
|
||
def remove_prefix(string, prefix): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Too bad we don't have Python 3.9 as minimum version requirement as this has been recently added to the standard library. I wonder whether we should put a TODO into the code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I noticed that too while coding writing this. Added a TODO in a3cab0e.
esrally/driver/runner.py
Outdated
preserve_existing=True) | ||
|
||
self.logger.info("Waiting for relocation to finish for index [%s] ...", source_index) | ||
await self._wait_for(es, source_index, "shard relocation for index [{}]".format(source_index)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could use an f-string here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 52f28a6
esrally/driver/runner.py
Outdated
await es.indices.shrink(index=source_index, target=final_target_index, body=target_body) | ||
|
||
self.logger.info("Waiting for shrink to finish for index [%s] ...", source_index) | ||
await self._wait_for(es, final_target_index, "shrink for index [{}]".format(final_target_index)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could use an f-string here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in a3cab0e
esrally/driver/runner.py
Outdated
await es.indices.shrink(index=source_index, target=final_target_index, body=target_body) | ||
|
||
self.logger.info("Waiting for shrink to finish for index [%s] ...", source_index) | ||
await self._wait_for(es, final_target_index, "shrink for index [{}]".format(final_target_index)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, I'd not complicate things unless there is a strong benefit.
esrally/driver/runner.py
Outdated
|
||
self.logger.info("Waiting for shrink to finish for index [%s] ...", source_index) | ||
await self._wait_for(es, final_target_index, "shrink for index [{}]".format(final_target_index)) | ||
self.logger.info("Shrinking [%s] to [%s] has finished.", source_index, final_target_index) | ||
# ops_count is not really important for this operation... | ||
return 1, "ops" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could use the number of shrunk indices here for the operation count?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in a3cab0e
docs/track.rst
Outdated
* ``source-index`` (mandatory): The name of the index that should be shrinked. | ||
* ``target-index`` (mandatory): The name of the index that contains the shrinked shards. | ||
* ``source-index`` (mandatory): The name of the index that should be shrinked. Multiple indices can be defined using the `Multi-target syntax <https://www.elastic.co/guide/en/elasticsearch/reference/current/multi-index.html>`_. | ||
* ``target-index`` (mandatory): The name of the index that contains the shrinked shards. If multiple indices match ``source-index``, the common prefix will appended to ``target-index``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it the other way around? The common prefix will be removed and instead the unique parts of each source index name will be appended to the string specified via target-index
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course 🤦 ! Addressed in a3cab0e
@danielmitterdorfer Thanks for the review! I addressed your comments and requested a second review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! LGTM
@@ -1303,7 +1303,7 @@ shrink-index | |||
With the operation ``shrink-index`` you can execute the `shrink index API <https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-shrink-index.html>`_. Note that this does not correspond directly to the shrink index API call in Elasticsearch but it is a high-level operation that executes all the necessary low-level operations under the hood to shrink an index. It supports the following parameters: | |||
|
|||
* ``source-index`` (mandatory): The name of the index that should be shrinked. Multiple indices can be defined using the `Multi-target syntax <https://www.elastic.co/guide/en/elasticsearch/reference/current/multi-index.html>`_. | |||
* ``target-index`` (mandatory): The name of the index that contains the shrinked shards. If multiple indices match ``source-index``, the common prefix will appended to ``target-index``. | |||
* ``target-index`` (mandatory): The name of the index that contains the shrinked shards. If multiple indices match ``source-index``, one shrink operation will execute for every matching index. Each shrink operation will use a modified ``target-index``: the unique suffix of the source index (derived by removing the common prefix of all matching source indices) will be appended to ``target-index``. See also the example below. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much clearer, thanks!
This commit adds support for specifying several source indices using the
standard Elasticsearch Multi-target syntax for the shrink-index
runner.
This helps automate shrink operations for multiple indices.