Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow index pattern for source-index in shrink-index operation #1118

Merged
merged 3 commits into from
Nov 20, 2020

Conversation

dliappis
Copy link
Contributor

This commit adds support for specifying several source indices using the
standard Elasticsearch Multi-target syntax for the shrink-index
runner.

This helps automate shrink operations for multiple indices.

This commit adds support for specifying several source indices using the
standard Elasticsearch Multi-target syntax for the shrink-index
runner.

This helps automate shrink operations for multiple indices.
@dliappis dliappis added enhancement Improves the status quo :Track Management New operations, changes in the track format, track download changes and the like labels Nov 19, 2020
@dliappis dliappis added this to the 2.0.3 milestone Nov 19, 2020
@dliappis dliappis self-assigned this Nov 19, 2020
await es.indices.shrink(index=source_index, target=final_target_index, body=target_body)

self.logger.info("Waiting for shrink to finish for index [%s] ...", source_index)
await self._wait_for(es, final_target_index, "shrink for index [{}]".format(final_target_index))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could run things in parallel by moving this to a separate loop later, however, I am not sure whether a high level of concurrent shrink operations will actually be beneficial or cause contention and provide minimal advantage (esp. if only one shrink node has been provided).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I'd not complicate things unless there is a strong benefit.

Copy link
Member

@danielmitterdorfer danielmitterdorfer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice convenience improvement. Looks fine for the most part but I left a few comments.

# we need to inject additional settings so we better copy the body
target_body = deepcopy(mandatory(params, "target-body", self))
shrink_node = params.get("shrink-node")
# Choose a random data node if none is specified
node_names = [shrink_node] if shrink_node else []
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this makes the code a bit hard to read as you operate on two variables (node_names and shrink_node) in several branches and check shrink_node multiple times.

How about this instead?

        # Choose a random data node if none is specified
        if shrink_node:
            node_names = [shrink_node]
        else:
            node_names = []
            # choose a random data node
            node_info = await es.nodes.info()
            for node in node_info["nodes"].values():
                if "data" in node["roles"]:
                    node_names.append(node["name"])
            if not node_names:
                raise exceptions.RallyAssertionError("Could not choose a suitable shrink-node automatically. Please specify it explicitly.")

This makes it easier to reason because there is only one top-level conditional and then we populate node_names in each branch separately.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

esrally/driver/runner.py Show resolved Hide resolved
f"Add it to your parameter source and try again.")


def remove_prefix(string, prefix):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too bad we don't have Python 3.9 as minimum version requirement as this has been recently added to the standard library. I wonder whether we should put a TODO into the code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I noticed that too while coding writing this. Added a TODO in a3cab0e.

preserve_existing=True)

self.logger.info("Waiting for relocation to finish for index [%s] ...", source_index)
await self._wait_for(es, source_index, "shard relocation for index [{}]".format(source_index))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use an f-string here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 52f28a6

await es.indices.shrink(index=source_index, target=final_target_index, body=target_body)

self.logger.info("Waiting for shrink to finish for index [%s] ...", source_index)
await self._wait_for(es, final_target_index, "shrink for index [{}]".format(final_target_index))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use an f-string here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in a3cab0e

await es.indices.shrink(index=source_index, target=final_target_index, body=target_body)

self.logger.info("Waiting for shrink to finish for index [%s] ...", source_index)
await self._wait_for(es, final_target_index, "shrink for index [{}]".format(final_target_index))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I'd not complicate things unless there is a strong benefit.


self.logger.info("Waiting for shrink to finish for index [%s] ...", source_index)
await self._wait_for(es, final_target_index, "shrink for index [{}]".format(final_target_index))
self.logger.info("Shrinking [%s] to [%s] has finished.", source_index, final_target_index)
# ops_count is not really important for this operation...
return 1, "ops"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could use the number of shrunk indices here for the operation count?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in a3cab0e

docs/track.rst Outdated
* ``source-index`` (mandatory): The name of the index that should be shrinked.
* ``target-index`` (mandatory): The name of the index that contains the shrinked shards.
* ``source-index`` (mandatory): The name of the index that should be shrinked. Multiple indices can be defined using the `Multi-target syntax <https://www.elastic.co/guide/en/elasticsearch/reference/current/multi-index.html>`_.
* ``target-index`` (mandatory): The name of the index that contains the shrinked shards. If multiple indices match ``source-index``, the common prefix will appended to ``target-index``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it the other way around? The common prefix will be removed and instead the unique parts of each source index name will be appended to the string specified via target-index?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course 🤦 ! Addressed in a3cab0e

@dliappis
Copy link
Contributor Author

@danielmitterdorfer Thanks for the review! I addressed your comments and requested a second review.

Copy link
Member

@danielmitterdorfer danielmitterdorfer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! LGTM

@@ -1303,7 +1303,7 @@ shrink-index
With the operation ``shrink-index`` you can execute the `shrink index API <https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-shrink-index.html>`_. Note that this does not correspond directly to the shrink index API call in Elasticsearch but it is a high-level operation that executes all the necessary low-level operations under the hood to shrink an index. It supports the following parameters:

* ``source-index`` (mandatory): The name of the index that should be shrinked. Multiple indices can be defined using the `Multi-target syntax <https://www.elastic.co/guide/en/elasticsearch/reference/current/multi-index.html>`_.
* ``target-index`` (mandatory): The name of the index that contains the shrinked shards. If multiple indices match ``source-index``, the common prefix will appended to ``target-index``.
* ``target-index`` (mandatory): The name of the index that contains the shrinked shards. If multiple indices match ``source-index``, one shrink operation will execute for every matching index. Each shrink operation will use a modified ``target-index``: the unique suffix of the source index (derived by removing the common prefix of all matching source indices) will be appended to ``target-index``. See also the example below.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much clearer, thanks!

@dliappis dliappis merged commit b8e8e6b into elastic:master Nov 20, 2020
@dliappis dliappis deleted the shrink-index-pattern branch November 20, 2020 09:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improves the status quo :Track Management New operations, changes in the track format, track download changes and the like
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants