Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create and use a unique ES API key for each simulated client #1520

Merged
merged 19 commits into from
Jun 29, 2022

Conversation

michaelbaamonde
Copy link
Contributor

@michaelbaamonde michaelbaamonde commented Jun 15, 2022

This PR introduces the create_api_keys_per_client client option. If true, the coordinating load driver will create a unique API key per logical client after the benchmark's allocation matrix is created, but before any task execution begins. For any given client, its generated API key will be used for authentication for all of the tasks assigned to it. Upon benchmark completion, the coordinating load driver will delete all API keys that it created initially.

Basic auth credentials are required to create API keys at the start of the benchmark and delete them at the end. We do intend to support using a "global API key" for these administrative operations (see #1067 (comment)) but that will be a follow-up.

Here is an example CLI invocation that you can use to test:

esrally race --distribution-version=8.2.0 --car="defaults,trial-license,x-pack-security" --client-options="use_ssl:true,verify_certs:false,basic_auth_user:'rally',basic_auth_password:'rally-password',create_api_key_per_client:true" --track=geonames --test-mode

@michaelbaamonde michaelbaamonde force-pushed the api-keys branch 4 times, most recently from 8e5732f to a81189e Compare June 21, 2022 00:06
Mike Baamonde added 4 commits June 21, 2022 10:32
It will be used multiple times: rest api check, api key creation, api key
deletion.
If the `create_api_key_per_client:true` client option is provided, the
coordinating load driver will create a unique API key per logical client
after the benchmark's allocation matrix is created, but before any task
execution begins. For any given client, its generated API key will be used
for authentication for all of the tasks assigned to it.

Upon benchmark completion, the coordinating load driver will delete all API keys
that it created initially.
@michaelbaamonde michaelbaamonde marked this pull request as ready for review June 21, 2022 14:51
Copy link
Member

@pquentin pquentin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really great! I yet have to try it, but expect to only leave nits.

docs/command_line_reference.rst Outdated Show resolved Hide resolved
esrally/client/factory.py Outdated Show resolved Hide resolved
esrally/client/factory.py Outdated Show resolved Hide resolved
esrally/client/factory.py Outdated Show resolved Hide resolved
esrally/driver/driver.py Show resolved Hide resolved
esrally/driver/driver.py Outdated Show resolved Hide resolved
esrally/client/factory.py Outdated Show resolved Hide resolved
esrally/client/factory.py Outdated Show resolved Hide resolved
Mike Baamonde added 3 commits June 24, 2022 12:30
This indicates that ES Security isn't enabled, so we inform the user and fail
the benchmark. Since this isn't recoverable, we don't bother retrying.
- Fix a copy/paste error that was calling side_effect on the wrong mock
- Ensure that call counts are correct
- Be more explicit about what arguments we expect calls to contain
Copy link
Member

@pquentin pquentin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to reiterate that I'm a big fan of the amount of care that went into this pull requests and its tests.

I have a final issue: I can't get Rally to show me the error message that you carefully crafted. If I leave out basic auth credentials or don't ask for x-pack then I only get a LaunchError in my console and an unprintable RallyError object in the logs. I should see the error messages instead. (This might be unrelated to this pull request. If yes, we can fix it in another one.)

tests/client/factory_test.py Outdated Show resolved Hide resolved
Comment on lines 579 to 592
@pytest.mark.parametrize("version", ["7.9.0", "7.10.0"])
@mock.patch("elasticsearch.Elasticsearch")
def test_raises_exception_when_api_key_deletion_fails(self, es, version):
es.info.return_value = {"version": {"number": version}}
ids = ["foo", "bar", "baz"]
es.security.invalidate_api_key.side_effect = [
elasticsearch.TransportError(503, "Service Unavailable"),
elasticsearch.TransportError(401, "Unauthorized"),
Exception("Whoops!"),
]

with pytest.raises(exceptions.RallyError, match=re.escape(f"Could not delete API keys with the following IDs: {ids}")):
client.delete_api_keys(es, ids)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Would you agree that this is subset of the test_legacy_api_key_deletion_reports_only_undeleted_ids_in_exception test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This actually raised a more substantial issue in my mind, which I've addressed in 8eec4b1:

It's possible for the 7.10.0+ version of the deletion code to fail silently if we rely just on exceptions to catch errors. This is because it's basically a bulk request, which means that the response can contain both successful and unsuccessful deletions but still have an HTTP 200 status code. That commit handles that scenario, modifies the logic for the "legacy" code, and refactors the relevant tests.

esrally/client/factory.py Show resolved Hide resolved
@michaelbaamonde
Copy link
Contributor Author

I can't get Rally to show me the error message that you carefully crafted. If I leave out basic auth credentials or don't ask for x-pack then I only get a LaunchError in my console and an unprintable RallyError object in the logs. I should see the error messages instead. (This might be unrelated to this pull request. If yes, we can fix it in another one.)

@pquentin I think we may not always fail particularly gracefully in general if the combination of client options and cars provided are somehow invalid (implementing #580 would help uncover some of these scenarios). But in this PR's case, here's how I've forced the main failure modes that are specific to API keys that we're trying to catch:

Basic auth missing

Invocation:

esrally race --distribution-version=8.2.0 --car="defaults,trial-license,x-pack-security" --client-options="create_api_key_per_client:true" --track=geonames --test-mode
Output

    [INFO] Race id is [3a64c6f4-287c-4311-b249-e5d0462c46c4]
    [INFO] Preparing for race ...
    Basic auth credentials are required in order to create API keys.
    Missing basic auth client options are: ['basic_auth_user', 'basic_auth_password']
    Read the documentation at https://esrally.readthedocs.io/en/latest/command_line_reference.html#client-options
    [ERROR] Cannot race. Traceback (most recent call last):
      File "/home/baamonde/code/elastic/rally/esrally/actor.py", line 92, in guard
        return f(self, msg, sender)
      File "/home/baamonde/code/elastic/rally/esrally/driver/driver.py", line 272, in receiveMsg_PrepareBenchmark
        self.coordinator.prepare_benchmark(msg.track)
      File "/home/baamonde/code/elastic/rally/esrally/driver/driver.py", line 677, in prepare_benchmark
        es_clients = self.create_es_clients()
      File "/home/baamonde/code/elastic/rally/esrally/driver/driver.py", line 605, in create_es_clients
        es[cluster_name] = self.es_client_factory(cluster_hosts, cluster_client_options).create()
      File "/home/baamonde/code/elastic/rally/esrally/client/factory.py", line 123, in __init__
        raise exceptions.SystemSetupError(
    esrally.exceptions.SystemSetupError: You must provide the 'basic_auth_user' and
      'basic_auth_password' client options in addition to
      'create_api_key_per_client' in order to create client API keys.

Basic auth incomplete (password missing)

Invocation:

esrally race --distribution-version=8.2.0 --car="defaults,trial-license,x-pack-security" --client-options="create_api_key_per_client:true,basic_auth_user:rally" --track=geonames --test-mode
Output

[INFO] Race id is [b93464a5-105a-484a-acdb-05b527958cfe]
[INFO] Preparing for race ...
Basic auth credentials are required in order to create API keys.
Missing basic auth client options are: ['basic_auth_password']
Read the documentation at https://esrally.readthedocs.io/en/latest/command_line_reference.html#client-options
[ERROR] Cannot race. Traceback (most recent call last):
  File "/home/baamonde/code/elastic/rally/esrally/actor.py", line 92, in guard
    return f(self, msg, sender)
  File "/home/baamonde/code/elastic/rally/esrally/driver/driver.py", line 272, in receiveMsg_PrepareBenchmark
    self.coordinator.prepare_benchmark(msg.track)
  File "/home/baamonde/code/elastic/rally/esrally/driver/driver.py", line 677, in prepare_benchmark
    es_clients = self.create_es_clients()
  File "/home/baamonde/code/elastic/rally/esrally/driver/driver.py", line 605, in create_es_clients
    es[cluster_name] = self.es_client_factory(cluster_hosts, cluster_client_options).create()
  File "/home/baamonde/code/elastic/rally/esrally/client/factory.py", line 123, in __init__
    raise exceptions.SystemSetupError(
esrally.exceptions.SystemSetupError: You must provide the 'basic_auth_user' and
  'basic_auth_password' client options in addition to
  'create_api_key_per_client' in order to create client API keys.

Security not enabled

Invocation:

esrally race --distribution-version=8.2.0 --client-options="create_api_key_per_client:true,basic_auth_user:'rally',basic_auth_password:'rally-password'" --track=geonames --test-mode --kill-running-processes
Output

[INFO] Race id is [e7cedccf-6508-4767-97a7-1e28c2d076f1]
[INFO] Preparing for race ...
[INFO] Racing on track [geonames], challenge [append-no-conflicts] and car ['defaults'] with version [8.2.0].

[ERROR] Cannot race. Traceback (most recent call last):
  File "/home/baamonde/code/elastic/rally/esrally/client/factory.py", line 287, in create_api_key
    return es.security.create_api_key({"name": f"rally-client-{client_id}"})
  File "/home/baamonde/code/elastic/rally/.venv/lib/python3.8/site-packages/elasticsearch/client/utils.py", line 168, in _wrapped
    return func(*args, params=params, headers=headers, **kwargs)
  File "/home/baamonde/code/elastic/rally/.venv/lib/python3.8/site-packages/elasticsearch/client/security.py", line 117, in create_api_key
    return self.transport.perform_request(
  File "/home/baamonde/code/elastic/rally/.venv/lib/python3.8/site-packages/elasticsearch/transport.py", line 458, in perform_request
    raise e
  File "/home/baamonde/code/elastic/rally/.venv/lib/python3.8/site-packages/elasticsearch/transport.py", line 419, in perform_request
    status, headers_response, data = connection.perform_request(
  File "/home/baamonde/code/elastic/rally/.venv/lib/python3.8/site-packages/elasticsearch/connection/http_urllib3.py", line 277, in perform_request
    self._raise_error(response.status, raw_data)
  File "/home/baamonde/code/elastic/rally/.venv/lib/python3.8/site-packages/elasticsearch/connection/base.py", line 330, in _raise_error
    raise HTTP_EXCEPTIONS.get(status_code, TransportError)(
elasticsearch.exceptions.TransportError: TransportError(405, 'Incorrect HTTP method for uri [/_security/api_key] and method [PUT], allowed: [POST]', 'Incorrect HTTP method for uri [/_security/api_key] and method [PUT], allowed: [POST]')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/baamonde/code/elastic/rally/esrally/driver/driver.py", line 657, in create_api_key
    api_key = client.create_api_key(es, client_id)
  File "/home/baamonde/code/elastic/rally/esrally/client/factory.py", line 291, in create_api_key
    raise exceptions.SystemSetupError(
esrally.exceptions.SystemSetupError: Got status code 405 when attempting to create API keys. Is Elasticsearch Security enabled?

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/baamonde/code/elastic/rally/esrally/actor.py", line 92, in guard
    return f(self, msg, sender)
  File "/home/baamonde/code/elastic/rally/esrally/driver/driver.py", line 277, in receiveMsg_StartBenchmark
    self.coordinator.start_benchmark()
  File "/home/baamonde/code/elastic/rally/esrally/driver/driver.py", line 750, in start_benchmark
    resp = self.create_api_key(self.default_sync_es_client, client_id)
  File "/home/baamonde/code/elastic/rally/esrally/driver/driver.py", line 664, in create_api_key
    raise exceptions.SystemSetupError(e.message)
esrally.exceptions.SystemSetupError: Got status code 405 when attempting to
create API keys. Is Elasticsearch Security enabled?

What did you try? If there's a scenario that's API-key specific that we can handle better, let's do it in this PR. If it's a more generic issue with invalid client options, a follow-up sounds good.

@michaelbaamonde
Copy link
Contributor Author

@elasticmachine test this please

Mike Baamonde added 5 commits June 27, 2022 17:22
In 7.10.0+, API keys can be invalidated in bulk. Like bulk indexing requests,
it's possible for some of the keys specified in the request to be deleted
while others fail. In this scenario, we won't actually get an exception, so
we need to parse the response ourselves to make sure that all keys were
actually deleted. If we don't do this, it's possible that we'd silently ignore
errors, leaving API keys behind that we didn't intend to.

This commit implements this error handling and also refactors the "legacy" API
key deletion code to use the same data structures for tracking which API keys
have been deleted and which failed. If there are any un-deleted API keys after
we've exhausted our number of attempts, we report their IDs.
Copy link
Member

@pquentin pquentin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works great, thanks! Ship it with or without the change to the except clause. And ignore my other comment. :)

time.sleep(1)
else:
raise_exception(remaining, cause=e)
except Exception as e:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we only catch RallyError here? I think for something else like say a KeyError there's no point in retrying.

esrally/client/factory.py Show resolved Hide resolved
@michaelbaamonde michaelbaamonde added highlight A substantial improvement that is worth mentioning separately in release notes :Load Driver Changes that affect the core of the load driver such as scheduling, the measurement approach etc. enhancement Improves the status quo labels Jun 29, 2022
@michaelbaamonde michaelbaamonde merged commit 9d2dd33 into elastic:master Jun 29, 2022
@pquentin pquentin added this to the 2.5.1 milestone Jul 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improves the status quo highlight A substantial improvement that is worth mentioning separately in release notes :Load Driver Changes that affect the core of the load driver such as scheduling, the measurement approach etc.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants