-
Notifications
You must be signed in to change notification settings - Fork 313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create and use a unique ES API key for each simulated client #1520
Conversation
8e5732f
to
a81189e
Compare
It will be used multiple times: rest api check, api key creation, api key deletion.
If the `create_api_key_per_client:true` client option is provided, the coordinating load driver will create a unique API key per logical client after the benchmark's allocation matrix is created, but before any task execution begins. For any given client, its generated API key will be used for authentication for all of the tasks assigned to it. Upon benchmark completion, the coordinating load driver will delete all API keys that it created initially.
a81189e
to
0c80c1f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really great! I yet have to try it, but expect to only leave nits.
Co-authored-by: Quentin Pradet <[email protected]>
Co-authored-by: Quentin Pradet <[email protected]>
This indicates that ES Security isn't enabled, so we inform the user and fail the benchmark. Since this isn't recoverable, we don't bother retrying.
- Fix a copy/paste error that was calling side_effect on the wrong mock - Ensure that call counts are correct - Be more explicit about what arguments we expect calls to contain
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to reiterate that I'm a big fan of the amount of care that went into this pull requests and its tests.
I have a final issue: I can't get Rally to show me the error message that you carefully crafted. If I leave out basic auth credentials or don't ask for x-pack then I only get a LaunchError in my console and an unprintable RallyError object in the logs. I should see the error messages instead. (This might be unrelated to this pull request. If yes, we can fix it in another one.)
tests/client/factory_test.py
Outdated
@pytest.mark.parametrize("version", ["7.9.0", "7.10.0"]) | ||
@mock.patch("elasticsearch.Elasticsearch") | ||
def test_raises_exception_when_api_key_deletion_fails(self, es, version): | ||
es.info.return_value = {"version": {"number": version}} | ||
ids = ["foo", "bar", "baz"] | ||
es.security.invalidate_api_key.side_effect = [ | ||
elasticsearch.TransportError(503, "Service Unavailable"), | ||
elasticsearch.TransportError(401, "Unauthorized"), | ||
Exception("Whoops!"), | ||
] | ||
|
||
with pytest.raises(exceptions.RallyError, match=re.escape(f"Could not delete API keys with the following IDs: {ids}")): | ||
client.delete_api_keys(es, ids) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Would you agree that this is subset of the test_legacy_api_key_deletion_reports_only_undeleted_ids_in_exception
test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This actually raised a more substantial issue in my mind, which I've addressed in 8eec4b1:
It's possible for the 7.10.0+ version of the deletion code to fail silently if we rely just on exceptions to catch errors. This is because it's basically a bulk request, which means that the response can contain both successful and unsuccessful deletions but still have an HTTP 200 status code. That commit handles that scenario, modifies the logic for the "legacy" code, and refactors the relevant tests.
@pquentin I think we may not always fail particularly gracefully in general if the combination of client options and cars provided are somehow invalid (implementing #580 would help uncover some of these scenarios). But in this PR's case, here's how I've forced the main failure modes that are specific to API keys that we're trying to catch: Basic auth missing Invocation: esrally race --distribution-version=8.2.0 --car="defaults,trial-license,x-pack-security" --client-options="create_api_key_per_client:true" --track=geonames --test-mode Output
Basic auth incomplete (password missing) Invocation: esrally race --distribution-version=8.2.0 --car="defaults,trial-license,x-pack-security" --client-options="create_api_key_per_client:true,basic_auth_user:rally" --track=geonames --test-mode Output
Security not enabled Invocation: esrally race --distribution-version=8.2.0 --client-options="create_api_key_per_client:true,basic_auth_user:'rally',basic_auth_password:'rally-password'" --track=geonames --test-mode --kill-running-processes Output
What did you try? If there's a scenario that's API-key specific that we can handle better, let's do it in this PR. If it's a more generic issue with invalid client options, a follow-up sounds good. |
@elasticmachine test this please |
In 7.10.0+, API keys can be invalidated in bulk. Like bulk indexing requests, it's possible for some of the keys specified in the request to be deleted while others fail. In this scenario, we won't actually get an exception, so we need to parse the response ourselves to make sure that all keys were actually deleted. If we don't do this, it's possible that we'd silently ignore errors, leaving API keys behind that we didn't intend to. This commit implements this error handling and also refactors the "legacy" API key deletion code to use the same data structures for tracking which API keys have been deleted and which failed. If there are any un-deleted API keys after we've exhausted our number of attempts, we report their IDs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This works great, thanks! Ship it with or without the change to the except clause. And ignore my other comment. :)
time.sleep(1) | ||
else: | ||
raise_exception(remaining, cause=e) | ||
except Exception as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we only catch RallyError
here? I think for something else like say a KeyError
there's no point in retrying.
This PR introduces the
create_api_keys_per_client
client option. Iftrue
, the coordinating load driver will create a unique API key per logical client after the benchmark's allocation matrix is created, but before any task execution begins. For any given client, its generated API key will be used for authentication for all of the tasks assigned to it. Upon benchmark completion, the coordinating load driver will delete all API keys that it created initially.Basic auth credentials are required to create API keys at the start of the benchmark and delete them at the end. We do intend to support using a "global API key" for these administrative operations (see #1067 (comment)) but that will be a follow-up.
Here is an example CLI invocation that you can use to test: