Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Service Accounts - Initial bootstrap plumbing to add essential classes #70391

Merged
merged 8 commits into from
Mar 16, 2021

Conversation

ywangd
Copy link
Member

@ywangd ywangd commented Mar 15, 2021

This PR is the initial effort to add essential classes for service accounts. The classes are wired in places, but not yet been used. I also intentionally left out the actual credential store implementation. The changes are largely based on the PoC code with quite a few renames and rearragements. I think this is a good first commit which does not bring in too many changes.

ToDo:

  • Tests for the new classes

@ywangd ywangd added >enhancement :Security/Authentication Logging in, Usernames/passwords, Realms (Native/LDAP/AD/SAML/PKI/etc) v8.0.0 v7.13.0 labels Mar 15, 2021
@ywangd ywangd requested a review from tvernum March 15, 2021 13:29
@elasticmachine elasticmachine added the Team:Security Meta label for security team label Mar 15, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-security (Team:Security)

Comment on lines 82 to 105
if (ElasticServiceAccounts.NAMESPACE.equals(token.getAccountId().namespace()) == false) {
logger.debug("only [{}] service accounts are supported, but received [{}]",
ElasticServiceAccounts.NAMESPACE, token.getAccountId().asPrincipal());
listener.onResponse(null);
return;
}

final ServiceAccount account = ACCOUNTS.get(token.getAccountId().serviceName());
if (account == null) {
logger.debug("the [{}] service account does not exist", token.getAccountId().asPrincipal());
listener.onFailure(null);
return;
}

if (serviceAccountsCredentialStore.authenticate(token)) {
listener.onResponse(success(account, token, nodeName));
} else {
final ParameterizedMessage message = new ParameterizedMessage(
"failed to authenticate service account [{}] with token name []",
token.getAccountId().asPrincipal(),
token.getTokenName());
logger.debug(message);
throw new ElasticsearchSecurityException(message.getFormattedMessage());
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part will be improved when we refactor the existing AuthenticationResult to be either generic or contain an Authentication object. But for now, I am throwing an exception to signal the termination of the authentication chain and null to mean continuation.
I think we can choose to not fallthrough (continue) in any case once a service account is extracted from the header. We touched on this in previous discussion, but I am not entirely sure now what we agreed upon. Fallthrough is the existing default behaviour, but we probably should be less lenient when it comes to service account. Please let me know your preference.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tvernum ^^^

Copy link
Contributor

@tvernum tvernum Mar 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think throwing an exception & terminating is the right approach.
The PoC code was originally written to handle API key auth, in which case there was an argument to be slightly more lenient because there was less guarantee that the header was intended to represent a service account.

Here, once we've deserialized a service account token correctly, I think it's safe to say that the client was intending to authenticate with that token and if that authc fails then the request fails too.

Copy link
Contributor

@tvernum tvernum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

null,
null,
null
));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this descriptor is good for now, but we'll need to check it with the Fleet team later on.
It's not quite identical to the fleet_enroll role that that Fleet creates today, and we'll want to work through the differences.

token.getAccountId().asPrincipal(),
token.getTokenName());
logger.debug(message);
throw new ElasticsearchSecurityException(message.getFormattedMessage());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Despite what I say about throwing below, I think this should be onFailure(new ElasticsearchSecurityException(..)) instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about that, but it seems we mostly use onFailure for system related issues, e.g. security index is not available etc. Also because I plan to refactor this later to return an updated AuthenticationResult which uses terminate instead of onFailure for credential mismatch. So overall I think throwing is ok here and all throwing will become AuthenticationResult#terminate in future. Does it make sense?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mostly use onFailure for system related issues

That's not intentional. onFailure is for when exceptions happen in async code.

So once serviceAccountsCredentialStore.authenticate becomes asynchronous, this will need to be onFailure because it's too late to throw an exception - control has passed from the original thread to the callback.

In general, it is preferable for code that takes a listener argument to signal all outcomes via onFailure or onResponse and never throw or return

final User user = account.asUser();
final Authentication.RealmRef authenticatedBy = new Authentication.RealmRef(REALM_NAME, REALM_TYPE, nodeName);
return new Authentication(user, authenticatedBy, null, Version.CURRENT, Authentication.AuthenticationType.TOKEN,
Map.of("_token_name", token.getTokenName()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the leading _?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not strictly necessary since we control service account entirely, but I think it is still better to use a reserved (with the leading _) name. It could be helpful on the client side for clarity.

@ywangd
Copy link
Member Author

ywangd commented Mar 16, 2021

@tvernum Tests are added. Also addressed your comments. For throwing vs onFailure, please see my comment here. Please let me know if you are ok with it. Or we can always come back to change it later.

I'll be merging the PR once tests pass. They are currently failing for unrelated maven repo issue.

@tvernum
Copy link
Contributor

tvernum commented Mar 16, 2021

For throwing vs onFailure, please see my comment here. Please let me know if you are ok with it. Or we can always come back to change it later.

We will need to change it once the code starts doing async work - but we can leave it until then.

@ywangd
Copy link
Member Author

ywangd commented Mar 16, 2021

For throwing vs onFailure, please see my comment here. Please let me know if you are ok with it. Or we can always come back to change it later.

We will need to change it once the code starts doing async work - but we can leave it until then.

This is a good point. I was hoping to use listener.onResponse(terminate) when refactor happens. But the async work is needed earlier than the refactor. So it makes sense to use onFailure first. I updated the code to always use onFailure for errors.

@ywangd
Copy link
Member Author

ywangd commented Mar 16, 2021

@elasticmachine run elasticsearch-ci/1 elasticsearch-ci/2 elasticsearch-ci/eql-correctness elasticsearch-ci/packaging-sample-windows

@ywangd
Copy link
Member Author

ywangd commented Mar 16, 2021

@elasticmachine run elasticsearch-ci/2 run elasticsearch-ci/eql-correctness run elasticsearch-ci/packaging-sample-windows

@ywangd ywangd merged commit 143202a into elastic:master Mar 16, 2021
ywangd added a commit to ywangd/elasticsearch that referenced this pull request Apr 8, 2021
…lastic#70391)

This PR is the initial effort to add essential classes for service accounts to
lay down the foundation of future works.  The classes are wired in places, but
not yet been used. Also intentionally left out the actual credential store
implementation. It is a good first commit which does not bring in too many
changes.
ywangd added a commit that referenced this pull request Apr 9, 2021
* Service Accounts - Initial bootstrap plumbing for essential classes (#70391)

This PR is the initial effort to add essential classes for service accounts to
lay down the foundation of future works.  The classes are wired in places, but
not yet been used. Also intentionally left out the actual credential store
implementation. It is a good first commit which does not bring in too many
changes.

* Service Accounts - New CLI tool for managing file tokens (#70454)

This is the second PR for service accounts. It adds a new CLI tool
elasticsearch-service-tokens to manage file tokens. The file tokens are stored
in the service_tokens file under the config directory. Out of the planned create,
remove and list sub-commands, this PR only implements the create function since
it is the most important one. The other two sub-commands will be handled in
separate PRs.

* Service Accounts - Authentication with file tokens (#70543)

This the 3rd PR for service accounts. It adds support for authentication with
file tokens. It also adds a cache for performance so that expensive pbkdf2
hashing does not have to be performed on every request. Adding a cache comes
with its own housekeeping work around invalidation. This PR ensures that cache
gets invalidated when underlying token file is changed. It does not implement
APIs for active invalidation. It will be handled in a separate PR after the API
token is in place.

* [Test] Service Account - fix test assumption

* [Test] Service Accounts - handle token names with leading hyphen (#70983)

The CLI tool needs an option terminator (--) for another option names that
begin with a hyphen. Otherwise it errors out with message of "not recognized
option". The service account token name can begin with a hyphen. Hence we need
to use -- when it is the case. An example of equivalent command line is
./bin/elasticsearch-service-tokens create elastic/fleet -- -lead-with-hyphen.

* Service Accounts - Fleet integration (#70724)

This PR implements rest of the pieces needed for Fleet integration, including:

* Get service account role descriptor for authorization
* API for creating service account token and storing in the security index
* API for list tokens for a service account
* New named privilege for manage service account
* Mandate HTTP TLS for both service account auth and service account related
  APIs
* Tests for API key related operations using service account

* [Test] Service Accounts - Remove colon from invalid token name generator (#71099)

The colon character is interpreted as the separate between token name and token
secret. So if a token name contains a colon, it is in theory invalid. But the
parser takes only the part before the colon as the token name and thus consider
it as a valid token name. Subsequent authentication will still fail. But for
tests, this generates a different exception and fails the expectation. This PR
removes the colon char from being used to generate invalid token names for
simplicity.

* Fix for 7.x quirks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :Security/Authentication Logging in, Usernames/passwords, Realms (Native/LDAP/AD/SAML/PKI/etc) Team:Security Meta label for security team v7.13.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants