-
Notifications
You must be signed in to change notification settings - Fork 25.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More avoidance for loadAuthorizedIndices #81237
More avoidance for loadAuthorizedIndices #81237
Conversation
...ugin/security/src/main/java/org/elasticsearch/xpack/security/authz/AuthorizationService.java
Outdated
Show resolved
Hide resolved
@elasticmachine update branch |
.../src/main/java/org/elasticsearch/xpack/core/security/authz/permission/IndicesPermission.java
Outdated
Show resolved
Hide resolved
...ugin/core/src/main/java/org/elasticsearch/xpack/core/security/authz/AuthorizationEngine.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/security/src/main/java/org/elasticsearch/xpack/security/authz/RBACEngine.java
Show resolved
Hide resolved
I think delaying evaluating the |
TBC I'm generally in favor of this change. |
…-authorized-indices
…-authorized-indices
@albertzaharovits I made a rather big change based on the discussion. It is still along the agreement we reached last time and overall a simplication. Instead of changing the signatures of existing Classes (
I think the change should have a postive impact on performance overall because above Scenario 1 is very common and the potential gain is significant (especially when the cluster is large in size). Scenario 3 does have certain performance penalty but it is mitigated by (1) We can advise users to list wildcard patterns first in the requested names; (2) It is unlikely for the list of concrete names to be very long which means the performance drop should not be anyway significant (compared to loading authorizedIndices). [1] An edge case of this scenario is that all requested names are for backing indices and the user does not have privileges over data stream. In this case, the index pattern predicate can run twice for each give name (first for the data stream name, a second time for the backing index name). However, it still unlikely an issue because (1) it still saves time in loading authorized indices which is likely much more expensive than running the predicate couple more times; (2) We do not recommend granting privileges only over the backing indices. If the user does not have privileges over the backing indices either, the request will end up be a 403 failure and performance is not critical in the failure case. [2] We can avoid the additional predicate executions by looping through the requested names earlier and find out whether it has any wildcard. However, this loop itself also has its cost. And if the names are all concrete, it ends up doing useless work. In short, it is still a tradeoff between different scenarios. |
Pinging @elastic/es-security (Team:Security) |
for (IndexAbstraction indexAbstraction : lookup.values()) { | ||
if (indexAbstraction.getType() != IndexAbstraction.Type.DATA_STREAM && predicate.test(indexAbstraction)) { | ||
indicesAndAliases.add(indexAbstraction.getName()); | ||
final Set<String> authorizedIndices = Collections.unmodifiableSet(indicesAndAliases); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The AuthorizedIndicesSet
should protect against modification to this, so maybe avoid the wrapping in an unmodifiable set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it is not strictly necessary. The Set is passed into timeChecker.accept
in the next line. I was pretending that if we don't know anything about timeChecker
and want to have maximum protection against modification, then we need to wrap it in an unmodifiableSet
. I think it was just me being unnecessarily paranoid. So I dropped the wrapping.
|
||
@Override | ||
public boolean containsAll(Collection<?> c) { | ||
return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should implement this in some way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching it. I meant to but somehow missed it. Fixed by just delegating to contains
if (authorizedIndices == null) { | ||
authorizedIndices = supplier.get(); | ||
} | ||
return authorizedIndices; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would slightly prefer we make this check thread safe, like in the CachingAsyncSupplier
.
We're not using it in a multi-thread fashion, and I don't think we would ever do, and even if we did we only risk performance problems with building the set multiple times (but no correctness problems).
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This Set is wrapped in a CachingAsyncSupplier
, i.e. resolvedIndicesAsyncSupplier
(not authorizedIndicesSupplier
which I will remove, see below). The resolvedIndicesAsyncSupplier
is what gets passed to other places so that this Set is never directly used by anything else. Since resolvedIndicesAsyncSupplier
already guarantees thread safety, I think the Set itself can be kept simple. I added the following comment to the class to clarify its usage:
* NOTE that the lazy loading does not guarantee to run only once and is not meant to be used by multi-threads
* because loading multiple times can incur performance penalty (but not correctness).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this implementation can be considered thread safe. There's no memory barrier here, so there's no guarantee about the state of the authorizedIndices
field when viewed from another thread.
The options are:
- simply state that it is not thread-safe and if callers require thread safety they need to synchronize
- mark
authorizedIndices
as volatile
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. It is indeed not thread safe. I changed the documentation to state that and warned about changing how this class is used.
}) | ||
); | ||
}); | ||
final AsyncSupplier<Set<String>> authorizedIndicesSupplier = new CachingAsyncSupplier<>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think we still need the CachingAsyncSupplier
wrapper magic?
I would ditch it, it's hard to follow and useless now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree it is redundant now and should be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM this is looking cool!
Only had small points.
Hi @ywangd, I've created a changelog YAML for you. |
public RBACEngine( | ||
Settings settings, | ||
CompositeRolesStore rolesStore, | ||
LoadAuthorizedIndicesTimeChecker.Factory authzIndicesTimerFactory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving TimeChecker here means it is no longer applied to custom AuthorizationEngine
. I think it is OK. The intention was always to measure the performance of RBACEngine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There seems to be a lack of tests that this change actually achieves anything. The existing tests verify that you haven't broken any existing behaviour, but there's no test that would fail if someone changed the behaviour back so it was no longer lazy.
if (authorizedIndices == null) { | ||
authorizedIndices = supplier.get(); | ||
} | ||
return authorizedIndices; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this implementation can be considered thread safe. There's no memory barrier here, so there's no guarantee about the state of the authorizedIndices
field when viewed from another thread.
The options are:
- simply state that it is not thread-safe and if callers require thread safety they need to synchronize
- mark
authorizedIndices
as volatile
@Override | ||
public boolean contains(Object o) { | ||
if (authorizedIndices == null) { | ||
return predicate.test((String) o); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strictly speaking this violates the guarantees on the Set
interface. You should return false
if o
is not a String
(rather than throw CCE).
return predicate.test((String) o); | |
if (o instanceof String s) { | |
return predicate.test(s); | |
} else { | |
return false; | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about it. But JDK doc says it can throw ClassCastException
. So I didn't do anything. Also adding instanceof
check can in theory hide subtle usage error. So I am not sure if it helps with anything.
x-pack/plugin/security/src/main/java/org/elasticsearch/xpack/security/authz/RBACEngine.java
Outdated
Show resolved
Hide resolved
I added two assertions to existing tests to ensure that the returned PS: Benchmark showed that the change helped a great deal for requests with a single concrete name. |
...ck/plugin/security/src/test/java/org/elasticsearch/xpack/security/authz/RBACEngineTests.java
Outdated
Show resolved
Hide resolved
…ecurity/authz/RBACEngineTests.java Co-authored-by: Tim Vernum <[email protected]>
…#81237) This PR replaces the concrete list of authorizeIndices with a lazily load class so that loading is not triggered if not necessary. For example, a search targeting a concret name, GET index/_search no longer triggers the loading. But names with wildcard will still trigger the loading. However if we organize the indices with data streams and alias, it is possible to achieve similar wildcard effect while still avoid the loading. That is, searches like GET alias/_search or GET data_stream/_search will not trigger the loading even they target multiple indices behind the single name.
…#88149) * Avoid loadAuthorizedIndices for requests with concrete names (#81237) This PR replaces the concrete list of authorizeIndices with a lazily load class so that loading is not triggered if not necessary. For example, a search targeting a concret name, GET index/_search no longer triggers the loading. But names with wildcard will still trigger the loading. However if we organize the indices with data streams and alias, it is possible to achieve similar wildcard effect while still avoid the loading. That is, searches like GET alias/_search or GET data_stream/_search will not trigger the loading even they target multiple indices behind the single name. * fix compilation Co-authored-by: Elastic Machine <[email protected]>
This PR replaces the concrete list of authorizeIndices with a lazily load class
so that loading is not triggered if not necessary.
For example, a search targeting a concret name,
GET index/_search
no longertriggers the loading. But names with wildcard will still trigger the loading.
However if we organize the indices with data streams and alias, it is possible
to achieve similar wildcard effect while still avoid the loading. That is,
searches like
GET alias/_search
orGET data_stream/_search
will nottrigger the loading even they target multiple indices behind the single name.