-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DISCUSS] ML - Spaces and Kibana Privileges #37709
Comments
Pinging @elastic/ml-ui |
Pinging @elastic/kibana-security |
/cc @epixa |
I think there is also an option 3 that is half way between options 1 and 2, and similar in some ways to how index patterns work. Indices live in Elasticsearch and know nothing about Kibana, so Kibana has index patterns that are Kibana saved objects that refer to the Elasticsearch indices. In the same way we could introduce Kibana saved objects associated with spaces that refer to ML jobs in Elasticsearch. The ML UI would only display jobs in a particular space if there was a corresponding Kibana saved object. A rough outline of how this might work is as follows:
(There are probably holes in this, and it should be taken as a starting point for discussion rather than a spec for implementation.) |
The approach which you've outlined @droberts195 seems reasonable. The biggest issue that I foresee, which isn't necessarily a deal breaker, is that ML jobs will have to be interacted with using ML specific APIs as the existing saved object APIs won't be able to perform the "application level join" with the jobs which are stored in the machine learning specific indices. Similarly, ML jobs won't be able to be exported/imported using the saved object management screens, without augmenting the underlying infrastructure. Another thing to consider is making the necessary changes to the "saved objects service" to be able to query the ES specific ML indices. I'm not too familiar with the structure of the indices to know how feasible this would be. |
I think that's probably for the best. We have an open issue for the ability to import and export ML jobs: elastic/elasticsearch#37987 An ML anomaly detector job could have a huge amount of data associated with it, spread over multiple indices:
Whatever is eventually done for the ML import/export functionality will have to take all of this into account. Therefore I think it would be best if the saved objects associated with ML jobs didn't show up at all in the list of objects to be exported. It would be highly misleading to just export or import the single job config document and lose everything else.
I think that this is also for the best based on my experience of the 6.x -> 7.x upgrade process. The ML job documents may need to be modified during the major version upgrade process. This will need to be done in such a way that jobs can continue to run in the mixed version cluster during a rolling upgrade. To facilitate this we grant no permissions to anyone on the If we get to the stage where we have a saved object twin for every job that is to be visible in the UI then initially these could be used to just define which space the job was in, but in the longer term could potentially replace the I'm sure there are still a lot of details to think through. Maybe we should organise a call to discuss in more depth. |
Agreed, and making this change would allow us to grant users access to ML jobs on a per-space level. Until we remove the user's privileges to query the ML indices directly, we aren't able to do so.
That sounds reasonable to me, let me know if there's anything that I can prepare before the meeting which would aide the discussion. |
During a security team roadmap planning, we discussed two alternate approaches for allowing ML to be part of the Kibana "base privileges". Originally we were planning on suggesting Configurable base privileges as the solution. However, we're currently favoring Kibana base privileges opt-in using kibana.yml as the solution. We are hesitant about adding "configurable base privileges" without allowing users time to adopt "feature controls" to determine whether we're solving a persistent and common issue, or whether we're introducing functionality which will cause confusion and increased complexity. Implementing the "configurable base privileges" primarily makes sense if this is a feature which we would like to have long-term for Kibana, but using it primarily as the solution for the 7.x time-frame to maintain backwards compatibility as we migrate ML to using Kibana privileges doesn't make much sense. If you all could read through the Kibana base privileges opt-in using kibana.yml issue and let us know whether the solution is acceptable or whether you'd prefer to discuss in person. We're hoping to get one of these solutions on our roadmap and ensure it's synchronized with your timeline for migrating the ML privileges model. |
I've given this some more thought in the context of what changes should be made to more fully support "linked" saved-objects, like the ones that ML will be using. Since the initial discussion, we've begun to think of Code's usage of saved-objects as being "linked" to resources on the Kibana server's filesystem, which is similar to ML's jobs being "linked" to Elasticsearch ML Jobs. The following are purely my thoughts, and are in no way prescriptive of how the ML team should implement this. It's very likely I'm grossly over-simplifying matters because of my lack of detailed knowledge about ML, so please take my opinions with a grain of salt. From the end-users perspective, there are two distinct operations which we intend to support for all saved-objects:
When using a "linked" saved-object, this becomes somewhat more complicated because part of the data is stored in the saved-object itself and some of it in the "linked" resource. However, it's possible to abstract away this complexity from the users. For example, when the user performs a create/update operation, they can be required to provide all of the data that will be stored in the saved-object and the ES ML job, and the Kibana server can create/update both the saved-object and associated ES ML job. This can be done using a SavedObjectsClientWrapper, so that end-users can continue to use the Saved objects APIs to perform the operation. A similar approach could be utilized for get/find operations. Utilizing the same SavedObjectsClientWrapper after retrieving the saved-objects, the Kibana server could retrieve all ES ML jobs and merge the two definitions. This would allow end-users and ML UI engineers to interact with Kibana ML Jobs using similar interfaces, and allow us to implement the ability to "share" ML Jobs in multiple spaces when that time comes. We'll likely want to add the ability for specific saved-object types to opt-out of being exportable, importable and copied to encourage the proper usage of these linked saved-objects given the inherent complexities. |
FWIW, after discussing with @epixa, he's not on-board with my previous recommendation to use the SavedObjectsClientWrapper to do the "in application joins" to the linked ES ML Jobs and favors using dedicated ML APIs for this. Since import/export and "copy to space" are out-of-scope for the initial implementation, the non-pedantic benefits we get from my prior recommendation are negligible. |
Closing this issue, as it's no longer being used for discussing this integration. |
We've had a number of discussions regarding the plans and requirements to transition Machine Learning to using Spaces and Kibana Privileges. So many discussions, that I figured it was worth us documenting the potential paths forward.
Ability to view all jobs across all spaces
My understanding is that ML's primary concern with adopting Spaces is it makes it hard for users to get a list of all ML jobs across every space, which could potentially lead to duplicate jobs being created. The duplicated jobs are an issue because ML jobs are computationally expensive and have a non-negligible impact on the health of the Elasticsearch cluster.
To address this concern, it was suggested that ML could add a new section to the Management application which would list all ML jobs across all spaces. Currently, the management sections are either for the entire Elasticsearch cluster, global to Kibana, or specific to the current space. It's not immediately obvious which sections are which "scope", so we've started discussing how we could improve this situation here. However, I don't think these changes absolutely have to be made before ML adds a management section to manage all ML jobs across all spaces.
Using spaces only for organization
Prior to migrating to Kibana Privileges, which is described in more detail below, it's possible for ML to use spaces as an organizational feature without changing the authorization model. This would require no changes to the way that ML performs authorization, but instead allow for each ML job to be augmented with a space identifier, and then ML's job endpoints in Kibana could filter out the jobs based on the currently selected space.
If there's enough need from ML's users to be able to categorize their ML jobs, this might be a reasonable place to start as it shouldn't require any change to "core Kibana" and appears on the surface to require minimal development effort from the ML team.
Migrating to Kibana Privileges
Kibana Privileges allow us to grant access to individual features and spaces within Kibana. Instead of allowing users to have direct access to the underlying system indices, we grant the kibana internal server user access to the system indices. We're then able to perform authorization within Kibana using the Kibana application privileges before executing the query against the Elasticsearch using the internal server user. The following is a rough sequence diagram of how this works within the context of Saved Objects:
Kibana applications which rely upon "saved objects" get this authorization applied automatically as part of the secured instance of the
SavedObjectsClient
. However, since ML doesn't store its data in the.kibana
index, it isn't currently possible to use theSavedObjectsClient
to access the ML jobs. There is an effort underway to allow theSavedObjectsClient
to work with other indices, but it will require the documents in the Elasticsearch index abide by the same "schema" that we use on the.kibana
index for this to work properly.Changing ML to use "saved objects" isn't an absolute necessity, at least immediately. However, to take advantage of the future enhancements to Kibana authorization, it would be advantageous to do so.
Regardless of whether or when we decide to change ML to use saved objects, the following change to the way that base privileges will behave will be required to ensure that existing roles which aren't supposed to grant access to ML don't unintentionally get access to ML: #35865
Option 1 - ML implements their own authorization
It's theoretically possible for ML to implement a similar workflow which is performed by the existing secured instance of the
SavedObjectsClient
for ML jobs. This would allow ML to continue using their existing Elasticsearch index structure.Additionally, this would require no changes to the SavedObjectsClient to support the management section for managing ML jobs across all Spaces.
The largest downside to this approach is that we'd end up re-implementing the existing access patterns which are already performed by the existing
SavedObjectsClient
. Additionally, as we continue to enhance Kibana's authorization, we'd have to continually ensure the ML authorization logic is updated as well, or else we lose feature parity.Option 2 - ML switches to using the SavedObjectsClient
This option puts us in a much better position from the technical-debt/maintenance perspective. Additionally, ML would automatically be able to take advantage of the new Kibana authorization features.
For ML to switch to use the
SavedObjectsClient
once #35747 is complete, the ML jobs will have to be stored in the same "schema" as existing saved objects within the.kibana
index. This will likely be the most challenging aspect of this approach.Additionally, we'll have to augment the existing ML reserved roles to grant additional privileges to read/write the new ML saved object types.
This would require changes to the SavedObjectsClient to allow ML jobs to be queried across all Spaces, which we don't have an issue to track yet.
The text was updated successfully, but these errors were encountered: