-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Encrypted saved objects encryption key gets generated by default #56448
Comments
Pinging @elastic/kibana-alerting-services (Team:Alerting Services) |
Pinging @elastic/kibana-security (Team:Security) |
Encryption keys being invalid can occur for a number of reasons:
I do agree that we should improve the UX when there is an automatically generated encryption key and do a better job at warning the user. However, I don't think we should rely upon this warning as the sole solution to encryption keys being invalid, as there are other situations where this can occur. Also, I'm not sure I'm following the part about "data loss". Granted, alerts will not be able to run during this time period, and it will require user intervention to "re-enable" them and provide a new API Key. Is this what you're referring to? For what it's worth, I provided the following commit to @XavierM which exposes whether or not the encryption key was randomly generated kobelb@b77c5c9 |
After giving it some more thought and going through @kobelb's feedback, I'm thinking we need to solve a few different scenarios while trying to warn the user as early as possible of potential consequences. Not all of this can or should be done for 7.6 but creating GitHub issues for the agreed upon solutions will be a start. Scenario 1: Encryption key is generated and user does a CRUD on an alert or actionThis is where we should do as much as we can in UX to avoid the user getting into scenario 2, 3 or 4. Some options where each require some changes to the ESO plugin: 1. Show a warning message in the UIIf they get past it, they can fall into scenario 2, 3 or 4 but at least they have been warned within alerting. 2. Prevent the user from doing a CRUD at the alert API level with some UI / UXThis prevents users getting into scenario 2, 3 or 4 from a generated encryption key. They would either have to change it themselves or not synchronize the keys between their deployments to get to scenario 2, 3 or 4. The UX for this would be to also show a warning from option 1 but prevent the user from continuing. 3. Prevent CRUD on ESO when no encryption key is providedThis has the same notes as option 2 but implemented at a lower level (ESO plugin instead of alerting plugin). Scenario 2: Encryption key has changed and user does a CRUD on an alert or actionThe problem we currently have at this layer is in some APIs (update, update API key, delete) load the decrypted alert before doing anything else. As soon as the decryption fails, the entire API fails. Some options to solve this: 1. Isolate background activityThe only reason we're loading the decrypted saved object is because we need to invalidate the API key. We could push this within try/catches to prevent the request from failing due to this background cleanup activity. In some scenarios we still need to load the alert saved object before continuing. In this scenario, we would have two loads on the saved object. Scenario 3: Encryption key has changed and alert is runningAlerts will not be able to run until users manually fix them by calling an alert API to generate a new API key. Some options to solve this: 1. UX and make sure retry logic is solidThe only option I can think of to solve this is to make sure the alerts recover after a user fixes the objects with broken encryption. The UX for this would be within the management screen, having a list of alerts with a status column showing "Error" for alerts that fail to run. The users would then be able to run the alert immediately after update or we could do it for them. Scenario 4: Encryption key has changed and action is runningActions are encrypted saved objects with Some options to solve this: 1. UX and make sure retry logic is solidSince we can capture the failure of decrypting the action saved object, we could enforce re-attempts so it works when the user fixes the I'm still not sure how we solve the |
I think data loss is meant generically here @kobelb . For example, if someone was using ESO to encrypt something such as PII data and they ended up with a random generated key, everything will work up to the point where they restart and then they will have to figure out how to re-enter all the data again. We don't use it for that at the moment, just for the API keys to be encrypted. But I can imagine that since most SIEM data eventually contains comments and information that could be of a PII or sensitive nature we might end up with a requirement later to encrypt more fields of timeline data, case data, etc ... We don't though at the moment. |
Ya, re: data loss, my read is the same as Frank's. I think the only example we have today (beyond API keys) is action |
Gotcha, thanks for the clarification regarding data-loss @FrankHassanabad and @pmuellr, I forgot that we also had secrets for third-party services in the actions. |
We can try to warn & discourage in the UI, the problem is we'll have a lot of alerting & actions consumers (i.e. many UIs to display warnings in), and UI is not the only way to create alerts. We will have API users too for SREs and devops use cases and likely others. The warnings need to get to all of these places. So I think we:
looks like beats management uses the approach of a default key:
I favor a default encryption key. It is not ideal, but neither is the frustration of being set up to fail because of a missing config. Failing before you create an alert is only slightly less frustrating than failing after a restart of Kibana. For users experimenting in a non-prod environment, a default key allows quick setup that will work across restarts. We'd need to add warnings to our API responses that can be displayed in the UI, as well as to our logs every time the default key is used, and make it obvious and annoying enough that it will be hard to ignore. |
I think we have 2 other problems too:
|
There was an issue opened for that just today: #55380 |
I'll open this as a separate issue, we won't be able to address it here. Going back to the original encrypted saved objects RFC, key rotation was briefly discussed and could be used as a starting point: #33740 (comment)
|
I'm going to close this issue. With the variable Thanks for everyone's input 🙏 |
Definitions
ESO = Encrypted Saved Objects
Problem
With alerting being built on top of ESO and SIEM using alerts for their detection engine. We have a blocking issue for 7.6 where the detection engine stops working after Kibana restarts, because an encryption key is being reset.
This discuss issue will be focused on the problem that there is data loss happening on alerts if administrators don't setup their installation properly. The feature of generating an encryption key also comes with the feature of losing your data on restart that users of alerting need to be aware of.
Generating encryption keys doesn't communicate to administrators that the following won't work:
While also not having any warnings in the following scenarios:
xpack.encrypted_saved_objects.encryptionKey
is not setdev
mode due to a static encryptionKey being usedOptions
We're exploring options to prevent users from creating alerts in such scenarios to avoid losing their data as well as exploring a way to provide SIEM the tools they need to prevent users from setting up the detection engine from the UI. Some of the options we're exploring so far are:
Disable the alert APIs whenever ESO is running with a generated encryption key. Expose via a property or function that returns a boolean indicating if the API is disabled in this scenario (to be used by SIEM). There is currently no way to find this out but would require some code changes in the ESO plugin to support this.
Prevent CRUD on any ESO whenever a generated encryption key is being used (in other words, removing generated encryption keys). This option would automatically work for our alerting and actions plugin and seems to be a better approach by preventing the user from creating data that will be lost on restart.
Some other fresh idea, maybe I'll have something better tomorrow 🙂
cc @peterschretlen
The text was updated successfully, but these errors were encountered: