Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change hashing algorithm in Saved Object Migrations to be FIPS compliant #169734

Closed
kc13greiner opened this issue Oct 25, 2023 · 19 comments
Closed
Assignees
Labels
Feature:FIPS FIPS mode for Kibana Feature:Migrations Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc Team:Security Team focused on: Auth, Users, Roles, Spaces, Audit Logging, and more!

Comments

@kc13greiner
Copy link
Contributor

Describe the feature:

On Kibana startup, index mappings are currently compared using an md5 hash. This should be updated to use a compliant algorithm.

Describe a specific use case for the feature:

To be FIPS compliant, we need to remove all uses of insecure hashing algorithms, which includes md5.

Acceptable hash algorithms can be found here on pg. 18 in Table 8

@kc13greiner kc13greiner added the Feature:FIPS FIPS mode for Kibana label Oct 25, 2023
@botelastic botelastic bot added the needs-team Issues missing a team label label Oct 25, 2023
@kc13greiner kc13greiner added Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc and removed needs-team Issues missing a team label labels Oct 25, 2023
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-core (Team:Core)

@pgayvallet
Copy link
Contributor

pgayvallet commented Oct 25, 2023

This hashing is only used internally as a serialization / comparison mechanism, with internal and controlled data and used, again, only internally. Also note that swaping the algorithm would be a somewhat complicated challenge given we would need a way to still compare the two hashes during the transition.

Knowing that, @kc13greiner, can we know how critical this request is?

@pgayvallet pgayvallet added the Team:Security Team focused on: Auth, Users, Roles, Spaces, Audit Logging, and more! label Oct 25, 2023
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-security (Team:Security)

@azasypkin
Copy link
Member

Knowing that, @kc13greiner, can we know how critical this request is?

If we run Node.js in FIPS mode, the md5 algorithm won't be available at all. This is not something we can control on a case-by-case basis, so feature-wise, it's critical.

If you're asking about the time-critical aspect, we're still figuring it out, but it's likely to be one of the top priorities since it might prevent Kibana from starting (assuming I understand the code correctly, md5Object will throw in FIPS mode).

@pgayvallet
Copy link
Contributor

pgayvallet commented Oct 25, 2023

So basically, we need to change the algorithm we're using to generate the SO mapping hashes that we're using for mapping comparisons for the v2 version of the migration algorithm (zdt doesn't store any hashes or use a hashing algorithm).

I don't think we can do any 2-stage conversion/migration, because it's impacting the traditional offering, and we can't force customers to migrate to a given minor before upgrading higher.

Meaning that, we will have, at some point, hashes stored in md5 "format", and Kibana that isn't able to use the algorithm anymore.

So.. there's two way to approach that. The easy, and the way less easy.

  1. The easy is just changing the algo and not caring about it. Next update will consider that each and every types have mapping differences, and update everything (I'm not sure to remember the full implications, but surely it's fine, right?)

  2. The hard is to find a well to be able to compare two hashes from different algorithm without having access to the first algo. I'm not sure there's really a way tbh, but maybe somehow will be more imaginative than I am?

cc @rudolf @gsoldevila what do you think about this? Do you think just blindly changing the algo we're using for the hashes without doing anything else could be acceptable?

@rudolf
Copy link
Contributor

rudolf commented Oct 26, 2023

Yeah we use the hashes stored in the _meta and compare it to the hashes calculated from the mappings do decide if everything is up to date, if not we update the mappings and run an update_by_query to "pick up" the mappings. This is potentially quite an expensive operation so we optimize this by detecting which types of saved objects had new mappings and then only updating those types.

One way to preserve all of this in a FIPS compliant way would be to generate FIPS compliant hash snapshots of all previous versions. That way Kibana can find a md5sum in the _meta, lookup what the equivalent sha256 is in the snapshots and then generate a sha256 from the mappings we expect to compare. These optimizations only matter if we're not going to have to reindex anyway, so we don't have to generate a snapshot of each and every possible version, just "index compatible" versions, for .kibana_task_manager this means a snapshot for every migration introduced since 7.17 since they're all compatible, but for .kibana (and the other indices) it's just back to the split (8.8.0).

Taking this idea one step further we could consider if this is a good time to switch from mappings comparisons to comparing model versions. Kibana would then use the md5sum in the _meta and lookup the type's model version from the snapshots. If the model versions are different and contain mappings changes we can run an update_by_query. Instead of storing the hashes in the _meta, we'd then store the model versions in the _meta for future migrations.

@gsoldevila
Copy link
Contributor

At a first glance, (1) seems like a valid solution.

But as Rudolf said, v2 will consider that all SO types have been updated. It will run a "compatible update" flow, and the overhead, as Pierre hints, is that v2 will think all mappings types have changed, and it will run an updateAndPickupMappings() operation, updating all SO in all indices, so that mappings changes are picked up.

This might be too expensive, so we have to consider possible strategies to mitigate this and allow updating only SO types that have actually changed.

@pgayvallet
Copy link
Contributor

pgayvallet commented Oct 26, 2023

One way to preserve all of this in a FIPS compliant way would be to generate FIPS compliant hash snapshots of all previous versions

That sounds incredibly tedious, painful and error-prone to be totally honest.... Even if I won't deny this is a possibility, I think I would first focus on to my initial question, "do we think this would be strictly necessary / what would be the impact of NOT doing it?

because I feared it would cause a full reindex (to another index, "old-style", considering the mappings are not compatible) but given @gsoldevila's answer

and the overhead, as Pierre hints, is that v2 will think all mappings types have changed, and it will run an updateAndPickupMappings() operation, updating all SO in all indices, so that mappings changes are picked up.

I understand that it would "just" trigger a full updateAndPickupMappings, so staying in the reindex-into-same-index operation, which would sound like a way more acceptable impact?

Taking this idea one step further we could consider if this is a good time to switch from mappings comparisons to comparing model versions. Kibana would then use the md5sum in the _meta and lookup the type's model version from the snapshots. If the model versions are different and contain mappings changes we can run an update_by_query. Instead of storing the hashes in the _meta, we'd then store the model versions in the _meta for future migrations.

You've lost me here. What about migration from versions prior to model versions being a thing?

@gsoldevila
Copy link
Contributor

gsoldevila commented Oct 26, 2023

I understand that it would "just" trigger a full updateAndPickupMappings, so staying in the reindex-into-same-index operation, which would sound like a way more acceptable impact?

Not sure it's acceptable though. Since we no longer reindex, the updateAndPickupMappings has become the most expensive step of the migrations, and if we have customers that have thousands of saved objects, we kinda rely on them updating regularly (and having a few types that change between upgrades). Full update + pickup (all SO types) can still be painful.
CC @rudolf to confirm

You've lost me here. What about migration from versions prior to model versions being a thing?

I discussed that one with Rudolf, as I was lost too:

It would also involve having a hardcoded snapshot with all model versions for each compatible stack version, and comparing current model versions against that.

@gsoldevila
Copy link
Contributor

Circling back to Pierre's original comment, since this hash is not encrypting any sensitive information, but it rather is for comparing reasons mainly, couldn't we have an in-house "comparing" algorithm that happens to be backwards compatible with the existing hashes?

@gsoldevila
Copy link
Contributor

gsoldevila commented Oct 26, 2023

I think we can get away with only storing a snapshot of a single version: the one where we release the update.

If we store a mapping of the form:

// NB it only contains the md5s of the current version types
const MIGRATION_HASHES_TO_MODEL_VERSION = {
  md5_1: '10.0.1', // might correspond to type "T1"
  ...
  md5_n: '10.0.0' // might correspond to type "Tn"
}

We can update the logic so that:

  • if model versions are present in the _meta => use same logic as zdt
  • if model versions are NOT present in the _meta => we're upgrading from an "old" hashing version:
    • check if the stored _meta.migrationMappingPropertyHashes md5's can be translated to model versions
      • if they are NOT present in the MIGRATION_HASHES_TO_MODEL_VERSION => the type has changed
      • if they are present in the MIGRATION_HASHES_TO_MODEL_VERSION => compare it with the current model version (reuse comparison from 'zdt'?)

WDYT?

@rudolf
Copy link
Contributor

rudolf commented Oct 27, 2023

Full update + pickup (all SO types) can still be painful.

It's definitely only the largest clusters where this is a problem, maybe even less than 0.01%, but we have had SDH's where millions of saved objects took hours to complete the update_by_query step. So if we can come up with a solution that we're confident in would be robust it would be really good.

I think we can get away with only storing a snapshot of a single version: the one where we release the update.

Yes that's a good point, we'd just need one md5 hash snapshot to know if a type is up to date or old.

@pgayvallet
Copy link
Contributor

I think we can get away with only storing a snapshot of a single version: the one where we release the update.

I'm probably missing something, but is that really true for traditional deployments, given a customer can jump versions between upgrades?

E.g

  • We release that change in version 8.12.0
  • Later a customer upgrade from version 8.7.0 to 8.14.2

Would that still work?

@gsoldevila
Copy link
Contributor

gsoldevila commented Nov 2, 2023

E.g

We release that change in version 8.12.0
Later a customer upgrade from version 8.7.0 to 8.14.2
Would that still work?

Yes, I believe that would still work. Given a typeX SO type:

  • If typeX changed in 8.8.0, our hardcoded map would contain typeX_md5_8.8.0's hash, so when upgrading, we wouldn't find the typeX_md5_8.7.0 hash in the MIGRATION_HASHES_TO_MODEL_VERSION. As a result, we can conclude that typeX SOs need to be picked up.
  • Conversely, if typeX changed in 8.13.0, it'd be using a newer model version than the one present in MIGRATION_HASHES_TO_MODEL_VERSION, so even though we'd find the typeX_md5_8.7.0 entry in the map, the model versions would not match. Same conclusion, typeX SOs need to be picked up.

@kc13greiner
Copy link
Contributor Author

Are there any additional questions or concerns that need to be discussed before this work can be planned?

@pgayvallet
Copy link
Contributor

@kc13greiner There's likely still implementation details to be discussed by @elastic/kibana-core, but I don't think we have any more questions for your team.

Or, well, maybe one: do we have a better visibility on when we'll need to do this for?

@kc13greiner
Copy link
Contributor Author

@pgayvallet We'd like this work complete by "early February" so Ops can begin their work. Please let me know if I can provide any additional information!

@pgayvallet
Copy link
Contributor

@gsoldevila do you think this could realistically be done for "early February" ? it feels very close tbh

@gsoldevila gsoldevila self-assigned this Feb 8, 2024
gsoldevila added a commit that referenced this issue Mar 25, 2024
…6803)

## Summary

Address #169734 .

We're currently storing information about _Saved Object_ types in the
`<index>.mapping._meta`.
More specifically, we're storing hashes of the mappings of each type,
under the `migrationMappingPropertyHashes` property.
This allows us to detect which types' mappings have changed, in order to
_update and pick up_ changes only for those types.

**The problem is that `md5` cannot be used if we want to be FIPS
compliant.**

Thus, the idea is to stop using hashes to track changes in the SO
mappings.
Instead, we're going to use the same `modelVersions` introduced by ZDT:

Whenever mappings change for a given SO type, the corresponding
`modelVersion` will change too, so `modelVersions` can be used to
determine if mappings have changed.

---------

Co-authored-by: kibanamachine <[email protected]>
@legrego
Copy link
Member

legrego commented Mar 28, 2024

Resolved via #176803

@legrego legrego closed this as completed Mar 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:FIPS FIPS mode for Kibana Feature:Migrations Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc Team:Security Team focused on: Auth, Users, Roles, Spaces, Audit Logging, and more!
Projects
None yet
Development

No branches or pull requests

7 participants