Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Add support for searchable snapshots within index management #808

Open
kotwanikunal opened this issue Jun 6, 2023 · 9 comments

Comments

@kotwanikunal
Copy link
Member

Is your feature request related to a problem?

What solution would you like?

  • Add in support for policy which can operate as follows -
    • Take a snapshot
    • Delete the index
    • Restore the index as a searchable snapshot index and any changes needed to link this index to the alias

What alternatives have you considered?

  • Manually creating searchable snapshot indices (as per the current instructions)

Do you have any additional context?

@bugmakerrrrrr
Copy link

bugmakerrrrrr commented Jun 7, 2023

@kotwanikunal maybe we can use the aliases api to atomicly replace the original index, which can operate as follows:

  • Task a snapshot
  • Restore the snapshot as a searchable snapshot index
  • Replace the original index with the restored index using _aliases api

@lezzago lezzago removed the untriaged label Jun 15, 2023
@sandervandegeijn
Copy link

Would love this as well. Opened an issue at the forums but missed this. Going to script it myself for the time being.

@beejaygee
Copy link

beejaygee commented Feb 10, 2024

Going to script it myself for the time being.

@sandervandegeijn Could you please provide your script or at least advise what logic you used to make this? Should I rotate index daily, snapshot it, delete it, restore back to the same name with remote_snapshot configured? Or would I need to use an alias?

I really am looking forward to this being built in.

@sandervandegeijn
Copy link

sandervandegeijn commented Feb 11, 2024

https://github.com/sandervandegeijn/opensearch-searchable-snapshot-management

sure, it's by no means abstracted enough as a general library, bit it serves our purposes. I should move some configs to command line parameters. But to get an idea it's good enough.

It will gather all indices, everything older than 7 days will be snapshotted (same name as the index), removed and restored as a searchable snapshot index with the same name as the original. The data will be available as is, so everything is searchable through the dashboards visalisations and such as the users were used to with the same index patterns.

After 185 days everything gets removed.

I like the aliases idea btw, this gives the option to make a difference between local en remote indices in their names while maintaining functionality. Script will be less complex as well. I might implement this later this week.

The distinction between both in dashboards / index management is non existent at this time.

@beejaygee
Copy link

@sandervandegeijn Thanks for this. Looks good. One question, it looks like one should use this in addition to an ISM policy to perform backups. Can you backup a remote snapshot? Would a backup policy just be backup all indices (including remote) and prefix with backup for their name (to avoid backups being marked as remote snapshots and being picked up by your script)?

I assume you just run this as a cron job every hour or something? Probably needs to be more frequent than daily due to it needing time to complete snapshot before it can restore completed snapshot as remote?

Regarding aliases, would there be any impact to performance? I thought I read a while back that using aliases has an impact to performance.

@sandervandegeijn
Copy link

I don't know, we would store that on the same storage so there is no benefit for us to do that. If you want to do it, maybe it's better to save an index to two repos before deleting it.

I'm going to restore the snapshots under a different index name, something like remote-* and create an alias. This thread gave me that idea.

We run it every hour yes, but you can do it less frequent. Index will be there unit the next run at whatever time that's going to happen.

We are using aliases extensively in our other clusters, never had any issues.

@spapadop
Copy link

spapadop commented Jul 9, 2024

We find this feature quite crucial for enabling extensive use of searchable snapshots, which are otherwise a great way to support "cold storage". It's a pity this one is not implemented yet. We all have to implement our own scripting to automate this process, which can be way more prone to errors. Is there an interest in moving this feature forward?

@ccben87
Copy link

ccben87 commented Jul 9, 2024

I am very much interested in having this feature implemented.

@wntmddus
Copy link

What should I do to contribute to this? I think I have implemented the solution

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Todo
Development

No branches or pull requests

9 participants