-
Notifications
You must be signed in to change notification settings - Fork 436
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Doc] Best practice to setup Redis for GCS FT #2582
[Doc] Best practice to setup Redis for GCS FT #2582
Comments
Options for Redis persistence https://redis.io/docs/latest/operate/oss_and_stack/management/persistence/ |
I put together this gist to persist Redis using GCSFuse https://gist.github.com/andrewsykim/55088178684b5a692854f932c8120914 |
@rueian Kai-Hsun mentioned you're a Redis expert, do you have opinions on whether we should use RDB or AOF for persistence? |
Hi @andrewsykim, Generally, AOF is better since it persists changes more frequently in the format of append logs while RDB is the periodic snapshot of the whole redis memory. However, when it comes to the integration of GCSFuse, I believe RDB is better because GCS doesn't support append operations. If we use AOF with GCSFuse, it will need to re-upload the whole aof file again and again whenever there is a new entry appended and the aof file will get bigger and bigger and then slows down redis in this case. |
Is this a problem specific to GCSFuse? Wouldn't this be a problem for any file-system based approach? |
It is not specific to GCSFuse. Most cloud storages doesn’t support append operations, except for Azure blob storage and AWS S3 Express One Zone. AOF getting bigger will not be a problem for other file systems which support append operations because they don’t need to rewrite the whole file when appending a new entry. |
okay makes sense, so we should either use block storage for AOF or RBD persistence if using GCSFuse |
I'm interested in getting this doc together |
Without clustering redis (all of the current guides have 1 replica), we're talking about using Redis as a save-to-disk engine. I'd like to include some clustering and failover to our best practices recommendations to get the most out of redis. For reentrant workloads that can handle rolling back a few minutes, I wonder if it would be simpler/cheaper to write GCS to disk occasionally. |
Exactly.
Ray only supports standalone Redis (single master). It doesn’t support Redis Cluster (multiple sharded masters) and Redis Sentinel, the two types of clustering have failover built-in. So, now users needs to implement Redis HA by themselves. As far as I know, https://github.com/dragonflydb/dragonfly-operator is the only open-source solution that has automatic failover.
Do you mean you want to skip Redis? |
Well, I was curious if it's the best design choice if we're not clustering. But I understand that's not the main point, thanks. |
Apologies for the delay, just put up what I've got to iterate on before the holidays. I'll be back in January. After we get a sample config and best practice here, I figure we'll want something in ray/cluster/kubernetes/user-guides as well, correct? |
Search before asking
Description
Create a user guide for configuring a RayCluster with GCS Fault Tolerance using Redis on AWS or GCP. The guide should include persistence option for Redis to ensure Redis state can be recovered after a restart.
Use case
No response
Related issues
No response
Are you willing to submit a PR?
The text was updated successfully, but these errors were encountered: