-
Notifications
You must be signed in to change notification settings - Fork 20
Distribute replica forests evenly #330
Comments
Thanks @dmcassel - are you able to use this feature to show what the forest plan is for your config - https://github.com/marklogic-community/ml-gradle/wiki/Creating-forests#previewing-forest-creation ? I haven't tried to reproduce this yet, but I don't recall adding support for this. |
Looks like the preview feature doesn't look at the forest directory. My staging database has forests laid out under ml-config/forests/(staging-db-name)/staging-forests.json, with a total of 54 forests over six hosts. (I'm hoping to make this property driven at some point, but that's what we have right now.) When I run the preview command, it doesn't see that forest config:
Note that I'm exploring this with a 3-node Docker cluster, rather than the six nodes we have in prod. |
I just added
What I'd like to see is my-staging-1's replica on host2 and my-staging-2's replica on host3. I haven't thought through the generalized algorithm yet, but I think you see what I'm going for, right? |
@rjrudin fyi I'm working on a PR for this |
I added this commit after merging in the PR: BuildForestTest is passing, and I used ConfigureReplicaForestsDebug on a local 3-host cluster to try out both strategies, and all appears well. I may do a 3.12.beta release of ml-gradle though so you can try this out ASAP. |
Added docs at https://github.com/marklogic-community/ml-gradle/wiki/Creating-forests#replica-forest-creation . This is in case we run into any issues with the Distributed implementation in the 3.12.0 release. |
Given some number of forests per host and a number of replicas (in my case, 1), ml-gradle currently puts each forest's replica on the next host -- so all of host1's forests are replicated on host2, all of host2's forests are replicated on host3, etc. Erin Miller's Hardware Reference Architecture: Direct Attached Storage recommends not doing this: "Assuming 6 primary and 6 replica forests per host, it’s important to distribute forests equally across hosts. Specifically, you don’t want to replicate all forests from host 1 to host 2. If host 1 then goes down, host 2 will be supporting 12 primary forests, since the six replicas will have changed roles to primary. "
Modify the replica host assignment such that replicas for a host's forests are evenly distributed around the cluster.
The text was updated successfully, but these errors were encountered: