Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify individual backupRepo availability in SolrCloud Status #358

Merged
6 changes: 6 additions & 0 deletions api/v1beta1/solrbackup_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -36,10 +36,16 @@ const (
// SolrBackupSpec defines the desired state of SolrBackup
type SolrBackupSpec struct {
// A reference to the SolrCloud to create a backup for
//
// +kubebuilder:validation:Pattern:=[a-z0-9]([-a-z0-9]*[a-z0-9])?
// +kubebuilder:validation:MinLength:=1
// +kubebuilder:validation:MinLength:=63
SolrCloud string `json:"solrCloud"`

// The name of the repository to use for the backup. Defaults to "legacy_local_repository" if not specified (the
// auto-configured repository for legacy singleton volumes).
//
// +kubebuilder:validation:Pattern:=[a-zA-Z0-9]([-_a-zA-Z0-9]*[a-zA-Z0-9])?
// +optional
RepositoryName string `json:"repositoryName,omitempty"`

Expand Down
12 changes: 9 additions & 3 deletions api/v1beta1/solrcloud_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -385,6 +385,8 @@ type SolrBackupRestoreOptions struct {
type SolrBackupRepository struct {
// A name used to identify this local storage profile. Values should follow RFC-1123. (See here for more details:
// https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#dns-label-names)
//
// +kubebuilder:validation:Pattern:=[a-zA-Z0-9]([-_a-zA-Z0-9]*[a-zA-Z0-9])?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[-1/?] This regex allows capitalized letters, but the validation on the 'solrCloud' string in solrbackup_types.go looks like it requires all lowercase.

Was that an oversight, or is there some rationale for that that I'm missing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is just a name for the backup repository, it is pretty independent from the name of the SolrCloud (which has limitations given by Kubernetes, which I linked below).
I used the same regex, but changed it to allow capital letters and underscores, since we don't need to be as strict with the names of the repositories (as far as I know).

Is there any limitation in Solr for the name of backup repositories? If so we should definitely change this to match.

Name string `json:"name"`

// A GCSRepository for Solr to use when backing up and restoring collections.
Expand Down Expand Up @@ -987,16 +989,16 @@ type SolrCloudStatus struct {
// SolrNodes contain the statuses of each solr node running in this solr cloud.
SolrNodes []SolrNodeStatus `json:"solrNodes"`

// Replicas is the number of number of desired replicas in the cluster
// Replicas is the number of desired replicas in the cluster
Replicas int32 `json:"replicas"`

// PodSelector for SolrCloud pods, required by the HPA
PodSelector string `json:"podSelector"`

// ReadyReplicas is the number of number of ready replicas in the cluster
// ReadyReplicas is the number of ready replicas in the cluster
ReadyReplicas int32 `json:"readyReplicas"`

// UpToDateNodes is the number of number of Solr Node pods that are running the latest pod spec
// UpToDateNodes is the number of Solr Node pods that are running the latest pod spec
UpToDateNodes int32 `json:"upToDateNodes"`

// The version of solr that the cloud is running
Expand All @@ -1021,6 +1023,10 @@ type SolrCloudStatus struct {
// BackupRestoreReady announces whether the solrCloud has the backupRestorePVC mounted to all pods
// and therefore is ready for backups and restores.
BackupRestoreReady bool `json:"backupRestoreReady"`

// BackupRepositoriesAvailable lists the backupRepositories specified in the SolrCloud and whether they are available across all Pods.
// +optional
BackupRepositoriesAvailable map[string]bool `json:"backupRepositoriesAvailable,omitempty"`
}

// SolrNodeStatus is the status of a solrNode in the cloud, with readiness status
Expand Down
7 changes: 7 additions & 0 deletions api/v1beta1/zz_generated.deepcopy.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 3 additions & 0 deletions config/crd/bases/solr.apache.org_solrbackups.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -1053,9 +1053,12 @@ spec:
type: object
repositoryName:
description: The name of the repository to use for the backup. Defaults to "legacy_local_repository" if not specified (the auto-configured repository for legacy singleton volumes).
pattern: '[a-zA-Z0-9]([-_a-zA-Z0-9]*[a-zA-Z0-9])?'
type: string
solrCloud:
description: A reference to the SolrCloud to create a backup for
minLength: 63
HoustonPutman marked this conversation as resolved.
Show resolved Hide resolved
pattern: '[a-z0-9]([-a-z0-9]*[a-z0-9])?'
type: string
required:
- solrCloud
Expand Down
12 changes: 9 additions & 3 deletions config/crd/bases/solr.apache.org_solrclouds.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -1021,6 +1021,7 @@ spec:
type: object
name:
description: 'A name used to identify this local storage profile. Values should follow RFC-1123. (See here for more details: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#dns-label-names)'
pattern: '[a-zA-Z0-9]([-_a-zA-Z0-9]*[a-zA-Z0-9])?'
type: string
s3:
description: An S3Repository for Solr to use when backing up and restoring collections.
Expand Down Expand Up @@ -6796,6 +6797,11 @@ spec:
status:
description: SolrCloudStatus defines the observed state of SolrCloud
properties:
backupRepositoriesAvailable:
additionalProperties:
type: boolean
description: BackupRepositoriesAvailable lists the backupRepositories specified in the SolrCloud and whether they are available across all Pods.
type: object
backupRestoreReady:
description: BackupRestoreReady announces whether the solrCloud has the backupRestorePVC mounted to all pods and therefore is ready for backups and restores.
type: boolean
Expand All @@ -6809,11 +6815,11 @@ spec:
description: PodSelector for SolrCloud pods, required by the HPA
type: string
readyReplicas:
description: ReadyReplicas is the number of number of ready replicas in the cluster
description: ReadyReplicas is the number of ready replicas in the cluster
format: int32
type: integer
replicas:
description: Replicas is the number of number of desired replicas in the cluster
description: Replicas is the number of desired replicas in the cluster
format: int32
type: integer
solrNodes:
Expand Down Expand Up @@ -6855,7 +6861,7 @@ spec:
description: The version of solr that the cloud is meant to be running. Will only be provided when the cloud is migrating between versions
type: string
upToDateNodes:
description: UpToDateNodes is the number of number of Solr Node pods that are running the latest pod spec
description: UpToDateNodes is the number of Solr Node pods that are running the latest pod spec
format: int32
type: integer
version:
Expand Down
75 changes: 65 additions & 10 deletions controllers/solrbackup_controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,12 @@ package controllers
import (
"context"
"fmt"
"k8s.io/apimachinery/pkg/fields"
"reflect"
"sigs.k8s.io/controller-runtime/pkg/builder"
"sigs.k8s.io/controller-runtime/pkg/handler"
"sigs.k8s.io/controller-runtime/pkg/predicate"
"sigs.k8s.io/controller-runtime/pkg/source"
"time"

"github.com/apache/solr-operator/controllers/util"
Expand Down Expand Up @@ -194,12 +199,10 @@ func (r *SolrBackupReconciler) reconcileSolrCloudBackup(ctx context.Context, bac
return solrCloud, collectionBackupsFinished, actionTaken, err
}

// Make sure that all solr nodes are active and have the backupRestore shared volume mounted
// TODO: we do not need all replicas to be healthy. We should just check that leaders exist for all shards. (or just let Solr do that)
cloudReady := solrCloud.Status.BackupRestoreReady && (solrCloud.Status.Replicas == solrCloud.Status.ReadyReplicas)
if !cloudReady {
logger.Info("Cloud not ready for backup backup", "solrCloud", solrCloud.Name)
return solrCloud, collectionBackupsFinished, actionTaken, errors.NewServiceUnavailable("Cloud is not ready for backups or restores")
// Make sure that all solr living Solr pods have the backupRepo configured
if !solrCloud.Status.BackupRepositoriesAvailable[backupRepository.Name] {
logger.Info("Cloud not ready for backup", "solrCloud", solrCloud.Name, "repository", backupRepository.Name)
return solrCloud, collectionBackupsFinished, actionTaken, errors.NewServiceUnavailable(fmt.Sprintf("Cloud is not ready for backups in the %s repository", backupRepository.Name))
}

// Only set the solr version at the start of the backup. This shouldn't change throughout the backup.
Expand Down Expand Up @@ -338,11 +341,63 @@ func (r *SolrBackupReconciler) persistSolrCloudBackups(ctx context.Context, back
}

// SetupWithManager sets up the controller with the Manager.
func (r *SolrBackupReconciler) SetupWithManager(mgr ctrl.Manager) error {
func (r *SolrBackupReconciler) SetupWithManager(mgr ctrl.Manager) (err error) {
r.config = mgr.GetConfig()

return ctrl.NewControllerManagedBy(mgr).
ctrlBuilder := ctrl.NewControllerManagedBy(mgr).
For(&solrv1beta1.SolrBackup{}).
Owns(&batchv1.Job{}).
Complete(r)
Owns(&batchv1.Job{})

ctrlBuilder, err = r.indexAndWatchForSolrClouds(mgr, ctrlBuilder)
if err != nil {
return err
}

return ctrlBuilder.Complete(r)
}

func (r *SolrBackupReconciler) indexAndWatchForSolrClouds(mgr ctrl.Manager, ctrlBuilder *builder.Builder) (*builder.Builder, error) {
solrCloudField := ".spec.solrCloud"

if err := mgr.GetFieldIndexer().IndexField(context.Background(), &solrv1beta1.SolrBackup{}, solrCloudField, func(rawObj client.Object) []string {
// grab the SolrBackup object, extract the used SolrCloud...
return []string{rawObj.(*solrv1beta1.SolrBackup).Spec.SolrCloud}
}); err != nil {
return ctrlBuilder, err
}

return ctrlBuilder.Watches(
&source.Kind{Type: &solrv1beta1.SolrCloud{}},
handler.EnqueueRequestsFromMapFunc(func(obj client.Object) []reconcile.Request {
solrCloud := obj.(*solrv1beta1.SolrCloud)
foundBackups := &solrv1beta1.SolrBackupList{}
listOps := &client.ListOptions{
FieldSelector: fields.OneTermEqualSelector(solrCloudField, obj.GetName()),
Namespace: obj.GetNamespace(),
}
err := r.List(context.Background(), foundBackups, listOps)
if err != nil {
// if no exporters found, just no-op this
return []reconcile.Request{}
}

requests := make([]reconcile.Request, 0)
for _, item := range foundBackups.Items {
// Only queue the request if the Cloud is ready.
cloudIsReady := solrCloud.Status.BackupRestoreReady
if item.Spec.RepositoryName != "" {
cloudIsReady = solrCloud.Status.BackupRepositoriesAvailable[item.Spec.RepositoryName]
}
if cloudIsReady {
requests = append(requests, reconcile.Request{
NamespacedName: types.NamespacedName{
Name: item.GetName(),
Namespace: item.GetNamespace(),
},
})
}
}
return requests
}),
builder.WithPredicates(predicate.GenerationChangedPredicate{})), nil
}
40 changes: 18 additions & 22 deletions controllers/solrcloud_controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -508,9 +508,12 @@ func (r *SolrCloudReconciler) reconcileCloudStatus(ctx context.Context, solrClou
return outOfDatePods, outOfDatePodsNotStarted, availableUpdatedPodCount, err
}
newStatus.PodSelector = selector.String()
allPodsBackupReady := true
for idx, p := range foundPods.Items {
nodeNames[idx] = p.Name
backupReposAvailable := make(map[string]bool, len(solrCloud.Spec.BackupRepositories))
for _, repo := range solrCloud.Spec.BackupRepositories {
backupReposAvailable[repo.Name] = false
}
for podIdx, p := range foundPods.Items {
nodeNames[podIdx] = p.Name
nodeStatus := solrv1beta1.SolrNodeStatus{}
nodeStatus.Name = p.Name
nodeStatus.NodeName = p.Spec.NodeName
Expand All @@ -537,9 +540,10 @@ func (r *SolrCloudReconciler) reconcileCloudStatus(ctx context.Context, solrClou
newStatus.ReadyReplicas += 1
}

// Skip "backup-readiness" check for pod if we've already found a pod that's not ready
if allPodsBackupReady {
allPodsBackupReady = allPodsBackupReady && isPodReadyForBackup(&p, solrCloud)
// Merge BackupRepository availability for this pod
backupReposAvailableForPod := util.GetAvailableBackupRepos(&p)
for repo, availableSoFar := range backupReposAvailable {
backupReposAvailable[repo] = (availableSoFar || podIdx == 0) && backupReposAvailableForPod[repo]
}

// A pod is out of date if it's revision label is not equal to the statefulSetStatus' updateRevision.
Expand Down Expand Up @@ -578,11 +582,15 @@ func (r *SolrCloudReconciler) reconcileCloudStatus(ctx context.Context, solrClou
for idx, nodeName := range nodeNames {
newStatus.SolrNodes[idx] = nodeStatusMap[nodeName]
}
if allPodsBackupReady && len(foundPods.Items) > 0 {
newStatus.BackupRestoreReady = true
} else {
newStatus.BackupRestoreReady = false
newStatus.BackupRepositoriesAvailable = backupReposAvailable
allPodsBackupReady := len(backupReposAvailable) > 0
for _, backupRepo := range solrCloud.Spec.BackupRepositories {
allPodsBackupReady = allPodsBackupReady && backupReposAvailable[backupRepo.Name]
if !allPodsBackupReady {
break
}
}
newStatus.BackupRestoreReady = allPodsBackupReady

// If there are multiple versions of solr running, use the first otherVersion as the current running solr version of the cloud
if len(otherVersions) > 0 {
Expand All @@ -602,18 +610,6 @@ func (r *SolrCloudReconciler) reconcileCloudStatus(ctx context.Context, solrClou
return outOfDatePods, outOfDatePodsNotStarted, availableUpdatedPodCount, nil
}

func isPodReadyForBackup(pod *corev1.Pod, solrCloud *solrv1beta1.SolrCloud) bool {
// If solrcloud doesn't request backup support then everything is 'ready' implicitly
if len(solrCloud.Spec.BackupRepositories) == 0 {
return false
}

// TODO: There is no way to possibly do this with the new S3 option.
// This is wrong, but not the end of the world.
// Replace with new functionality in https://github.com/apache/solr-operator/issues/326
return true
}

func (r *SolrCloudReconciler) reconcileNodeService(ctx context.Context, logger logr.Logger, instance *solrv1beta1.SolrCloud, nodeName string) (err error, ip string) {
// Generate Node Service
service := util.GenerateNodeService(instance, nodeName)
Expand Down
54 changes: 53 additions & 1 deletion controllers/solrcloud_controller_backup_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,7 @@ var _ = FDescribe("SolrCloud controller - Backup Repositories", func() {
PodOptions: &solrv1beta1.PodOptions{
EnvVariables: extraVars,
Volumes: extraVolumes,
Annotations: testPodAnnotations,
},
},
BackupRepositories: []solrv1beta1.SolrBackupRepository{
Expand All @@ -98,7 +99,10 @@ var _ = FDescribe("SolrCloud controller - Backup Repositories", func() {

// Annotations for the solrxml configMap
solrXmlMd5 := fmt.Sprintf("%x", md5.Sum([]byte(configMap.Data[util.SolrXmlFile])))
Expect(statefulSet.Spec.Template.Annotations).To(HaveKeyWithValue(util.SolrXmlMd5Annotation, solrXmlMd5), "Wrong solr.xml MD5 annotation in the pod template!")
Expect(statefulSet.Spec.Template.Annotations).To(Equal(util.MergeLabelsOrAnnotations(testPodAnnotations, map[string]string{
"solr.apache.org/solrXmlMd5": solrXmlMd5,
util.SolrBackupRepositoriesAnnotation: "test-repo",
})), "Incorrect pod annotations")

// Env Variable Tests
expectedEnvVars := map[string]string{
Expand Down Expand Up @@ -273,4 +277,52 @@ var _ = FDescribe("SolrCloud controller - Backup Repositories", func() {
})
})
})

FContext("Multiple Repositories - Annotations", func() {
BeforeEach(func() {
solrCloud.Spec = solrv1beta1.SolrCloudSpec{
ZookeeperRef: &solrv1beta1.ZookeeperRef{
ConnectionInfo: &solrv1beta1.ZookeeperConnectionInfo{
InternalConnectionString: "host:7271",
},
},
CustomSolrKubeOptions: solrv1beta1.CustomSolrKubeOptions{
PodOptions: &solrv1beta1.PodOptions{
EnvVariables: extraVars,
Volumes: extraVolumes,
},
},
BackupRepositories: []solrv1beta1.SolrBackupRepository{
{
Name: "test-repo",
S3: &solrv1beta1.S3Repository{
Region: "test-region",
Bucket: "test-bucket",
},
},
{
Name: "another",
S3: &solrv1beta1.S3Repository{
Region: "test-region-2",
Bucket: "test-bucket-2",
},
},
},
}
})
FIt("has the correct resources", func() {
By("testing the Solr ConfigMap")
configMap := expectConfigMap(ctx, solrCloud, solrCloud.ConfigMapName(), map[string]string{"solr.xml": util.GenerateSolrXMLStringForCloud(solrCloud)})

By("testing the Solr StatefulSet with explicit volumes and envVars before adding S3Repo credentials")
// Make sure envVars and Volumes are correct be
statefulSet := expectStatefulSet(ctx, solrCloud, solrCloud.StatefulSetName())

// Annotations for the solrxml configMap
Expect(statefulSet.Spec.Template.Annotations).To(Equal(map[string]string{
"solr.apache.org/solrXmlMd5": fmt.Sprintf("%x", md5.Sum([]byte(configMap.Data["solr.xml"]))),
util.SolrBackupRepositoriesAnnotation: "another,test-repo",
}), "Incorrect pod annotations")
})
})
})
5 changes: 3 additions & 2 deletions controllers/solrprometheusexporter_controller_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -190,8 +190,9 @@ var _ = FDescribe("SolrPrometheusExporter controller - General", func() {
Expect(deployment.Labels).To(Equal(util.MergeLabelsOrAnnotations(expectedDeploymentLabels, testDeploymentLabels)), "Incorrect deployment labels")
Expect(deployment.Annotations).To(Equal(testDeploymentAnnotations), "Incorrect deployment annotations")
Expect(deployment.Spec.Template.ObjectMeta.Labels).To(Equal(util.MergeLabelsOrAnnotations(expectedDeploymentLabels, testPodLabels)), "Incorrect pod labels")
testPodAnnotations[util.PrometheusExporterConfigXmlMd5Annotation] = fmt.Sprintf("%x", md5.Sum([]byte(testExporterConfig)))
Expect(deployment.Spec.Template.ObjectMeta.Annotations).To(Equal(testPodAnnotations), "Incorrect pod annotations")
Expect(deployment.Spec.Template.ObjectMeta.Annotations).To(Equal(util.MergeLabelsOrAnnotations(testPodAnnotations, map[string]string{
util.PrometheusExporterConfigXmlMd5Annotation: fmt.Sprintf("%x", md5.Sum([]byte(testExporterConfig))),
})), "Incorrect pod annotations")

Expect(deployment.Spec.Template.Spec.Containers).To(HaveLen(len(extraContainers1)+1), "Wrong number of containers for the Deployment")
Expect(deployment.Spec.Template.Spec.Containers[1:]).To(Equal(extraContainers1), "Incorrect sidecar containers")
Expand Down
Loading