This repo contains sample configuration files to install Apache Solr on Amazon Elastic Kubernetes Service (EKS). It also contains some files required to run the demo. This repository walks through the installation and configuration of the following components-
- An Amazon EKS Cluster with three managed node groups
- An Apacle Solr cluster, also known as SolrCloud
- A Zookeeper ensemble, required by SolrCloud
- Apache Solr auto-scaler to scale Solr replicas
- Prometheus to extract custom metrics from SolrCloud cluster, to be used by Horizontal Pod Autoscaler.
- Horizontal Pod Autoscaler (HPA) and Cluster Autoscaler (CA) for the EKS cluster to scale the Pods within the managed node groups and to scale the compute for the EKS Cluster respectively.
- An AWS Account.
- An AWS Cloud9 workspace. Setup a Cloud9 workspace following the instructions found here
- Install Kubernetes tool eksctl, kubectl and AWS CLI
- Create an IAM role for your Cloud9 workspace.
- Attach the IAM role to the Cloud9 workspace.
- Update the IAM settings for your Cloud9 workspace.
- From a terminal in your Cloud9 workspace, clone this git repository and set the directory:
git clone <repo_url> apache-solr-k8s-main
cd apache-solr-k8s-main/config
- Create an Amazon EKS cluster using. Note: replace
<region of choice>
with the AWS region you wish to deploy your EKS Cluster, for example--region=us-west-2
.
eksctl create cluster --version=1.21 \
--name= solr8demo \
--region=<region of choice> \
--node-private-networking \
--alb-ingress-access \
--asg-access \
--without-nodegroup
- Create the Managed Node Groups in private subnets within the cluster using:
⚠️ The managed node groups config file uses EC2 instance typem5.xlarge
which is not free tier eligible. Thus, your AWS account may also incur charges for EC2. For pricing details of Amazon Elastic Kubernetes Service refer the Amazon EKS pricing page.
eksctl create nodegroup -f managedNodegroups.yml
- Setup the Helm charts, and install Prometheus:
curl -sSL https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3 | bash
helm repo add stable https://charts.helm.sh/stable/
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
- Install Kubernetes Metrics Server
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
Verify that the metrics-server deployment is running the desired number of pods with the following command.
kubectl get deployment metrics-server -n kube-system
- Install ZooKeeper for SolrCloud Zookeeper ensemble:
kubectl create configmap zookeeper-ensemble-config --from-env-file=zk-config.properties
kubectl apply -f zookeeper.yml
Check status of the pods in the StatefulSet for Zookeeper by running the following command
kubectl get pods -l app=zk
Expected output should look like
NAME READY STATUS RESTARTS AGE
zk-0 1/1 Running 0 4h4m
zk-1 1/1 Running 0 4h3m
zk-2 1/1 Running 0 4h3m
- Install Solr and Solr-metrics exporter:
kubectl create configmap solr-cluster-config --from-env-file=solr-config.properties
kubectl apply -f solr-cluster.yml
kubectl apply -f solr-exporter.yml
Check status of the Solr pods
kubectl get pods -l app=solr-app
Expected output
NAME READY STATUS RESTARTS AGE
solr-0 1/1 Running 0 3h59m
solr-1 1/1 Running 0 3h59m
solr-2 1/1 Running 0 3h58m
Verify that the Solr Exporter service is running on port 9983. This is important since our HPA depends on Solr metrics to be exported to Kubernetes metrics server via Prometheus.
kubectl get service/solr-exporter-service
Expected output (Note: CLUSTER-IP will likely be different)
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
solr-exporter-service ClusterIP 10.100.205.122 <none> 9983/TCP 4h1m
- Update the
prom.yml
Prometheus configuration file with thesolr-exporter-service
IP and host port.
Find the solr-exporter-service
cluster IP address using the command below
kubectl get service/solr-exporter-service -o jsonpath='{.spec.clusterIP}'
Update the prometheus.yml
property in the prom.yml
file as shown below and replace <solr-exporter-service-IP>
with the cluster IP from above command. Save the file.
scrape_configs:
- job_name: prometheus
static_configs:
- targets:
- localhost:9090
- job_name: solr
scheme: http
static_configs:
- targets: ['<solr-exporter-service-IP>:9983']
- Install Prometheus adapter:
helm install prometheus-adapter prometheus-community/prometheus-adapter \
--set prometheus.url=http://prometheus-server.default.svc.cluster.local \
--set prometheus.port=80 \
--values=adapterConfig.yml
helm install prometheus prometheus-community/prometheus \
--values prom.yml
- Configure Horizontal Pod Autoscaler (HPA) and Cluster Autoscaler (CA) using
kubectl
:
kubectl apply -f hpa.yml
kubectl apply -f cluster-autoscaler-autodiscover.yaml
Verify HPA has been setup correctly-
kubectl describe hpa
Expected output
Name: solr-hpa
Namespace: default
Labels: <none>
Annotations: <none>
CreationTimestamp: Wed, 22 Dec 2021 19:25:18 +0000
Reference: StatefulSet/solr
Metrics: ( current / target )
"solr_metrics" (target value): 4021 / 50k
Min replicas: 3
Max replicas: 20
StatefulSet pods: 20 current / 20 desired
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True ReadyForNewScale recommended size matches current size
ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from external metric solr_metrics(nil)
ScalingLimited True TooManyReplicas the desired replica count is more than the maximum replica count
Events: <none>
⚠️ The "solr_metrics" value may be 0 or a lower number when setting up the Solr deployment. However, this number is expected to change when Solr receives client requests. Also note that themaxReplicas
used in thehpa.yml
config file is set to 10. You may consider changing this to meet the needs of your Solr deployment.maxReplicas
defines the maximum number of pods the HPA can scale up to.
- Obtain SolrCloud Administration UI URL using
kubectl get services solr-service
from a terminal in your Cloud9 workspace. The URL will be of the formhttp://<xxxxx>.<region>.elb.amazonaws.com:8983
.
- Create a Solr Collection named
Books
using the Solr Administration UI and upload the sample data filedata/books.json
.
- Cnfigure SolrCloud autoscaler by setting a Search Rate Trigger. The autoscaler config can be set using the endpoint
http://<xxxxx>.<region>.elb.amazonaws.com:8983/api/cluster/autoscaling/
:
curl -X POST -H 'Content-type:application/json' -d '{
"set-trigger": {
"name" : "search_rate_trigger",
"event" : "searchRate",
"collections" : "Books",
"metric" : "QUERY./select.requestTimes:1minRate",
"aboveRate" : 10.0,
"belowRate" : 0.01,
"waitFor" : "30s",
"enabled" : true,
"actions" : [
{
"name" : "compute_plan",
"class": "solr.ComputePlanAction"
},
{
"name" : "execute_plan",
"class": "solr.ExecutePlanAction"
}
]
}
}' http://<xxxxx>.<region>.elb.amazonaws.com:8983/api/cluster/autoscaling/
A Python script is included in the scripts
directory which can be used to test the deployment.
- Change directory
cd scripts
chmod 744 ./submit_mc_pi_k8s_requests_books.py
- Install the required dependencies
sudo python3 -m pip install -r ./requirements.txt
- Run the script
python ./submit_mc_pi_k8s_requests_books.py -p 1 -r 1 -i 1
To run a short load test the value of flags -p
, -r
, and -i
can be increased
python ./submit_mc_pi_k8s_requests_books.py -p 100 -r 30 -i 30000000 > result.txt
Review the result.txt
file to ensure you are getting search query responses from Solr.
Use the following steps to clean up the Solr environment.
- Uninstall Horizontal Pod Autoscaler (HPA) and Cluster Autoscaler (CA):
kubectl delete -f hpa.yml
kubectl delete -f cluster-autoscaler-autodiscover.yaml
- Uninstall Solr:
kubectl delete -f solr-cluster.yml
kubectl delete configmap solr-cluster-config
kubectl delete -f solr-exporter.yml
- Uninstall Zookeeper:
kubectl delete -f zookeeper.yml
kubectl delete configmap zookeeper-ensemble-config
- Delete the Managed Node Groups:
eksctl delete nodegroup -f managedNodegroups.yml
- Delete the Amazon EKS cluster:
eksctl delete cluster --name=solr8demo
See CONTRIBUTING for more information.
This library is licensed under the MIT-0 License. See the LICENSE file.