Volcano is system for runnning high performance workloads on Kubernetes. It provides a suite of mechanisms currently missing from Kubernetes that are commonly required by many classes of high performance workload including:
- machine learning/deep learning,
- bioinformatics/genomics, and
- other "big data" applications.
These types of applications typically run on generalized domain frameworks like Tensorflow, Spark, PyTorch, MPI, etc, which Volcano integrates with.
Some examples of the mechanisms and features that Volcano adds to Kubernetes are:
- Job management extensions and improvements, e.g:
- Multi-pod jobs
- Lifecycle management extensions including suspend/resume and restart.
- Improved error handling
- Indexed jobs
- Task dependencies
- Scheduling extensions, e.g:
- Co-scheduling
- Fair-share scheduling
- Queue scheduling
- Preemption and reclaims
- Reservartions and backfills
- Topology-based scheduling
- Runtime extensions, e.g:
- Support for specialized continer runtimes like Singularity, with GPU accelerator extensions and enhanced security features.
- Other
- Data locality awareness and intelligent scheduling
- Optimizations for data throughput, round-trip latency, etc.
Volcano builds upon a decade and a half of experience running a wide variety of high performance workloads at scale using several systems and platforms, combined with best-of-breed ideas and practices from the open source community.
The easiest way to deploy Volcano is to use the Helm chart.
First of all, clone the repo to your local path:
# mkdir -p $GOPATH/src/github.com/kubernetes-sigs
# cd $GOPATH/src/github.com/kubernetes-sigs
# git clone https://github.com/kubernetes-sigs/volcano
Official images are available on DockerHub, however you can build them locally with the command:
cd $GOPATH/src/volcano.sh/volcano
make images
## Verify your images
# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
volcanosh/volcano-admission latest a83338506638 8 seconds ago 41.4MB
volcanosh/volcano-scheduler latest faa3c2a25ac3 9 seconds ago 49.6MB
volcanosh/volcano-controllers latest 7b11606ebfb8 10 seconds ago 44.2MB
NOTE: You need ensure the images are correctly loaded in your kubernetes cluster, for
example, if you are using kind cluster,
try command kind load docker-image <image-name>:<tag>
for each of the images.
Second, install the required helm plugin and generate valid certificate, volcano uses a helm plugin gen-admission-secret to generate certificate for admission service to communicate with kubernetes API server.
#1. Install helm plugin
helm plugin install installer/chart/volcano/plugins/gen-admission-secret
#2. Generate secret within service name
helm gen-admission-secret --service <specified-name>-admission-service --namespace <namespace>
## For eg:
kubectl create namespace volcano-trial
helm gen-admission-secret --service volcano-trial-admission-service --namespace volcano-trial
Finally, install helm chart.
helm install installer/chart/volcano --namespace <namespace> --name <specified-name>
For eg :
helm install installer/chart/volcano --namespace volcano-trial --name volcano-trial
NOTE:The <specified-name>
used in the two commands above should be identical.
To Verify your installation run the following commands:
#1. Verify the Running Pods
# kubectl get pods --namespace <namespace>
NAME READY STATUS RESTARTS AGE
<specified-name>-admission-84fd9b9dd8-9trxn 1/1 Running 0 43s
<specified-name>-controllers-75dcc8ff89-42v6r 1/1 Running 0 43s
<specified-name>-scheduler-b94cdb867-89pm2 1/1 Running 0 43s
#2. Verify the Services
# kubectl get services --namespace <namespace>
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
<specified-name>-admission-service ClusterIP 10.105.78.53 <none> 443/TCP 91s
Volcano also utilize kind cluster to provide a simple way to cover E2E tests. Make sure you have kubectl and kind binary installed on your local environment before running tests. Command as below:
make e2e-kind
In case of debugging, you can keep the kubernetes cluster environment after tests via:
CLEANUP_CLUSTER=-1 make e2e-kind
And if only parts of the tests are focused, please execute:
TEST_FILE=<test-file-name> make e2e-kind
Command above will finally be translated
into: go test ./test/e2e/volcano -v -timeout 30m -args --ginkgo.regexScansFilePath=true --ginkgo.focus=<test-file-name>
You can reach the maintainers of this project at:
Slack: #volcano-sh
Mailing List: https://groups.google.com/forum/#!forum/volcano-sh