Presented at O'Reilly Artificial Intelligence Conference "Industrialized capsule networks for text analytics"
Highlights of the session :
- Overview of Capsule Networks ( What and Why )
- Capsule Networks for text classification
- How to leverage Kubeflow for Industrialization
- Setup Kubeflow on GCP with Multi GPU Support enabled
- Use Tensorflow to create CaspNet Estimator [ from Tensorflow Keras Model to Tensorflow Estimator ]
- Distributed Multi-GPU training of CapsNet [ using Tensorflow MirroredStrategy ]
- Use TF-Job for distributed training on K8S cluster [ for Single-Class & Multi-Class Classification with Multiple Neural Network Architectures ]
- Use Katib for highly scalable hyper-parameter tuning [ with Random Search, Grid Search and Bayesian Search for hyper-parameters]
- Challenges and Future Work
We would like to acknowledge the work done by researchers and community in this field.
Research Papers
- Text Classification Using Capsules
- Investigating Capsule Networks with Dynamic Routing for Text Classification
Github Repos
We have modified and adapted from following implementation and focused more on Kubeflow implementation for scalibility and performance.
- Clone the github repo
git clone
- navigate to code directory
cd capsule-text-kubeflow
Make sure you have gcloud SDK is installed and pointing to the right GCP PROJECT. You can use
gcloud init
to perform this action. -
Setup environment variables
gcloud config set project ${PROJECT_ID}
gcloud config set compute/zone ${ZONE}
- Use one-click deploy interface by GCP to setup kubeflow using . For more details you can refer to official documentation[].
one the deployment is completed. You can connect to the cluster.
- Connecting to the cluster
gcloud container clusters get-credentials ${DEPLOYMENT_NAME} \
--project ${PROJECT_ID} \
--zone ${ZONE}
Set context
kubectl config set-context $(kubectl config current-context) --namespace=kubeflow
kubectl get all
- If you want to use GPUs for your training process. You can add GPU backed Node pool in the Kubernetes Cluster.
gcloud container node-pools create accel \
--project ${PROJECT_ID} \
--zone ${ZONE} \
--cluster ${DEPLOYMENT_NAME} \
--accelerator type=nvidia-tesla-k80,count=1 \
--num-nodes 1 \
--machine-type n1-highmem-8 \
--disk-size=220 \
--scopes cloud-platform \
--verbosity error
- You can then install required Nvidia Drivers to utilize the GPUs.
kubectl apply -f
You can then Open the Ambassador Interface by navigating to your GCP console. Kubernetes Engine -> Services -> Ambassador -> Click on "Port Forwarding". Follow the instruction to open the Ambassador image.
Build custom image for tensorflow with all GPU driver configuration ( setting
and installingCUDATOOLKIT
//allow docker to access our GCR registry
gcloud auth configure-docker --quiet
cd jupyter-image && make build PROJECT_ID=$PROJECT_ID && cd ..
cd jupyter-image && make push PROJECT_ID=$PROJECT_ID && cd ..
Use Notebooks in Ambassador UI for running your experiments. Select custom image and set the image name that you just created. You can set the resources and GPUs.
Upload the notebook available
in notebooks subfolder inside the code directory.
- Install Kustomize
Set current working directory
cd multi-label
// download kustomize for linux (including Cloud Shell)
// for macOS, use kustomize_2.0.3_darwin_amd64 for Linux use : kustomize_2.0.3_linux_amd64
//download tar of ksonnet
wget --no-check-certificate \$KS_VER
mv kustomize_2.0.3_darwin_amd64 kustomize
//add ks command to path
chmod +x kustomize
export PATH=${PATH}:$(pwd)
//check kustomize version and ensure it is 2.0.3
kustomize version
- Build Image
cd $WORKING_DIR/train-image/
// sample : capsnet-kubeflow:v1
//set the path on GCR you want to push the image to
//build the tensorflow model into a container
//container is tagged with its eventual path on GCR, but it stays local for now
docker build $WORKING_DIR/train-image -t $TRAIN_PATH -f $WORKING_DIR/train-image/Dockerfile.model
- Check locally
docker run -it $TRAIN_PATH
- Push Docker image to GCR
//allow docker to access our GCR registry
gcloud auth configure-docker --quiet
//push container to GCR
docker push $TRAIN_PATH
move to training folder
cd $WORKING_DIR/training/GCS
- check service account access
gcloud --project=$PROJECT_ID iam service-accounts list | grep $DEPLOYMENT_NAME
- check kubernetes secrets
kubectl describe secret user-gcp-sa
- Train on the cluster
For capsule A, set the modelType to "capsule-A" , for capusle-B set the modelType to "capsule-B" and change the name accordingly.
// set the parameters for this job : CNN
kustomize edit add configmap capsule-map-training --from-literal=modelType=CNN
kustomize edit add configmap capsule-map-training --from-literal=name=train-capsnet-text-cnn-1
kustomize edit set image training-image=$TRAIN_PATH
kustomize edit add configmap capsule-map-training --from-literal=learningRate=0.0005
kustomize edit add configmap capsule-map-training --from-literal=batchSize=25
kustomize edit add configmap capsule-map-training --from-literal=numEpochs=5
- Set Google Application Credentials
// set credentials
kustomize edit add configmap capsule-map-training --from-literal=secretName=user-gcp-sa
kustomize edit add configmap capsule-map-training --from-literal=secretMountPath=/var/secrets
kustomize edit add configmap capsule-map-training --from-literal=GOOGLE_APPLICATION_CREDENTIALS=/var/secrets/user-gcp-sa.json
- Train at Scale
kustomize build . |kubectl apply -f -
kubectl describe tfjob
kubectl logs -f train-capsnet-text-cnn-1-chief-0
We will be using Katib for hyper-parameter tuning. For testing katib you can try out sample example. Make changes in the yaml file as per your need.
kubectl create -f