diff --git a/content/docs/extensions/deployment/kubernetes.md b/content/docs/extensions/deployment/kubernetes.md deleted file mode 100644 index e3542a49..00000000 --- a/content/docs/extensions/deployment/kubernetes.md +++ /dev/null @@ -1,503 +0,0 @@ -# Kubernetes Deployments Support - -Deploy a model to a kubernetes cluster exposing its prediction endpoints through -a service. - -## Preparation - -- Make sure you have a Kubernetes cluster accessible, with the corresponding - kubeconfig file available. -- The cluster has access to a docker registry so as to pull docker images. -- Relevant permissions to create resources on the cluster -- deployment, - service, etc. are present. -- Nodes are accessible and reachable, with an external IP address (valid for a - NodePort service, more details to come below). - -## Description - -Deploying to a Kubernetes cluster involves 2 main steps: - -1. Build the docker image and upload it to a registry. -2. Create resources on the Kubernetes cluster -- specifically, a `namespace`, a - `deployment` and a `service`. - -Once this is done, one can use the usual workflow of -[`mlem deployment run`](/doc/command-reference/deployment/run) to deploy on -Kubernetes. - - - -You can use [`mlem types deployment kubernetes`](/doc/command-reference/types) -to list all the configurable parameters. - - - -Most of the configurable parameters in the list above come with sensible -defaults. But at the least, one needs to follow the structure given below: - -```cli -$ mlem deployment run service_name --model model --env kubernetes --conf service_type=loadbalancer - -⏳️ Loading model from model.mlem -💾 Saving deployment to service_name.mlem -🛠 Creating docker image ml - 🛠 Building MLEM wheel file... - 💼 Adding model files... - 🛠 Generating dockerfile... - 💼 Adding sources... - 💼 Generating requirements file... - 🛠 Building docker image ml:4ee45dc33804b58ee2c7f2f6be447cda... - ✅ Built docker image ml:4ee45dc33804b58ee2c7f2f6be447cda -namespace created. status='{'conditions': None, 'phase': 'Active'}' -deployment created. status='{'available_replicas': None, - 'collision_count': None, - 'conditions': None, - 'observed_generation': None, - 'ready_replicas': None, - 'replicas': None, - 'unavailable_replicas': None, - 'updated_replicas': None}' -service created. status='{'conditions': None, 'load_balancer': {'ingress': None}}' -✅ Deployment ml is up in mlem namespace -``` - -where: - -- `service_name` is a name of one's own choice, of which corresponding - `service_name.mlem` and `service_name.mlem.state` files will be created. -- `model` denotes the path to model saved via `mlem`. -- `service_type` is configurable and is passed as `loadbalancer`. The default - value is `nodeport` if not passed. - -### Checking the docker images - -One can check the docker image built via `docker image ls` which should give the -following output: - -``` -REPOSITORY TAG IMAGE ID CREATED SIZE -ml 4ee45dc33804b58ee2c7f2f6be447cda 16cf3d92492f 3 minutes ago 778MB -... -``` - -### Checking the kubernetes resources - -Pods created can be checked via `kubectl get pods -A` which should have a pod in -the `mlem` namespace present as shown below: - -``` -NAMESPACE NAME READY STATUS RESTARTS AGE -kube-system coredns-6d4b75cb6d-xp68b 1/1 Running 7 (12m ago) 7d22h -... -kube-system storage-provisioner 1/1 Running 59 (11m ago) 54d -mlem ml-cddbcc89b-zkfhx 1/1 Running 0 5m58s -``` - -By default, all resources are created in the `mlem` namespace. This ofcourse is -configurable using `--conf namespace=prod` where `prod` is the desired namespace -name. - -### Making predictions via mlem - -One can of course use the -[`mlem deployment apply`](/doc/command-reference/deployment/apply) command to -ping the deployed endpoint to get the predictions back. An example could be: - -```cli -$ mlem deployment apply service_name data --json - -[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2] -``` - -where `data` is the dataset saved via `mlem`. - -### Deleting the Kubernetes resources - -A model can easily be undeployed using `mlem deploy remove service_name` which -will delete the `pods`, `services` and the `namespace` i.e. clear the resources -from the cluster. The docker image will still persist in the registry though. - -
- -### ⚙️ About which cluster to use - -MLEM tries to find the kubeconfig file from the environment variable -`KUBECONFIG` or the default location `~/.kube/config`. - -If you need to use another path, one can pass it with - -`--conf kube_config_file_path=...` - -
- -## Case Study: Using EKS cluster with ECR on AWS - -The deployment to a cloud managed kubernetes cluster such as EKS is simple and -analogous to how it is done in the steps above for a local cluster (such as -minikube). - - - -To setup an EKS cluster, you can simply use [`eksctl`](https://eksctl.io/) - -A simple command such as - -```cli -eksctl create cluster --name cluster-name --region us-east-1 -``` - -will setup an EKS cluster for you with default parameters such as two `m5.large` -worker nodes. - -Other tools such as -[`terraform`](https://learn.hashicorp.com/tutorials/terraform/eks) can also be -used. - - - -The popular docker registry choice to be used with EKS is ECR (Elastic Container -Registry). Make sure the EKS cluster has at least read access to ECR. - -### ECR - -Make sure you have a repository in ECR where docker images can be uploaded. In -the sample screenshot below, there exists a `classifier` repository: - -![alt text](/img/ecr.png) - -### Using MLEM with ECR and EKS - -Provided that the default kubeconfig file (present at `~/.kube/config`) can -communicate with EKS, execute the following command: - -```cli -$ mlem deploy run service_name --model model --env kubernetes --conf registry=ecr --conf registry.account=342840881361 --conf registry.region="us-east-1" --conf registry.host="342840881361.dkr.ecr.us-east-1.amazonaws.com/classifier" --conf image_name=classifier --conf service_type=loadbalancer - -⏳️ Loading model from model.mlem -💾 Saving deployment to service_name.mlem -🛠 Creating docker image classifier - 🛠 Building MLEM wheel file... - 💼 Adding model files... - 🛠 Generating dockerfile... - 💼 Adding sources... - 💼 Generating requirements file... - 🛠 Building docker image 342840881361.dkr.ecr.us-east-1.amazonaws.com/classifier:4ee45dc33804b58ee2c7f2f6be447cda... - 🗝 Logged in to remote registry at host 342840881361.dkr.ecr.us-east-1.amazonaws.com - ✅ Built docker image 342840881361.dkr.ecr.us-east-1.amazonaws.com/classifier:4ee45dc33804b58ee2c7f2f6be447cda - 🔼 Pushing image 342840881361.dkr.ecr.us-east-1.amazonaws.com/classifier:4ee45dc33804b58ee2c7f2f6be447cda to -342840881361.dkr.ecr.us-east-1.amazonaws.com - ✅ Pushed image 342840881361.dkr.ecr.us-east-1.amazonaws.com/classifier:4ee45dc33804b58ee2c7f2f6be447cda to -342840881361.dkr.ecr.us-east-1.amazonaws.com -namespace created. status='{'conditions': None, 'phase': 'Active'}' -deployment created. status='{'available_replicas': None, - 'collision_count': None, - 'conditions': None, - 'observed_generation': None, - 'ready_replicas': None, - 'replicas': None, - 'unavailable_replicas': None, - 'updated_replicas': None}' -service created. status='{'conditions': None, 'load_balancer': {'ingress': None}}' -✅ Deployment classifier is up in mlem namespace -``` - -- Note that the repository name in ECR i.e. `classifier` has to match with the - `image_name` supplied through `--conf` - -### Checking the docker images - -One can check the docker image built via `docker image ls` which should give the -following output: - -``` -REPOSITORY TAG IMAGE ID CREATED SIZE -342840881361.dkr.ecr.us-east-1.amazonaws.com/classifier 4ee45dc33804b58ee2c7f2f6be447cda 96afb03ad6f5 2 minutes ago 778MB -... -``` - -This can also be verified in ECR: - -![alt text](/img/ecr_image.png) - -### Checking the kubernetes resources - -Pods created can be checked via `kubectl get pods -A` which should have a pod in -the `mlem` namespace present as shown below: - -``` -NAMESPACE NAME READY STATUS RESTARTS AGE -kube-system aws-node-pr8cn 1/1 Running 0 11m -... -kube-system kube-proxy-dfxsv 1/1 Running 0 11m -mlem classifier-687655f977-h7wsl 1/1 Running 0 83s -``` - -By default, all resources are created in the `mlem` namespace. This ofcourse is -configurable using `--conf namespace=prod` where `prod` is the desired namespace -name. - -Services created can be checked via `kubectl get svc -A` which should look like -the following: - -``` -NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE -default kubernetes ClusterIP 10.100.0.1 443/TCP 20m -kube-system kube-dns ClusterIP 10.100.0.10 53/UDP,53/TCP 20m -mlem classifier LoadBalancer 10.100.87.16 a069daf48f9f244338a4bf5c60c6b823-1734837081.us-east-1.elb.amazonaws.com 8080:32067/TCP 2m32s -``` - -### Making predictions via mlem or otherwise - -One can clearly visit the External IP of the service `classifier` created by -`mlem` i.e. - -**a069daf48f9f244338a4bf5c60c6b823-1734837081.us-east-1.elb.amazonaws.com:8080** - -using the browser and see the usual FastAPI docs page: - -![alt text](/img/fastapi.png) - -But one can also use the -[`mlem deployment apply`](/doc/command-reference/deployment/apply) command to -ping the deployed endpoint to get the predictions back. An example could be: - -```cli -$ mlem deployment apply service_name data --json - -[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2] -``` - -i.e. `mlem` knows how to calculate the externally reachable endpoint given the -service type. - -### A note about NodePort Service - - - -While the example discussed above deploys a LoadBalancer Service Type, but one -can also use NodePort (which is the default) OR via -`--conf service_type=nodeport` - -While `mlem` knows how to calculate externally reachable IP address, make sure -the EC2 machine running the pod has external traffic allowed to it. This can be -configured in the inbound rules of the node's security group. - -This can be seen as the last rule being added below: - -![alt text](/img/inbound.png) - - - -## Swapping the model in deployment - -If you want to change the model that is currently under deployment, simply run - -```cli -$ mlem deploy run service_name --model other-model -``` - -This will build a new docker image corresponding to the `other-model` and will -terminate the existing pod and create a new one, thereby replacing it, without -downtime. - -This can be seen below: - -### Checking the docker images - -``` -REPOSITORY TAG IMAGE ID CREATED SIZE -342840881361.dkr.ecr.us-east-1.amazonaws.com/classifier d57e4cacec82ebd72572d434ec148f1d 9bacd4cd9cc0 11 minutes ago 2.66GB -342840881361.dkr.ecr.us-east-1.amazonaws.com/classifier 4ee45dc33804b58ee2c7f2f6be447cda 26cb86b55bc4 About an hour ago 778MB -... -``` - -Notice how a new docker image with the tag `d57e4cacec82ebd72572d434ec148f1d` is -built. - -### Checking the deployment process - -``` -⏳️ Loading deployment from service_name.mlem -⏳️ Loading model from other-model.mlem -🛠 Creating docker image classifier - 🛠 Building MLEM wheel file... - 💼 Adding model files... - 🛠 Generating dockerfile... - 💼 Adding sources... - 💼 Generating requirements file... - 🛠 Building docker image 342840881361.dkr.ecr.us-east-1.amazonaws.com/classifier:d57e4cacec82ebd72572d434ec148f1d... - 🗝 Logged in to remote registry at host 342840881361.dkr.ecr.us-east-1.amazonaws.com - ✅ Built docker image 342840881361.dkr.ecr.us-east-1.amazonaws.com/classifier:d57e4cacec82ebd72572d434ec148f1d - 🔼 Pushing image 342840881361.dkr.ecr.us-east-1.amazonaws.com/classifier:d57e4cacec82ebd72572d434ec148f1d to 342840881361.dkr.ecr.us-east-1.amazonaws.com - ✅ Pushed image 342840881361.dkr.ecr.us-east-1.amazonaws.com/classifier:d57e4cacec82ebd72572d434ec148f1d to 342840881361.dkr.ecr.us-east-1.amazonaws.com -✅ Deployment classifier is up in mlem namespace -``` - -Here, an existing deployment i.e. `service_name` is used but with a newer model. -Hence, details of registry need not be passed again. The contents of -`service_name` can be checked by inspecting the `service_name.mlem` file. - -### Checking the kubernetes resources - -We can see the existing pod being terminated and the new one running in its -place below: - -``` -NAMESPACE NAME READY STATUS RESTARTS AGE -kube-system aws-node-pr8cn 1/1 Running 0 90m -... -kube-system kube-proxy-dfxsv 1/1 Running 0 90m -mlem classifier-66b9588df5-wmc2v 1/1 Running 0 99s -mlem classifier-687655f977-bm4w8 1/1 Terminating 0 60m -``` - -## Requirements - -```bash -pip install mlem[kubernetes] -# or -pip install kubernetes docker -``` - -## Examples - -```python - -``` - -## Implementation reference - -### `class K8sYamlBuilder` - -**MlemABC parent type**: `builder` - -**MlemABC type**: `kubernetes` - - MlemBuilder implementation for building Kubernetes manifests/yamls - -**Fields**: - -- `target: str` _(required)_ - Target path for the manifest/yaml - -- `namespace: str = "mlem"` - Namespace to create kubernetes resources such as - pods, service in - -- `image_name: str = "ml"` - Name of the docker image to be deployed - -- `image_uri: str = "ml:latest"` - URI of the docker image to be deployed - -- `image_pull_policy: ImagePullPolicy = "Always"` - Image pull policy for the - docker image to be deployed - -- `port: int = 8080` - Port where the service should be available - -- `service_type: ServiceType = NodePortService()` - Type of service by which - endpoints of the model are exposed - ---- - -### `class K8sDeploymentState` - -**MlemABC parent type**: `deploy_state` - -**MlemABC type**: `kubernetes` - - DeployState implementation for Kubernetes deployments - -**Fields**: - -- `model_hash: str` - hash of deployed model meta - -- `image: DockerImage` - Docker Image being used for Deployment - -- `deployment_name: str` - Name of Deployment - ---- - -### `class K8sDeployment` - -**MlemABC parent type**: `deployment` - -**MlemABC type**: `kubernetes` - - MlemDeployment implementation for Kubernetes deployments - -**Fields**: - -- `namespace: str = "mlem"` - Namespace to create kubernetes resources such as - pods, service in - -- `image_name: str = "ml"` - Name of the docker image to be deployed - -- `image_uri: str = "ml:latest"` - URI of the docker image to be deployed - -- `image_pull_policy: ImagePullPolicy = "Always"` - Image pull policy for the - docker image to be deployed - -- `port: int = 8080` - Port where the service should be available - -- `service_type: ServiceType = NodePortService()` - Type of service by which - endpoints of the model are exposed - -- `state_manager: StateManager` - State manager used - -- `server: Server` - Type of Server to use, with options such as FastAPI, - RabbitMQ etc. - -- `registry: DockerRegistry = DockerRegistry()` - Docker registry - -- `daemon: DockerDaemon = host=''` - Docker daemon - -- `kube_config_file_path: str` - Path for kube config file of the cluster - ---- - -### `class K8sEnv` - -**MlemABC parent type**: `env` - -**MlemABC type**: `kubernetes` - - MlemEnv implementation for Kubernetes Environments - -**Fields**: - -- `registry: DockerRegistry` - Docker registry - ---- - -### `class ClusterIPService` - -**MlemABC parent type**: `k8s_service_type` - -**MlemABC type**: `clusterip` - - ClusterIP Service implementation for service inside a Kubernetes - Cluster - -**No fields** - ---- - -### `class LoadBalancerService` - -**MlemABC parent type**: `k8s_service_type` - -**MlemABC type**: `loadbalancer` - - LoadBalancer Service implementation for service inside a Kubernetes - Cluster - -**No fields** - ---- - -### `class NodePortService` - -**MlemABC parent type**: `k8s_service_type` - -**MlemABC type**: `nodeport` - - NodePort Service implementation for service inside a Kubernetes Cluster - -**No fields** diff --git a/content/docs/object-reference/deployment/kubernetes.md b/content/docs/object-reference/deployment/kubernetes.md index 0af08039..145915ea 100644 --- a/content/docs/object-reference/deployment/kubernetes.md +++ b/content/docs/object-reference/deployment/kubernetes.md @@ -1,6 +1,53 @@ -# kubernetes +## kubernetes -## `class K8sDeployment` +### `class K8sYamlBuilder` + +**MlemABC parent type**: `builder` + +**MlemABC type**: `kubernetes` + + MlemBuilder implementation for building Kubernetes manifests/yamls + +**Fields**: + +- `target: str` _(required)_ - Target path for the manifest/yaml + +- `namespace: str = "mlem"` - Namespace to create kubernetes resources such as + pods, service in + +- `image_name: str = "ml"` - Name of the docker image to be deployed + +- `image_uri: str = "ml:latest"` - URI of the docker image to be deployed + +- `image_pull_policy: ImagePullPolicy = "Always"` - Image pull policy for the + docker image to be deployed + +- `port: int = 8080` - Port where the service should be available + +- `service_type: ServiceType = NodePortService()` - Type of service by which + endpoints of the model are exposed + +--- + +### `class K8sDeploymentState` + +**MlemABC parent type**: `deploy_state` + +**MlemABC type**: `kubernetes` + + DeployState implementation for Kubernetes deployments + +**Fields**: + +- `model_hash: str` - hash of deployed model meta + +- `image: DockerImage` - Docker Image being used for Deployment + +- `deployment_name: str` - Name of Deployment + +--- + +### `class K8sDeployment` **MlemABC parent type**: `deployment` @@ -38,32 +85,52 @@ --- -## `class K8sDeploymentState` +### `class K8sEnv` -**MlemABC parent type**: `deploy_state` +**MlemABC parent type**: `env` **MlemABC type**: `kubernetes` - DeployState implementation for Kubernetes deployments + MlemEnv implementation for Kubernetes Environments **Fields**: -- `model_hash: str` - Hash of deployed model meta +- `registry: DockerRegistry` - Docker registry -- `image: DockerImage` - Docker Image being used for Deployment +--- -- `deployment_name: str` - Name of Deployment +### `class ClusterIPService` + +**MlemABC parent type**: `k8s_service_type` + +**MlemABC type**: `clusterip` + + ClusterIP Service implementation for service inside a Kubernetes + Cluster + +**No fields** --- -## `class K8sEnv` +### `class LoadBalancerService` -**MlemABC parent type**: `env` +**MlemABC parent type**: `k8s_service_type` -**MlemABC type**: `kubernetes` +**MlemABC type**: `loadbalancer` - MlemEnv implementation for Kubernetes Environments + LoadBalancer Service implementation for service inside a Kubernetes + Cluster -**Fields**: +**No fields** -- `registry: DockerRegistry` - Docker registry +--- + +### `class NodePortService` + +**MlemABC parent type**: `k8s_service_type` + +**MlemABC type**: `nodeport` + + NodePort Service implementation for service inside a Kubernetes Cluster + +**No fields** diff --git a/content/docs/user-guide/deploying/kubernetes.md b/content/docs/user-guide/deploying/kubernetes.md index 634da5f0..20990c76 100644 --- a/content/docs/user-guide/deploying/kubernetes.md +++ b/content/docs/user-guide/deploying/kubernetes.md @@ -1,8 +1,17 @@ # Kubernetes -## Description +Deploy a model to a kubernetes cluster exposing its prediction endpoints through +a service. + +## Preparation -**TODO** +- Make sure you have a Kubernetes cluster accessible, with the corresponding + kubeconfig file available. +- The cluster has access to a docker registry so as to pull docker images. +- Relevant permissions to create resources on the cluster -- deployment, + service, etc. are present. +- Nodes are accessible and reachable, with an external IP address (valid for a + NodePort service, more details to come below). ## Requirements @@ -12,6 +21,344 @@ pip install mlem[kubernetes] pip install kubernetes docker ``` +## Description + +Deploying to a Kubernetes cluster involves 2 main steps: + +1. Build the docker image and upload it to a registry. +2. Create resources on the Kubernetes cluster -- specifically, a `namespace`, a + `deployment` and a `service`. + +Once this is done, one can use the usual workflow of +[`mlem deployment run`](/doc/command-reference/deployment/run) to deploy on +Kubernetes. + + + +You can use [`mlem types deployment kubernetes`](/doc/command-reference/types) +to list all the configurable parameters. + + + +Most of the configurable parameters in the list above come with sensible +defaults. But at the least, one needs to follow the structure given below: + +```cli +$ mlem deployment run service_name --model model --env kubernetes --conf service_type=loadbalancer + +⏳️ Loading model from model.mlem +💾 Saving deployment to service_name.mlem +🛠 Creating docker image ml + 🛠 Building MLEM wheel file... + 💼 Adding model files... + 🛠 Generating dockerfile... + 💼 Adding sources... + 💼 Generating requirements file... + 🛠 Building docker image ml:4ee45dc33804b58ee2c7f2f6be447cda... + ✅ Built docker image ml:4ee45dc33804b58ee2c7f2f6be447cda +namespace created. status='{'conditions': None, 'phase': 'Active'}' +deployment created. status='{'available_replicas': None, + 'collision_count': None, + 'conditions': None, + 'observed_generation': None, + 'ready_replicas': None, + 'replicas': None, + 'unavailable_replicas': None, + 'updated_replicas': None}' +service created. status='{'conditions': None, 'load_balancer': {'ingress': None}}' +✅ Deployment ml is up in mlem namespace +``` + +where: + +- `service_name` is a name of one's own choice, of which corresponding + `service_name.mlem` and `service_name.mlem.state` files will be created. +- `model` denotes the path to model saved via `mlem`. +- `service_type` is configurable and is passed as `loadbalancer`. The default + value is `nodeport` if not passed. + +### Checking the docker images + +One can check the docker image built via `docker image ls` which should give the +following output: + +``` +REPOSITORY TAG IMAGE ID CREATED SIZE +ml 4ee45dc33804b58ee2c7f2f6be447cda 16cf3d92492f 3 minutes ago 778MB +... +``` + +### Checking the kubernetes resources + +Pods created can be checked via `kubectl get pods -A` which should have a pod in +the `mlem` namespace present as shown below: + +``` +NAMESPACE NAME READY STATUS RESTARTS AGE +kube-system coredns-6d4b75cb6d-xp68b 1/1 Running 7 (12m ago) 7d22h +... +kube-system storage-provisioner 1/1 Running 59 (11m ago) 54d +mlem ml-cddbcc89b-zkfhx 1/1 Running 0 5m58s +``` + +By default, all resources are created in the `mlem` namespace. This ofcourse is +configurable using `--conf namespace=prod` where `prod` is the desired namespace +name. + +### Making predictions via mlem + +One can of course use the +[`mlem deployment apply`](/doc/command-reference/deployment/apply) command to +ping the deployed endpoint to get the predictions back. An example could be: + +```cli +$ mlem deployment apply service_name data --json + +[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2] +``` + +where `data` is the dataset saved via `mlem`. + +### Deleting the Kubernetes resources + +A model can easily be undeployed using `mlem deploy remove service_name` which +will delete the `pods`, `services` and the `namespace` i.e. clear the resources +from the cluster. The docker image will still persist in the registry though. + +
+ +### ⚙️ About which cluster to use + +MLEM tries to find the kubeconfig file from the environment variable +`KUBECONFIG` or the default location `~/.kube/config`. + +If you need to use another path, one can pass it with + +`--conf kube_config_file_path=...` + +
+ +## Case Study: Using EKS cluster with ECR on AWS + +The deployment to a cloud managed kubernetes cluster such as EKS is simple and +analogous to how it is done in the steps above for a local cluster (such as +minikube). + + + +To setup an EKS cluster, you can simply use [`eksctl`](https://eksctl.io/) + +A simple command such as + +```cli +eksctl create cluster --name cluster-name --region us-east-1 +``` + +will setup an EKS cluster for you with default parameters such as two `m5.large` +worker nodes. + +Other tools such as +[`terraform`](https://learn.hashicorp.com/tutorials/terraform/eks) can also be +used. + + + +The popular docker registry choice to be used with EKS is ECR (Elastic Container +Registry). Make sure the EKS cluster has at least read access to ECR. + +### ECR + +Make sure you have a repository in ECR where docker images can be uploaded. In +the sample screenshot below, there exists a `classifier` repository: + +![alt text](/img/ecr.png) + +### Using MLEM with ECR and EKS + +Provided that the default kubeconfig file (present at `~/.kube/config`) can +communicate with EKS, execute the following command: + +```cli +$ mlem deploy run service_name --model model --env kubernetes --conf registry=ecr --conf registry.account=342840881361 --conf registry.region="us-east-1" --conf registry.host="342840881361.dkr.ecr.us-east-1.amazonaws.com/classifier" --conf image_name=classifier --conf service_type=loadbalancer + +⏳️ Loading model from model.mlem +💾 Saving deployment to service_name.mlem +🛠 Creating docker image classifier + 🛠 Building MLEM wheel file... + 💼 Adding model files... + 🛠 Generating dockerfile... + 💼 Adding sources... + 💼 Generating requirements file... + 🛠 Building docker image 342840881361.dkr.ecr.us-east-1.amazonaws.com/classifier:4ee45dc33804b58ee2c7f2f6be447cda... + 🗝 Logged in to remote registry at host 342840881361.dkr.ecr.us-east-1.amazonaws.com + ✅ Built docker image 342840881361.dkr.ecr.us-east-1.amazonaws.com/classifier:4ee45dc33804b58ee2c7f2f6be447cda + 🔼 Pushing image 342840881361.dkr.ecr.us-east-1.amazonaws.com/classifier:4ee45dc33804b58ee2c7f2f6be447cda to +342840881361.dkr.ecr.us-east-1.amazonaws.com + ✅ Pushed image 342840881361.dkr.ecr.us-east-1.amazonaws.com/classifier:4ee45dc33804b58ee2c7f2f6be447cda to +342840881361.dkr.ecr.us-east-1.amazonaws.com +namespace created. status='{'conditions': None, 'phase': 'Active'}' +deployment created. status='{'available_replicas': None, + 'collision_count': None, + 'conditions': None, + 'observed_generation': None, + 'ready_replicas': None, + 'replicas': None, + 'unavailable_replicas': None, + 'updated_replicas': None}' +service created. status='{'conditions': None, 'load_balancer': {'ingress': None}}' +✅ Deployment classifier is up in mlem namespace +``` + +- Note that the repository name in ECR i.e. `classifier` has to match with the + `image_name` supplied through `--conf` + +### Checking the docker images + +One can check the docker image built via `docker image ls` which should give the +following output: + +``` +REPOSITORY TAG IMAGE ID CREATED SIZE +342840881361.dkr.ecr.us-east-1.amazonaws.com/classifier 4ee45dc33804b58ee2c7f2f6be447cda 96afb03ad6f5 2 minutes ago 778MB +... +``` + +This can also be verified in ECR: + +![alt text](/img/ecr_image.png) + +### Checking the kubernetes resources + +Pods created can be checked via `kubectl get pods -A` which should have a pod in +the `mlem` namespace present as shown below: + +``` +NAMESPACE NAME READY STATUS RESTARTS AGE +kube-system aws-node-pr8cn 1/1 Running 0 11m +... +kube-system kube-proxy-dfxsv 1/1 Running 0 11m +mlem classifier-687655f977-h7wsl 1/1 Running 0 83s +``` + +By default, all resources are created in the `mlem` namespace. This ofcourse is +configurable using `--conf namespace=prod` where `prod` is the desired namespace +name. + +Services created can be checked via `kubectl get svc -A` which should look like +the following: + +``` +NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE +default kubernetes ClusterIP 10.100.0.1 443/TCP 20m +kube-system kube-dns ClusterIP 10.100.0.10 53/UDP,53/TCP 20m +mlem classifier LoadBalancer 10.100.87.16 a069daf48f9f244338a4bf5c60c6b823-1734837081.us-east-1.elb.amazonaws.com 8080:32067/TCP 2m32s +``` + +### Making predictions via mlem or otherwise + +One can clearly visit the External IP of the service `classifier` created by +`mlem` i.e. + +**a069daf48f9f244338a4bf5c60c6b823-1734837081.us-east-1.elb.amazonaws.com:8080** + +using the browser and see the usual FastAPI docs page: + +![alt text](/img/fastapi.png) + +But one can also use the +[`mlem deployment apply`](/doc/command-reference/deployment/apply) command to +ping the deployed endpoint to get the predictions back. An example could be: + +```cli +$ mlem deployment apply service_name data --json + +[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2] +``` + +i.e. `mlem` knows how to calculate the externally reachable endpoint given the +service type. + +### A note about NodePort Service + + + +While the example discussed above deploys a LoadBalancer Service Type, but one +can also use NodePort (which is the default) OR via +`--conf service_type=nodeport` + +While `mlem` knows how to calculate externally reachable IP address, make sure +the EC2 machine running the pod has external traffic allowed to it. This can be +configured in the inbound rules of the node's security group. + +This can be seen as the last rule being added below: + +![alt text](/img/inbound.png) + + + +## Swapping the model in deployment + +If you want to change the model that is currently under deployment, simply run + +```cli +$ mlem deploy run service_name --model other-model +``` + +This will build a new docker image corresponding to the `other-model` and will +terminate the existing pod and create a new one, thereby replacing it, without +downtime. + +This can be seen below: + +### Checking the docker images + +``` +REPOSITORY TAG IMAGE ID CREATED SIZE +342840881361.dkr.ecr.us-east-1.amazonaws.com/classifier d57e4cacec82ebd72572d434ec148f1d 9bacd4cd9cc0 11 minutes ago 2.66GB +342840881361.dkr.ecr.us-east-1.amazonaws.com/classifier 4ee45dc33804b58ee2c7f2f6be447cda 26cb86b55bc4 About an hour ago 778MB +... +``` + +Notice how a new docker image with the tag `d57e4cacec82ebd72572d434ec148f1d` is +built. + +### Checking the deployment process + +``` +⏳️ Loading deployment from service_name.mlem +⏳️ Loading model from other-model.mlem +🛠 Creating docker image classifier + 🛠 Building MLEM wheel file... + 💼 Adding model files... + 🛠 Generating dockerfile... + 💼 Adding sources... + 💼 Generating requirements file... + 🛠 Building docker image 342840881361.dkr.ecr.us-east-1.amazonaws.com/classifier:d57e4cacec82ebd72572d434ec148f1d... + 🗝 Logged in to remote registry at host 342840881361.dkr.ecr.us-east-1.amazonaws.com + ✅ Built docker image 342840881361.dkr.ecr.us-east-1.amazonaws.com/classifier:d57e4cacec82ebd72572d434ec148f1d + 🔼 Pushing image 342840881361.dkr.ecr.us-east-1.amazonaws.com/classifier:d57e4cacec82ebd72572d434ec148f1d to 342840881361.dkr.ecr.us-east-1.amazonaws.com + ✅ Pushed image 342840881361.dkr.ecr.us-east-1.amazonaws.com/classifier:d57e4cacec82ebd72572d434ec148f1d to 342840881361.dkr.ecr.us-east-1.amazonaws.com +✅ Deployment classifier is up in mlem namespace +``` + +Here, an existing deployment i.e. `service_name` is used but with a newer model. +Hence, details of registry need not be passed again. The contents of +`service_name` can be checked by inspecting the `service_name.mlem` file. + +### Checking the kubernetes resources + +We can see the existing pod being terminated and the new one running in its +place below: + +``` +NAMESPACE NAME READY STATUS RESTARTS AGE +kube-system aws-node-pr8cn 1/1 Running 0 90m +... +kube-system kube-proxy-dfxsv 1/1 Running 0 90m +mlem classifier-66b9588df5-wmc2v 1/1 Running 0 99s +mlem classifier-687655f977-bm4w8 1/1 Terminating 0 60m +``` + ## Examples ```python