Skip to content

Commit

Permalink
[SDK] Make service account configurable for build_image_from_working_…
Browse files Browse the repository at this point in the history
…dir (kubeflow#3419)

* Add kfp-container-builder sa

* Allow service account to be configurable

* Fix tests

* Fix test

* Use documentation for service account to introduce compatibility with different types of installation

* updated doc

* clean up

* Update container_builder_test.py

* Update _build_image_api.py

* Update kustomization.yaml

* Add executable permission for presubmit tests mkp.sh
  • Loading branch information
Bobgy authored and Jeffwan committed Dec 9, 2020
1 parent 4d3f544 commit 7b0f2bc
Show file tree
Hide file tree
Showing 11 changed files with 95 additions and 11 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -698,3 +698,8 @@ spec:
- containerPort: 8888
- containerPort: 8887
serviceAccountName: ml-pipeline
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: kubeflow-pipelines-container-builder
4 changes: 4 additions & 0 deletions manifests/kustomize/base/pipeline/container-builder-sa.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
apiVersion: v1
kind: ServiceAccount
metadata:
name: kubeflow-pipelines-container-builder
1 change: 1 addition & 0 deletions manifests/kustomize/base/pipeline/kustomization.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -30,3 +30,4 @@ resources:
- pipeline-runner-role.yaml
- pipeline-runner-rolebinding.yaml
- pipeline-runner-sa.yaml
- container-builder-sa.yaml
2 changes: 1 addition & 1 deletion manifests/kustomize/gcp-workload-identity-setup.sh
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ USER_GSA=${USER_GSA:-$CLUSTER_NAME-kfp-user}

# Kubernetes Service Account (KSA)
SYSTEM_KSA=(ml-pipeline-ui ml-pipeline-visualizationserver)
USER_KSA=(pipeline-runner default) # default service account is used for container building, TODO: give it a specific name
USER_KSA=(pipeline-runner kubeflow-pipelines-container-builder)

cat <<EOF
Expand Down
1 change: 1 addition & 0 deletions sdk/python/kfp/containers/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,4 @@
# See the License for the speci

from ._build_image_api import *
from ._container_builder import *
7 changes: 7 additions & 0 deletions sdk/python/kfp/containers/_build_image_api.py
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,13 @@ def build_image_from_working_dir(image_name: str = None, working_dir: str = None
timeout: Optional. The image building timeout in seconds.
base_image: Optional. The container image to use as the base for the new image. If not set, the Google Deep Learning Tensorflow CPU image will be used.
builder: Optional. An instance of ContainerBuilder or compatible class that will be used to build the image.
The default builder uses "kubeflow-pipelines-container-builder" service account in "kubeflow" namespace. It works with Kubeflow Pipelines clusters installed in "kubeflow" namespace using Google Cloud Marketplace or Standalone with version > 0.4.0.
If your Kubeflow Pipelines is installed in a different namespace, you should use ContainerBuilder(namespace='<your-kfp-namespace>', ...).
Depending on how you installed Kubeflow Pipelines, you need to configure your ContainerBuilder instance's namespace and service_account:
For clusters installed with Kubeflow >= 0.7, use ContainerBuidler(namespace='<your-user-namespace>', service_account='default-editor', ...). You can omit the namespace if you use kfp sdk from in-cluster notebook, it uses notebook namespace by default.
For clusters installed with Kubeflow < 0.7, use ContainerBuilder(service_account='default', ...).
For clusters installed using Google Cloud Marketplace or Standalone with version <= 0.4.0, use ContainerBuilder(namespace='<your-kfp-namespace>' service_account='default')
You may refer to https://www.kubeflow.org/docs/pipelines/installation/overview/ for more details about different installation options.
Returns:
The full name of the container image including the hash digest. E.g. gcr.io/my-org/my-image@sha256:86c1...793c.
Expand Down
22 changes: 16 additions & 6 deletions sdk/python/kfp/containers/_container_builder.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,10 @@
# See the License for the specific language governing permissions and
# limitations under the License.

__all__ = [
'ContainerBuilder',
]

import logging
import tarfile
import tempfile
Expand All @@ -22,7 +26,6 @@
GCS_STAGING_BLOB_DEFAULT_PREFIX = 'kfp_container_build_staging'
GCR_DEFAULT_IMAGE_SUFFIX = 'kfp_container'


def _get_project_id():
import requests
URL = "http://metadata.google.internal/computeMetadata/v1/project/project-id"
Expand Down Expand Up @@ -51,20 +54,27 @@ class ContainerBuilder(object):
"""
ContainerBuilder helps build a container image
"""
def __init__(self, gcs_staging=None, default_image_name=None, namespace=None):
def __init__(self, gcs_staging=None, default_image_name=None, namespace=None,
service_account='kubeflow-pipelines-container-builder'):
"""
Args:
gcs_staging (str): GCS bucket/blob that can store temporary build files,
default is gs://PROJECT_ID/kfp_container_build_staging.
default is gs://PROJECT_ID/kfp_container_build_staging. You have to
specify this when it doesn't run in cluster.
default_image_name (str): Target container image name that will be used by the build method if the target_image argument is not specified.
namespace (str): kubernetes namespace where the pod is launched,
namespace (str): Kubernetes namespace where the container builder pod is launched,
default is the same namespace as the notebook service account in cluster
or 'kubeflow' if not in cluster
or 'kubeflow' if not in cluster. If using the full Kubeflow
deployment and not in cluster, you should specify your own user namespace.
service_account (str): Kubernetes service account the pod uses for container building,
The default value is "kubeflow-pipelines-container-builder". It works with Kubeflow Pipelines clusters installed using Google Cloud Marketplace or Standalone with version > 0.4.0.
The service account should have permission to read and write from staging gcs path and upload built images to gcr.io.
"""
self._gcs_staging = gcs_staging
self._gcs_staging_checked = False
self._default_image_name = default_image_name
self._namespace = namespace
self._service_account = service_account

def _get_namespace(self):
if self._namespace is None:
Expand Down Expand Up @@ -134,7 +144,7 @@ def _generate_kaniko_spec(self, context, docker_filename, target_image):
],
'image': 'gcr.io/kaniko-project/executor@sha256:78d44ec4e9cb5545d7f85c1924695c89503ded86a59f92c7ae658afa3cff5400',
}],
'serviceAccountName': 'default'}
'serviceAccountName': self._service_account}
}
return content

Expand Down
28 changes: 25 additions & 3 deletions sdk/python/tests/compiler/container_builder_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -58,10 +58,32 @@ def test_generate_kaniko_yaml(self, mock_gcshelper):
test_data_dir = os.path.join(os.path.dirname(__file__), 'testdata')

# check
builder = ContainerBuilder(gcs_staging=GCS_BASE, default_image_name=DEFAULT_IMAGE_NAME, namespace='default')
builder = ContainerBuilder(gcs_staging=GCS_BASE,
default_image_name=DEFAULT_IMAGE_NAME,
namespace='default')
generated_yaml = builder._generate_kaniko_spec(docker_filename='dockerfile',
context='gs://mlpipeline/kaniko_build.tar.gz', target_image='gcr.io/mlpipeline/kaniko_image:latest')
context='gs://mlpipeline/kaniko_build.tar.gz',
target_image='gcr.io/mlpipeline/kaniko_image:latest')
with open(os.path.join(test_data_dir, 'kaniko.basic.yaml'), 'r') as f:
golden = yaml.safe_load(f)

self.assertEqual(golden, generated_yaml)
self.assertEqual(golden, generated_yaml)

def test_generate_kaniko_yaml_kubeflow(self, mock_gcshelper):
""" Test generating the kaniko job yaml for Kubeflow deployment """

# prepare
test_data_dir = os.path.join(os.path.dirname(__file__), 'testdata')

# check
builder = ContainerBuilder(gcs_staging=GCS_BASE,
default_image_name=DEFAULT_IMAGE_NAME,
namespace='user',
service_account='default-editor',)
generated_yaml = builder._generate_kaniko_spec(docker_filename='dockerfile',
context='gs://mlpipeline/kaniko_build.tar.gz',
target_image='gcr.io/mlpipeline/kaniko_image:latest',)
with open(os.path.join(test_data_dir, 'kaniko.kubeflow.yaml'), 'r') as f:
golden = yaml.safe_load(f)

self.assertEqual(golden, generated_yaml)
2 changes: 1 addition & 1 deletion sdk/python/tests/compiler/testdata/kaniko.basic.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ metadata:
sidecar.istio.io/inject: 'false'
spec:
restartPolicy: Never
serviceAccountName: default
serviceAccountName: kubeflow-pipelines-container-builder
containers:
- name: kaniko
image: gcr.io/kaniko-project/executor@sha256:78d44ec4e9cb5545d7f85c1924695c89503ded86a59f92c7ae658afa3cff5400
Expand Down
34 changes: 34 additions & 0 deletions sdk/python/tests/compiler/testdata/kaniko.kubeflow.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# Copyright 2018 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.


apiVersion: v1
kind: Pod
metadata:
generateName: kaniko-
namespace: user
annotations:
sidecar.istio.io/inject: 'false'
spec:
restartPolicy: Never
serviceAccountName: default-editor
containers:
- name: kaniko
image: gcr.io/kaniko-project/executor@sha256:78d44ec4e9cb5545d7f85c1924695c89503ded86a59f92c7ae658afa3cff5400
args: ["--cache=true",
"--dockerfile=dockerfile",
"--context=gs://mlpipeline/kaniko_build.tar.gz",
"--destination=gcr.io/mlpipeline/kaniko_image:latest",
"--digest-file=/dev/termination-log",
]
Empty file modified test/presubmit-tests-mkp.sh
100644 → 100755
Empty file.

0 comments on commit 7b0f2bc

Please sign in to comment.