Skip to content
This repository has been archived by the owner on Jan 31, 2022. It is now read-only.

speedup development by using skaffold file sync #78

Open
jlewi opened this issue Dec 25, 2019 · 6 comments
Open

speedup development by using skaffold file sync #78

jlewi opened this issue Dec 25, 2019 · 6 comments

Comments

@jlewi
Copy link
Contributor

jlewi commented Dec 25, 2019

PR #77 contains skaffold.yaml files to continuously build and deploy the microservices.

Skaffold is building the containers on the cluster using Kaniko. The builds look like they are taking O(minutes) to rebuild.

We should be able to speed that up by taking advantage of skaffold's file sync feature to directly synchronize files to the containers and skipping the container rebuilds.

@issue-label-bot
Copy link

Issue-Label Bot is automatically applying the label kind/feature to this issue, with a confidence of 0.81. Please mark this comment with 👍 or 👎 to give our bot feedback!

Links: app homepage, dashboard and code for this bot.

@jlewi
Copy link
Contributor Author

jlewi commented Dec 26, 2019

I gave this a quick try and skaffold gives me the warning

WARN[0772] error adding dirty artifact to changeset: inferring syncmap for image gcr.io/issue-label-bot-dev/bot-worker: SyncMap is not supported by this builder 

Opend GoogleContainerTools/skaffold#3448

@jlewi
Copy link
Contributor Author

jlewi commented Dec 26, 2019

Using manual mode for skaffold file sync seems to work; Here's my sample stanza

    sync:
        manual:
        - src: 'py/code_intelligence/*.py'
          dest: '/'
        - src: 'py/label_microservice/*.py'
          dest: '/'        

I was relying on flask's reloader to reload things when the code changed. If instead of using flask we just have a binary running pubsub; we might need to run it in a subprocess and restart it ourselves.

Then again maybe we should continue to use flask to do things like a health check.

@jlewi
Copy link
Contributor Author

jlewi commented Dec 26, 2019

Also hit GoogleContainerTools/skaffold#3454

@jlewi
Copy link
Contributor Author

jlewi commented Dec 27, 2019

I created a simple file to run a program in a subprocess and auto restart it whenever some files change.
https://github.com/kubeflow/code-intelligence/blob/06125b0cd2e25e01b1ec67cf444987747f071600/py/code_intelligence/run_with_auto_restart.py

So far this appears to be working as expected with skaffold's file sync.

jlewi pushed a commit to jlewi/code-intelligence that referenced this issue Dec 27, 2019
…er models.

* This PR includes a number of changes that make it easier to spin up
  the issue embedding service as a microservice intended as an internal
  backend for other models/service.

* issue_embbedding service should not be exposed externally or do auth
  based on a token

  * Fix kubeflow#76

  * Doing auth just complicates matters and is unnecessary; if we want to expose
    it externally and secure it (e.g. with IAP) we can do so by using Kubeflow
    and putting it behind the Kubeflow ingress

  * Doing auth makes it harder to use the embedding service as backend because
    the client and server need to aggree on the API key used for authorization.

  * As a result delete the K8s resources related to the ingress and certificate

* Create a kustomize package for deploying it.

  * Using kustomize will make it easier to define different versions of the
    service corresponding to dev, staging, and prod environments

* Create a skaffold configuration for the issue embedding service

  * Skaffold makes it easy to continually rebuild and deploy the service;
    this is very useful for development.

  * Related to: kubeflow#78

* Cleanup the Dockerfile for the embedding service
  * We don't install requirements.txt
  * Need to install flask session.

* Simplify the naming scheme for the issue embedding service.

* The embedding service flask app can't run with debug mode turned on
  so raise an exception if debug mode is on.

* Fixes to Embedding service Dockerfile

  * Don't cache the model inside the dockerfile
  * Make the docker context the root of the repo as opposed to Issue_Embeddings
    This makes it possible to include files in /py in the docker container

  * It doesn't look like we need to pip install requirements.txt because the
    Dockerfile is explicitly installing the dependencies we need.

* Use port 80 for the kubernetes service; its more convenient to expose it on
  the common port rather than a unique service specific port.

Related to: kubeflow#70 combine universal label model
  and repo specific model.
jlewi pushed a commit to jlewi/code-intelligence that referenced this issue Dec 27, 2019
* Related to kubeflow#78 - Speedup development using
  skaffold file sync

* Skaffold can auto sync python files to a deployed container skipping the
  expensive image rebuilding and deployment steps.

* However, for that to work we need to restart our program when the files
  change.

* We can't rely on Flask's autoreloader because some of our models interact
  poorly with Flask's debug mode.

* This is a simple python program that launches another program in a subprocess.
  The program will also watch directories an automatically restart the
  subprocess if any file changes are detected.
k8s-ci-robot pushed a commit that referenced this issue Dec 31, 2019
* Related to #78 - Speedup development using
  skaffold file sync

* Skaffold can auto sync python files to a deployed container skipping the
  expensive image rebuilding and deployment steps.

* However, for that to work we need to restart our program when the files
  change.

* We can't rely on Flask's autoreloader because some of our models interact
  poorly with Flask's debug mode.

* This is a simple python program that launches another program in a subprocess.
  The program will also watch directories an automatically restart the
  subprocess if any file changes are detected.
k8s-ci-robot pushed a commit that referenced this issue Dec 31, 2019
…er models. (#80)

* This PR includes a number of changes that make it easier to spin up
  the issue embedding service as a microservice intended as an internal
  backend for other models/service.

* issue_embbedding service should not be exposed externally or do auth
  based on a token

  * Fix #76

  * Doing auth just complicates matters and is unnecessary; if we want to expose
    it externally and secure it (e.g. with IAP) we can do so by using Kubeflow
    and putting it behind the Kubeflow ingress

  * Doing auth makes it harder to use the embedding service as backend because
    the client and server need to aggree on the API key used for authorization.

  * As a result delete the K8s resources related to the ingress and certificate

* Create a kustomize package for deploying it.

  * Using kustomize will make it easier to define different versions of the
    service corresponding to dev, staging, and prod environments

* Create a skaffold configuration for the issue embedding service

  * Skaffold makes it easy to continually rebuild and deploy the service;
    this is very useful for development.

  * Related to: #78

* Cleanup the Dockerfile for the embedding service
  * We don't install requirements.txt
  * Need to install flask session.

* Simplify the naming scheme for the issue embedding service.

* The embedding service flask app can't run with debug mode turned on
  so raise an exception if debug mode is on.

* Fixes to Embedding service Dockerfile

  * Don't cache the model inside the dockerfile
  * Make the docker context the root of the repo as opposed to Issue_Embeddings
    This makes it possible to include files in /py in the docker container

  * It doesn't look like we need to pip install requirements.txt because the
    Dockerfile is explicitly installing the dependencies we need.

* Use port 80 for the kubernetes service; its more convenient to expose it on
  the common port rather than a unique service specific port.

Related to: #70 combine universal label model
  and repo specific model.
@psykidellic
Copy link

I hit the same issue today using the custom builder. If i set infer, it gives me the synmacp error. But doing manual worked. This is a simple Flask based app. Let me know if you need more info to work on this or some other issue, i can respond to.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants