Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let pipelines use model from PVC #87

Open
piotrpdev opened this issue Sep 21, 2023 · 4 comments
Open

Let pipelines use model from PVC #87

piotrpdev opened this issue Sep 21, 2023 · 4 comments
Assignees
Labels
feature/pipelines Support for MLOps pipelines that package and deliver models to the Edge good first issue Good for newcomers kind/documentation Improvements or additions to documentation kind/enhancement New feature or request priority/high Important issue that needs to be resolved asap. Releases should not have too many o

Comments

@piotrpdev
Copy link
Member

piotrpdev commented Sep 21, 2023

Description

You could provide a local model back in #18 (piotrpdev/azureml-model-to-edge@626d5a5) but that was replaced for the S3 method (and also Git in #57 soon). Ideally a single task should handle getting the model and take some input for how e.g. using Git, S3, Azure, PVC (local), etc. Modifying our existing kserve-download-model Tekton Task is probably best for this.

Since no downloading needs to actually be done for the PVC (local) scenario, you could maybe just skip the kserve-download-model Task all together using a Tekton when guard. However, I think modifying the Task to move the model from the provided PVC to the buildah-cache workspace is better, especially since we're probably going to make each PipelineRun use its own PVC soon (using volumeClaimTemplate).


Isn't downloading from Git enough?

The PoC phase two requirements specify using a model stored in a PVC as a 'P0: must have' [1]. When RHODS integration and the new Model Registry [2] comes into play this will probably be an important feature.

A/C

  • Provide template PVC file which uses best practices
  • Easy and documented way of uploading a local model to the PVC e.g. using a ubi9 pod and oc cp
  • Have a step/task that can copy the model from one PV (e.g. one being used by a Jupyter Notebook) to the PV/workspace being used by the pipelines.
  • azureml-container-pipeline still works when using:
    • PVC (local model)
    • S3
    • Git
@piotrpdev piotrpdev added good first issue Good for newcomers kind/documentation Improvements or additions to documentation kind/enhancement New feature or request priority/high Important issue that needs to be resolved asap. Releases should not have too many o labels Sep 21, 2023
@piotrpdev piotrpdev moved this from Backlog to Todo in AI Edge Tracking Sep 21, 2023
@jackdelahunt jackdelahunt self-assigned this Sep 25, 2023
@adelton
Copy link
Contributor

adelton commented Sep 26, 2023

You could provide a local model back in #18 (piotrpdev/azureml-model-to-edge@626d5a5) but that was replaced for the S3 method (and also Git in #57 soon).

I wonder -- if this is a PoC, shouldn't it showcase all methods? Have three (or how many needed) models and applications, one with model in the git, one in S3 pulled in in build-time, and one in S3 pulled in in run-time? Unless we have a strong reason to avoid any of those approaches, we shouldn't be replacing them, we should be adding the new options.

And in documentation explain how they differ.

Ideally a single task should handle getting the model and take some input for how e.g. using Git, S3, Azure, PVC (local), etc.

Overall though, this is not Edge-specific, is it? You might want something like that in the general Open Data Hub case as well.

@piotrpdev
Copy link
Member Author

piotrpdev commented Sep 26, 2023

we shouldn't we replacing them, we should be adding the new options.

I agree, I think I removed the local model method by mistake or assumed we weren't going to use it at the time.

You might want something like that in the general OpenDataHub case as well.

Can you elaborate on this please?

@adelton
Copy link
Contributor

adelton commented Sep 26, 2023

The various approaches to connecting models to runtimes might go to whatever demos / guides / learning materials that we have about using Open Data Hub and building pipelines there. We we have a place to carry the information outside of this ai-edge repo, we wouldn't need to focus on the wealth of options here.

@piotrpdev piotrpdev mentioned this issue Oct 3, 2023
3 tasks
@piotrpdev
Copy link
Member Author

piotrpdev commented Oct 3, 2023

I'm sorry if I didn't make it clear enough, but for the issue to be resolved the pipelines need to be able to use a model from any PVC. Ideally this means we:

  1. Provide a PVC template along with a way to upload a local model to it (like Local model upload to PVC #115)
  2. Have a step/task that can copy the model from one PV (e.g. one being used by a Jupyter Notebook or 1.) to the PV/workspace being used by the pipelines.

(I updated the issue A/C to reflect this)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature/pipelines Support for MLOps pipelines that package and deliver models to the Edge good first issue Good for newcomers kind/documentation Improvements or additions to documentation kind/enhancement New feature or request priority/high Important issue that needs to be resolved asap. Releases should not have too many o
Projects
Status: In Progress
Status: No status
Status: No status
Development

No branches or pull requests

4 participants