-
Notifications
You must be signed in to change notification settings - Fork 67
Future of metadata development and deployment - standalone? Only as part of KFP #225
Comments
quick question, here "metadata deployment" means google/ml-metadata or KF metadata? If google/ml-metadata, then it's already in KFP (google/mlmd provided a gRPC server). I think we don't have plan/resource to continue the KF metadata, right? I may lack of knowledge/info on the context. |
I think #217 (comment) is the only planned future feature. Other than that, keeping current status and asking for community help is the best I can forsee. |
If we don't plan to extend it, and it becomes redundant with google/ml-metadata, and that's whre KFP is focussed, best to bring it up in community meeting to decide on the future |
If #217 is the only planned feature then what does this mean for creating a generic metadata story? When people deploy Kubeflow pipelines do they get:
/cc @paveldournov |
With current status, both 1. and 2. mentioned above are true. If we have bandwidth, it's better metadata UI is keeping maintained separately, but I don't think that's the case now. |
Issue-Label Bot is automatically applying the labels:
Please mark this comment with 👍 or 👎 to give our bot feedback! |
I believe when you install the full KFP there are two different URLs for the UI for the metadata store
Are these both pointing at the same UI service or are they two different servers? I suspect they are two different servers but I could be wrong. |
They are two different servers. Some code is reused in kubeflow/frontend, but the codebase are built, distributed completely separatedly. |
Yes these are two different code bases (kubeflow/metadata) and
(kubeflow/pipelines) both which import MLMD Lineage from
(kubeflow/frontend).
Ideally pipelines would not have showed the link to the artifact store when
running in iframe mode (within central-dashboard). But currently both UIs
display the same information
…On Mon, May 11, 2020 at 4:28 PM Yuan (Bob) Gong ***@***.***> wrote:
They are two different servers. Some code is reused in kubeflow/frontend,
but the codebase are built, distributed completely separatedly.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#225 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABIOV2QLGUOWLJJH6VOFG43RRCC27ANCNFSM4MSMVWDA>
.
--
—
Yours Sincerely,
Apoorv Verma.
|
Here's my understanding:
So to summarize I think there several components
|
I think a major feature lineage tracking was introduced with 1.0
|
Some clarifications on current status
Google Metadata repo also provides a grpc server for interacting with metadata https://github.com/kubeflow/manifests/blob/master/metadata/base/metadata-deployment.yaml#L73 This repository made a REST server on top of it (I'm not sure about technical details, it could be a wrapper on metadata client or the grpc server.) KFP UI and this repository are already reusing shared components in kubeflow/frontend for the lineage view, but not yet for lists. and both repos agree on the same schema |
We do use metadata for some metrics tracking in non kfp projects. The reason this is more like a project for KFP is because we don't have experiment concept for other workloads, For example, user has to use SDK manually in their distributed training operator or notebook to log params or metrics. Visualization is limited as well. Even the adoption of this project is not high at this moment, I hope to have it separate and well designed. it will become more important once we have generic experiment concepts across kubeflow project. |
Current KFP is using google/metadata repo for MLMD stuff. It's deployed as a gRPC server which connects with the DB. Pipeline tasks/steps calls the gRPC server to access data. |
/cc @zhitaoli |
/cc @aronchick |
If I recall correctly this repository might have originally been providing the following functionality on top of TFX-metadata
Some of this functionality might no longer be needed I think tfx-metadata might support GRPC. Regarding the UI; I believe as @avdaredevil mentioned above some of the frontend code has been refactored into reusable libraries in kubeflow/frontend. I'm unclear to what extent the KFP UI and metadata UIs have been updated to use those shared libraries. /cc @zhenghuiwang |
I think we don't have a replacement for these items:
The following might not be required any more
Current status, kubeflow/pipelines is using those shared libraries entirely, I'm not sure about kubeflow/metadata. |
@neuromage @Bobgy I thought KFP was defining some higher level schemas? |
I filed #250 to get rid of the standalone metadata UI. Its lagging behind the KFP metadata UI and noone seems to be maintaining. Regarding the SDK; I stumbled upon |
/kind feature
What is the future of metadata deployment?
There are currently at least two variants of metadata
I think the differences might pertain mostly to the UI. I think KFP ships a UI for metadata integrated into the KFP UI but I think the backend might be the same.
I think the net effect is that a lot of development is happening in the KFP UI and the generic metadata UI is lagging behind; e.g. #217 is tracking upstreaming changes for lineage that are in KFP UI but not metadata UI.
I think metadata is largely based on mlmd which is developed in google/ml-metadata
What's the path forward for providing a metadata story?
/cc @neuromage @Bobgy @rmgogogo @avdaredevil @zhenghuiwang
The text was updated successfully, but these errors were encountered: