Vertex AI Pipeline - Container OP `set_cpu_limit` does not work with parameter_values nor at runtime #6681

SaschaHeyer · 2021-10-05T06:57:45Z

Hello Kubeflow Team,
Hello Google Team,

The container OP .set_cpu_limitonly works when the value is set explicit and not via parameter_values or at runtime

pipelines/sdk/python/kfp/dsl/_container_op.py

Line 378 in 4906ab2

def set_cpu_limit(

Reproduce

parameter_values: see steps to reproduce
runtime: see https://github.com/kubeflow/pipelines/blob/master/samples/core/resource_spec/runtime_resource_request.py

Environment

How did you deploy Kubeflow Pipelines (KFP)? Vertex AI Pipelines
kfp 1.8.4
kfp-pipeline-spec 0.1.11
kfp-server-api 1.7.0

Steps to reproduce

Not working

from kfp.v2.dsl import pipeline

@pipeline(name="reproduction",
              pipeline_root="ADD PIPELINE ROOT")
def pipeline(cpu_limit: str):
    train_op = train().set_cpu_limit(cpu_limit)

compiler.Compiler().compile(pipeline_func=pipeline,
        package_path='pipeline.json')

api_client = AIPlatformClient(
                project_id="ADD PROJECT",
                region="us-central1"
                )

response = api_client.create_run_from_job_spec(
    'pipeline.json',
      parameter_values={
          'cpu_limit': "16"
  }
)

Working

from kfp.v2.dsl import pipeline

@pipeline(name="reproduction",
              pipeline_root="ADD PIPELINE ROOT")
def pipeline():
    train_op = train().set_cpu_limit("16")

compiler.Compiler().compile(pipeline_func=pipeline,
        package_path='pipeline.json')

api_client = AIPlatformClient(
                project_id="ADD PROJECT",
                region="us-central1"
                )

response = api_client.create_run_from_job_spec(
    'pipeline.json'
)

Expected result

The CPU limits can be set via parameter_values

Looking forward to your feedback

The text was updated successfully, but these errors were encountered:

zijianjoy · 2021-10-08T00:48:19Z

cc @chensun

SaschaHeyer · 2021-11-03T11:28:01Z

Morning
any updates?

chensun · 2021-11-11T17:27:31Z

Hi @SaschaHeyer , this is indeed a known limitation and we plan to discuss the best solution for this in Q1/Q2 2022.

Can you help us understand what's your use case to set a dynamic value for cpu limit, and how critical is this feature to you? Thanks!

SaschaHeyer · 2021-11-23T17:51:17Z

Hi @chensun
Thanks a lot for your feedback.

I work for one of the biggest Google Cloud partners, we get this request regularly from our customers, at least once every 2 weeks.
Parameterizing the machine type (CPU and memory) can be really useful if you use the same pipeline just for different datasets and or hyperparameters (This way there is no need to re-compile).

Changing those hyperparameters also might require bigger machines.
For example, if you increase the batch size.

Currently, a re-compile of the pipeline is required. Would be useful if we could do this via parameter as well.

iuiu34 · 2021-11-30T16:02:47Z

in this line, would be nice also if when a task throws an kfp error for being out of memory, that
a) you can play with the memory-limit as a parameter as SaschaHeyer request, just re-runing the task, not the whole pipeline (though if cache is enabled this maybe is already solved)
b) does the upscale automatically and re-runs the task again

chensun · 2021-12-16T08:44:11Z

@SaschaHeyer Thanks for the context!

chensun · 2021-12-16T08:50:29Z

in this line, would be nice also if when a task throws an kfp error for being out of memory, that
a) you can play with the memory-limit as a parameter as SaschaHeyer request, just re-runing the task, not the whole pipeline (though if cache is enabled this maybe is already solved)

Yes, caching would help here if the upstream doesn't have any changes on their inputs.

b) does the upscale automatically and re-runs the task again

This might create some surprise billing issue :)

iuiu34 · 2021-12-16T10:12:57Z

This might create some surprise billing issue :)

yep, in case implemented, there should be an autoscale: bool = False argument in the function kfp.v2.compiler.Compiler().compile
But agree that option b): auto-scaling could have some dramatic problems for the user in terms of money money that option a) doesn't have.

ashrafgt · 2021-12-17T13:25:16Z

Huge +1 on this!

stale · 2022-04-17T07:26:46Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

iuiu34 · 2022-04-17T12:01:02Z

are plans to support this? or is explored in another ticket?

SaschaHeyer · 2022-12-01T14:35:58Z

Hi
are there any updates? This would be a huge benefit for re-using pipelines without the need to re-compile them.

saigirishgilly98 · 2022-12-01T14:56:26Z

Hi are there any updates? This would be a huge benefit for re-using pipelines without the need to re-compile them.

+++

I agree with @SaschaHeyer, we are building reusable pipeline templates with only data changing and depending on the data size, we would want to be able to configure the CPU and Memory for each of the components through pipeline params or any other way.

acarvalho2-wiq · 2024-04-26T04:45:47Z

Hi guys, do we have any updates on this?
I am also looking for exactly the same dynamic parameterisation of my pipeline.

github-actions · 2024-06-25T07:42:06Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions · 2024-07-17T07:41:34Z

This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it.

google-oss-prow · 2024-07-19T04:31:52Z

@entsarangi: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

entsarangi · 2024-07-19T04:33:07Z

This is an useful feature to have cpu_limit available via pipeline_params ? Any update or workaround that doesn't involve hard-coded values ?

tymorton · 2025-01-07T13:53:20Z

This has been resolved.
#11097

SaschaHeyer added area/backend kind/bug labels Oct 5, 2021

SaschaHeyer changed the title ~~Vertex AI Pipeline - Container OP set_cpu_limitdoes not work with parameter_values~~ Vertex AI Pipeline - Container OP set_cpu_limit does not work with parameter_values Oct 5, 2021

SaschaHeyer changed the title ~~Vertex AI Pipeline - Container OP set_cpu_limit does not work with parameter_values~~ Vertex AI Pipeline - Container OP set_cpu_limit does not work with parameter_values (at runtime) Oct 5, 2021

SaschaHeyer changed the title ~~Vertex AI Pipeline - Container OP set_cpu_limit does not work with parameter_values (at runtime)~~ Vertex AI Pipeline - Container OP set_cpu_limit does not work with parameter_values nor at runtime Oct 5, 2021

SaschaHeyer changed the title ~~Vertex AI Pipeline - Container OP set_cpu_limit does not work with parameter_values nor at runtime~~ Vertex AI Pipeline - Container OP set_cpu_limit does not work with **parameter_values** nor at **runtime** Oct 5, 2021

SaschaHeyer changed the title Vertex AI Pipeline - Container OP set_cpu_limit does not work with **parameter_values** nor at **runtime** Vertex AI Pipeline - Container OP set_cpu_limit does not work with parameter_values nor at runtime Oct 5, 2021

chensun added area/sdk and removed area/backend labels Nov 11, 2021

chensun self-assigned this Nov 11, 2021

stale bot added the lifecycle/stale The issue / pull request is stale, any activities remove this label. label Apr 17, 2022

stale bot removed the lifecycle/stale The issue / pull request is stale, any activities remove this label. label Apr 17, 2022

github-actions bot added the lifecycle/stale The issue / pull request is stale, any activities remove this label. label Jun 25, 2024

github-actions bot closed this as completed Jul 17, 2024

github-project-automation bot added this to KFP SDK Triage Aug 29, 2024

github-project-automation bot moved this to Closed in KFP SDK Triage Aug 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vertex AI Pipeline - Container OP `set_cpu_limit` does not work with parameter_values nor at runtime #6681

Vertex AI Pipeline - Container OP `set_cpu_limit` does not work with parameter_values nor at runtime #6681

SaschaHeyer commented Oct 5, 2021 •

edited

Loading

zijianjoy commented Oct 8, 2021

SaschaHeyer commented Nov 3, 2021

chensun commented Nov 11, 2021

SaschaHeyer commented Nov 23, 2021 •

edited

Loading

iuiu34 commented Nov 30, 2021

chensun commented Dec 16, 2021

chensun commented Dec 16, 2021

iuiu34 commented Dec 16, 2021 •

edited

Loading

ashrafgt commented Dec 17, 2021

stale bot commented Apr 17, 2022

iuiu34 commented Apr 17, 2022

SaschaHeyer commented Dec 1, 2022

saigirishgilly98 commented Dec 1, 2022

acarvalho2-wiq commented Apr 26, 2024

github-actions bot commented Jun 25, 2024

github-actions bot commented Jul 17, 2024

google-oss-prow bot commented Jul 19, 2024

entsarangi commented Jul 19, 2024

tymorton commented Jan 7, 2025

Vertex AI Pipeline - Container OP set_cpu_limit does not work with parameter_values nor at runtime #6681

Vertex AI Pipeline - Container OP set_cpu_limit does not work with parameter_values nor at runtime #6681

Comments

SaschaHeyer commented Oct 5, 2021 • edited Loading

Environment

Steps to reproduce

Expected result

zijianjoy commented Oct 8, 2021

SaschaHeyer commented Nov 3, 2021

chensun commented Nov 11, 2021

SaschaHeyer commented Nov 23, 2021 • edited Loading

iuiu34 commented Nov 30, 2021

chensun commented Dec 16, 2021

chensun commented Dec 16, 2021

iuiu34 commented Dec 16, 2021 • edited Loading

ashrafgt commented Dec 17, 2021

stale bot commented Apr 17, 2022

iuiu34 commented Apr 17, 2022

SaschaHeyer commented Dec 1, 2022

saigirishgilly98 commented Dec 1, 2022

acarvalho2-wiq commented Apr 26, 2024

github-actions bot commented Jun 25, 2024

github-actions bot commented Jul 17, 2024

google-oss-prow bot commented Jul 19, 2024

entsarangi commented Jul 19, 2024

tymorton commented Jan 7, 2025

Vertex AI Pipeline - Container OP `set_cpu_limit` does not work with parameter_values nor at runtime #6681

Vertex AI Pipeline - Container OP `set_cpu_limit` does not work with parameter_values nor at runtime #6681

SaschaHeyer commented Oct 5, 2021 •

edited

Loading

SaschaHeyer commented Nov 23, 2021 •

edited

Loading

iuiu34 commented Dec 16, 2021 •

edited

Loading