Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] support setting extended resources for array node map tasks #2592

Merged
merged 2 commits into from
Jul 22, 2024

Conversation

pvditt
Copy link
Contributor

@pvditt pvditt commented Jul 20, 2024

Tracking issue

fixes: https://linear.app/unionai/issue/COR-1565/problem-with-array-node-map-tasks-and-gpu-accelerators-pod-toleration

Why are the changes needed?

gpu_accelerator is not set for tasks that are subtasks of ArrayNodes

(also impacts legacy map tasks - if we want to fix that too)

What changes were proposed in this pull request?

Return the underlying python function task's extended resources to be set when serializing an array node map task

How was this patch tested?

  • added unit test

Setup process

Confirmed that the GPU Accelerator was set under the task's extended resources

from flytekit import Resources, task, workflow, map_task
from flytekit.extras.accelerators import T4, V100, L4, GPUAccelerator


@task(
    limits=Resources(gpu="1"),
    interruptible=False,
    accelerator=GPUAccelerator("nvidia-l4"),
)
def with_defined_gpu_accelerator_l4(input: str) -> str:
    return f"Hello {input}!"


@workflow
def wf() -> str:
    return with_defined_gpu_accelerator_l4(input="1")


@workflow
def wf1():
    map_task(with_defined_gpu_accelerator_l4)(input=["1", "2", "3"])

Generated flyteworkflow for the ArrayNode subtask

Tasks:
  resource_type:TASK  project:"flytesnacks"  domain:"development"  name:"tests.flytekit.integration.map_task_issue.map_with_defined_gpu_accelerator_l4_1_6b3bd0353da5de6e84d7982921ead2b3-arraynode"  version:"OLIadKy0uixtSn6C-3Y4FA":
    Container:
      Args:
        pyflyte-fast-execute
        --additional-distribution
        s3://my-s3-bucket/flytesnacks/development/TND7O55TRQU7DOG45ZVB2NN73I======/script_mode.tar.gz
        --dest-dir
        .
        --
        pyflyte-map-execute
        --inputs
        {{.input}}
        --output-prefix
        {{.outputPrefix}}
        --raw-output-data-prefix
        {{.rawOutputDataPrefix}}
        --checkpoint-path
        {{.checkpointOutputPrefix}}
        --prev-checkpoint
        {{.prevCheckpointPrefix}}
        --resolver
        flytekit.core.array_node_map_task.ArrayNodeMapTaskResolver
        --
        vars
        
        resolver
        flytekit.core.python_auto_container.default_task_resolver
        task-module
        tests.flytekit.integration.map_task_issue
        task-name
        with_defined_gpu_accelerator_l4_1
      Image:  cr.flyte.org/flyteorg/flytekit:py3.11-latest
      Resources:
        Limits:
          Name:   CPU
          Value:  500m
          Name:   MEMORY
          Value:  500Mi
          Name:   GPU
          Value:  1
        Requests:
          Name:   CPU
          Value:  500m
          Name:   MEMORY
          Value:  500Mi
          Name:   GPU
          Value:  1
    Custom:
      Min Success Ratio:  1
    Extended Resources:
      Gpu Accelerator:
        Device:  nvidia-l4
    Id:
      Domain:         development
      Name:           tests.flytekit.integration.map_task_issue.map_with_defined_gpu_accelerator_l4_1_6b3bd0353da5de6e84d7982921ead2b3-arraynode
      Project:        flytesnacks
      Resource Type:  TASK
      Version:        OLIadKy0uixtSn6C-3Y4FA
    Interface:
      Inputs:
        Variables:
          Input:
            Type:
              Collection Type:
                Simple:  STRING
      Outputs:
        Variables:
          o0:
            Type:
              Collection Type:
                Simple:  STRING
    Metadata:
      Interruptible:  false
      Retries:
      Runtime:
        Flavor:         python
        Type:           FLYTE_SDK
        Version:        1.12.0b7.dev4+g1c5e56358.d20240506
    Task Type Version:  1
    Type:               python-task

Screenshots

Check all the applicable boxes

  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

Related PRs

Docs link

@pvditt
Copy link
Contributor Author

pvditt commented Jul 20, 2024

Do we also want to fix this for legacy map tasks? Really simple fix, was just unsure what the procedure is with fixing bugs in legacy entities

@eapolinario
Copy link
Collaborator

Do we also want to fix this for legacy map tasks? Really simple fix, was just unsure what the procedure is with fixing bugs in legacy entities

No, let's leave legacy map tasks alone.

@eapolinario eapolinario merged commit 8ee5281 into master Jul 22, 2024
46 of 47 checks passed
Mecoli1219 pushed a commit to Mecoli1219/flytekit that referenced this pull request Jul 27, 2024
mao3267 pushed a commit to mao3267/flytekit that referenced this pull request Jul 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants