-
Notifications
You must be signed in to change notification settings - Fork 61
GPU Type #419
Conversation
Codecov ReportPatch coverage has no change and project coverage change:
Additional details and impacted files@@ Coverage Diff @@
## master #419 +/- ##
==========================================
+ Coverage 75.92% 78.48% +2.55%
==========================================
Files 18 18
Lines 1458 1250 -208
==========================================
- Hits 1107 981 -126
+ Misses 294 212 -82
Partials 57 57
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Signed-off-by: Yee Hing Tong <[email protected]> Signed-off-by: Jeev B <[email protected]>
* Pass task execution metadata from agent Signed-off-by: Hongxin Liang <[email protected]> * Add doc Signed-off-by: Hongxin Liang <[email protected]> * Update protos/flyteidl/admin/agent.proto Co-authored-by: Kevin Su <[email protected]> Signed-off-by: Honnix <[email protected]> * Regenerate --------- Signed-off-by: Hongxin Liang <[email protected]> Signed-off-by: Honnix <[email protected]> Co-authored-by: Kevin Su <[email protected]> Signed-off-by: Jeev B <[email protected]>
* add tags to execution spec Signed-off-by: Kevin Su <[email protected]> * add tags to execution spec Signed-off-by: Kevin Su <[email protected]> * add comment Signed-off-by: Kevin Su <[email protected]> --------- Signed-off-by: Kevin Su <[email protected]> Signed-off-by: Jeev B <[email protected]>
Signed-off-by: Katrina Rogan <[email protected]> Signed-off-by: Jeev B <[email protected]>
Signed-off-by: Kevin Su <[email protected]> Signed-off-by: Jeev B <[email protected]>
Signed-off-by: Yee Hing Tong <[email protected]> Signed-off-by: Jeev B <[email protected]>
Signed-off-by: Yee Hing Tong <[email protected]> Signed-off-by: Jeev B <[email protected]>
Signed-off-by: Jeev B <[email protected]>
Signed-off-by: Daniel Rammer <[email protected]> Signed-off-by: Jeev B <[email protected]>
Signed-off-by: Jeev B <[email protected]>
Signed-off-by: Jeev B <[email protected]>
Signed-off-by: Jeev B <[email protected]>
Signed-off-by: Jeev B <[email protected]>
Signed-off-by: Jeev B <[email protected]>
|
||
// Additional metadata associated with resources to allocate to a task | ||
message ResourceMetadata { | ||
oneof accelerator_value { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why oneof? if it's not set, it'll just be nil, do you want to allow nullifying a field? if so maybe that should go into the override message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The thought was to support other accelerator types like TPUs in the future. One option would have been to just add all possible accelerator types as separate fields without the guarantee that oneof provides - that only one type is set at any given point in time.
I actually did not consider nullifying an accelerator specified in the task metadata, in the override. That does sound nice. If we were to do that, would it be a bad idea to add a accelerator_not_set
to the oneof?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since oneofs do not change wire format, I would keep it out and kick the can down the road when we add a different accelerator to decide then.
There isn't a great way to nullify in Protos unfortunately...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This reverts commit ef37788. Signed-off-by: Jeev B <[email protected]>
* Revert "GPU Type (#419)" This reverts commit ef37788. Signed-off-by: Jeev B <[email protected]> * Restore .readthedocs.yml Signed-off-by: Jeev B <[email protected]> --------- Signed-off-by: Jeev B <[email protected]>
* add field Signed-off-by: Yee Hing Tong <[email protected]> Signed-off-by: Jeev B <[email protected]> * Pass task execution metadata from agent (#422) * Pass task execution metadata from agent Signed-off-by: Hongxin Liang <[email protected]> * Add doc Signed-off-by: Hongxin Liang <[email protected]> * Update protos/flyteidl/admin/agent.proto Co-authored-by: Kevin Su <[email protected]> Signed-off-by: Honnix <[email protected]> * Regenerate --------- Signed-off-by: Hongxin Liang <[email protected]> Signed-off-by: Honnix <[email protected]> Co-authored-by: Kevin Su <[email protected]> Signed-off-by: Jeev B <[email protected]> * Add tags to execution spec (#414) * add tags to execution spec Signed-off-by: Kevin Su <[email protected]> * add tags to execution spec Signed-off-by: Kevin Su <[email protected]> * add comment Signed-off-by: Kevin Su <[email protected]> --------- Signed-off-by: Kevin Su <[email protected]> Signed-off-by: Jeev B <[email protected]> * Correct comment for array job max parallelism (#431) Signed-off-by: Katrina Rogan <[email protected]> Signed-off-by: Jeev B <[email protected]> * Add the scalar to the operand (#427) Signed-off-by: Kevin Su <[email protected]> Signed-off-by: Jeev B <[email protected]> * add selector Signed-off-by: Yee Hing Tong <[email protected]> Signed-off-by: Jeev B <[email protected]> * move selectors from container to task metadata Signed-off-by: Yee Hing Tong <[email protected]> Signed-off-by: Jeev B <[email protected]> * drop only_preferred Signed-off-by: Jeev B <[email protected]> * Updating boilerplate to lock golangci-lint version (#435) Signed-off-by: Daniel Rammer <[email protected]> Signed-off-by: Jeev B <[email protected]> * add unpartitioned selector Signed-off-by: Jeev B <[email protected]> * refactor Signed-off-by: Jeev B <[email protected]> * refactor Signed-off-by: Jeev B <[email protected]> * fix oneof names Signed-off-by: Jeev B <[email protected]> * add build.os for read the docs Signed-off-by: Jeev B <[email protected]> --------- Signed-off-by: Yee Hing Tong <[email protected]> Signed-off-by: Jeev B <[email protected]> Signed-off-by: Hongxin Liang <[email protected]> Signed-off-by: Honnix <[email protected]> Signed-off-by: Kevin Su <[email protected]> Signed-off-by: Katrina Rogan <[email protected]> Signed-off-by: Daniel Rammer <[email protected]> Co-authored-by: Honnix <[email protected]> Co-authored-by: Kevin Su <[email protected]> Co-authored-by: Kevin Su <[email protected]> Co-authored-by: Katrina Rogan <[email protected]> Co-authored-by: Jeev B <[email protected]> Co-authored-by: Dan Rammer <[email protected]>
* Revert "GPU Type (#419)" This reverts commit 7bd98a9. Signed-off-by: Jeev B <[email protected]> * Restore .readthedocs.yml Signed-off-by: Jeev B <[email protected]> --------- Signed-off-by: Jeev B <[email protected]>
TL;DR
Adds support for passing along extra metadata related to resources (e.g. accelerator type) to the backend
Type
Are all requirements met?
Complete description
How did you fix the bug, make the feature etc. Link to any design docs etc
Tracking Issue
Remove the 'fixes' keyword if there will be multiple PRs to fix the linked issue
fixes https://github.com/flyteorg/flyte/issues/
Follow-up issue
NA
OR
https://github.com/flyteorg/flyte/issues/