Skip to content
This repository has been archived by the owner on Oct 9, 2023. It is now read-only.

Add quality of service logic to admin #107

Merged
merged 6 commits into from
Jul 22, 2020
Merged

Add quality of service logic to admin #107

merged 6 commits into from
Jul 22, 2020

Conversation

katrogan
Copy link
Contributor

@katrogan katrogan commented Jul 8, 2020

TL;DR

This PR implements translation from a user-specified quality of service to a queueing budget used for cost optimization.

Users can specify the quality of service for an execution (in order of decreasing specificity)

  • At CreateExecution request time
  • In the LaunchPlan spec
  • In the Workflow spec
  • As an overridable MatchableResource

System administrators can specify default quality of service specs for different tiers and default tiers for different domains.

Type

  • Bug Fix
  • Feature
  • Plugin

Are all requirements met?

  • Code completed
  • Smoke tested
  • Unit tests added
  • Code documentation added
  • Any pending items have an associated Issue

Complete description

https://docs.google.com/document/d/1rYqr01x0XL6n1AdmEOUbTIGeS9Z2hJCcnG5NMy8aQ68/edit#

Tracking Issue

flyteorg/flyte#343

Follow-up issue

NA

@codecov-commenter
Copy link

codecov-commenter commented Jul 8, 2020

Codecov Report

Merging #107 into master will decrease coverage by 0.07%.
The diff coverage is 64.64%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #107      +/-   ##
==========================================
- Coverage   62.60%   62.52%   -0.08%     
==========================================
  Files         102      104       +2     
  Lines        7418     7587     +169     
==========================================
+ Hits         4644     4744     +100     
- Misses       2228     2286      +58     
- Partials      546      557      +11     
Flag Coverage Δ
#unittests 62.52% <64.64%> (-0.08%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/workflowengine/impl/propeller_executor.go 52.20% <ø> (ø)
pkg/runtime/quality_of_service_provider.go 23.07% <23.07%> (ø)
pkg/runtime/configuration_provider.go 51.72% <33.33%> (-2.13%) ⬇️
pkg/manager/impl/executions/quality_of_service.go 64.34% <64.34%> (ø)
pkg/manager/impl/execution_manager.go 68.67% <85.71%> (+0.03%) ⬆️
pkg/config/serverconfig_flags.go 70.83% <100.00%> (+3.39%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6d15a2b...ae66d06. Read the comment docs.


// If nothing in the hierarchy has set the quality of service, see if an override exists in the matchable attributes
// resource table.
resource, err := q.resourceManager.GetResource(ctx, interfaces.ResourceRequest{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we only do this if qualityOfServiceTier == UNDEFINED ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good call, thanks! refactored this before opening the PR and missed the check here

@katrogan
Copy link
Contributor Author

friendly ping @EngHabu @migueltol22

@@ -159,3 +159,16 @@ cluster_resources:
valueFrom:
env: SHELL
refresh: 3s
qualityOfService:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a fan of this name - I just don't like the connotation of a low quality of service. Is it too late to change this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not used so technically i can go make a breaking change :) any suggestions instead? basic/baseline?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually we've had bad luck with supposedly no-op breaking changes so I would rather not (this is checked into flyteidl, unfortunately)

// Config values should always be vetted so there's no need to check the error from conversion.
duration, _ := ptypes.Duration(executionValues.QueueingBudget)

return QualityOfServiceSpec{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add a debug log line to each case so we can more easily understand what was used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -0,0 +1,26 @@
package mocks
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why doesn't this file have those generated file comments? is this not generated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope, see flyteorg/flyte#149

wild-endeavor
wild-endeavor previously approved these changes Jul 21, 2020
@katrogan
Copy link
Contributor Author

@migueltol22 @EngHabu

EngHabu
EngHabu previously approved these changes Jul 22, 2020
return &workflowengineInterfaces.ExecutionInfo{
Cluster: testCluster,
}, nil
})
//qosProvider := runtimeMocks.NewMockQualityOfServiceProvider()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: deadcode... delete?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, thanks

@katrogan katrogan dismissed stale reviews from EngHabu and wild-endeavor via ae66d06 July 22, 2020 19:57
@katrogan katrogan merged commit 06d9b79 into master Jul 22, 2020
eapolinario pushed a commit that referenced this pull request Sep 6, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants