Skip to content

Commit

Permalink
fix: update default aux container limits and instance types (#8959)
Browse files Browse the repository at this point in the history
  • Loading branch information
erikwilson authored Mar 6, 2024
1 parent c7e5d43 commit dcaa893
Show file tree
Hide file tree
Showing 8 changed files with 15 additions and 15 deletions.
4 changes: 2 additions & 2 deletions harness/determined/deploy/aws/templates/efs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ Parameters:
AuxAgentInstanceType:
Type: String
Description: Instance Type of agents in the auxiliary resource pool
Default: t2.xlarge
Default: m5.xlarge

ComputeAgentInstanceType:
Type: String
Expand Down Expand Up @@ -111,7 +111,7 @@ Parameters:
MaxAuxContainersPerAgent:
Type: Number
Description: Maximum number of CPU containers to launch on agents in the default auxiliary resource pool.
Default: 100
Default: 16

MaxIdleAgentPeriod:
Type: String
Expand Down
4 changes: 2 additions & 2 deletions harness/determined/deploy/aws/templates/fsx.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ Parameters:
AuxAgentInstanceType:
Type: String
Description: Instance Type of agents in the auxiliary resource pool
Default: t2.xlarge
Default: m5.xlarge

ComputeAgentInstanceType:
Type: String
Expand Down Expand Up @@ -111,7 +111,7 @@ Parameters:
MaxAuxContainersPerAgent:
Type: Number
Description: Maximum number of CPU containers to launch on agents in the default auxiliary resource pool.
Default: 100
Default: 16

MaxIdleAgentPeriod:
Type: String
Expand Down
4 changes: 2 additions & 2 deletions harness/determined/deploy/aws/templates/govcloud.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ Parameters:
AuxAgentInstanceType:
Type: String
Description: Instance Type of agents in the auxiliary resource pool
Default: t2.xlarge
Default: m5.xlarge

ComputeAgentInstanceType:
Type: String
Expand Down Expand Up @@ -77,7 +77,7 @@ Parameters:
MaxAuxContainersPerAgent:
Type: Number
Description: Maximum number of CPU containers to keep running on agents in the CPU resource pool
Default: 100
Default: 16

MaxIdleAgentPeriod:
Type: String
Expand Down
4 changes: 2 additions & 2 deletions harness/determined/deploy/aws/templates/lore.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ Parameters:
AuxAgentInstanceType:
Type: String
Description: Instance Type of agents in the auxiliary resource pool
Default: t2.xlarge
Default: m5.xlarge

ComputeAgentInstanceType:
Type: String
Expand Down Expand Up @@ -111,7 +111,7 @@ Parameters:
MaxAuxContainersPerAgent:
Type: Number
Description: Maximum number of CPU containers to launch on agents in the default auxiliary resource pool.
Default: 100
Default: 16

MaxIdleAgentPeriod:
Type: String
Expand Down
4 changes: 2 additions & 2 deletions harness/determined/deploy/aws/templates/secure.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@ Parameters:
AuxAgentInstanceType:
Type: String
Description: Instance Type of agents in the auxiliary resource pool
Default: t2.xlarge
Default: m5.xlarge

ComputeAgentInstanceType:
Type: String
Expand Down Expand Up @@ -132,7 +132,7 @@ Parameters:
MaxAuxContainersPerAgent:
Type: Number
Description: Maximum number of CPU containers to launch on agents in the default auxiliary resource pool.
Default: 100
Default: 16

MaxIdleAgentPeriod:
Type: String
Expand Down
4 changes: 2 additions & 2 deletions harness/determined/deploy/aws/templates/simple-rds.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ Parameters:
AuxAgentInstanceType:
Type: String
Description: Instance Type of agents in the auxiliary resource pool
Default: t2.xlarge
Default: m5.xlarge

ComputeAgentInstanceType:
Type: String
Expand Down Expand Up @@ -116,7 +116,7 @@ Parameters:
MaxAuxContainersPerAgent:
Type: Number
Description: Maximum number of CPU containers to launch on agents in the default auxiliary resource pool.
Default: 100
Default: 16

MaxIdleAgentPeriod:
Type: String
Expand Down
4 changes: 2 additions & 2 deletions harness/determined/deploy/aws/templates/simple.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ Parameters:
AuxAgentInstanceType:
Type: String
Description: Instance Type of agents in the auxiliary resource pool
Default: t2.xlarge
Default: m5.xlarge

ComputeAgentInstanceType:
Type: String
Expand Down Expand Up @@ -103,7 +103,7 @@ Parameters:
MaxAuxContainersPerAgent:
Type: Number
Description: Maximum number of CPU containers to launch on agents in the default auxiliary resource pool.
Default: 100
Default: 16

MaxIdleAgentPeriod:
Type: String
Expand Down
2 changes: 1 addition & 1 deletion harness/determined/deploy/gcp/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ class defaults:
GPU_NUM = 4
GPU_TYPE = "nvidia-tesla-t4"
MASTER_INSTANCE_TYPE = "n1-standard-2"
MAX_AUX_CONTAINERS_PER_AGENT = 100
MAX_AUX_CONTAINERS_PER_AGENT = 16
MAX_IDLE_AGENT_PERIOD = "10m"
MAX_AGENT_STARTING_PERIOD = "20m"
OPERATION_TIMEOUT_PERIOD = "5m"
Expand Down

0 comments on commit dcaa893

Please sign in to comment.