Skip to content

Latest commit

 

History

History
 
 

computation

Computation

This module sets up the resources to run Metaflow steps on AWS Batch. One can modify how many resources we want to have available, as well as configure autoscaling

This module is not required to use Metaflow, as you can also run steps locally, or in a Kubernetes cluster instead.

To read more, see the Metaflow docs

Inputs

Name Description Type Default Required
batch_type AWS Batch Compute Type ('ec2', 'fargate', 'spot') string "ec2" no
compute_environment_ami_id The AMI ID to use for Batch Compute Environment EC2 instances. If not specified, defaults to the latest ECS optimised AMI. string null no
compute_environment_desired_vcpus Desired Starting VCPUs for Batch Compute Environment [0-16] for EC2 Batch Compute Environment (ignored for Fargate) number n/a yes
compute_environment_egress_cidr_blocks CIDR blocks to which egress is allowed from the Batch Compute environment's security group list(string)
[
"0.0.0.0/0"
]
no
compute_environment_instance_types The instance types for the compute environment as a comma-separated list list(string) n/a yes
compute_environment_max_vcpus Maximum VCPUs for Batch Compute Environment [16-96] number n/a yes
compute_environment_min_vcpus Minimum VCPUs for Batch Compute Environment [0-16] for EC2 Batch Compute Environment (ignored for Fargate) number n/a yes
compute_environment_spot_bid_percentage The maximum percentage of on-demand EC2 instance price to bid for spot instances when using the 'spot' AWS Batch Compute Type. number 100 no
compute_environment_user_data_base64 Base64 hash of the user data to use for Batch Compute Environment EC2 instances. string null no
database_password_secret_manager_arn The arn of the database password stored in AWS secrets manager string n/a yes
ecs_cluster_id The ID of an existing ECS cluster to run services on. If no cluster ID is specfied, a new cluster will be created. string null no
iam_partition IAM Partition (Select aws-us-gov for AWS GovCloud, otherwise leave as is) string "aws" no
metaflow_vpc_id ID of the Metaflow VPC this SageMaker notebook instance is to be deployed in string n/a yes
resource_prefix Prefix given to all AWS resources to differentiate between applications string n/a yes
resource_suffix Suffix given to all AWS resources to differentiate between environment and workspace string n/a yes
standard_tags The standard tags to apply to every AWS resource. map(string) n/a yes
subnet1_id The first private subnet used for redundancy string n/a yes
subnet2_id The second private subnet used for redundancy string n/a yes

Outputs

Name Description
METAFLOW_BATCH_JOB_QUEUE AWS Batch Job Queue ARN for Metaflow
batch_compute_environment_security_group_id The ID of the security group attached to the Batch Compute environment.
batch_job_queue_arn The ARN of the job queue we'll use to accept Metaflow tasks
ecs_execution_role_arn The IAM role that grants access to ECS and Batch services which we'll use as our Metadata Service API's execution_role for our Fargate instance
ecs_instance_role_arn This role will be granted access to our S3 Bucket which acts as our blob storage.