-
Notifications
You must be signed in to change notification settings - Fork 312
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Signed-off-by: Elveskevtar <[email protected]>
- Loading branch information
1 parent
99689bd
commit a1f5d1d
Showing
16 changed files
with
1,300 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
FROM lambci/lambda:build-python2.7 | ||
RUN yum install libffi-devel openssl-devel | ||
RUN mkdir /var/package | ||
ADD requirements-lambda.txt /var/task |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,170 @@ | ||
# CfnCluster Step Functions | ||
|
||
CfnCluster Step Function is a state management solution for deploying high-performance computing (HPC) CfnClusters in an environment with a configurable state machine. This allows our customers to not only run jobs based on particular state of previous job executions, but it also provides real-time visualizations through AWS Step Functions. Additionally, the Step Function state machine handles the setup and teardown process during execution so that customers can focus on their workloads instead of the compute infastructure. | ||
|
||
## Usage | ||
|
||
* Dependencies: | ||
* `docker` installed | ||
* `aws-cli` installed | ||
* Ensure that your AWS credentials are properly configured | ||
* Visit the [AWS Documentation](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-started.html) for more information | ||
|
||
``` | ||
$ pip install -r requirements.txt | ||
$ ./deploy.py --bucket <name> --region <AWS region> --config <file> --jobs <file> | ||
``` | ||
|
||
To Run Step Function: | ||
|
||
1. Wait for CloudFormation stack to deploy | ||
2. Click the link generated by the deploy.py script which links to the [AWS Step Functions Console](https://console.aws.amazon.com/states) | ||
3. Input format: | ||
``` | ||
{ | ||
"cluster_name": "<cluster name>" | ||
} | ||
``` | ||
4. Click `Start Execution` | ||
|
||
## How to Specify Jobs | ||
|
||
Jobs are specified in a configuration file whose path is passed to the `--jobs` or `-j` parameter. An example of a given configuration file can be seen below: | ||
|
||
``` | ||
[order] | ||
sequential = job1, job2 | ||
[job job1] | ||
name = thejobtouse.sh | ||
s3_uri = s3://job-bucket/thejobtouse.sh | ||
[job job2] | ||
handler = a_real_job.sh | ||
local_path = /path/where/job/lives | ||
wait_time = 30 | ||
``` | ||
|
||
### Order Section [order] | ||
|
||
Required Parameters: | ||
|
||
`sequential`: List of job names to schedule sequentially given in the form of a comma separated list; order matters | ||
|
||
``` | ||
[order] | ||
sequential = goodjob, badjob, otherjob | ||
``` | ||
|
||
OR | ||
|
||
`parallel`: List of job names to schedule in parallel given in the form of a comma separated list; order does not matter | ||
|
||
``` | ||
[order] | ||
parallel = goodjob, badjob, otherjob | ||
``` | ||
|
||
**Important**: either `sequential` or `parallel` must be specified; not both | ||
|
||
### Job Section [job <job_name>] | ||
|
||
Required Parameters: | ||
|
||
`s3_uri`: An S3 URI pointing to the script or folder to package for job scheduling or execution | ||
|
||
``` | ||
[job apple] | ||
s3_uri = s3://thebucket/thefolder | ||
handler = thescript | ||
``` | ||
|
||
OR | ||
|
||
`local_path`: A local path (relative to the jobs config file or absolute) pointing to the script or folder to package for job scheduling and execution | ||
|
||
``` | ||
[job banana] | ||
local_path = /path/to/the/script | ||
handler = script | ||
``` | ||
|
||
AND | ||
|
||
`handler`: The path and name of the script to run. Since the `s3_uri` and `local_path` can both be directories, this is to specify which file to send off to the scheduler | ||
|
||
``` | ||
[job carrot] | ||
local_path = relative/path/project | ||
handler = script/path/in/project.sh | ||
``` | ||
|
||
**Important**: either `s3_uri` or `local_path` must be specified; not both | ||
|
||
Optional Parameters: | ||
|
||
`wait_time`: How long to wait between rechecking the status of the job to see if it's completed; default = 10; range 1-240 due to scheduler limitations | ||
|
||
``` | ||
[job donut] | ||
s3_uri = s3://bucket/script | ||
handler = script | ||
wait_time = 240 | ||
``` | ||
|
||
## Arguments | ||
|
||
### `--config` or `-c` | ||
|
||
Specifies the CfnCluster configuration file to use. This will be utilized by the step function to deploy user defined clusters. For more information on how to configure CfnCluster visit the [CfnCluster Documentation](http://cfncluster.readthedocs.io/en/latest/getting_started.html#configuring-cfncluster). | ||
|
||
### `--bucket` or `-b` | ||
|
||
Specifies the name of the S3 bucket to be used to store the source code that creates and terminates the CfnClusters. **Important**: if the bucket already exists, it must be in the same region as that given by the --region argument. If it does not exist, it will be made for you in the specified region. | ||
|
||
### `--jobs` or `-j` | ||
|
||
Specifies the job configuration file to use. This will be used to package your jobs for use in the Step Function. | ||
|
||
## Optional Arguments | ||
|
||
### `--region` or `-r` | ||
|
||
Specifies the AWS region to deploy the CloudFormation stack that contains the Step Function and corresponding source code to deploy and terminate CfnClusters. Defaults to us-east-1. | ||
|
||
### `--stack-name` or `-s` | ||
|
||
Specifies the name that should be given to the CloudFormation stack that the script deploys. | ||
|
||
### `--key-name` or `-k` | ||
|
||
Specifies the name of the EC2 key pair to use for the CfnCluster master node. **Important**: the `key_name` parameter is optional but if you choose to specify it, the [EC2 key pair](https://console.aws.amazon.com/ec2#KeyPairs) with this name must exist and a secret in [AWS Secrets Manager](https://console.aws.amazon.com/secretsmanager) must exist with the same name and a secret value set to the private key. If `key_name` is omitted, it is defaulted to `cfncluster-stepfunctions`. | ||
|
||
## Flags | ||
|
||
### `--help` or `-h` | ||
|
||
Prints the help menu and usage to standard output. | ||
|
||
``` | ||
usage: deploy.py [-h] --bucket BUCKET_NAME --config CONFIG_FILE --jobs | ||
JOBS_CONFIG [--stack-name STACK_NAME] [--region REGION] | ||
[--key-name KEY_NAME] | ||
Deploys CfnCluster Step Function | ||
optional arguments: | ||
-h, --help show this help message and exit | ||
--bucket BUCKET_NAME, -b BUCKET_NAME | ||
Specify s3 bucket to use/create | ||
--config CONFIG_FILE, -c CONFIG_FILE | ||
Specify config file to use | ||
--jobs JOBS_CONFIG, -j JOBS_CONFIG | ||
Specify jobs config file to use | ||
--stack-name STACK_NAME, -s STACK_NAME | ||
Specify the stack name to use | ||
--region REGION, -r REGION | ||
Specify the region to deploy in | ||
--key-name KEY_NAME, -k KEY_NAME | ||
Specify the ec2 key pair | ||
``` |
Oops, something went wrong.