-
Notifications
You must be signed in to change notification settings - Fork 14
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #61 from STRIDES/rework-readme-structure
added tutorials dir back to accomodate ms flows emails for a few more months
- Loading branch information
Showing
21 changed files
with
6,619 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Large diffs are not rendered by default.
Oops, something went wrong.
202 changes: 202 additions & 0 deletions
202
tutorials/notebooks/ElasticBLAST/run_elastic_blast.ipynb
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,202 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"id": "8c3f3bb2", | ||
"metadata": {}, | ||
"source": [ | ||
"# Run ElasticBLAST using AWS Batch" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "aee3b229", | ||
"metadata": {}, | ||
"source": [ | ||
"This notebook is based on the [this tutorial](https://blast.ncbi.nlm.nih.gov/doc/elastic-blast/quickstart-aws.html). Make sure you select a kernel with Python 3.7 for the Elastic BLAST install. One good option is `conda_mxnet_latest_p37`. " | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "38dfb579", | ||
"metadata": {}, | ||
"source": [ | ||
"### 1) Install elastic blast" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "d96bb988", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"!pip3 install elastic-blast" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "684e79f6", | ||
"metadata": {}, | ||
"source": [ | ||
"Test your install, it should print out a version and full help menu." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "2aa11ccc", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"!elastic-blast --version\n", | ||
"!elastic-blast --help" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "58b59cb0", | ||
"metadata": {}, | ||
"source": [ | ||
"### 2) Optionally, create a bucket for this tutorial if one does not yet exist" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "319ff226", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"!aws s3 mb s3://elasticblast-sagemaker" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "449d7511", | ||
"metadata": {}, | ||
"source": [ | ||
"### 3) Create a config file that defines the job parameters" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "b578c1ea", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"!touch BDQA.ini" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "a1b0a866", | ||
"metadata": {}, | ||
"source": [ | ||
"Open the config file and add the following:\n", | ||
"```\n", | ||
"[cloud-provider]\n", | ||
"aws-region = us-east-1\n", | ||
"aws-vpc = vpc-0eaafe0236e351a36\n", | ||
"aws-subnet = subnet-043d7614ae5dc30c9\n", | ||
"aws-key-pair = cloud-lab-testing\n", | ||
"\n", | ||
"[cluster]\n", | ||
"num-nodes = 3\n", | ||
"labels = owner=ec2-user\n", | ||
"\n", | ||
"[blast]\n", | ||
"program = blastp\n", | ||
"db = refseq_protein\n", | ||
"queries = s3://elasticblast-test/queries/BDQA01.1.fsa_aa\n", | ||
"results = s3://elasticblast-sagemaker/results/BDQA\n", | ||
"options = -task blastp-fast -evalue 0.01 -outfmt \"7 std sskingdoms ssciname\"\n", | ||
"```\n", | ||
"\n", | ||
"You can add additional configuration values from [this guide](https://blast.ncbi.nlm.nih.gov/doc/elastic-blast/configuration.html). If you need to run this a few times, make sure you either rename the ouput folder, or delete the results folder from the S3 bucket. If you are using your own data, make sure to modify the database and the S3 queries path." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "9a9f8192", | ||
"metadata": {}, | ||
"source": [ | ||
"### 4) Submit the job" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "398253e8", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"!elastic-blast submit --cfg BDQA.ini" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "9a8e7716", | ||
"metadata": {}, | ||
"source": [ | ||
"### 5) Check results and troubleshoot" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "94a43c5e", | ||
"metadata": {}, | ||
"source": [ | ||
"+ You can monitor the job initially by going to `CloudFormation` and viewing the events tab of the elastic blast stack. If there is an error, you should be able to pinpoint it in these event logs.\n", | ||
"+ You can view the progress by going to `AWS Batch`, select the Job queue that begins with `elasticblast`, and then make sure jobs are moving from Runnable to Running to Succeeded. The number of jobs that run together will be the number of nodes you selected in the config file. To run more jobs at once, increase the `cluster` parameter `num-nodes`. \n", | ||
"+ Finally, to view your outputs, look at the files in your S3 output bucket, something like `aws s3 ls s3://elasticblast-sagemaker/results/BDQA/`." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "292947f1-5247-4da5-81bd-7fc8fc420ca4", | ||
"metadata": {}, | ||
"source": [ | ||
"### 6) Clean up cloud resources" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "e677ba64-38e0-49d4-919b-4bb51de83cdd", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"!elastic-blast delete --cfg BDQA.ini" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"environment": { | ||
"kernel": "python3", | ||
"name": "common-cpu.m93", | ||
"type": "gcloud", | ||
"uri": "gcr.io/deeplearning-platform-release/base-cpu:m93" | ||
}, | ||
"kernelspec": { | ||
"display_name": "Python 3", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.7.12" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 5 | ||
} |
Oops, something went wrong.