Prepare your workflow
Start your workflow
Get your workflow Id for your workflow
Abort an in-progress workflow
For any pipeline, you can create a WDL file that calls your tools in Docker containers. Please note that Cromwell on Azure only supports tasks with Docker containers defined for security reasons.
In order to run a WDL file, you must modify/create a workflow with the following runtime attributes for the tasks that are compliant with the TES or Task Execution Schemas:
runtime {
cpu: 1
memory: "2 GB"
disk: "10 GB"
docker:
maxRetries: 0
preemptible: true
}
Ensure that the attributes memory
and disk
(note: use the singular form for disk
NOT disks
) have units. Supported units from Cromwell:
KB - "KB", "K", "KiB", "Ki"
MB - "MB", "M", "MiB", "Mi"
GB - "GB", "G", "GiB", "Gi"
TB - "TB", "T", "TiB", "Ti"
The preemptible
attribute is a boolean (not an integer). You can specify preemptible
as true
or false
for each task. When set to true
Cromwell on Azure will use a low-priority batch VM to run the task. If set to false
Cromwell on Azure will use a dedicated VM to run the task.
bootDiskSizeGb
and zones
attributes are not supported by the TES backend.
Each of these runtime attributes are specific to your workflow and tasks within those workflows. The default values for resource requirements are as set above.
Learn more about Cromwell's runtime attributes here.
To run a workflow using Cromwell on Azure, you will need to specify the location of your WDL or CWL file and inputs JSON file in an Cromwell on Azure-specific trigger JSON file which also includes any workflow options and dependencies. Submitting this trigger file initiates the Cromwell workflow.
All trigger JSON files include the following information:
- The "WorkflowUrl" is the url for your WDL or CWL file.
- The "WorkflowInputsUrl" is the url for your input JSON file. You can use this file to customize inputs to any workflow file.
- The "WorkflowOptionsUrl" is only used with some workflow files. If you are not using it set this to
null
. - The "WorkflowDependenciesUrl" is only used with some workflow files. If you are not using it set this to
null
.
Your trigger file should be configured as follows:
{
"WorkflowUrl": <URL path to your WDL file in quotes>,
"WorkflowInputsUrl": <URL path to your input json file in quotes>,
"WorkflowOptionsUrl": <URL path to your workflow options json in quotes>,
"WorkflowDependenciesUrl": <URL path to your workflow dependencies file in quotes>
}
By default, Cromwell on Azure mounts a storage account to your instance, which is found in your resource group after a successful deployment. You can follow these steps to mount a different storage account that you manage or own, to your Cromwell on Azure instance.
There are four main ways to specify the blob paths within your trigger JSON, "WorkflowUrl" WDL or CWL file, "WorkflowInputsUrl" JSON file, and "WorkflowDependenciesUrl" file. The "WorkflowOptionsUrl" file only supports the first format.
If using the default storage account or using a storage account connected to your Cromwell on Azure instance:
- For blobs/files hosted on an Azure Storage account that is connected to your Cromwell on Azure instance, the input path consists of 3 parts - the storage account name, the blob container name, blob/file path with extension, following this format:
/<storageaccountname>/<containername>/<blobName>
Example file path for an "inputs" container in a storage account "msgenpublicdata" will look like
"/msgenpublicdata/inputs/chr21.read1.fq.gz"
This is the only supported format for blob paths within the "WorkflowOptionsUrl" file.
- You can also use the https URLs which can be found by clicking on the blob to view the properties from the Azure portal. The URL path to "WorkflowUrl" for a test WDL file in a container called "inputs" will look like:
https://<storageaccountname>.blob.core.windows.net/inputs/test/test.wdl
If using files in locations that are not connected to your Cromwell on Azure instance:
3. Via SAS URLs for Azure Storage account blobs/files that are not connected to your Cromwell on Azure instance
4. Via public http or https URLs like GitHub raw URLs
Any additional scripts or subworkflows must be accessible to TES. Apart from the above methods, the "WorkflowDependenciesUrl" property can also be defined via a ZIP file in a storage container accessible by Cromwell.
To start a WDL workflow, go to your Cromwell on Azure Storage account associated with your host VM. In the workflows
container, place the trigger JSON file in the "new" virtual directory (note: virtual directories do not exist on their own, they are just part of a blob's name). This initiates a Cromwell workflow, and returns a workflow ID that is appended to the trigger JSON file name and transferred to the "inprogress" directory in the workflows
container.
This can be done programmatically using the Azure Storage SDKs, or manually via the Azure Portal or Azure Storage Explorer.
For example, a trigger JSON file with name task1.json
in the "new" directory, will be move to the "inprogress" directory with a modified name task1.uuid.json
. This uuid is a workflow ID assigned by Cromwell.
Once your workflow completes, you can view the output files of your workflow in the cromwell-executions
container within your Azure Storage Account. Additional output files from the Cromwell endpoint, including metadata and the timing file, are found in the outputs
container. To learn more about Cromwell's metadata and timing information, visit the Cromwell documentation.
The Cromwell workflow ID is generated by Cromwell once the workflow is in progress, and it is appended to the trigger JSON file name.
For example, placing a trigger JSON file with name task1.json
in the "new" directory will initiate the workflow. Once the workflow begins, the JSON file will be moved to the "inprogress" directory in the "workflows" container with a modified name task1.guid.json
To abort a workflow that is in-progress, go to your Cromwell on Azure Storage account associated with your host VM. In the workflows
container, place an empty file in the "abort" virtual directory named cromwellID.json
, where "cromwellID" is the Cromwell workflow ID you wish to abort.