Workflow to manage processing of FOVs and Cells for the Cell Variance Analysis program.
All steps and functionality in this package can be run as single steps or all together by using the command line.
In general, all commands for this package will follow the format:
cvapipe {step} {command}
step
is the name of the step such as "ValidateDataset"command
is what you want that step to do, such as "run" or "push"
Available Steps
validate_dataset
:cvapipe validatedataset run --raw_dataset {path_to_dataset}
will validate that the provided dataset can be processed by the downstream steps.prep_analysis_single_cell_ds
:cvapipe prepanalysissinglecellds run --dataset /path/to/cell_table.parquet
will prepare the data table for analysis and other downstream stepsmito_class
:cvapipe mitoclass run --dataset /path/to/preprocessed/cell_table.csv
will run mitotic classifer and generate the manifest for analysismerge_dataset
:cvapipe mergedataset run --dataset_with_annotation /path/to/manifest/from/step1 --dataset_from_labkey /path/to/manifest/from/step3
will generate the manifest for CFE
To run the entire pipeline from start to finish you can simply run:
cvapipe all run --raw_dataset {path to dataset}
Note: The mitotic classifier step was implemented with pytorch-lightning (PLT). PLT support running on slurm in two ways: by submitting a slurm job or with a customized SlurmCluster API, which is different from the SlurmClaster from Dask. So, the whole pipeline will only run through first 2 steps. The last 2 steps need to run as single steps
run
: run the processing for that single step or the entire pipelinepull
: pull down the data required for the step provided (takes your current git branch into account)push
: push the steps data up (takes your current git branch into account)checkout
: checkout the most recent data for your step and git branchclean
: clean the steps local staging directory
Stable Release: pip install cvapipe
Development Head: pip install git+https://github.com/AllenCell/cvapipe.git
See CONTRIBUTING.md for information related to developing the code.
For more details on how this pipeline is constructed please see cookiecutter-stepworkflow and datastep.
To add new steps to this pipeline, run make_new_step
and follow the instructions in
CONTRIBUTING.md
Additionally, for step workflow specific development recommendations please read: DEV_RECOMMENDATIONS.md
If you do not have the raw pipeline four data to run through the pipeline, run the following commands to generate the starting dataset:
pip install -e .[all]
python scripts/create_aics_dataset.py
Options for this script are available and can be viewed with:
python scripts/create_aics_dataset.py --help
Free software: Allen Institute Software License