The predict
function will use a maxATAC model to predict TF binding in a new condition. The user must provide a model and a bigwig file that corresponds to an ATAC-seq signal track.
maxatac predict --model CTCF.h5 --signal GM12878.bigwig
or
maxatac predict --tf CTCF --signal GM12878.bigwig
The user must provide either the TF name that they want to make predictions for OR the h5 model file they desire. If the user provides a TF name, the best model will be used and the correct threshold file will be provided for peak calling.
The ATAC-seq signal bigwig track that will be used to make TF binding predictions.
Output filename prefix (without extension) to use. Default maxatac_predict
.
This argument specifies the path to the 2bit DNA sequence for the genome of interest. maxATAC models are trained with hg38 so you will need the correct .2bit
file.
The cutoff type (i.e. Precision
, Recall
, F1
, log2FC
). (F1 = F1-score, and log2FC = Log2( Precision : Random Precision)). Default: F1.
The cutoff value for the cutoff type provided. Note precision, recall, and F1-scores range 0-1, while better-than-random log2FC scores range from 0 to infinity. Example: 0.7.
The cutoff file provided in /data/models that corresponds to the average validation performance metrics for the TF model.
Output directory path. Default: ./prediction_results
The path to a bigWig file that has regions to exclude. Default: maxATAC-defined blacklist.
The path to a BED file containing genomic regions to focus TF predictions on. These peaks will be used to refine the prediction windows. Default: whole-chromosome predictions.
The number of regions to predict on per batch. Default 10000
. Decrease this value if you are having memory issues.
The step size to use for building the prediction intervals. Overlapping prediction bins will be averaged together. Default: INPUT_LENGTH/4
, where INPUT_LENGTH is the maxATAC model input size of 1,024 bp.
The path to the chromosome sizes file. This is used to generate the bigwig signal tracks.
The chromosomes to make predictions on. Our models do not currently consider chromosomes X or Y. This means that most of the files will not contain this information. You should not predict in chrX or chrY unless you know your bigWig contains these chromosomes. Default: autosomal chromosomes 1-22.
This argument is used to set the logging level. Currently, the only working logging level is ERROR
.
The windows to use for prediction. These windows must be 1,024 bp wide and have a consistent step size.
This will skip calling peaks on prediction tracks.