We expect a quality_control
HDF5 group at the root of the file, containing parameters and statistics for quality control on the RNA count matrix.
The group itself contains the parameters
and results
subgroups.
Definitions:
num_cells
: number of cells in the dataset, prior to any filtering. This is typically determined from theinputs
step.num_samples
: number of samples in the dataset. This is typically determined from theinputs
step.
parameters
should contain:
use_mito_default
: a scalar integer to be interpreted as a boolean. This specifies whether to use the default mitochondrial gene list.mito_prefix
: a scalar string containing the expected prefix for mitochondrial gene symbols.nmads
: a scalar float specifying the number of MADs to use to define the QC thresholds.skip
: a scalar integer indicating whether quality control should be skipped.
results
should contain:
metrics
, a group containing per-cell QC metrics derived from the RNA count data. This contains:sums
: a float dataset of length equal tonum_cells
, containing the total count for each cell.detected
: an integer dataset of length equal tonum_cells
, containing the total number of detected genes for each cell.proportion
: a float dataset of length equal tonum_cells
, containing the percentage of counts in (mitochondrial) genes.
thresholds
, a group containing thresholds on the metrics for each sample. This contains:sums
: a float dataset of length equal tonum_samples
, containing the total count threshold for each sample.detected
: an integer dataset of length equal tonum_samples
, containing the threshold on the total number of detected genes for each sample.proportion
: a float dataset of length equal tonum_samples
, containing the threshold on the percentage of counts in (mitochondrial) genes for each sample.
discards
: an integer dataset of length equal tonum_cells
. Each value is interpreted as a boolean and specifies whether the corresponding cell would be discarded by the RNA-based filter thresholds.
If skip = true
, results
may be an empty group.
However, if any of metrics
, thresholds
or discards
is present, they should follow the requirements listed above.
Updated in version 2.1, with the following changes from the previous version:
- Support skipping of the QC step.