Skip to content

Latest commit

 

History

History
51 lines (36 loc) · 2.73 KB

File metadata and controls

51 lines (36 loc) · 2.73 KB

Overview

We expect an adt_quality_control HDF5 group at the root of the file, containing information about the quality control metrics and filters derived from the ADT counts. The group itself contains the parameters and results subgroups.

No ADT data was available prior to version 2.0 of the format, so the adt_quality_control group may be absent in such files.

Definitions:

  • adt_available: whether ADTs are present in the dataset. This is typically determined by examining the inputs.
  • num_cells: number of cells in the dataset, prior to any filtering. This is typically determined from the inputs step.
  • num_samples: number of samples in the dataset. This is typically determined from the inputs step.

Parameters

parameters should contain:

  • igg_prefix: a scalar string containing the expected prefix for IgG features.
  • nmads: a scalar float specifying the number of MADs to use to define the QC thresholds.
  • min_detected_drop: a scalar float specifying the minimum relative drop in the number of detected features before a cell is considered to be low-quality.
  • skip: a scalar integer to be interpreted as a boolean, specifying whether to skip the QC for the ADTs.

Results

If adt_available = false, results should be empty.

If adt_available = true, results should contain:

  • metrics, a group containing per-cell QC metrics derived from the RNA count data. This contains:
    • sums: a float dataset of length equal to num_cells, containing the total count for each cell.
    • detected: an integer dataset of length equal to num_cells, containing the total number of detected features for each cell.
    • igg_total: a float dataset of length equal to num_cells, containing the total count in IgG features.
  • thresholds, a group containing thresholds on the metrics for each sample. This group contains:
    • detected: a float dataset of length equal to num_samples, containing the threshold on the total number of detected features for each sample.
    • igg_total: a float dataset of length equal to num_samples, containing the threshold on the total counts in IgG features for each sample.
  • discards: an integer dataset of length equal to num_cells. Each value is interpreted as a boolean and specifies whether the corresponding cell would be discarded by the ADT-based filter thresholds.

If adt_available = true and skip = true, results may be an empty group. However, if any of metrics, thresholds or discards is present, they should follow the requirements listed above.

History

Updated in version 2.1, with the following changes from the previous version:

  • Allow the QC step to be skipped.