Skip to content

Latest commit

 

History

History
35 lines (23 loc) · 1.38 KB

v3_0.md

File metadata and controls

35 lines (23 loc) · 1.38 KB

Overview

We expect a crispr_pca HDF5 group at the root of the file, describing how the PCA was performed on the RNA expression matrix. The group itself contains the parameters and results subgroups.

Definitions:

  • crispr_available: whether CRISPR data is present in the dataset. This is typically determined by examining the inputs step.
  • num_cells: number of cells remaining after QC filtering. This is typically determined from the cell_filtering step.

Parameters

parameters should contain:

  • num_pcs: a scalar integer containing the number of PCs to compute.
  • block_method: a scalar string specifying the method to use when dealing with multiple blocks in the dataset. This may be "none", "regress" or "weight".

Results

If crispr_available = false, results should be empty.

Otherwise, results should contain:

  • pcs: a 2-dimensional float dataset containing the PC coordinates in a row-major layout. Each row corresponds to a cell (after QC filtering), with num_cells rows in total. Each column corresponds to a PC, with no more than (but possibly less than) num_pcs columns in total. PCs may be computed with block-specific weights or regression, depending on block_method.
  • var_exp: a float dataset of length equal to the number of PCs, containing the percentage of variance explained by each PC.

History

Added in version 3.0.