Contents are stored inside an inputs
HDF5 group at the root of the file.
The cell_labelling
group itself contains the parameters
and results
subgroups.
Definitions:
num_clusters
: number of clusters generated in the analysis. This is determined by thechoose_clustering
step.
parameters
should contain:
human_references
: a string dataset defining the human reference datasets used for labelling. Each entry contains the name of a reference dataset, e.g.,"BlueprintEncode"
.mouse_references
: a string dataset defining the mouse reference datasets used for labelling. Each entry contains the name of a reference dataset, e.g.,"ImmGen"
.
results
should contain:
per_reference
: a group containing the label assignments for each cluster in each reference. Each child is named after its corresponding reference as listed inparameters
, though not all references listed inparameters
need to be present here. Each child is a 1-dimensional string dataset of length equal tonum_clusters
, where each entry contains the assigned label for the corresponding cluster.
For multiple references of the relevant species, results
will also contain:
integrated
: a string dataset of length equal tonum_clusters
. This specifies the reference with the top-scoring label for each cluster, after integrating the results of all per-reference classifications.
Added in version 1.0.