Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prepare Monocle 3.0 for Cell Ranger 3.0 #267

Open
wants to merge 3 commits into
base: monocle3_alpha
Choose a base branch
from

Conversation

evolvedmicrobe
Copy link
Contributor

This the same as #243, but for this branch.

Hopefully will help resolve things like #265

10X is released a new version of CellRanger that is changed the output format of its matrices.  They also deprecated the rkit R package, so it will no longer be able to help users load their data into monocle.  This pull request makes Monocle compatible with the new version, and avoids having users download a separate R package by moving the data loading functionality of Rkit into monocle, with updates to handle CellRanger 3.0 data.

In particular, the following changes were made to the output file format in 3.0:

**Text File Formats** - In order to save disk space, the sparse matrix and barcode text files will now be gzipped. As R automatically identifies and correctly reads gzipped files, no changes were needed to account for this other than appending a suffix when necessary. Additionally, in order to account for experiments that have "multimodal" datasets, the gene.tsv will instead become the features.tsv file. This file will contain an additional column describing the type of feature referred to in that row of the matrix.

**Feature Data** - CellRanger now supports obtaining both feature barcoding (e.g. CRISPR/Antibody/Dextamer) data in addition to standard Gene Expression data.

To replace the functionality of Rkit, this pull request adds a new function to `monocle` called, `load_cellranger_data`.  It behaves similarly to the old R kit function `load_cellranger_matrix` with a few important distinctions (small name change made to avoid confusion but hint at the strong similarity).

1 - **It directly returns a `CellDataSet` object.**  Rather than have the user convert after the fact, it just loads the data directly into this.

2 - **It ignores Feature Barcoding data** - As this is a new feature for CellRanger 3.0, for now it is not loaded into monocle.

3 - **It transparently handles v2.0 vs v3.0 data.**  Although the formats are different, the function detects the version used and loads the data appropriately.

Very small test data files and associated tests were added to verify the expected behavior.
This was in the documentation and has no effect.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant