-
Notifications
You must be signed in to change notification settings - Fork 925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New tutorial - single cell data import and format conversion #4590
New tutorial - single cell data import and format conversion #4590
Conversation
Thanks @shiltemann! I've added a wee news post to this PR and updated some of the contributions to tutorials :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, this tutorial is very much needed in the community. We are seeing now and then questions about getting data into Galaxy and converting. A very good start!!
I liked the concept of this tutorial but it would be nice to have some more alternative hands-ons for each type of conversion. It seems like the tutorial is built to work with other tutorials based on the EBI tools. Maybe some users just need to perform basic clustering with PBMC tutorial. Some of the conversions can be done with the latest SCEasy with a single click to produce the latest version of anndata that is compatible with the PBMC tutorial.
Currently, it seems like these are the only options to convert. Do you think it makes sense to add an alternative hands-on with SCEasy converter (with choose your tutorial)? Then we can produce the latest and old versions of anndata.
> 3. Alternatively, you can import history where we created the Seurat object: [Input history](https://singlecell.usegalaxy.eu/u/j.jakiela/h/ebi-scxa---anndata-scanpy-or-seurat-object-1) | ||
> | ||
> {% snippet faqs/galaxy/histories_import.md %} | ||
> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By default, Galaxy detected this file as rds format. Hence, the SCEasy tool failed to detect the file as input. An additional step to assign datatype to rdata would be nice.
Edit: I saw that you added a new trick to run on unsupported file format
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for spotting that - I can add a note for the users to change the datatype to rdata when importing a file in this way.
|
||
Most of our conversions involve extracting tables from different data objects and importing them into the target object. | ||
|
||
First, we will extract observations (cell metadata) and the full matrix from our AnnData. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any reasons to extract matrix and then use DropletUtils and Seurat Read10x tools? This can be done in a single step with the new SCEasy converter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, the idea was to show the manual conversion to give the users an idea of the structure of those objects and how they are related.
The new SCEasy converter was introduced first in this tutorial, because it's the easiest to use and probably does all the conversion the users might need. I thought it's straightforward enough to mention it at the beginning and then just show other methods, but your comment gave me an idea of adding an additional hands-on box here that will show the one-step conversion with the new SCEasy converter.
> | ||
{: .hands_on} | ||
|
||
Finally, let's combine those files that we have just generated and turn them into the SingleCellExperiment! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, can be done with SCEasy. Anndata -> Seurat -> SCE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, will add an additional hands-on box!
> | ||
{: .hands_on} | ||
|
||
And that's it! The latest tool {% tool [SCEasy Converter](toolshed.g2.bx.psu.edu/repos/iuc/sceasy_convert/sceasy_convert/0.0.7+galaxy2) %} will do the same but the output file will be the newest AnnData version and will not work with the tool used below. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe good to have an alternative hands-on with the latest SCEasy converter tool too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just mentioned the latest SCEasy here without a hands-on box because I thought it's similar enough to the old one that the users will know how to use it. And it would produce the AnnData output which wouldn't be compatible with the Filter, Plot, Explore workflow and in this section the main aim was to do the conversion and pass on the converted object into that workflow. As you pointed out previously, we need an older version of AnnData for that workflow, hence there it is the older version of SCEasy used here.
|
||
> <hands-on-title> Modify AnnData object to make it compatible with Filter, Plot, Explore workflow </hands-on-title> | ||
> | ||
> 1. {% tool [AnnData Operations](toolshed.g2.bx.psu.edu/repos/ebi-gxa/anndata_ops/anndata_ops/1.8.1+galaxy92) %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did it work for you? It failed for me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh no!
It did work for me: https://singlecell.usegalaxy.eu/u/j.jakiela/h/seurat---anndata - see dataset 266 old SCEasy conversion, then 268 Anndata operations and all datasets that come later are the outputs of the Filter, Plot, Explore workflow.
* Forbid ?s fixes galaxyproject#4658 * Fixes galaxyproject#4664 * fix raw tag --------- Co-authored-by: Helena Rasche <[email protected]>
Apologies for keeping updating this PR for so long, but I think I'm done! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That linting error is fine 👍
Tutorial describing the most common single-cell datatypes, how to import data using EBI SCXA and HCA tools, and how to convert between the formats.
Can leave it as one big tutorial or split into separate smaller ones.
@nomadscientist
UPDATES AFTER THE REVIEW: