New NCBI to AnnData tutorial #4480

hexhowells · 2023-11-03T12:30:24Z

New tutorial that takes raw NCBI data and processes it into an AnnData object with metadata annotations. Requires some proof reading and minor updates still!

…ing-material into NCBI-to-AnnData

shiltemann · 2023-11-09T10:52:03Z

@hexhowells awesome, thanks a lot!

Could you put a copy of the input data on Zenodo? We have automations which automatically populate a Galaxy Data Library with tutorial data from Zenodo. This way learners who are dealing with bad internet connections or speeds can import the data directly from the data library, circumventing their own networks.

hexhowells · 2023-11-10T12:41:32Z

I've made a Zenodo record and linked to it/added it to the tutorial, It requires review before being published but should be good after that!

mtekman

Looks good! I just had some small word changes.

Question: do the users really need to manually download the data first, or is that part an optional step in the tutorial?

topics/single-cell/tutorials/scrna-ncbi-anndata/tutorial.md

hexhowells · 2023-11-10T13:13:31Z

Downloading the data manually is an optional step however being able to find and download the relevant data is part of the tutorial for those who may not be familiar with the process.

shiltemann · 2023-11-10T13:31:29Z

@hexhowells agreed, still important to show the standard process. The Zenodo copy can be the fallback option.

You should also be able to import the tar file from NCBI/GEO directly into Galaxy via URL, then use Galaxy's "Unzip" tool to unpack it into a collection with all the individual files.

hexhowells · 2023-11-10T14:44:15Z

@shiltemann I was initially going to use the Unzip tool in Galaxy but was getting errors when trying to unzip the .GZ files, I'm not entirely sure what is causing the issue.

pavanvidem

A very useful tutorial! Thanks @hexhowells !!

topics/single-cell/tutorials/scrna-ncbi-anndata/tutorial.md

pavanvidem · 2023-11-10T13:32:05Z

topics/single-cell/tutorials/scrna-ncbi-anndata/tutorial.md

+GSM5353214_PA_AUG_PB_1A_S1.dge.txt
+GSM5353215_PA_AUG_PB_1B_S2.dge.txt
+GSM5353216_PA_PB1A_Pool_1_3_S50_L002_dge.txt
+GSM5353217_PA_PB1A_Pool_2_S107_L004_dge.txt
+GSM5353218_PA_PB1B_Pool_1_2_S74_L003_dge.txt
+GSM5353219_PA_PB1B_Pool_2_S24_L001_dge.txt
+GSM5353220_PA_PB1B_Pool_3_S51_L002_dge.txt
+GSM5353221_PA_PB2A_Pool_1_3_S25_L001_dge.txt
+GSM5353222_PA_PB2B_Pool_1_3_S52_L002_dge.txt
+GSM5353223_PA_PB2B_Pool_2_S26_L001_dge.txt


If we keep the manual download. Then please move this step under Obtaining the Data section. So the order is as follows. Get the data into Galaxy (either by manual download or from Zenodo), then Look at the metadata, and finally add tags.

This was added later as there are 53 raw files but only 10 needed for the tutorial, the process of figuring out which files are needed is done in the "Understanding the Data" section, I'll see if I can reorganise it for it to better make sense.

topics/single-cell/tutorials/scrna-ncbi-anndata/tutorial.md

Co-authored-by: Pavankumar Videm <[email protected]>

topics/single-cell/tutorials/scrna-ncbi-anndata/tutorial.md

nomadscientist · 2023-12-01T00:03:03Z

topics/single-cell/tutorials/scrna-ncbi-anndata/tutorial.md

+>         - *"Find pattern"*: `batch`
+>         - *"Replace with"*: `replicate`
+>
+> 2. {% tool [Cut](Cut1) %} with the following parameters:


I suspect you could speed this section up by removing the 'Cut' option and using the Multi-Join
(combine multiple files)
(Galaxy Version 1.1.1) instead, using the initial barcode column as the key, and THEN cutting once to remove it

nomadscientist · 2023-12-01T00:04:25Z

topics/single-cell/tutorials/scrna-ncbi-anndata/tutorial.md

+
+Lets now add the replicate column which tells us which rows are part of pools of the same patient and tumor location.
+
+> <hands-on-title>Create replicate metadata</hands-on-title>


a few screenshots of where you get this data from the spreadsheet would be good, and if you needed lines from the paper or supplemental data/figures/text to help you decode what the hell was happening, those screenshots would be good. It's walking the user through how you were able to figure this out

mtekman

Nice tutorial, it just gets a bit long with the commands I think and maybe reward the user with some kind of image, or info box, or something to help them feel satisfaction that the large parameter tool they just executed did something big.

topics/single-cell/tutorials/scrna-ncbi-anndata/tutorial.md

mtekman · 2023-12-07T18:21:18Z

topics/single-cell/tutorials/scrna-ncbi-anndata/tutorial.md

+>
+{: .hands_on}
+
+We will now add a column to indicate which sample each row came from using the sample ID's described earlier.


Such a big command should be rewarded with some kind of image or screenshot or something to engage the user again.

mtekman · 2023-12-07T18:21:27Z

topics/single-cell/tutorials/scrna-ncbi-anndata/tutorial.md

+> 3. **Rename** {% icon galaxy-pencil %} output `Specimen Metadata`
+>
+{: .hands_on}
+


Same comment as above

topics/single-cell/tutorials/scrna-ncbi-anndata/tutorial.md

Co-authored-by: mtekman <[email protected]>

pavanvidem · 2023-12-12T15:39:20Z

Workflow and tests are missing. @hexhowells do you have a workflow ready?

hexhowells · 2023-12-12T16:13:46Z

@pavanvidem Yes the tutorial was made from this workflow I built: https://usegalaxy.eu/u/hexhowells/h/ncbi

mtekman · 2023-12-12T18:41:10Z

topics/single-cell/images/scrna-ncbi-anndata/cell_metadata.png

nice addition!

…ing-material into NCBI-to-AnnData

pavanvidem · 2023-12-12T20:30:20Z

@hexhowells I guess, moving tip snippet out of Tag your datasets tip should fix the linting error.

pavanvidem

that hopefully fixes the linting error

topics/single-cell/tutorials/scrna-ncbi-anndata/tutorial.md

Co-authored-by: Pavankumar Videm <[email protected]>

…history

pavanvidem

Fixed the links to the first 2 files, tested the tutorial, and added links to an example history and the workflow. Ready to merge! Thanks @hexhowells for this great tutorial!

bgruening · 2023-12-13T18:43:37Z

🎉 !!!

hexhowells and others added 7 commits October 23, 2023 11:01

init NCBI to AnnData tutorial

6ac1f10

add introduction and initial descriptions

94c9ee2

add more information

e189f93

Merge branch 'galaxyproject:main' into NCBI-to-AnnData

6cd049a

update tutorial

219d645

Merge branch 'galaxyproject:main' into NCBI-to-AnnData

b458fc8

Merge branch 'NCBI-to-AnnData' of https://github.com/hexhowells/train…

0c96981

…ing-material into NCBI-to-AnnData

github-actions bot added the single-cell label Nov 3, 2023

fix linting issues

6e2e685

shiltemann marked this pull request as ready for review November 9, 2023 10:42

shiltemann requested a review from a team as a code owner November 9, 2023 10:42

hexhowells added 2 commits November 10, 2023 12:09

add bibliography

2c7175f

update tutorial

2f5567a

mtekman reviewed Nov 10, 2023

View reviewed changes

topics/single-cell/tutorials/scrna-ncbi-anndata/tutorial.md Outdated Show resolved Hide resolved

topics/single-cell/tutorials/scrna-ncbi-anndata/tutorial.md Outdated Show resolved Hide resolved

pavanvidem reviewed Nov 10, 2023

View reviewed changes

hexhowells and others added 8 commits November 10, 2023 15:38

updated tutorial

ac64076

Update topics/single-cell/tutorials/scrna-ncbi-anndata/tutorial.md

9b0ac04

Co-authored-by: Pavankumar Videm <[email protected]>

Update topics/single-cell/tutorials/scrna-ncbi-anndata/tutorial.md

7e760ea

Co-authored-by: Pavankumar Videm <[email protected]>

Update topics/single-cell/tutorials/scrna-ncbi-anndata/tutorial.md

d98ae6d

Co-authored-by: Pavankumar Videm <[email protected]>

include all 10 samples in example image

c32381b

Update tool version

660ae61

Co-authored-by: Pavankumar Videm <[email protected]>

update replicate tool parameters

270fc05

add comment box about parameter usage

bdd9b65

hexylena reviewed Nov 20, 2023

View reviewed changes

topics/single-cell/tutorials/scrna-ncbi-anndata/tutorial.md Outdated Show resolved Hide resolved

hexylena reviewed Nov 20, 2023

View reviewed changes

topics/single-cell/tutorials/scrna-ncbi-anndata/tutorial.md Outdated Show resolved Hide resolved

nomadscientist added 2 commits November 30, 2023 23:51

fix grammar

c1d2530

more details icon

169f4ed

nomadscientist reviewed Dec 1, 2023

View reviewed changes

nomadscientist added 2 commits December 1, 2023 00:11

remove default tool options

7e32d85

minor text edits

fc2e065

mtekman reviewed Dec 7, 2023

View reviewed changes

hexhowells and others added 4 commits December 12, 2023 12:21

Update topics/single-cell/tutorials/scrna-ncbi-anndata/tutorial.md

2c4560d

Co-authored-by: mtekman <[email protected]>

Update topics/single-cell/tutorials/scrna-ncbi-anndata/tutorial.md

315a310

Co-authored-by: mtekman <[email protected]>

Update topics/single-cell/tutorials/scrna-ncbi-anndata/tutorial.md

fc40054

Co-authored-by: mtekman <[email protected]>

left align table

774294d

add explanation of NCBI database

9561836

add screenshot of cell metadata

1cd21b6

mtekman reviewed Dec 12, 2023

View reviewed changes

topics/single-cell/images/scrna-ncbi-anndata/cell_metadata.png Outdated

Copy link

Collaborator

mtekman Dec 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice addition!

hexhowells reacted with laugh emoji

pavanvidem and others added 5 commits December 12, 2023 20:08

add workflow

cd3728b

add more detail about finding metadata

3f3c8fb

Merge branch 'NCBI-to-AnnData' of https://github.com/hexhowells/train…

ce4296b

…ing-material into NCBI-to-AnnData

add details about replicate files

e59c559

fix typos

12536a6

pavanvidem reviewed Dec 12, 2023

View reviewed changes

topics/single-cell/tutorials/scrna-ncbi-anndata/tutorial.md Outdated Show resolved Hide resolved

topics/single-cell/tutorials/scrna-ncbi-anndata/tutorial.md Outdated Show resolved Hide resolved

hexhowells and others added 4 commits December 13, 2023 10:19

Update topics/single-cell/tutorials/scrna-ncbi-anndata/tutorial.md

d482a13

Co-authored-by: Pavankumar Videm <[email protected]>

Fix linting error

a5579e7

Co-authored-by: Pavankumar Videm <[email protected]>

Update to a working worflow and change test files type to tabular

5503254

Update editors and testers; also added links to workflow and example …

8622962

…history

pavanvidem approved these changes Dec 13, 2023

View reviewed changes

bgruening merged commit 9db8751 into galaxyproject:main Dec 13, 2023
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New NCBI to AnnData tutorial #4480

New NCBI to AnnData tutorial #4480

hexhowells commented Nov 3, 2023

shiltemann commented Nov 9, 2023

hexhowells commented Nov 10, 2023

mtekman left a comment

hexhowells commented Nov 10, 2023

shiltemann commented Nov 10, 2023

hexhowells commented Nov 10, 2023

pavanvidem left a comment

pavanvidem Nov 10, 2023

hexhowells Nov 10, 2023

nomadscientist Dec 1, 2023

nomadscientist Dec 1, 2023

mtekman left a comment

mtekman Dec 7, 2023

mtekman Dec 7, 2023

pavanvidem commented Dec 12, 2023

hexhowells commented Dec 12, 2023

mtekman Dec 12, 2023

pavanvidem commented Dec 12, 2023

pavanvidem left a comment

pavanvidem left a comment

bgruening commented Dec 13, 2023


		Lets now add the replicate column which tells us which rows are part of pools of the same patient and tumor location.

		> <hands-on-title>Create replicate metadata</hands-on-title>

New NCBI to AnnData tutorial #4480

New NCBI to AnnData tutorial #4480

Conversation

hexhowells commented Nov 3, 2023

shiltemann commented Nov 9, 2023

hexhowells commented Nov 10, 2023

mtekman left a comment

Choose a reason for hiding this comment

hexhowells commented Nov 10, 2023

shiltemann commented Nov 10, 2023

hexhowells commented Nov 10, 2023

pavanvidem left a comment

Choose a reason for hiding this comment

pavanvidem Nov 10, 2023

Choose a reason for hiding this comment

hexhowells Nov 10, 2023

Choose a reason for hiding this comment

nomadscientist Dec 1, 2023

Choose a reason for hiding this comment

nomadscientist Dec 1, 2023

Choose a reason for hiding this comment

mtekman left a comment

Choose a reason for hiding this comment

mtekman Dec 7, 2023

Choose a reason for hiding this comment

mtekman Dec 7, 2023

Choose a reason for hiding this comment

pavanvidem commented Dec 12, 2023

hexhowells commented Dec 12, 2023

mtekman Dec 12, 2023

Choose a reason for hiding this comment

pavanvidem commented Dec 12, 2023

pavanvidem left a comment

Choose a reason for hiding this comment

pavanvidem left a comment

Choose a reason for hiding this comment

bgruening commented Dec 13, 2023