Is MAESTRO compatible with 10X data derived from nuclear RNA? #98

Dazcam · 2021-01-07T11:46:01Z

Hello,

I'm currently installing the MAESTRO prerequisites and, after reading the paper, I'd like to ask if MAESTRO is compatible with 10X data derived from nuclear RNA, particularly if I'm looking to integrate single-modal snRNA- and snATAC-seq data?

And more specifically, could the use of a pre-mRNA reference and GTF files for alignment, as opposed to standard reference/annotation files, impact a MAESTRO analysis at all?

Until now I have been using Cell Ranger 4 for my analysis which recommends using a pre-mRNA reference and GTF file for nuclear RNA. I had started creating STARsolo compatible versions of these files for my MAESTRO analysis and wondered if this is the best course of action, particularly as 10X have recently released v5 which includes a new function for dealing with intronic reads without the need of a pre-RNA reference, and STARsolo also provides a similar function.

Regardless, it would be useful to hear if you have any recommendations or points of interest that I should consider when running MAESTRO using single-nuclear data.

Many Thanks,

Darren

crazyhottommy · 2021-01-11T00:06:46Z

Hi,
MAESTRO uses STARsolo for scRNAseq quantification. You can
add --soloFeatures GeneFull for single-nuclei data after you initiate the Snakefile manually at https://github.com/liulab-dfci/MAESTRO/blob/master/MAESTRO/Snakemake/scRNA/Snakefile#L48

In the future, we should expose that as a parameter in the config.yaml file.

Thanks!

Dazcam · 2021-01-11T09:53:54Z

Many thanks for responding. I will add that command to the Snakefile today and see if it runs to completion. The pipeline hit the skids after the scrna_rseqc_genecov rule. Although that rule completed without error the logs reported the following warning:

Cannot get coverage signal from 14510_PFC_RNAAligned.sortedByCoord.out.sample.bam ! Skip

	Sample	Skewness
@ 2021-01-09 00:14:17: Running R script ...

Likely a mismatch between the BED and BAM files. This caused the pipeline to choke during the scrna_rseqc_plot rule as the RNAGenebodyCoveragePlot could not be generated.

Error in `[.data.frame`(gene_cov, , 2) : undefined columns selected
Calls: RNAGenebodyCoveragePlot -> [ -> [.data.frame

I also had a buffer size issue. I assume this is due to my samples being sequenced extremely deeply?

EXITING because of fatal error: buffer size for SJ output is too small
Solution: increase input parameter --limitOutSJcollapsed

I managed to solve it by adding the following line in shell command of the scrna_map rule.

--limitOutSJcollapsed 5000000

Source here. May be worth adding this somewhere in config or docs?

Are you planning on adding ssclusteval to the pipeline?

Dazcam · 2021-01-13T08:38:06Z

UPDATE: 13th Jan 2021

When running with the --soloFeatures GeneFull parameter the directory names of some of the output files are changed such that they do not match what is specified in the Snakefile.

Instead of: Result/STAR/%sSolo.out/Gene/raw/matrix.mtx

They are stored in Result/STAR/%sSolo.out/GeneFull/raw/matrix.mtx

I think this only affects the scrna-map and scrna_qc rules.

Error message:

MissingOutputException in line 21 of /scratch/c.c1477909/maestro_analysis/14510_PFC_RNAv2/Snakefile:
Job Missing files after 5 seconds:
Result/STAR/14510_PFC_RNASolo.out/Gene/raw/matrix.mtx
Result/STAR/14510_PFC_RNASolo.out/Gene/raw/features.tsv
Result/STAR/14510_PFC_RNASolo.out/Gene/raw/barcodes.tsv
This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait.
Job id: 0 completed successfully, but some output files are missing. 0
 
Removing output files of failed job scrna_map since they might be corrupted:
Result/STAR/14510_PFC_RNAAligned.sortedByCoord.out.bam, Result/STAR/14510_PFC_RNAAligned.sortedByCoord.out.bam.bai
Shutting down, this might take some time.

I have modified the Snakefile and now running MAESTRO again.

crazyhottommy · 2021-01-13T14:11:57Z

Thanks for reporting, we will keep this in our mind and make it in our next release!

crazyhottommy · 2021-07-26T18:48:11Z

Hi, we just made a new release MAESTRO1.5.1 which supports single-nuclei data. Can you please give it a try?
Thanks!

Dazcam · 2021-07-27T14:48:02Z

Thanks for the update. Unfortunately I had to abandon using Maestro due to the issues I was having around the time I posted. I now have a well developed pipeline of my own for my single-nuclei data but will keep my eye on Maestro's development and may consider using in the future.

crazyhottommy · 2021-07-27T15:05:33Z

Thanks for the feedback!

njohnso6 · 2022-12-07T15:04:19Z

I got the same error:
EXITING because of fatal error: buffer size for SJ output is too small
Solution: increase input parameter --limitOutSJcollapsed
When running the newest version 1.5.4 (only available on the macs3 fork) to run the multiome pipeline. I have yet to try the solution previously proposed. Will let you know.

crazyhottommy added the enhancement New feature or request label Jan 13, 2021

crazyhottommy self-assigned this Jan 13, 2021

crazyhottommy assigned crazyhottommy and baigal628 and unassigned crazyhottommy Jan 26, 2021

crazyhottommy closed this as completed Jul 27, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is MAESTRO compatible with 10X data derived from nuclear RNA? #98

Is MAESTRO compatible with 10X data derived from nuclear RNA? #98

Dazcam commented Jan 7, 2021 •

edited

Loading

crazyhottommy commented Jan 11, 2021

Dazcam commented Jan 11, 2021 •

edited

Loading

Dazcam commented Jan 13, 2021 •

edited

Loading

crazyhottommy commented Jan 13, 2021

crazyhottommy commented Jul 26, 2021

Dazcam commented Jul 27, 2021

crazyhottommy commented Jul 27, 2021

njohnso6 commented Dec 7, 2022

Is MAESTRO compatible with 10X data derived from nuclear RNA? #98

Is MAESTRO compatible with 10X data derived from nuclear RNA? #98

Comments

Dazcam commented Jan 7, 2021 • edited Loading

crazyhottommy commented Jan 11, 2021

Dazcam commented Jan 11, 2021 • edited Loading

Dazcam commented Jan 13, 2021 • edited Loading

crazyhottommy commented Jan 13, 2021

crazyhottommy commented Jul 26, 2021

Dazcam commented Jul 27, 2021

crazyhottommy commented Jul 27, 2021

njohnso6 commented Dec 7, 2022

Dazcam commented Jan 7, 2021 •

edited

Loading

Dazcam commented Jan 11, 2021 •

edited

Loading

Dazcam commented Jan 13, 2021 •

edited

Loading