Skip to content

Commit

Permalink
check spelling
Browse files Browse the repository at this point in the history
  • Loading branch information
lldelisle committed Sep 12, 2023
1 parent 5141c08 commit 3457a62
Show file tree
Hide file tree
Showing 2 changed files with 9 additions and 9 deletions.
16 changes: 8 additions & 8 deletions workflows/transcriptomics/rnaseq-sr/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

## Inputs dataset

- The workflow needs a list of dataset of fastqsanger.
- The workflow needs a list of datasets of fastqsanger.
- As well as a gtf file with genes
- Optional, but recommended: a gtf file with regions to exclude from normalization in Cufflinks.

Expand All @@ -15,21 +15,21 @@ chrM chrM_gene exon 0 16299 . - . gene_id "chrM_gene_minus"; transcript_id "chrM

## Inputs values

- forward adapter sequence: this depends on the library preparation. Usually classical RNA libraries are Truseq and ISML (relatively new Illumina library) is Nextera. If you don't know, use FastQC to determine if it is Truseq or Nextera. If the read length is relatively short (50bp), there is probably no adapter.
- forward adapter sequence: this depends on the library preparation. Usually classical Illumina RNA libraries are Truseq and ISML (relatively new Illumina library) is Nextera. If you don't know, use FastQC to determine if it is Truseq or Nextera. If the read length is relatively short (50bp), there is probably no adapter so it will not impact your results.
- reference_genome: this field will be adapted to the genomes available for STAR
- strandness: For stranded RNA, reverse means that the read is complementary to the coding sequence, forward means that the read is in the same orientation as the coding sequence. This will help you to get from STAR only the counts corresponding to your library preparation. This is also used for the stranded coverage and for FPKM computation with cufflinks/StringTie.
- cufflinks_FPKM: Whether you want to get FPKM with Cufflinks (pretty long)
- stringtie_FPKM: Whether you want to get FPKM/TPM etc... with Cufflinks.
- stringtie_FPKM: Whether you want to get FPKM/TPM etc... with Stringtie.

## Processing

- The workflow will remove adapters and low quality bases and filter out any read smaller than 15bp
- The filtered reads are mapped with STAR with ENCODE parameters (for long RNA-seq but I use it for short also). STAR is also used to count reads per gene and stranded specific normalized coverage (on uniquely mapped reads).
- The workflow will remove adapters and low quality bases and filter out any read smaller than 15bp.
- The filtered reads are mapped with STAR with ENCODE parameters (for long RNA-seq but I use it for short also). STAR is also used to count reads per gene and generate stranded specific normalized coverage (on uniquely mapped reads).
- A multiQC is run to have an overview of the QC. This can also be used to get the strandness.
- FPKM values for tenes and transcripts are computed with cufflinks using correction for multi-mapped reads (optionnal).
- FPKM/TMP values for genes are computed with SstringTie.
- FPKM values for genes and transcripts are computed with cufflinks using correction for multi-mapped reads (this step is optionnal).
- FPKM/TPM values for genes are computed with StringTie (this step is optional).
- The BAM is filtered to keep only uniquely mapped reads (tag NH:i:1).
- Coverage unstranded, and each strand independently is computed with bedtools and normalized to the number of million uniquely mapped reads.
- Coverage unstranded is computed with bedtools and normalized to the number of million uniquely mapped reads.
- The three coverage files are converted to bigwig.

### Warning
Expand Down
2 changes: 1 addition & 1 deletion workflows/transcriptomics/rnaseq-sr/rnaseq-sr.ga
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"a_galaxy_workflow": "true",
"annotation": "This workflow takes as input a list of single-reads fastqs. Adapters and bad quality bases are removed with cutadapt. Reads are mapped with STAR with ENCODE parameters and genes are counted simultaneously as well as normalized coverage (per million mapped reads) on uniquely mapped reads. The counts are reprocessed to be similar to HTSeq-count output. FPKM are computed with cufflinks and/or with StringTie.",
"annotation": "This workflow takes as input a list of single-reads fastqs. Adapters and bad quality bases are removed with cutadapt. Reads are mapped with STAR with ENCODE parameters and genes are counted simultaneously as well as normalized coverage (per million mapped reads) on uniquely mapped reads. The counts are reprocessed to be similar to HTSeq-count output. FPKM are computed with cufflinks and/or with StringTie. The unstranded normalized coverage is computed with bedtools.",
"creator": [
{
"class": "Person",
Expand Down

0 comments on commit 3457a62

Please sign in to comment.