Skip to content

Commit

Permalink
Ready for v5.3.0
Browse files Browse the repository at this point in the history
Sqanti reads ready for production. vUpdated to version 5.3.0 in all files that include sqanti version
  • Loading branch information
Fabian-RY authored Dec 4, 2024
2 parents 5c6544a + b15626d commit c0194f3
Show file tree
Hide file tree
Showing 13 changed files with 110,612 additions and 14 deletions.
8 changes: 4 additions & 4 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,18 +1,18 @@
# Base image for SQANTI3/v5.2.2 with Ubuntu 22.04
# Base image for SQANTI3/v5.3.0 with Ubuntu 22.04

# Using ubuntu 22.04
# Right now edlib doesn't work with python 3.12 which is the default version
# of python in Ubuntu 24.04. edlib has had no updates since April 19, 2023
# so no compatibility is expected for the time being.
# Dockerfile originally developed by @skchronicles, updated and optimized
# for version v5.2.2 by @Fabian-RY
# for version v5.3.0 by @Fabian-RY

FROM ubuntu:22.04
SHELL ["/bin/bash", "--login" ,"-c"]

# Set the versions of different softwares dependencies and SQANTI3 version
# To install
ENV SQANTI3_VERSION="5.2.2"
ENV SQANTI3_VERSION="5.3.0"
ENV DESALT_VERSION="1.5.6"
ENV NAMFINDER_VERSION="0.1.3"

Expand Down Expand Up @@ -124,7 +124,7 @@ ENV PATH="${PATH}:/opt2/namfinder/${NAMFINDER_VERSION}/namfinder-${NAMFINDER_VER
WORKDIR /opt2

########### SQANTI3 (currentily v${NAMFINDER_VERSION}) ############
# Installs SQANTI3 with the version defined in the ENV variable (currently 5.2.2)
# Installs SQANTI3 with the version defined in the ENV variable (currently 5.3.0)
# dependenciesand requirements have already been
# satisfied, for more info see:
# https://github.com/ConesaLab/SQANTI3
Expand Down
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ SQANTI3 is the newest version of the [SQANTI tool](https://www.ncbi.nlm.nih.gov/
SQANTI3 is the first module of the [Functional IsoTranscriptomics (FIT)](https://tappas.org/) framework, which also includes IsoAnnot and tappAS.

## Installation
The [latest SQANTI3 release](https://github.com/ConesaLab/SQANTI3/releases/tag/v5.2.2) (31/07/2024) is **version 5.2.2**. See our wiki for [installation instructions](https://github.com/ConesaLab/SQANTI3/wiki/Dependencies-and-installation).
The [latest SQANTI3 release](https://github.com/ConesaLab/SQANTI3/releases/tag/v5.3.0) (04/12/2024) is **version 5.3.0**. See our wiki for [installation instructions](https://github.com/ConesaLab/SQANTI3/wiki/Dependencies-and-installation).

For informacion about previous releases and features introduced in them, see the [version history](https://github.com/ConesaLab/SQANTI3/wiki/Version-history).

Expand Down Expand Up @@ -56,3 +56,4 @@ If you are using SQANTI3 in your research, please cite the following paper in ad

- Pardo-Palacios, F.J., Arzalluz-Luque, A. et al. **SQANTI3: curation of long-read transcriptomes for accurate identification of known and novel isoforms**. *Nat Methods* (2024). https://doi.org/10.1038/s41592-024-02229-2

- Keil, N., Monzó, C., McIntyre, L., Conesa, A. **SQANTI-reads: a tool for the quality assessment of long read data in multi-sample lrRNA-seq experiments**. BioRxiv (2024) https://www.biorxiv.org/content/10.1101/2024.08.23.609463v2
5 changes: 4 additions & 1 deletion SQANTI3.conda_env.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ channels:
- r
- defaults
dependencies:
- argcomplete=3.4.0
- bcbio-gff
- bedtools
- biopython<=1.81
Expand All @@ -15,7 +16,9 @@ dependencies:
- cython
- desalt
- gffread
- gtftools=0.9.0
- gmap
- jinja2
- kallisto=0.48.0
- minimap2
- numpy
Expand Down Expand Up @@ -61,8 +64,8 @@ dependencies:
- seqtk
- star
- slamem
- gtftools
- seaborn
- scikit-learn
- pip:
- ultra-bioinformatics
- pdf2image
36,747 changes: 36,747 additions & 0 deletions example/sqanti_reads_test/sqR1.gtf

Large diffs are not rendered by default.

36,747 changes: 36,747 additions & 0 deletions example/sqanti_reads_test/sqR2.gtf

Large diffs are not rendered by default.

36,747 changes: 36,747 additions & 0 deletions example/sqanti_reads_test/sqR3.gtf

Large diffs are not rendered by default.

4 changes: 4 additions & 0 deletions example/sqanti_reads_test/sqR_design_file.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
sampleID,file_acc,classification_file,junction_file
SQ_R1,sqR1,./example/sqanti_reads_test//sqR1/SQ_R1_reads_classification.txt,./example/sqanti_reads_test//sqR1/SQ_R1_junctions.txt
SQ_R2,sqR2,./example/sqanti_reads_test//sqR2/SQ_R2_reads_classification.txt,./example/sqanti_reads_test//sqR2/SQ_R2_junctions.txt
SQ_R3,sqR3,./example/sqanti_reads_test//sqR3/SQ_R3_reads_classification.txt,./example/sqanti_reads_test//sqR3/SQ_R3_junctions.txt
2 changes: 1 addition & 1 deletion sqanti3_filter.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#!/usr/bin/env python3
__author__ = "[email protected]"
__version__ = '5.2.2' # Python 3.7 syntax!
__version__ = '5.3.0.' # Python 3.7 syntax!

"""
New SQANTI3 filter. It will serve as a wrapper for "rules" filter and "Machine-Learning" filter.
Expand Down
2 changes: 1 addition & 1 deletion sqanti3_qc.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
# Modified by Fran ([email protected]) currently as SQANTI3 version (05/15/2020)

__author__ = "[email protected]"
__version__ = '5.2.2' # Python 3.7
__version__ = '5.3.0' # Python 3.7

import pdb
import os, re, sys, subprocess, timeit, glob, copy
Expand Down
2 changes: 1 addition & 1 deletion sqanti3_rescue.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#!/usr/bin/env python3
__author__ = "[email protected]"
__version__ = '5.2.2'
__version__ = '5.3.0'

###################################################
########## SQANTI3 RESCUE WRAPPER ##########
Expand Down
142 changes: 142 additions & 0 deletions sqanti3_wrapper.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,142 @@
#!/bin/bash

# Conf for SQANTI3 wrapper
# Includes all variables needed to execute all optional parameters of SQANTI3_QC,
# SQANTI3_filter and SQANTI3_rescue.
# Using this wrapper helps to know the execution parameters of a run

sqanti3_qc="python3 sqanti3_qc.py"
sqanti3_filter="python3 sqanti3_filter.py"
sqanti3_rescue="python3 sqanti3_rescue.py"

# Somewhat common arguments for QC, Filter and rescue
# This parameters will (99.9% of the time) be common between the steps
# They only need to be defined once, and referenced within the script or conf
reference_gtf="example/gencode.v38.basic_chr22.gtf"
reference_fasta="example/GRCh38.p13_chr22.fasta"
cpus="4"
json_for_rules="utilities/filter/filter_default.json" # json file with rules for filter and rescue rules
threshold="0.7"

# Execution control
# Skip parts at will, however, be careful if you skip QC or filter, but you
# need one of its ouput later and still does not exist, it will fail.

skip_qc="true" #true to skip the QC
skip_filter="true" # true to skip the filter
filter_mode="both" # rules, ml or both
skip_rescue="false" # true to skip the rescue
rescue_mode="both" # rules, ml or both

# Sqanti3.qc

QC_input="example/UHR_chr22.gtf" # Input data
QC_min_ref_length="" # minimum reference transcript length. Default 0 bp
QC_force_id_ignore=""
QC_cage_peak_bed_file="data/ref_TSS_annotation/human.refTSS_v3.1.hg38.bed"
QC_aligner_choice="" # minimap2, deSALT, gmap or uLTRA
QC_polyA_motif_list="data/polyA_motifs/mouse_and_human.polyA_motif.txt"
QC_polyA_peak=""
QC_phylobed=""
QC_skipORF="true"
QC_is_fusion="false"
QC_orf_input=""
QC_is_fastq="false" # Requiered to be true if QC_input is a fastq file.
QC_expression_matrix=""
QC_gmap_index=""
QC_chunks=1
QC_output_prefix="UHR_chr22"
QC_destination_folder="/tmp/sqanti3_wrapper/QC/"
QC_coverage="" # Junctions coverage file
QC_sites=""
QC_window=""
QC_genename="" # Column name from GTF to define gene names
QC_full_length_pacbio_abundance_tsv="example/UHR_abundance.tsv"
QC_saturation="true"
QC_report_file="both" # pdf, html or both
QC_isoAnnotLite=""
QC_gff3=""
QC_short_reads_fofn="example/UHR_chr22_short_reads.fofn"
QC_SR_bam=""
QC_isoform_hits=""
QC_ratio_TSS_metric="" # Which metric should be reported in the ratio_TSS column



# Sqanti3.filter

# Common elements for filter rules and ml
filter_skip_report="false"
filter_input_classification="${QC_destination_folder}/${QC_output_prefix}_classification.txt"
filter_corrected_gtf="${QC_destination_folder}/${QC_output_prefix}_corrected.gtf"
filter_isoforms="GMST/GMST_tmp.faa"
filter_isoannotgff3="example/SQANTI3_QC_output/UHR_chr22.gff3"
filter_sam=""
filter_faa="${QC_destination_folder}/${QC_output_prefix}_corrected.faa"
filter_monoexonic="true"
filter_skip_report=""

filter_cpus=${cpus}

# Filter rules parameters
filter_rules_ouput_folder="/tmp/sqanti3_wrapper/sqanti3_filter_rules"
filter_rules_prefix="${QC_output_prefix}"
filter_rules_json_file=${json_for_rules}

# Filter ml parameters
filter_ml_ouput_folder="/tmp/sqanti3_wrapper/sqanti3_filter_ml"
filter_ml_prefix="${QC_output_prefix}"
filter_ml_percent_training="0.8"
filter_ml_TP=""
filter_ml_TN=""
filter_ml_threshold=${threshold}
filter_ml_remove_columns=""
filter_ml_intermediate_files=""
filter_ml_max_class_size=""
filter_ml_intrapriming=""

# Sqanti3.rescue

# Rescue rules parameters
rescue_rules_filtered_classification="${filter_rules_ouput_folder}/${filter_rules_prefix}_RulesFilter_result_classification.txt"
rescue_rules_reference_classification="${QC_destination_folder}/${QC_output_prefix}_classification.txt"
rescue_rules_isoforms="${QC_destination_folder}/${QC_output_prefix}_corrected.fasta"
rescue_rules_gtf="${filter_rules_ouput_folder}/${filter_rules_prefix}.filtered.gtf"
rescue_rules_monoexons="all"
rescue_rules_mode="full"
rescue_rules_output_prefix="${QC_output_prefix}"
rescue_rules_output_folder="/tmp/sqanti3_wrapper/sqanti3_rescue_rules"


# Rescue ml parameters
rescue_ml_filtered_classification="${filter_ml_ouput_folder}/${filter_ml_prefix}_MLresult_classification.txt"
rescue_ml_reference_classification="${QC_destination_folder}/${QC_output_prefix}_classification.txt"
rescue_ml_isoforms="${QC_destination_folder}/${QC_output_prefix}_corrected.fasta"
rescue_ml_gtf="${filter_ml_ouput_folder}/${filter_ml_prefix}.filtered.gtf"
rescue_ml_monoexons="all"
rescue_ml_mode="full"
rescue_ml_output_prefix="${QC_output_prefix}"
rescue_ml_output_folder="/tmp/sqanti3_wrapper/sqanti3_rescue_ml"
rescue_ml_monoexons="all"
rescue_ml_mode="full"
rescue_ml_randomforest_rdata="${filter_ml_ouput_folder}/randomforest.RData"


# WARNING ZONE: By default the QC, filter and Rescue use the common
# parameters for references and static files but they can be changed
# if needed (although you probably won't need to').

# QC warning zone
QC_reference_gtf=${reference_gtf}
QC_reference_fasta=${reference_fasta}
QC_cpus=${cpus}

# Rescue warning zone

rescue_rules_reference_genome=${reference_fasta}
rescue_rules_reference_gtf=${reference_gtf}
rescue_rules_json_file="${json_for_rules}"

rescue_ml_reference_genome=${reference_fasta}
rescue_ml_reference_gtf=${reference_gtf}
rescue_ml_threshold=${threshold}
Loading

0 comments on commit c0194f3

Please sign in to comment.