Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENSEMBL annotations not populating correctly #43

Open
carnold-sjsu opened this issue Oct 9, 2023 · 4 comments
Open

ENSEMBL annotations not populating correctly #43

carnold-sjsu opened this issue Oct 9, 2023 · 4 comments

Comments

@carnold-sjsu
Copy link

carnold-sjsu commented Oct 9, 2023

Hi, I'm running this pipeline from the SJSU HPC and I'm having an issue where ~8.5k out of my 29k rows of data are populating with both the symbol and genename as "NA". In a given NA row, there is a valid ENSEMBL ID that, if I look it up on ENSEMBL, leads to a valid gene product with an annotation that looks as though it is just not being populated correctly. I am running OSD-511 using the following script (using the cached files established on the spartan01 HPC by Jonathan Oribello):

NXF_SINGULARITY_CACHEDIR=/home/joribello/test_install/singularity nextflow run NF_RCP-F_1.0.3/main.nf -profile singularity,slurm -resume --gldsAccession GLDS-511 -c /home/joribello/test_install/cos_hpc_nextflow.config -c give_ALIGN_STAR_more_memory.config --runsheetPath /home/carnold/GLDS-511/Metadata/GLDS-511_bulkRNASeq_v1_runsheet.csv

The original runsheet was edited to correct the switched R1 and R2 files for one of the samples (they were entered incorrectly in the downloaded version from GL).
ALIGN_STAR_more_memory.config goes as follows:
process {
withName:'ALIGN_STAR' {
memory='45GB'
}
withName:'SORT_INDEX_BAM' {
memory='45GB'
}
withName: "COUNT_ALIGNED" {
maxRetries = 3
errorStrategy = 'retry'
memory = { 8.GB + 4.GB * task.attempt }
}
}

@asaravia-butler
Copy link
Collaborator

Hi @carnold-sjsu, will you please attach the following files:
/home/joribello/test_install/cos_hpc_nextflow.config
/home/carnold/GLDS-511/Metadata/GLDS-511_bulkRNASeq_v1_runsheet.csv

@carnold-sjsu
Copy link
Author

carnold-sjsu commented Jan 15, 2024

Sure thing, here are those files. Thanks for getting back to me!
note: I changed the filetype of the .config file to .txt, since github doesn't support upload of .config files.
cos_hpc_nextflow.config.txt
GLDS-511_bulkRNASeq_v1_runsheet.csv

A quick note that I may lose access to the SJSU HPC and files from this run, as I believe they kick graduates out at some point!

@asaravia-butler
Copy link
Collaborator

Thanks, @carnold-sjsu. I don't see an issue with either of those files. I'd like to check a few more. Will you please send these two files to me?
NF_RCP-F_1.0.3/config/default.config
NF_RCP-F_1.0.3/config/software/by_docker_image.config

@carnold-sjsu
Copy link
Author

Here you go.
default.config.txt
by_docker_image.config.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants