Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ASCAT executing R files #125

Closed
ggabernet opened this issue Feb 24, 2020 · 25 comments
Closed

ASCAT executing R files #125

ggabernet opened this issue Feb 24, 2020 · 25 comments
Assignees
Labels
bug Something isn't working

Comments

@ggabernet
Copy link
Member

ggabernet commented Feb 24, 2020

Hi, when running Sarek with multiple variant callers, it seems like the first one is picked and the rest are ignored. I run it indicating Strelka and ASCAT, and ASCAT was just ignored.


nextflow run ggabernet/nf-core-sarek -r v2.5.2-branch \
--outdir 's3://qbic-bucket-virginia/resultsdirsarekicgc1' \
-w 's3://qbic-bucket-virginia/workdirsarekicgc1' \
--tracedir 's3://qbic-bucket-virginia/tracesarekicgc1' \
--input 's3://qbic-bucket-virginia/icgc-sarek/input-icgc-1.tsv' \
--genome 'GRCh38' \
--tools 'Strelka,ASCAT,snpEff' \
-c awsbatch.config \
--awsregion 'us-east-1' \
--igenomes_base 's3://qbic-bucket-virginia/references' \
--awscli '/home/ec2-user/miniconda/bin/aws' -resume
@maxulysse maxulysse self-assigned this Feb 24, 2020
@maxulysse maxulysse added the bug Something isn't working label Feb 24, 2020
@maxulysse
Copy link
Member

I'll look at it right away.

@ggabernet
Copy link
Member Author

Hi Maxime, sorry my bad. The convertAlleleCounts process failed and that is why ASCAT was not triggered

@ggabernet
Copy link
Member Author

so that's not the issue

@maxulysse
Copy link
Member

OK, good to know, any idea what the problem was?

@ggabernet
Copy link
Member Author

Fatal error: cannot open file '/home/ec2-user/.nextflow/assets/ggabernet/nf-core-sarek/bin/convertAlleleCounts.r': No such file or directory

@ggabernet
Copy link
Member Author

I had to make a fork to fix the SamToFastq issue, but all the rest of the code is the same as the 2.5.2 release

@maxulysse
Copy link
Member

Did it worked before? Or do you think that it was already an issue?
Maybe we shouldn't use that to call the R script:

Rscript ${workflow.projectDir}/bin/convertAlleleCounts.r ...

@ggabernet
Copy link
Member Author

yes that could be the issue, shouldn't it work directly with convertAlleleCounts.r as the bin is added to the path?

@ggabernet
Copy link
Member Author

I can test it out and let you know

@maxulysse
Copy link
Member

I'll try it out on our cluster as well.

@ggabernet ggabernet changed the title ASCAT tool is ignored ASCAT executing R files Feb 24, 2020
@maxulysse
Copy link
Member

By the way, if you're using ASCAT, the current dev has some good improvement.
You can now specify purity an ploidy

@ggabernet
Copy link
Member Author

ggabernet commented Feb 24, 2020

I've tried Rscript convertAlleleCounts.r and directly convertAlleleCounts.r (as you have the shebang Rscript line. Nothing works:

.command.sh: //nextflow-bin/convertAlleleCounts.r: /bin/env: bad interpreter: No such file or directory

It's a bit weird as it worked for me in Bcellmagic like the last option

@ggabernet
Copy link
Member Author

Ah I just saw the shebang line was missing /usr/, I try with this now

@maxulysse
Copy link
Member

it also seems that there's a typo in the shebang for run_ascat.R as well

@ggabernet
Copy link
Member Author

yes, I fixed both now, let's see

@maxulysse
Copy link
Member

You're trying on AWS?

@ggabernet
Copy link
Member Author

yes, I have to set it up there so we can run ASCAT on ICGC data

@ggabernet
Copy link
Member Author

looks good now, the job was immediately killed before. But to make sure I'll post it when it runs through

@maxulysse
Copy link
Member

Good, I made the same changes, and I'm trying it out on our server.
You can make a PR, and if it works for everyone we can merge

@ggabernet
Copy link
Member Author

perfect, will do!

@ggabernet
Copy link
Member Author

ggabernet commented Feb 24, 2020

This is solved now, but I am having another issue with ASCAT, this time I switched already to the dev branch as suggested:

[1] Reading Tumor LogR data...
[1] Reading Tumor BAF data...
[1] Reading Germline LogR data...
[1] Reading Germline BAF data...
[1] Registering SNP locations...
[1] Splitting genome in distinct chunks...
Error in names(x) <- value : 
  'names' attribute [2] must be the same length as the vector [1]
Calls: ascat.GCcorrect -> colnames<-
In addition: Warning message:
In read.table(file = GCcontentfile, header = TRUE, as.is = TRUE) :
  incomplete final line found by readTableHeader on 'input.5'
Execution halted

I love that R does not print the line number in errors...

@ggabernet
Copy link
Member Author

skipping ascat.GCcorrect works, there must be a problem in the GCcontentfile, but it's super hard to debug on AWS, will try on the cluster tomorrow

@maxulysse
Copy link
Member

No problem executing R as we planned.
But I had an issue with the GC file that wasn't recognized.
I'll try to fix that.

@maxulysse
Copy link
Member

OK, I found why you're having currently this bug with #127
I made a mistake with #107 and forgot to snake case fully the params for the ascat gc file in conf/igenomes.config.
Since you already have a PR open open, I'll let you correct ac_lociGC to ac_loci_gc.

@ggabernet
Copy link
Member Author

great, let's hope this solves the issues

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants