Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File not found error when using downloaded kraken2_db #99

Closed
artefaritaKuniklo opened this issue Sep 10, 2020 · 4 comments
Closed

File not found error when using downloaded kraken2_db #99

artefaritaKuniklo opened this issue Sep 10, 2020 · 4 comments

Comments

@artefaritaKuniklo
Copy link

artefaritaKuniklo commented Sep 10, 2020

Describe the bug

when I use downloaded kraken2 database and put the archive in some place and run the pipeline with --kraken2_db parameter,
the process kraken2_db_preparation reported a bug which is caused by file not found

Steps to reproduce

$ nextflow run nf-core/mag --reads '../data/*_{1,2}.fastq' --busco_reference bacteria_odb9.tar.gz --task.memory 120 --task.cpus 32 -profile docker --kraken2_db minikraken2_v2_8GB_201904.tgz
....
Error executing process > 'kraken2_db_preparation (1)'                                        `bash .command.run`

Caused by:
  Missing output file(s) `minikraken2_v2_8GB_201904/*.k2d` expected by process `kraken2_db_preparation (1)`

Command executed:

  tar -xf "minikraken2_v2_8GB_201904.tgz"

Command exit status:
  0

Command output:
  (empty)

Work dir:
  /home/stella/Proj/20200908_NF/run/work/2a/aa7e195cef0882b13f789c378442af

Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`

and I am sure that this bug is caused by the different folder name in the archive and the base name of the archive

➜   tree ./minikraken2_v2_8GB_201904_UPDATE 
./minikraken2_v2_8GB_201904_UPDATE
├── database100mers.kmer_distrib
├── database150mers.kmer_distrib
├── database200mers.kmer_distrib
├── hash.k2d
├── opts.k2d
└── taxo.k2d

when the pipeline tried to find minikraken2_v2_8GB_201904/*.k2d, there is only minikraken2_v2_8GB_201904_UPDATE

System:

  • Hardware: Dell Server R720
  • OS:
➜  uname -a
Linux hal9003 5.4.0-47-generic #51-Ubuntu SMP Fri Sep 4 19:50:52 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
  • Executor: local execute within tmux

Nextflow Installation:

  • Version: N E X T F L O W ~ version 20.07.1+nf-core/mag version 1.0.0
@d4straub
Copy link
Collaborator

Hi there, thanks for reporting this.

The pipeline expects that extraction of minikraken2_v2_8GB_201904.tgz yields a folder minikraken2_v2_8GB_201904 with the respective database files. The tgz archive you are using however contains a folder with a different name. To fix this, you could just rename the tgz file to match the folder name, i.e. minikraken2_v2_8GB_201904.tgz -> minikraken2_v2_8GB_201904_UPDATE.tgz. I haven't tested that, but I think that should work.

This should be actually fixed in the dev branch, see #54.

@artefaritaKuniklo
Copy link
Author

okay, I have already fixed it and I will submit a pull soon

@d4straub
Copy link
Collaborator

I appreciate your effort, but as I said, it is fixed already.

@artefaritaKuniklo artefaritaKuniklo mentioned this issue Sep 10, 2020
1 task
@artefaritaKuniklo
Copy link
Author

oh sorry, I just forgot that

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants