-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
singlem issue #159
Comments
Hi Francesco, In the future, can you put the error in a code block (e.g. https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet#code)? Its a bit hard to read as is. I've done this below. It looks truncated. Is there anything after Note that 01/10/2025 04:33:04 PM INFO: SingleM v0.18.1
01/10/2025 04:33:04 PM INFO: Acquiring SingleM metapackage from Zenodo backpack directory specified ..
01/10/2025 04:33:04 PM INFO: Retrieval successful. Location of backpack is: /home/fricci/rp24/fra/database_files/S3.2.1.GTDB_r214.metapackage_20231006.smpkg.zb
01/10/2025 04:33:04 PM INFO: Loaded 59 SingleM packages
01/10/2025 04:33:04 PM INFO: Using as input 1 different pairs of sequence files e.g. /home/fricci/rp24/fra/analyses/Tess/raw_reads/DML-31_L3.1.fq.gz & /home/fricci/r>
01/10/2025 04:33:04 PM INFO: Filtering sequence files through DIAMOND blastx
01/10/2025 05:38:04 PM INFO: Finished DIAMOND prefilter phase
01/10/2025 05:38:04 PM INFO: Assigning sequences to SingleM packages with DIAMOND ..
01/10/2025 05:42:25 PM INFO: Running taxonomic assignment ..
01/10/2025 05:42:25 PM INFO: Assigning taxonomy by singlem query ..
01/10/2025 05:42:25 PM INFO: Querying against species database with 430 sequences, using method smafa-naive and max divergence 2
01/10/2025 05:42:25 PM INFO: Searching with SMAFA NAIVE by nucleotide sequence ..
01/10/2025 05:42:25 PM INFO: Querying index for S3.1.ribosomal_protein_L2_rplB
Traceback (most recent call last):
File "/fs04/rp24/fra/analyses/Tess/output/binning/binchicken/path_to_conda_envs/8af436ae67d59fa6a12aa49a286a3324_/bin/singlem", line 709, in
singlem.pipe.SearchPipe().run(
File "/fs04/rp24/fra/analyses/Tess/output/binning/binchicken/path_to_conda_envs/8af436ae67d59fa6a12aa49a286a3324_/lib/python3.12/site-packages/singlem/pipe.py", li>
otu_table_object = self.run_to_otu_table(**kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/fs04/rp24/fra/analyses/Tess/output/binning/binchicken/path_to_conda_envs/8af436ae67d59fa6a12aa49a286a3324_/lib/python3.12/site-packages/singlem/pipe.py", li>
otu_table_object = self.assign_taxonomy_and_process(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/fs04/rp24/fra/analyses/Tess/output/binning/binchicken/path_to_conda_envs/8af436ae67d59fa6a12aa49a286a3324_/lib/python3.12/site-packages/singlem/pipe.py", li>
assignment_result = self.assign_taxonomy(
^^^^^^^^^^^^^^^^^^^^^^
File "/fs04/rp24/fra/analyses/Tess/output/binning/binchicken/path_to_conda_envs/8af436ae67d59fa6a12aa49a286a3324/lib/python3.12/site-packages/singlem/pipe.py", li>
query_based_assignment_result = PipeTaxonomyAssignerByQuery().assign_taxonomy(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/fs04/rp24/fra/analyses/Tess/output/binning/binchicken/path_to_conda_envs/8af436ae67d59fa6a12aa49a286a3324_/lib/python3.12/site-packages/singlem/pipe_taxonom>
query_single_set(queries[0], 0)
File "/fs04/rp24/fra/analyses/Tess/output/binning/binchicken/path_to_conda_envs/8af436ae67d59fa6a12aa49a286a3324_/lib/python3.12/site-packages/singlem/pipe_taxonom>
for hit in querier.query_with_queries(queries, sdb, max_species_divergence, method, SequenceDatabase.NUCLEOTIDE_TYPE, 1, None, False, None):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/fs04/rp24/fra/analyses/Tess/output/binning/binchicken/path_to_conda_envs/8af436ae67d59fa6a12aa49a286a3324_/lib/python3.12/site-packages/singlem/querier.py",>
smafa_stdout = extern.run(smafa_cmd, stdin='\n'.join([">{}\n{}".format(i, q.sequence) for i, q in enumerate(chunked_queries)]))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/fs04/rp24/fra/analyses/Tess/output/binning/binchicken/path_to_conda_envs/8af436ae67d59fa6a12aa49a286a3324_/lib/python3.12/site-packages/extern/init.py",>
raise ExternCalledProcessError(process, command)
extern.ExternCalledProcessError: Command smafa query --database '/home/fricci/rp24/fra/database_files/S3.2.1.GTDB_r214.metapackage_20231006.smpkg.zb/payload_director>
STDERR was: b'[2025-01-10T06:42:25Z INFO bird_tool_utils::clap_utils] Smafa version 0.8.0\n[2025-01-10T06:42:25Z INFO smafa] Decoding db file "/home/fricci/rp24/fr> |
Damn, sorry Sam. Unfortunately I don't have that file anymore, it's been overwritten. How can I change the database that singlem sources from within binchicken? |
No problem. You can change it with: Otherwise, rerunning binchicken should give the same error message? |
Thanks! Yes, I am rerunning binchicken atm, but I'll probably stop it and install the new single database. When I was trying to troubleshoot the issue before I thought it could have been because the path to the single db is SINGLEM_METAPACKAGE_PATH='/home/fricci/rp24/fra/database_files/S3.2.1.GTDB_r214.metapackage_20231006.smpkg.zb'. Do you this could be the issue? Should I point the path do the subfolders within the db? |
Pointing to the |
Smafa v0.8.0, which is what you were running, requires a singlem S4.x database, it won't work with S3.x ones - that is likely the source of the issue - I suggest updating - not sure if this is a general thing @AroneyS ? |
Thanks Ben and Sam, I'll try what you guys recommended and let you know how it went! |
If the pipeline encounters an error, is it possible to restart it from where it stopped? |
Just rerun the command as is. Should be fine, depending on the error and what you did to fix it. |
Hello since I've tried to fix the single issue I face a new issue earlier on in the pipeline, specifically: Error in rule genome_transcripts: when I check genomes_protein.log, this is the output: PRODIGAL v2.6.3 [February, 2016]
|
What command are you using? It looks like you are providing a list of genomes to |
This is my command: I did not change it from before tho, and that step worked fine previously. So you are saying to modify it to: right? |
Thanks Sam, binchicken seems to work fine so far. I'll keep you posted! |
Hi Sam I had to restart binchicken a few times cause node restriction walltime of 3 days. After last time I restarted it, I keep getting the following error: [Fri Jan 24 08:04:02 2025] Activating conda environment: path_to_conda_envs/61ed490c404ac70f052761cb9a62d3f6_ Removing output files of failed job checkm2 since they might be corrupted: unfortunately I can't locate the input, output and log folders reported at this step. I think that's where the problem is coming from. Do you have any advice? Thanks |
What is in |
Hello
I am getting the following issue when binchicken runs singlem, I think it is because /home/fricci/rp24/fra/database_files/S3.2.1.GTDB_r214.metapackage_20231006.smpkg.zb is being recognized as a directory, not as a single file by smafa. Do you have any suggestion on how to fix the following?
This is the error I get:
01/10/2025 04:33:04 PM INFO: SingleM v0.18.1
01/10/2025 04:33:04 PM INFO: Acquiring SingleM metapackage from Zenodo backpack directory specified ..
01/10/2025 04:33:04 PM INFO: Retrieval successful. Location of backpack is: /home/fricci/rp24/fra/database_files/S3.2.1.GTDB_r214.metapackage_20231006.smpkg.zb
01/10/2025 04:33:04 PM INFO: Loaded 59 SingleM packages
01/10/2025 04:33:04 PM INFO: Using as input 1 different pairs of sequence files e.g. /home/fricci/rp24/fra/analyses/Tess/raw_reads/DML-31_L3.1.fq.gz & /home/fricci/r>
01/10/2025 04:33:04 PM INFO: Filtering sequence files through DIAMOND blastx
01/10/2025 05:38:04 PM INFO: Finished DIAMOND prefilter phase
01/10/2025 05:38:04 PM INFO: Assigning sequences to SingleM packages with DIAMOND ..
01/10/2025 05:42:25 PM INFO: Running taxonomic assignment ..
01/10/2025 05:42:25 PM INFO: Assigning taxonomy by singlem query ..
01/10/2025 05:42:25 PM INFO: Querying against species database with 430 sequences, using method smafa-naive and max divergence 2
01/10/2025 05:42:25 PM INFO: Searching with SMAFA NAIVE by nucleotide sequence ..
01/10/2025 05:42:25 PM INFO: Querying index for S3.1.ribosomal_protein_L2_rplB
Traceback (most recent call last):
File "/fs04/rp24/fra/analyses/Tess/output/binning/binchicken/path_to_conda_envs/8af436ae67d59fa6a12aa49a286a3324_/bin/singlem", line 709, in
singlem.pipe.SearchPipe().run(
File "/fs04/rp24/fra/analyses/Tess/output/binning/binchicken/path_to_conda_envs/8af436ae67d59fa6a12aa49a286a3324_/lib/python3.12/site-packages/singlem/pipe.py", li>
otu_table_object = self.run_to_otu_table(**kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/fs04/rp24/fra/analyses/Tess/output/binning/binchicken/path_to_conda_envs/8af436ae67d59fa6a12aa49a286a3324_/lib/python3.12/site-packages/singlem/pipe.py", li>
otu_table_object = self.assign_taxonomy_and_process(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/fs04/rp24/fra/analyses/Tess/output/binning/binchicken/path_to_conda_envs/8af436ae67d59fa6a12aa49a286a3324_/lib/python3.12/site-packages/singlem/pipe.py", li>
assignment_result = self.assign_taxonomy(
^^^^^^^^^^^^^^^^^^^^^^
File "/fs04/rp24/fra/analyses/Tess/output/binning/binchicken/path_to_conda_envs/8af436ae67d59fa6a12aa49a286a3324/lib/python3.12/site-packages/singlem/pipe.py", li>
query_based_assignment_result = PipeTaxonomyAssignerByQuery().assign_taxonomy(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/fs04/rp24/fra/analyses/Tess/output/binning/binchicken/path_to_conda_envs/8af436ae67d59fa6a12aa49a286a3324_/lib/python3.12/site-packages/singlem/pipe_taxonom>
query_single_set(queries[0], 0)
File "/fs04/rp24/fra/analyses/Tess/output/binning/binchicken/path_to_conda_envs/8af436ae67d59fa6a12aa49a286a3324_/lib/python3.12/site-packages/singlem/pipe_taxonom>
for hit in querier.query_with_queries(queries, sdb, max_species_divergence, method, SequenceDatabase.NUCLEOTIDE_TYPE, 1, None, False, None):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/fs04/rp24/fra/analyses/Tess/output/binning/binchicken/path_to_conda_envs/8af436ae67d59fa6a12aa49a286a3324_/lib/python3.12/site-packages/singlem/querier.py",>
smafa_stdout = extern.run(smafa_cmd, stdin='\n'.join([">{}\n{}".format(i, q.sequence) for i, q in enumerate(chunked_queries)]))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/fs04/rp24/fra/analyses/Tess/output/binning/binchicken/path_to_conda_envs/8af436ae67d59fa6a12aa49a286a3324_/lib/python3.12/site-packages/extern/init.py",>
raise ExternCalledProcessError(process, command)
extern.ExternCalledProcessError: Command smafa query --database '/home/fricci/rp24/fra/database_files/S3.2.1.GTDB_r214.metapackage_20231006.smpkg.zb/payload_director>
STDERR was: b'[2025-01-10T06:42:25Z INFO bird_tool_utils::clap_utils] Smafa version 0.8.0\n[2025-01-10T06:42:25Z INFO smafa] Decoding db file "/home/fricci/rp24/fr>
The text was updated successfully, but these errors were encountered: