Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Error in rule fix_gtdb_taxonomy" for custom data #235

Closed
andrekind17 opened this issue Apr 17, 2023 · 2 comments · Fixed by #226
Closed

"Error in rule fix_gtdb_taxonomy" for custom data #235

andrekind17 opened this issue Apr 17, 2023 · 2 comments · Fixed by #226

Comments

@andrekind17
Copy link
Collaborator

INFO 17/04 15:28:16 Reading GTDB metadata .json files...
INFO 17/04 15:28:16 Getting taxonomic information...
INFO 17/04 15:28:16 Getting metadata into table...
INFO 17/04 15:28:16 No additional metadata found.
INFO 17/04 15:28:16 Finalizing table...
Traceback (most recent call last):
File "/bigdata/home/WIN.DTU.DK/gentile/bgcflow/.snakemake/conda/03517672abe9c665423eb3b1c199390f_/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3652, in get_loc
return self._engine.get_loc(casted_key)
File "pandas/_libs/index.pyx", line 147, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 176, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 7080, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 7088, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'metadata'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/bigdata/home/WIN.DTU.DK/gentile/bgcflow/workflow/bgcflow/bgcflow/data/fix_gtdb_taxonomy.py", line 48, in summarize_gtdb_json
{i: df.loc[i, "metadata"] for i in df.index}
File "/bigdata/home/WIN.DTU.DK/gentile/bgcflow/workflow/bgcflow/bgcflow/data/fix_gtdb_taxonomy.py", line 48, in
{i: df.loc[i, "metadata"] for i in df.index}
File "/bigdata/home/WIN.DTU.DK/gentile/bgcflow/.snakemake/conda/03517672abe9c665423eb3b1c199390f_/lib/python3.9/site-packages/pandas/core/indexing.py", line 1096, in getitem
return self.obj._get_value(*key, takeable=self.takeable)
File "/bigdata/home/WIN.DTU.DK/gentile/bgcflow/.snakemake/conda/03517672abe9c665423eb3b1c199390f
/lib/python3.9/site-packages/pandas/core/frame.py", line 3879, in _get_value
series = self.get_item_cache(col)
File "/bigdata/home/WIN.DTU.DK/gentile/bgcflow/.snakemake/conda/03517672abe9c665423eb3b1c199390f
/lib/python3.9/site-packages/pandas/core/frame.py", line 4264, in get_item_cache
loc = self.columns.get_loc(item)
File "/bigdata/home/WIN.DTU.DK/gentile/bgcflow/.snakemake/conda/03517672abe9c665423eb3b1c199390f
/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3654, in get_loc
raise KeyError(key) from err
KeyError: 'metadata'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/bigdata/home/WIN.DTU.DK/gentile/bgcflow/workflow/bgcflow/bgcflow/data/fix_gtdb_taxonomy.py", line 81, in
summarize_gtdb_json(sys.argv[1], sys.argv[2])
File "/bigdata/home/WIN.DTU.DK/gentile/bgcflow/workflow/bgcflow/bgcflow/data/fix_gtdb_taxonomy.py", line 71, in summarize_gtdb_json
[df.loc[:, ["genome_id", "gtdb_release"]], df_taxonomy], axis=1
File "/bigdata/home/WIN.DTU.DK/gentile/bgcflow/.snakemake/conda/03517672abe9c665423eb3b1c199390f_/lib/python3.9/site-packages/pandas/core/indexing.py", line 1097, in getitem
return self.getitem_tuple(key)
File "/bigdata/home/WIN.DTU.DK/gentile/bgcflow/.snakemake/conda/03517672abe9c665423eb3b1c199390f
/lib/python3.9/site-packages/pandas/core/indexing.py", line 1289, in _getitem_tuple
return self.getitem_tuple_same_dim(tup)
File "/bigdata/home/WIN.DTU.DK/gentile/bgcflow/.snakemake/conda/03517672abe9c665423eb3b1c199390f
/lib/python3.9/site-packages/pandas/core/indexing.py", line 955, in _getitem_tuple_same_dim
retval = getattr(retval, self.name).getitem_axis(key, axis=i)
File "/bigdata/home/WIN.DTU.DK/gentile/bgcflow/.snakemake/conda/03517672abe9c665423eb3b1c199390f
/lib/python3.9/site-packages/pandas/core/indexing.py", line 1332, in _getitem_axis
return self.getitem_iterable(key, axis=axis)
File "/bigdata/home/WIN.DTU.DK/gentile/bgcflow/.snakemake/conda/03517672abe9c665423eb3b1c199390f
/lib/python3.9/site-packages/pandas/core/indexing.py", line 1272, in _getitem_iterable
keyarr, indexer = self.get_listlike_indexer(key, axis)
File "/bigdata/home/WIN.DTU.DK/gentile/bgcflow/.snakemake/conda/03517672abe9c665423eb3b1c199390f
/lib/python3.9/site-packages/pandas/core/indexing.py", line 1462, in _get_listlike_indexer
keyarr, indexer = ax.get_indexer_strict(key, axis_name)
File "/bigdata/home/WIN.DTU.DK/gentile/bgcflow/.snakemake/conda/03517672abe9c665423eb3b1c199390f
/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 5876, in _get_indexer_strict
self.raise_if_missing(keyarr, indexer, axis_name)
File "/bigdata/home/WIN.DTU.DK/gentile/bgcflow/.snakemake/conda/03517672abe9c665423eb3b1c199390f
/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 5938, in _raise_if_missing
raise KeyError(f"{not_found} not in index")
KeyError: "['gtdb_release'] not in index"

@matinnuhamunada matinnuhamunada linked a pull request Apr 17, 2023 that will close this issue
@matinnuhamunada
Copy link
Collaborator

This issue persists when giving a project with only custom fasta files and without additional taxonomy information. Will be fixed in 0.6.1

@andrekind17
Copy link
Collaborator Author

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants