[Feature Request] Increase generalizability of software #12

jolespin · 2024-08-15T19:28:08Z

First off, incredible thanks for the developing this software suite! I've been struggling with calculating KEGG module completion ratios (MCR) at scale ever since KEGG removed their MAPLE software but still that wasn't able to run at scale as it was dependent on the web interface. I found a KEGG MCR calculation functionality in MicrobeAnnotator that I reimplemented in my VEBA (GitHub) (Pub) doi:10.1093/nar/gkae528 but the methodology from the original implementation is hard coded and doesn't capture alternative paths as well as it can. Separately I've been developing some "shortest path" based approaches for MCR calculations but would rather use your implementation since it's already further along.

For general usage, I'd like to recommend a few things:
0. Remove forced dependency versions

biopython==1.83
networkx==3.3
graphviz==0.20.3

This is very restrictive. Maybe you can do something like graphviz>=0.20.3 if this is the minimum version.

Move all of the functions into a package with few dependencies. This will allow one to load in the functions to run the tool internally within a Python environment (e.g., if they want to include this as a dependency for another package as I plan to do) or run externally with a command line interface (current usage)
For the "list" option, I would recommend using line breaks instead of commas. The reason for this is that most tools (e.g., grep, seqkit, skani) take in identifier lists with each item on a new line.
Provide a batch option that allows [id_genome, id_ko] (or alternatively [id_contig, id_ko] if not genome-resolved)
Make some of the packages optional with an error message telling you to install it if you're trying to run functionality that requires it and it's not installed (e.g., Biopython). The idea here is to keep the package as lightweight and flexible as possible.
Output to a directory instead of using as a base name prefix

The text was updated successfully, but these errors were encountered:

KateSakharova · 2024-08-20T09:21:13Z

Hi @jolespin, Thank you very much for your suggestion!
I will definetely improve tool in new release!
Best,
Kate

jolespin · 2024-08-23T20:17:05Z

Hi @KateSakharova just reaching out to let you know that I tried pushing some changes to your repo to address the items above but was having a lot of difficulty regarding the package structure/layout.

I need the pathway completion functionality for my VEBA software suite (https://github.com/jolespin/veba) to alleviate some bottlenecks in my workflow. Currently backlogged on some analysis and I realized that it would be faster for me to reimplement rather than push/pull changes to current repo. However, I would be more than happy to help integrate into your package if you were interested. In the meantime, I have you fully acknowledged across the top and bottom of repo so people know that the theory and base code are credited to you.

The reimplementation is below:
https://github.com/jolespin/kegg_pathway_profiler/

I designed the reimplementation so it can be used within Python and through the CLI

If you would like to me to make any adjustments (e.g., adding a preprint or any other citations) please let me know.

This package is a reimplementation of kegg-pathways-completeness-tool (e.g., base code and theory).
For any publications or usage, please cite the original implementation and credit the lead developer (See Acknowledgements below).

Acknowledgements:
Ekaterina Sakharova the developer for the original implementation kegg-pathways-completeness-tool.

KateSakharova self-assigned this Aug 20, 2024

KateSakharova added the enhancement New feature or request label Aug 20, 2024

KateSakharova mentioned this issue Sep 30, 2024

Code improvements #14

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Increase generalizability of software #12

[Feature Request] Increase generalizability of software #12

jolespin commented Aug 15, 2024 •

edited

Loading

KateSakharova commented Aug 20, 2024

jolespin commented Aug 23, 2024 •

edited

Loading

[Feature Request] Increase generalizability of software #12

[Feature Request] Increase generalizability of software #12

Comments

jolespin commented Aug 15, 2024 • edited Loading

KateSakharova commented Aug 20, 2024

jolespin commented Aug 23, 2024 • edited Loading

jolespin commented Aug 15, 2024 •

edited

Loading

jolespin commented Aug 23, 2024 •

edited

Loading