Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Subsetting the database when using poppunk_visualise --cytoscape with --include-files #196

Closed
muppi1993 opened this issue Feb 10, 2022 · 5 comments · Fixed by #204
Closed
Labels
bug Something isn't working

Comments

@muppi1993
Copy link
Contributor

Versions

poppunk 2.4.0
poppunk_sketch 1.7.4

Command used and output returned

poppunk_visualise --ref-db GPS_v4 --query-db poppunk_clusters --cytoscape --output example_cytoscape --tree none --include-files gps_cluster3_list.txt --network-file GPS_v4/GPS_v4_graph.gt 

Output: one .graphml and two .csv files

Describe the bug

I tried to only include a subset of the dataset in the output with --include-files, which worked fine for the --microreact output. However, the .graphml network contains all isolates from the database rather than just those listed in gps_cluster3_list.txt.

@johnlees johnlees added the bug Something isn't working label Feb 24, 2022
@johnlees
Copy link
Member

This should be done by this bit of code which masks the not-included nodes: https://github.com/johnlees/PopPUNK/blob/46aff5d5715b26a7582c733d6956cb4c78748a99/PopPUNK/plot.py#L488-L495

But maybe when we print the graphml it prints the masked nodes too? @nickjcroucher do you remember if this is the case?

@nickjcroucher
Copy link
Collaborator

All gets saved with the same function - masking can be a little tricky in graph-tool. Will take a look.

@nickjcroucher
Copy link
Collaborator

Also @muppi1993 highlighted that running with just --cytoscape still generates a tree, which is not needed and quite slow - I think we should change this default, unless there any objections?

@johnlees johnlees mentioned this issue May 12, 2022
37 tasks
@johnlees johnlees linked a pull request May 12, 2022 that will close this issue
37 tasks
@johnlees
Copy link
Member

johnlees commented Aug 3, 2022

All gets saved with the same function - masking can be a little tricky in graph-tool. Will take a look.

I think we just need a GraphView (adding in #204)

Also @muppi1993 highlighted that running with just --cytoscape still generates a tree, which is not needed and quite slow - I think we should change this default, unless there any objections?

Agree, will also add in 2.5.0

@sydelstan
Copy link

sydelstan commented Apr 4, 2024

poppunk_visualise --ref-db poppunk_clusters --output cytoscape_5 --cytoscape --network-file /poppunk_clusters/poppunk_clusters_refs_graph.gt --include-files strains.csv --external-clustering meta.csv

I have the same issue when running this code -- several strains from the reference database are still included in the final cytoscape output even though it should just include the strains from the query/strain list

@johnlees @johnlees

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants