-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Accessory distance issue #135
Comments
Hi Stephen - what version of the software are you using? The most recent versions have a |
Hello Nicholas,
I hope you are doing well and staying safe – I got my first dose of Covid19 vaccine about an hour ago. I had some difficulty in getting popunk to install using the “conda install” process. On one linux/ubuntu 18.04 workstation conda only wanted to install an early version 1 variant. I ended up putting poppunk on a second ubuntu 18.04 workstation where conda accepted version 2.0.2 for install. Initially after install I was getting an error message indicating a problem with biopython and Bio.Alphabet but that was because I needed a newer version of python3. After updating python to 3.8.6, poppunk then ran without throwing error messages. I have not yet mastered the process of setting up different environments to isolate different version of perl and python from each other to illuminate running/package/library conflicts.
SBB
From: nickjcroucher <[email protected]>
Reply-To: johnlees/PopPUNK <[email protected]>
Date: Thursday, December 17, 2020 at 09:55
To: johnlees/PopPUNK <[email protected]>
Cc: Work <[email protected]>, Author <[email protected]>
Subject: [EXTERNAL] Re: [johnlees/PopPUNK] John and Nicholas (#135)
Hi Stephen - what version of the software are you using? The most recent versions have a --qc-filter continue flag to prevent runs halting at the database creation stage; the --max-a-dist 1.0 will prevent such filtering at the distance estimation stage. Hopefully those will help!
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<https://urldefense.com/v3/__https:/github.com/johnlees/PopPUNK/issues/135*issuecomment-747528066__;Iw!!Jm49CwcP98D83js1EA!rRV_7xaC5JGX0aRO1pUOlIVOF93EvqYdTEvF7mG5mTOEekLLqyZRWwFnopWa6_0lPOThgQ$>, or unsubscribe<https://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/AMAJPUI7HLGQMXXWADHEN4LSVISXXANCNFSM4U7VY4RQ__;!!Jm49CwcP98D83js1EA!rRV_7xaC5JGX0aRO1pUOlIVOF93EvqYdTEvF7mG5mTOEekLLqyZRWwFnopWa6_2A6hkDmw$>.
Houston Methodist. Leading Medicine. Houston Methodist is ranked by U.S. News & World Report as the No. 1 hospital in Texas for patient care and safety and one of the top 20 hospitals in the nation. In addition, we are nationally ranked in 11 specialties, the most in the state, and have been national leaders throughout the COVID-19 pandemic in research, offering innovative treatments and surpassing CDC safety standards. For more than 100 years, Houston Methodist has provided the best — and safest — clinical care, advanced technology and patient experience. That is our promise of leading medicine.
houstonmethodist.org
twitter.com/MethodistHosp
facebook.com/HoustonMethodist ***CONFIDENTIALITY NOTICE*** This e-mail is the property of Houston Methodist and/or its relevant affiliates and may contain restricted and privileged material for the sole use of the intended recipient(s). Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender and delete all copies of the message. Thank you.
|
Congratulations on getting the vaccine so quickly! The conda version should be 2.2.0 at the moment, though we are hopeful of upgrading this in the next couple of days. |
If you cannot get 2.2.0, please make sure all your input is alphabetically sorted. See: Ideally you should upgrade to the most recent version, but I appreciate conda can be a pain to get right. My advice would be:
|
Also make sure the channel order is correct (conda-forge -> bioconda -> defaults) and see the advice here if there are problems: https://conda-forge.org/docs/user/tipsandtricks.html#using-multiple-channels%3E |
Nick,
Early access to the vaccine comes as a consequence of working in a large medical center. The Musser lab has sequenced over 20,000 SarCov2 patient isolates as part of the ongoing epidemiology and evolution analysis of the virus, with no end of the patient isolate sequencing likely in the near future. I will try building poppunk from source to see if I can get the latest version installed. And also try the hints that John just sent. Plan to attend the upcoming Lancefield 2021 meeting and hope to see you there and get a chance to buy you your libation of choice.
Sincerely,
SBB
*************************************************************
Stephen B. Beres Ph.D.
Professor of Pathology and Genomic Medicine,
Institute for Academic Medicine
Director of Microbial Informatics,
Center for Molecular and Translational Human Infectious Disease Research
Houston Methodist Research Institute
6670 Bertner Ave.
Houston, TX 77030
MS RIB R6-111
(713) 441-5067
[email protected]
From: nickjcroucher <[email protected]>
Reply-To: johnlees/PopPUNK <[email protected]>
Date: Thursday, December 17, 2020 at 10:38
To: johnlees/PopPUNK <[email protected]>
Cc: Work <[email protected]>, Author <[email protected]>
Subject: [EXTERNAL] Re: [johnlees/PopPUNK] Accessory distance issue (#135)
Congratulations on getting the vaccine so quickly! The conda version should be 2.2.0 at the moment, though we are hopeful of upgrading this in the next couple of days.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<https://urldefense.com/v3/__https:/github.com/johnlees/PopPUNK/issues/135*issuecomment-747555371__;Iw!!Jm49CwcP98D83js1EA!uUZ_JHkaemkidVt5WcWsHi5R59hoYJNCyfSQ6e1uQvgdQm7BzNbw7tT8wRbu-V2cFjWC8w$>, or unsubscribe<https://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/AMAJPUI5V2XSUWA4NENLBF3SVIXUFANCNFSM4U7VY4RQ__;!!Jm49CwcP98D83js1EA!uUZ_JHkaemkidVt5WcWsHi5R59hoYJNCyfSQ6e1uQvgdQm7BzNbw7tT8wRbu-V1eyjDkqw$>.
Houston Methodist. Leading Medicine. Houston Methodist is ranked by U.S. News & World Report as the No. 1 hospital in Texas for patient care and safety and one of the top 20 hospitals in the nation. In addition, we are nationally ranked in 11 specialties, the most in the state, and have been national leaders throughout the COVID-19 pandemic in research, offering innovative treatments and surpassing CDC safety standards. For more than 100 years, Houston Methodist has provided the best — and safest — clinical care, advanced technology and patient experience. That is our promise of leading medicine.
houstonmethodist.org
twitter.com/MethodistHosp
facebook.com/HoustonMethodist ***CONFIDENTIALITY NOTICE*** This e-mail is the property of Houston Methodist and/or its relevant affiliates and may contain restricted and privileged material for the sole use of the intended recipient(s). Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender and delete all copies of the message. Thank you.
|
John,
Pursuant to my initial request for assistance with getting poppunk installed:
I am running poppunk on a large set of sequences that essentially lack any accessory gene content. I can complete the first step of generating the DB, but when trying to run an initial model fit, I get numerous warnings: Accessory outlier at a= ..., and a final output that "Distances failed quality control (change QC options to run anyway). Can you suggest what such QC options changes should be made? I have tried adding --ignore-length and --core-only with the same end results. Thanks for the software and any assistance.
I have gotten the latest version of poppunk 2.2.0 installed and have run it a couple of times on a large set of 27,000 sequences using different values of K:
➜ 27065x442708_fna-files poppunk --fit-model --distances 27065x442708/27065x442708.dists --output 27065x442708 --full-db --ref-db 27065x442708 --K 2 --threads 54
Graph-tools OpenMP parallelisation enabled: with 54 threads
PopPUNK (POPulation Partitioning Using Nucleotide Kmers)
(with backend: sketchlib v1.5.3
sketchlib: /home/bioinfo-4/anaconda3/lib/python3.8/site-packages/pp_sketchlib.cpython-38-x86_64-linux-gnu.so)
Mode: Fitting model to reference database
Fit summary:
Avg. entropy of assignment 0.0003
Number of components used 2
Scaled component means:
[0.8505769 0.60737495]
[0.03222048 0.01719254]
Network summary:
Components 294
Density 0.1041
Transitivity 0.9998
Score 0.8957
Done
➜ 27065x442708_fna-files poppunk --fit-model --distances 27065x442708/27065x442708.dists --output 27065x442708 --full-db --ref-db 27065x442708 --K 3 --threads 54
Graph-tools OpenMP parallelisation enabled: with 54 threads
PopPUNK (POPulation Partitioning Using Nucleotide Kmers)
(with backend: sketchlib v1.5.3
sketchlib: /home/bioinfo-4/anaconda3/lib/python3.8/site-packages/pp_sketchlib.cpython-38-x86_64-linux-gnu.so)
Mode: Fitting model to reference database
Fit summary:
Avg. entropy of assignment 0.0001
Number of components used 3
Scaled component means:
[0.86022602 0.60916511]
[0.02274602 0.00908855]
[0.35283459 0.28778643]
Network summary:
Components 1065
Density 0.0893
Transitivity 0.9834
Score 0.8956
Done
The model fit values seem to be pretty good, but the number of components determined using K= 2 is closer to my expectations. I would now like to try running poppunk clustering at different fixed threshold distances, but I get the following error messages:
➜ 27065x442708_fna-files poppunk --threshold 0.05 --distances 27065x442708/27065x442708.dists --output 27065x442708 --full-db --ref-db 27065x442708 --threads 54
Graph-tools OpenMP parallelisation enabled: with 54 threads
PopPUNK (POPulation Partitioning Using Nucleotide Kmers)
(with backend: sketchlib v1.5.3
sketchlib: /home/bioinfo-4/anaconda3/lib/python3.8/site-packages/pp_sketchlib.cpython-38-x86_64-linux-gnu.so)
Mode: Applying a core distance threshold
Traceback (most recent call last):
File "/home/bioinfo-4/anaconda3/bin/poppunk", line 10, in <module>
sys.exit(main())
File "/home/bioinfo-4/anaconda3/lib/python3.8/site-packages/PopPUNK/__main__.py", line 476, in main
assignments = new_model.apply_threshold(distMat, args.threshold)
File "/home/bioinfo-4/anaconda3/lib/python3.8/site-packages/PopPUNK/models.py", line 653, in apply_threshold
y = self.assign(X)
File "/home/bioinfo-4/anaconda3/lib/python3.8/site-packages/PopPUNK/models.py", line 749, in assign
y = pp_sketchlib.assignThreshold(X/self.scale, 0, self.core_boundary, 0, cpus)
TypeError: assignThreshold(): incompatible function arguments. The following argument types are supported:
1. (distMat: numpy.ndarray[numpy.float32[m, n], flags.writeable, flags.c_contiguous], slope: int, x_max: float, y_max: float, num_threads: int = 1) -> numpy.ndarray[numpy.float32[m, 1]]
Invoked with: array([[2.50596058e-04, 1.74374913e-03],
[1.93757660e-04, 1.56915467e-03],
[2.35754022e-04, 1.51339627e-03],
...,
[1.06428655e-04, 4.23902558e-04],
[8.54968166e-05, 1.78980772e-04],
[8.96912243e-05, 1.83410230e-04]]), 0, 0.05, 0, 1
I have checked all of the listed poppunk dependencies and all are in place and exceed the minimum versions. As the error messages seem to indicate an incompatibility with numpy, I have installed numpy 1.19.4.
I appreciate any assistance or addit5ional guidance you can provide.
Happy Holidays,
SBB
From: John Lees <[email protected]>
Reply-To: johnlees/PopPUNK <[email protected]>
Date: Thursday, December 17, 2020 at 10:46
To: johnlees/PopPUNK <[email protected]>
Cc: Work <[email protected]>, Author <[email protected]>
Subject: [EXTERNAL] Re: [johnlees/PopPUNK] Accessory distance issue (#135)
Also make sure the channel order is correct (conda-forge -> bioconda -> defaults) and see the advice here if there are problems: https://conda-forge.org/docs/user/tipsandtricks.html#using-multiple-channels%3E<https://urldefense.com/v3/__https:/conda-forge.org/docs/user/tipsandtricks.html*using-multiple-channels*3E__;IyU!!Jm49CwcP98D83js1EA!oIxHn4NtkCqp5AfSvsQ53JU28_MDqRf7RlVlN4_YxYLCRk5Gh7SGiZoKfQxTI4T-TYtr9w$>
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<https://urldefense.com/v3/__https:/github.com/johnlees/PopPUNK/issues/135*issuecomment-747560149__;Iw!!Jm49CwcP98D83js1EA!oIxHn4NtkCqp5AfSvsQ53JU28_MDqRf7RlVlN4_YxYLCRk5Gh7SGiZoKfQxTI4QJ3gNSYg$>, or unsubscribe<https://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/AMAJPUKENRVK34KXBVC5ZHTSVIYUZANCNFSM4U7VY4RQ__;!!Jm49CwcP98D83js1EA!oIxHn4NtkCqp5AfSvsQ53JU28_MDqRf7RlVlN4_YxYLCRk5Gh7SGiZoKfQxTI4Tw1gMjSg$>.
Houston Methodist. Leading Medicine. Houston Methodist is ranked by U.S. News & World Report as the No. 1 hospital in Texas for patient care and safety and one of the top 20 hospitals in the nation. In addition, we are nationally ranked in 11 specialties, the most in the state, and have been national leaders throughout the COVID-19 pandemic in research, offering innovative treatments and surpassing CDC safety standards. For more than 100 years, Houston Methodist has provided the best — and safest — clinical care, advanced technology and patient experience. That is our promise of leading medicine.
houstonmethodist.org
twitter.com/MethodistHosp
facebook.com/HoustonMethodist ***CONFIDENTIALITY NOTICE*** This e-mail is the property of Houston Methodist and/or its relevant affiliates and may contain restricted and privileged material for the sole use of the intended recipient(s). Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender and delete all copies of the message. Thank you.
|
Hi Stephen, The second issue may be a bug at our end, it was previously reported here too: #106 |
I have just released a new version of PopPUNK. Could you try updating by running The
(see https://poppunk.readthedocs.io/en/latest/model_fitting.html#threshold) If that still doesn't work, perhaps you could share your .h5 file with me so I can see if I can replicate your issue? |
John,
Happy New Year and Brexit to you. Thanks for the update, it may take me a few days. Both of my big linux workstations are busy running a RAxML alignment and a Mauve alignment and have been doing so for the last 3 days. I do not know how long it will be until they finish, but when/if they do, I will give it a shot.
Best,
SBB
From: John Lees <[email protected]>
Reply-To: johnlees/PopPUNK <[email protected]>
Date: Tuesday, January 5, 2021 at 06:50
To: johnlees/PopPUNK <[email protected]>
Cc: Work <[email protected]>, Author <[email protected]>
Subject: [EXTERNAL] Re: [johnlees/PopPUNK] Accessory distance issue (#135)
I have just released a new version of PopPUNK. Could you try updating by running conda install poppunk==2.3.0 pp-sketchlib==1.6.0 in your conda environment?
The --threshold command works on my tests in PopPUNK 2.3.0 w/ pp-sketchlib 1.6.0. Its format has changed slightly, so in your case would be:
poppunk --fit-model threshold --threshold 0.05 --distances 27065x442708/27065x442708.dists --output 27065x442708 --full-db --ref-db 27065x442708 --threads 54
(see https://poppunk.readthedocs.io/en/latest/model_fitting.html#threshold<https://urldefense.com/v3/__https:/poppunk.readthedocs.io/en/latest/model_fitting.html*threshold__;Iw!!Jm49CwcP98D83js1EA!tnF3HvpB7XoqHoEgy465U7fJ6w41i9CuMDWr8VrC0l1SSdfqTQGunZ_kxT5V6yk3_IUUWQ$>)
If that still doesn't work, perhaps you could share your .h5 file with me so I can see if I can replicate your issue?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<https://urldefense.com/v3/__https:/github.com/johnlees/PopPUNK/issues/135*issuecomment-754615938__;Iw!!Jm49CwcP98D83js1EA!tnF3HvpB7XoqHoEgy465U7fJ6w41i9CuMDWr8VrC0l1SSdfqTQGunZ_kxT5V6ykMzLDrcg$>, or unsubscribe<https://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/AMAJPUOW26WZRQDQRR2P3MTSYMDJDANCNFSM4U7VY4RQ__;!!Jm49CwcP98D83js1EA!tnF3HvpB7XoqHoEgy465U7fJ6w41i9CuMDWr8VrC0l1SSdfqTQGunZ_kxT5V6ykW_qJAkQ$>.
Houston Methodist. Leading Medicine. Houston Methodist is ranked by U.S. News & World Report as the No. 1 hospital in Texas for patient care and safety and one of the top 20 hospitals in the nation. In addition, we are nationally ranked in 11 specialties, the most in the state, and have been national leaders throughout the COVID-19 pandemic in research, offering innovative treatments and surpassing CDC safety standards. For more than 100 years, Houston Methodist has provided the best — and safest — clinical care, advanced technology and patient experience. That is our promise of leading medicine.
houstonmethodist.org
twitter.com/MethodistHosp
facebook.com/HoustonMethodist ***CONFIDENTIALITY NOTICE*** This e-mail is the property of Houston Methodist and/or its relevant affiliates and may contain restricted and privileged material for the sole use of the intended recipient(s). Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender and delete all copies of the message. Thank you.
|
Hi there,
I found that a good workaround was to use Florent |
John,
I some time ago installed the updated version of PopPUNK as advised (as below) and have used it to parse my large population using a variety of the provided models. I ran into some problems when running HDBSCAN that it kept failing. The problem was that it was running out of ram. I would start PopPUNK running later in the day so that it could process all night, come in in the morning to find that it had failed. It was not until I started it running early in the morning and then stayed late to monitor the running that it became clear that it was exceeding my installed 128 GB of RAM. Running refinement of the HDBSCAN clustering seemed to need even more RAM than did the initial HDBSCAN clustering. I setup swap space and the HDBSDCAN clustering worked fine. A RAM requirement warning for the HDBSCAN clustering in the manual or a more informative error message in the program might be helpful.
I am now trying to evaluate/compare the different clusterings by calculating the rand index. There seems to have been a name change for the programs in the scripts folder. The manual gives “calculate_rand_indicies.py”, but I could not find this in my execution path for the conda PopPUNK install. Searching anaconda/bin I found “python_calculate_rand_indicies.py”, and most of the other scripts were also prepended with “python_”. I ran “python_calculate_rand_indicies.py” and get the following output error messages:
➜ 27065x442708 poppunk_calculate_rand_indices.py
Traceback (most recent call last):
File "/home/bioinfo-4/anaconda3/bin/poppunk_calculate_rand_indices.py", line 11, in <module>
from sklearn.metrics.cluster.supervised import check_clusterings
ModuleNotFoundError: No module named 'sklearn.metrics.cluster.supervised'
➜ 27065x442708 poppunk_calculate_rand_indices.py --help
Traceback (most recent call last):
File "/home/bioinfo-4/anaconda3/bin/poppunk_calculate_rand_indices.py", line 11, in <module>
from sklearn.metrics.cluster.supervised import check_clusterings
ModuleNotFoundError: No module named 'sklearn.metrics.cluster.supervised'
➜ 27065x442708 poppunk_calculate_rand_indices.py --input dist_0.003/27065x442708_clusters.csv,dbscan/refined/27065x442708_clusters.csv
Traceback (most recent call last):
File "/home/bioinfo-4/anaconda3/bin/poppunk_calculate_rand_indices.py", line 11, in <module>
from sklearn.metrics.cluster.supervised import check_clusterings
ModuleNotFoundError: No module named 'sklearn.metrics.cluster.supervised'
➜ 27065x442708 poppunk_calculate_silhouette.py
usage: calculate_silhouette [-h] --distances DISTANCES --cluster-csv CLUSTER_CSV [--cluster-col CLUSTER_COL] [--id-col ID_COL] [--sub SUB]
calculate_silhouette: error: the following arguments are required: --distances, --cluster-csv
Running “python_calculate_silhouette.py” seems to work. The script seems to be looking for a scikit learn module. The PopPUNK manual lists scikit-learn 0.19.1 as a dependency – from conda list I have scikit-learn 0.24.1 installed.
Thanks for your assistance.
Best regards,
SBB
*************************************************************
Stephen B. Beres Ph.D.
Professor of Pathology and Genomic Medicine,
Institute for Academic Medicine
Director of Microbial Informatics,
Center for Molecular and Translational Human Infectious Disease Research
Houston Methodist Research Institute
6670 Bertner Ave.
Houston, TX 77030
MS RIB R6-111
(713) 441-5067
***@***.***
From: John Lees ***@***.***>
Reply-To: johnlees/PopPUNK ***@***.***>
Date: Tuesday, January 5, 2021 at 06:50
To: johnlees/PopPUNK ***@***.***>
Cc: Work ***@***.***>, Author ***@***.***>
Subject: [EXTERNAL] Re: [johnlees/PopPUNK] Accessory distance issue (#135)
I have just released a new version of PopPUNK. Could you try updating by running conda install poppunk==2.3.0 pp-sketchlib==1.6.0 in your conda environment?
The --threshold command works on my tests in PopPUNK 2.3.0 w/ pp-sketchlib 1.6.0. Its format has changed slightly, so in your case would be:
poppunk --fit-model threshold --threshold 0.05 --distances 27065x442708/27065x442708.dists --output 27065x442708 --full-db --ref-db 27065x442708 --threads 54
(see https://poppunk.readthedocs.io/en/latest/model_fitting.html#threshold<https://urldefense.com/v3/__https:/poppunk.readthedocs.io/en/latest/model_fitting.html*threshold__;Iw!!Jm49CwcP98D83js1EA!tnF3HvpB7XoqHoEgy465U7fJ6w41i9CuMDWr8VrC0l1SSdfqTQGunZ_kxT5V6yk3_IUUWQ$>)
If that still doesn't work, perhaps you could share your .h5 file with me so I can see if I can replicate your issue?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<https://urldefense.com/v3/__https:/github.com/johnlees/PopPUNK/issues/135*issuecomment-754615938__;Iw!!Jm49CwcP98D83js1EA!tnF3HvpB7XoqHoEgy465U7fJ6w41i9CuMDWr8VrC0l1SSdfqTQGunZ_kxT5V6ykMzLDrcg$>, or unsubscribe<https://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/AMAJPUOW26WZRQDQRR2P3MTSYMDJDANCNFSM4U7VY4RQ__;!!Jm49CwcP98D83js1EA!tnF3HvpB7XoqHoEgy465U7fJ6w41i9CuMDWr8VrC0l1SSdfqTQGunZ_kxT5V6ykW_qJAkQ$>.
Houston Methodist. Leading Medicine. Houston Methodist is ranked by U.S. News & World Report as the No. 1 hospital in Texas for patient care and safety and one of the top 20 hospitals in the nation. In addition, we are nationally ranked in 11 specialties, the most in the state, and have been national leaders throughout the COVID-19 pandemic in research, offering innovative treatments and surpassing CDC safety standards. For more than 100 years, Houston Methodist has provided the best — and safest — clinical care, advanced technology and patient experience. That is our promise of leading medicine.
houstonmethodist.org
twitter.com/MethodistHosp
facebook.com/HoustonMethodist ***CONFIDENTIALITY NOTICE*** This e-mail is the property of Houston Methodist and/or its relevant affiliates and may contain restricted and privileged material for the sole use of the intended recipient(s). Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender and delete all copies of the message. Thank you.
|
John,
Please disregard my last message. I did a conda –force-reinstall for scikit-learn 0.24.1, and now “poppunk_calculate_rand_indicies.py” seems to be working.
Sincerely,
SBB
From: Work ***@***.***>
Date: Tuesday, March 9, 2021 at 14:01
To: johnlees/PopPUNK ***@***.***>
Subject: Re: [EXTERNAL] Re: [johnlees/PopPUNK] Accessory distance issue (#135)
John,
I some time ago installed the updated version of PopPUNK as advised (as below) and have used it to parse my large population using a variety of the provided models. I ran into some problems when running HDBSCAN that it kept failing. The problem was that it was running out of ram. I would start PopPUNK running later in the day so that it could process all night, come in in the morning to find that it had failed. It was not until I started it running early in the morning and then stayed late to monitor the running that it became clear that it was exceeding my installed 128 GB of RAM. Running refinement of the HDBSCAN clustering seemed to need even more RAM than did the initial HDBSCAN clustering. I setup swap space and the HDBSDCAN clustering worked fine. A RAM requirement warning for the HDBSCAN clustering in the manual or a more informative error message in the program might be helpful.
I am now trying to evaluate/compare the different clusterings by calculating the rand index. There seems to have been a name change for the programs in the scripts folder. The manual gives “calculate_rand_indicies.py”, but I could not find this in my execution path for the conda PopPUNK install. Searching anaconda/bin I found “python_calculate_rand_indicies.py”, and most of the other scripts were also prepended with “python_”. I ran “python_calculate_rand_indicies.py” and get the following output error messages:
➜ 27065x442708 poppunk_calculate_rand_indices.py
Traceback (most recent call last):
File "/home/bioinfo-4/anaconda3/bin/poppunk_calculate_rand_indices.py", line 11, in <module>
from sklearn.metrics.cluster.supervised import check_clusterings
ModuleNotFoundError: No module named 'sklearn.metrics.cluster.supervised'
➜ 27065x442708 poppunk_calculate_rand_indices.py --help
Traceback (most recent call last):
File "/home/bioinfo-4/anaconda3/bin/poppunk_calculate_rand_indices.py", line 11, in <module>
from sklearn.metrics.cluster.supervised import check_clusterings
ModuleNotFoundError: No module named 'sklearn.metrics.cluster.supervised'
➜ 27065x442708 poppunk_calculate_rand_indices.py --input dist_0.003/27065x442708_clusters.csv,dbscan/refined/27065x442708_clusters.csv
Traceback (most recent call last):
File "/home/bioinfo-4/anaconda3/bin/poppunk_calculate_rand_indices.py", line 11, in <module>
from sklearn.metrics.cluster.supervised import check_clusterings
ModuleNotFoundError: No module named 'sklearn.metrics.cluster.supervised'
➜ 27065x442708 poppunk_calculate_silhouette.py
usage: calculate_silhouette [-h] --distances DISTANCES --cluster-csv CLUSTER_CSV [--cluster-col CLUSTER_COL] [--id-col ID_COL] [--sub SUB]
calculate_silhouette: error: the following arguments are required: --distances, --cluster-csv
Running “python_calculate_silhouette.py” seems to work. The script seems to be looking for a scikit learn module. The PopPUNK manual lists scikit-learn 0.19.1 as a dependency – from conda list I have scikit-learn 0.24.1 installed.
Thanks for your assistance.
Best regards,
SBB
*************************************************************
Stephen B. Beres Ph.D.
Professor of Pathology and Genomic Medicine,
Institute for Academic Medicine
Director of Microbial Informatics,
Center for Molecular and Translational Human Infectious Disease Research
Houston Methodist Research Institute
6670 Bertner Ave.
Houston, TX 77030
MS RIB R6-111
(713) 441-5067
***@***.***
From: John Lees ***@***.***>
Reply-To: johnlees/PopPUNK ***@***.***>
Date: Tuesday, January 5, 2021 at 06:50
To: johnlees/PopPUNK ***@***.***>
Cc: Work ***@***.***>, Author ***@***.***>
Subject: [EXTERNAL] Re: [johnlees/PopPUNK] Accessory distance issue (#135)
I have just released a new version of PopPUNK. Could you try updating by running conda install poppunk==2.3.0 pp-sketchlib==1.6.0 in your conda environment?
The --threshold command works on my tests in PopPUNK 2.3.0 w/ pp-sketchlib 1.6.0. Its format has changed slightly, so in your case would be:
poppunk --fit-model threshold --threshold 0.05 --distances 27065x442708/27065x442708.dists --output 27065x442708 --full-db --ref-db 27065x442708 --threads 54
(see https://poppunk.readthedocs.io/en/latest/model_fitting.html#threshold<https://urldefense.com/v3/__https:/poppunk.readthedocs.io/en/latest/model_fitting.html*threshold__;Iw!!Jm49CwcP98D83js1EA!tnF3HvpB7XoqHoEgy465U7fJ6w41i9CuMDWr8VrC0l1SSdfqTQGunZ_kxT5V6yk3_IUUWQ$>)
If that still doesn't work, perhaps you could share your .h5 file with me so I can see if I can replicate your issue?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<https://urldefense.com/v3/__https:/github.com/johnlees/PopPUNK/issues/135*issuecomment-754615938__;Iw!!Jm49CwcP98D83js1EA!tnF3HvpB7XoqHoEgy465U7fJ6w41i9CuMDWr8VrC0l1SSdfqTQGunZ_kxT5V6ykMzLDrcg$>, or unsubscribe<https://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/AMAJPUOW26WZRQDQRR2P3MTSYMDJDANCNFSM4U7VY4RQ__;!!Jm49CwcP98D83js1EA!tnF3HvpB7XoqHoEgy465U7fJ6w41i9CuMDWr8VrC0l1SSdfqTQGunZ_kxT5V6ykW_qJAkQ$>.
Houston Methodist. Leading Medicine. Houston Methodist is ranked by U.S. News & World Report as the No. 1 hospital in Texas for patient care and safety and one of the top 20 hospitals in the nation. In addition, we are nationally ranked in 11 specialties, the most in the state, and have been national leaders throughout the COVID-19 pandemic in research, offering innovative treatments and surpassing CDC safety standards. For more than 100 years, Houston Methodist has provided the best — and safest — clinical care, advanced technology and patient experience. That is our promise of leading medicine.
houstonmethodist.org
twitter.com/MethodistHosp
facebook.com/HoustonMethodist ***CONFIDENTIALITY NOTICE*** This e-mail is the property of Houston Methodist and/or its relevant affiliates and may contain restricted and privileged material for the sole use of the intended recipient(s). Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender and delete all copies of the message. Thank you.
|
John,
Please disregard my last message to “Please disregard”, I still am getting the same error message looking for 'sklearn.metrics.cluster.supervised' when running “poppunk_calculate_rand_indicies.py” with PopPUNK 2.3.0 and scikit-learn 0.24.1.
SBB
From: Work ***@***.***>
Date: Tuesday, March 9, 2021 at 14:25
To: johnlees/PopPUNK ***@***.***>
Subject: FW: [EXTERNAL] Re: [johnlees/PopPUNK] Accessory distance issue (#135)
John,
Please disregard my last message. I did a conda –force-reinstall for scikit-learn 0.24.1, and now “poppunk_calculate_rand_indicies.py” seems to be working.
Sincerely,
SBB
From: Work ***@***.***>
Date: Tuesday, March 9, 2021 at 14:01
To: johnlees/PopPUNK ***@***.***>
Subject: Re: [EXTERNAL] Re: [johnlees/PopPUNK] Accessory distance issue (#135)
John,
I some time ago installed the updated version of PopPUNK as advised (as below) and have used it to parse my large population using a variety of the provided models. I ran into some problems when running HDBSCAN that it kept failing. The problem was that it was running out of ram. I would start PopPUNK running later in the day so that it could process all night, come in in the morning to find that it had failed. It was not until I started it running early in the morning and then stayed late to monitor the running that it became clear that it was exceeding my installed 128 GB of RAM. Running refinement of the HDBSCAN clustering seemed to need even more RAM than did the initial HDBSCAN clustering. I setup swap space and the HDBSDCAN clustering worked fine. A RAM requirement warning for the HDBSCAN clustering in the manual or a more informative error message in the program might be helpful.
I am now trying to evaluate/compare the different clusterings by calculating the rand index. There seems to have been a name change for the programs in the scripts folder. The manual gives “calculate_rand_indicies.py”, but I could not find this in my execution path for the conda PopPUNK install. Searching anaconda/bin I found “python_calculate_rand_indicies.py”, and most of the other scripts were also prepended with “python_”. I ran “python_calculate_rand_indicies.py” and get the following output error messages:
➜ 27065x442708 poppunk_calculate_rand_indices.py
Traceback (most recent call last):
File "/home/bioinfo-4/anaconda3/bin/poppunk_calculate_rand_indices.py", line 11, in <module>
from sklearn.metrics.cluster.supervised import check_clusterings
ModuleNotFoundError: No module named 'sklearn.metrics.cluster.supervised'
➜ 27065x442708 poppunk_calculate_rand_indices.py --help
Traceback (most recent call last):
File "/home/bioinfo-4/anaconda3/bin/poppunk_calculate_rand_indices.py", line 11, in <module>
from sklearn.metrics.cluster.supervised import check_clusterings
ModuleNotFoundError: No module named 'sklearn.metrics.cluster.supervised'
➜ 27065x442708 poppunk_calculate_rand_indices.py --input dist_0.003/27065x442708_clusters.csv,dbscan/refined/27065x442708_clusters.csv
Traceback (most recent call last):
File "/home/bioinfo-4/anaconda3/bin/poppunk_calculate_rand_indices.py", line 11, in <module>
from sklearn.metrics.cluster.supervised import check_clusterings
ModuleNotFoundError: No module named 'sklearn.metrics.cluster.supervised'
➜ 27065x442708 poppunk_calculate_silhouette.py
usage: calculate_silhouette [-h] --distances DISTANCES --cluster-csv CLUSTER_CSV [--cluster-col CLUSTER_COL] [--id-col ID_COL] [--sub SUB]
calculate_silhouette: error: the following arguments are required: --distances, --cluster-csv
Running “python_calculate_silhouette.py” seems to work. The script seems to be looking for a scikit learn module. The PopPUNK manual lists scikit-learn 0.19.1 as a dependency – from conda list I have scikit-learn 0.24.1 installed.
Thanks for your assistance.
Best regards,
SBB
*************************************************************
Stephen B. Beres Ph.D.
Professor of Pathology and Genomic Medicine,
Institute for Academic Medicine
Director of Microbial Informatics,
Center for Molecular and Translational Human Infectious Disease Research
Houston Methodist Research Institute
6670 Bertner Ave.
Houston, TX 77030
MS RIB R6-111
(713) 441-5067
***@***.***
From: John Lees ***@***.***>
Reply-To: johnlees/PopPUNK ***@***.***>
Date: Tuesday, January 5, 2021 at 06:50
To: johnlees/PopPUNK ***@***.***>
Cc: Work ***@***.***>, Author ***@***.***>
Subject: [EXTERNAL] Re: [johnlees/PopPUNK] Accessory distance issue (#135)
I have just released a new version of PopPUNK. Could you try updating by running conda install poppunk==2.3.0 pp-sketchlib==1.6.0 in your conda environment?
The --threshold command works on my tests in PopPUNK 2.3.0 w/ pp-sketchlib 1.6.0. Its format has changed slightly, so in your case would be:
poppunk --fit-model threshold --threshold 0.05 --distances 27065x442708/27065x442708.dists --output 27065x442708 --full-db --ref-db 27065x442708 --threads 54
(see https://poppunk.readthedocs.io/en/latest/model_fitting.html#threshold<https://urldefense.com/v3/__https:/poppunk.readthedocs.io/en/latest/model_fitting.html*threshold__;Iw!!Jm49CwcP98D83js1EA!tnF3HvpB7XoqHoEgy465U7fJ6w41i9CuMDWr8VrC0l1SSdfqTQGunZ_kxT5V6yk3_IUUWQ$>)
If that still doesn't work, perhaps you could share your .h5 file with me so I can see if I can replicate your issue?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<https://urldefense.com/v3/__https:/github.com/johnlees/PopPUNK/issues/135*issuecomment-754615938__;Iw!!Jm49CwcP98D83js1EA!tnF3HvpB7XoqHoEgy465U7fJ6w41i9CuMDWr8VrC0l1SSdfqTQGunZ_kxT5V6ykMzLDrcg$>, or unsubscribe<https://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/AMAJPUOW26WZRQDQRR2P3MTSYMDJDANCNFSM4U7VY4RQ__;!!Jm49CwcP98D83js1EA!tnF3HvpB7XoqHoEgy465U7fJ6w41i9CuMDWr8VrC0l1SSdfqTQGunZ_kxT5V6ykW_qJAkQ$>.
Houston Methodist. Leading Medicine. Houston Methodist is ranked by U.S. News & World Report as the No. 1 hospital in Texas for patient care and safety and one of the top 20 hospitals in the nation. In addition, we are nationally ranked in 11 specialties, the most in the state, and have been national leaders throughout the COVID-19 pandemic in research, offering innovative treatments and surpassing CDC safety standards. For more than 100 years, Houston Methodist has provided the best — and safest — clinical care, advanced technology and patient experience. That is our promise of leading medicine.
houstonmethodist.org
twitter.com/MethodistHosp
facebook.com/HoustonMethodist ***CONFIDENTIALITY NOTICE*** This e-mail is the property of Houston Methodist and/or its relevant affiliates and may contain restricted and privileged material for the sole use of the intended recipient(s). Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender and delete all copies of the message. Thank you.
|
John,
I tried to use conda to install scikit-learn 0.19.1 which is listed as a dependency for PopPUNK 2.3.0, but got back the error message that scikit-learn 0.19.1 is not compatible with Python 3.8.
SBB
Houston Methodist. Leading Medicine. Houston Methodist is ranked by U.S. News & World Report as the No. 1 hospital in Texas for patient care and safety and one of the top 20 hospitals in the nation. In addition, we are nationally ranked in 11 specialties, the most in the state, and have been national leaders throughout the COVID-19 pandemic in research, offering innovative treatments and surpassing CDC safety standards. For more than 100 years, Houston Methodist has provided the best — and safest — clinical care, advanced technology and patient experience. That is our promise of leading medicine.
houstonmethodist.org
twitter.com/MethodistHosp
facebook.com/HoustonMethodist ***CONFIDENTIALITY NOTICE*** This e-mail is the property of Houston Methodist and/or its relevant affiliates and may contain restricted and privileged material for the sole use of the intended recipient(s). Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender and delete all copies of the message. Thank you.
|
@flass Was this with using |
yes I was using the |
@sbberes To try and answer your points in turn:
These fixes will be included in PopPUNK v2.4.0, which we hope will be out in the next couple of weeks. With the scripts: These will be in poppunk v2.4.0 also, but to run it now you can just download the updated standalone script from here, which will run from any path as long as you have your conda environment activated. |
John,
No apologies necessary, you always come through quicky with the assistance, I for one appreciate it. As the available genome sequence data sets become larger and larger, PopPUNK will only become a more essential tool. Thanks so much.
SBB
From: John Lees ***@***.***>
Reply-To: johnlees/PopPUNK ***@***.***>
Date: Wednesday, March 10, 2021 at 06:51
To: johnlees/PopPUNK ***@***.***>
Cc: Work ***@***.***>, Mention ***@***.***>
Subject: [EXTERNAL] Re: [johnlees/PopPUNK] Accessory distance issue (#135)
@sbberes<https://urldefense.com/v3/__https:/github.com/sbberes__;!!Jm49CwcP98D83js1EA!tO09r0bMbNr1vdWfq7zsWRAC-xOWP359hLs8n6io3D7LWOKMDmKpr4SIOH-JxKvMp05fWQ$> To try and answer your points in turn:
* The RAM use (and time) of HDBSCAN fits on large datasets is indeed a problem, and one I ran into in parallel to you. I've fixed it in the recent code and it should now stay below a few Gb in most cases (I was able to fit to 50k genomes using 8 cores in about three hours, and around 30Gb of memory).
* I believe that fit refinement should also now benefit from similar improvements.
These fixes will be included in PopPUNK v2.4.0, which we hope will be out in the next couple of weeks.
With the scripts:
Yes, my apologies that these have changed. I had missed the warning from sklearn that these modules would be moved in v0.24. I've updated these paths to fix this. They are runnable from poppunk_calclulate_rand.py<https://urldefense.com/v3/__http:/poppunk_calclulate_rand.py__;!!Jm49CwcP98D83js1EA!tO09r0bMbNr1vdWfq7zsWRAC-xOWP359hLs8n6io3D7LWOKMDmKpr4SIOH-JxKu19_nHFQ$>, and I have also the manual with this.
These will be in poppunk v2.4.0 also, but to run it now you can just download the updated standalone script from here<https://urldefense.com/v3/__https:/raw.githubusercontent.com/johnlees/PopPUNK/a541845dc121d2ce70bb9efc898e1a31e7a7030c/scripts/poppunk_calculate_rand_indices.py__;!!Jm49CwcP98D83js1EA!tO09r0bMbNr1vdWfq7zsWRAC-xOWP359hLs8n6io3D7LWOKMDmKpr4SIOH-JxKtA7m3ONg$>, which will run from any path as long as you have your conda environment activated.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<https://urldefense.com/v3/__https:/github.com/johnlees/PopPUNK/issues/135*issuecomment-795369951__;Iw!!Jm49CwcP98D83js1EA!tO09r0bMbNr1vdWfq7zsWRAC-xOWP359hLs8n6io3D7LWOKMDmKpr4SIOH-JxKsSq3uCdg$>, or unsubscribe<https://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/AMAJPUNFFZMFH2EFFB3VYJTTC5TONANCNFSM4U7VY4RQ__;!!Jm49CwcP98D83js1EA!tO09r0bMbNr1vdWfq7zsWRAC-xOWP359hLs8n6io3D7LWOKMDmKpr4SIOH-JxKtwIccUEg$>.
Houston Methodist. Leading Medicine. Houston Methodist is ranked by U.S. News & World Report as the No. 1 hospital in Texas for patient care and safety and one of the top 20 hospitals in the nation. In addition, we are nationally ranked in 11 specialties, the most in the state, and have been national leaders throughout the COVID-19 pandemic in research, offering innovative treatments and surpassing CDC safety standards. For more than 100 years, Houston Methodist has provided the best — and safest — clinical care, advanced technology and patient experience. That is our promise of leading medicine.
houstonmethodist.org
twitter.com/MethodistHosp
facebook.com/HoustonMethodist ***CONFIDENTIALITY NOTICE*** This e-mail is the property of Houston Methodist and/or its relevant affiliates and may contain restricted and privileged material for the sole use of the intended recipient(s). Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender and delete all copies of the message. Thank you.
|
John,
Looking for more help -- I tried running the updated poppunk_calculate_rand.py (calc-rand.py) script and got the following error message (below) – indicating an incompatibility with the newer version of NumPy. I have NumPy version 1.20.1. I could not fix the error by simply replacing np.asscalr(a) with a.item in the script. I need to learn more about NumPy and Sklearn.
➜ rand calc-rand.py
usage: calculate_rand_indices [-h] --input INPUT [--output OUTPUT] [--subset SUBSET]
calculate_rand_indices: error: the following arguments are required: --input
➜ rand calc-rand.py --input emm-ST.csv,refined.csv
/home/bioinfo-4/Desktop/dev-scripts/calc-rand.py:127: DeprecationWarning: np.asscalar(a) is deprecated since NumPy v1.16, use a.item() instead
indices_x.append(np.asscalar(np.where(names_list[input_x].values == name)[0]))
Thanks,
SBB
From: John Lees ***@***.***>
Reply-To: johnlees/PopPUNK ***@***.***>
Date: Wednesday, March 10, 2021 at 06:51
To: johnlees/PopPUNK ***@***.***>
Cc: Work ***@***.***>, Mention ***@***.***>
Subject: [EXTERNAL] Re: [johnlees/PopPUNK] Accessory distance issue (#135)
@sbberes<https://urldefense.com/v3/__https:/github.com/sbberes__;!!Jm49CwcP98D83js1EA!tO09r0bMbNr1vdWfq7zsWRAC-xOWP359hLs8n6io3D7LWOKMDmKpr4SIOH-JxKvMp05fWQ$> To try and answer your points in turn:
* The RAM use (and time) of HDBSCAN fits on large datasets is indeed a problem, and one I ran into in parallel to you. I've fixed it in the recent code and it should now stay below a few Gb in most cases (I was able to fit to 50k genomes using 8 cores in about three hours, and around 30Gb of memory).
* I believe that fit refinement should also now benefit from similar improvements.
These fixes will be included in PopPUNK v2.4.0, which we hope will be out in the next couple of weeks.
With the scripts:
Yes, my apologies that these have changed. I had missed the warning from sklearn that these modules would be moved in v0.24. I've updated these paths to fix this. They are runnable from poppunk_calclulate_rand.py<https://urldefense.com/v3/__http:/poppunk_calclulate_rand.py__;!!Jm49CwcP98D83js1EA!tO09r0bMbNr1vdWfq7zsWRAC-xOWP359hLs8n6io3D7LWOKMDmKpr4SIOH-JxKu19_nHFQ$>, and I have also the manual with this.
These will be in poppunk v2.4.0 also, but to run it now you can just download the updated standalone script from here<https://urldefense.com/v3/__https:/raw.githubusercontent.com/johnlees/PopPUNK/a541845dc121d2ce70bb9efc898e1a31e7a7030c/scripts/poppunk_calculate_rand_indices.py__;!!Jm49CwcP98D83js1EA!tO09r0bMbNr1vdWfq7zsWRAC-xOWP359hLs8n6io3D7LWOKMDmKpr4SIOH-JxKtA7m3ONg$>, which will run from any path as long as you have your conda environment activated.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<https://urldefense.com/v3/__https:/github.com/johnlees/PopPUNK/issues/135*issuecomment-795369951__;Iw!!Jm49CwcP98D83js1EA!tO09r0bMbNr1vdWfq7zsWRAC-xOWP359hLs8n6io3D7LWOKMDmKpr4SIOH-JxKsSq3uCdg$>, or unsubscribe<https://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/AMAJPUNFFZMFH2EFFB3VYJTTC5TONANCNFSM4U7VY4RQ__;!!Jm49CwcP98D83js1EA!tO09r0bMbNr1vdWfq7zsWRAC-xOWP359hLs8n6io3D7LWOKMDmKpr4SIOH-JxKtwIccUEg$>.
Houston Methodist. Leading Medicine. Houston Methodist is ranked by U.S. News & World Report as the No. 1 hospital in Texas for patient care and safety and one of the top 20 hospitals in the nation. In addition, we are nationally ranked in 11 specialties, the most in the state, and have been national leaders throughout the COVID-19 pandemic in research, offering innovative treatments and surpassing CDC safety standards. For more than 100 years, Houston Methodist has provided the best — and safest — clinical care, advanced technology and patient experience. That is our promise of leading medicine.
houstonmethodist.org
twitter.com/MethodistHosp
facebook.com/HoustonMethodist ***CONFIDENTIALITY NOTICE*** This e-mail is the property of Houston Methodist and/or its relevant affiliates and may contain restricted and privileged material for the sole use of the intended recipient(s). Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender and delete all copies of the message. Thank you.
|
Looks like another deprecation I've missed! I've got a fix here for you to try: I don't have files to hand to test it myself so I hope that it just works, but let me know if there is still an error and I'll get set up to test it properly |
John,
Worked with the first of my data sets. Thanks again.
Best regards,
SBB
From: John Lees ***@***.***>
Reply-To: johnlees/PopPUNK ***@***.***>
Date: Wednesday, March 10, 2021 at 12:05
To: johnlees/PopPUNK ***@***.***>
Cc: Work ***@***.***>, Mention ***@***.***>
Subject: [EXTERNAL] Re: [johnlees/PopPUNK] Accessory distance issue (#135)
Looks like another deprecation I've missed! I've got a fix here for you to try:
https://raw.githubusercontent.com/johnlees/PopPUNK/30cd9c15b2503e090a0a2bf02617e724584dc576/scripts/poppunk_calculate_rand_indices.py<https://urldefense.com/v3/__https:/raw.githubusercontent.com/johnlees/PopPUNK/30cd9c15b2503e090a0a2bf02617e724584dc576/scripts/poppunk_calculate_rand_indices.py__;!!Jm49CwcP98D83js1EA!vx3Uso_K6vA4aWZIU5IhaZsuIvXn9zamt05naMSK5LvB0xZpSvpeWW80aKibPX1ELhQLmA$>
I don't have files to hand to test it myself so I hope that it just works, but let me know if there is still an error and I'll get set up to test it properly
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<https://urldefense.com/v3/__https:/github.com/johnlees/PopPUNK/issues/135*issuecomment-795836439__;Iw!!Jm49CwcP98D83js1EA!vx3Uso_K6vA4aWZIU5IhaZsuIvXn9zamt05naMSK5LvB0xZpSvpeWW80aKibPX3c90z9nA$>, or unsubscribe<https://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/AMAJPUKSON2AG4S3LUMTXOTTC6YF3ANCNFSM4U7VY4RQ__;!!Jm49CwcP98D83js1EA!vx3Uso_K6vA4aWZIU5IhaZsuIvXn9zamt05naMSK5LvB0xZpSvpeWW80aKibPX2Y6FHAvw$>.
Houston Methodist. Leading Medicine. Houston Methodist is ranked by U.S. News & World Report as the No. 1 hospital in Texas for patient care and safety and one of the top 20 hospitals in the nation. In addition, we are nationally ranked in 11 specialties, the most in the state, and have been national leaders throughout the COVID-19 pandemic in research, offering innovative treatments and surpassing CDC safety standards. For more than 100 years, Houston Methodist has provided the best — and safest — clinical care, advanced technology and patient experience. That is our promise of leading medicine.
houstonmethodist.org
twitter.com/MethodistHosp
facebook.com/HoustonMethodist ***CONFIDENTIALITY NOTICE*** This e-mail is the property of Houston Methodist and/or its relevant affiliates and may contain restricted and privileged material for the sole use of the intended recipient(s). Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender and delete all copies of the message. Thank you.
|
I am running poppunk on a large set of sequences that essentially lack any accessory gene content. I can complete the first step fo generating the DB, but when trying to run an initial model fit, I get numerous warnings: Accessory outlier at a= ..., and a final output that "Distances failed quality control (change QC options to run anyway). Can you suggest what such QC options changes should be made? I have tried adding --ignore-length and --core-only with the same end results. Thanks for the software and any assistance.
Best regards,
Stephen Beres
The text was updated successfully, but these errors were encountered: