Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Usher version is not reported with --all-versions or updated with --update #439

Closed
dbtara opened this issue Apr 28, 2022 · 7 comments
Closed

Comments

@dbtara
Copy link

dbtara commented Apr 28, 2022

Hi,
pangolin --all-versions does not report usher version nor does pangolin --update update it. This latest patch to usher would have gone unnoticed by us if we had not been going through issues and interacting with @AngieHinrichs . My concern is that this update will be missed by a vast majority of users and will cause more confusion. Can reporting version and updating Usher be added to pangolin version/update process. This is imperative as the default mode uses Usher for calls.

Thanks

@aretchless
Copy link

It may also be helpful to include usher version when reporting the assignment cache version, if that would affect those assignments (unless it can by synced with the locally installed version of usher).

@AngieHinrichs
Copy link
Member

Thanks @dbtara for raising this issue. I agree that we need to make it easier to find the version of usher used by pangolin (and other tools while we're at it: faToVcf, minimap2 and gofasta are also important), and to make it easier for users to a) know when an environment update is necessary and b) update their environments. We were already aware that this general issue with environment updates affected major version updates of pangolin which happen about once a year, but updates of other tools between those times are important too -- and currently invisible to users as you point out.

Adding versions of other tools to the output of pangolin --all-versions is a great idea and should be very easy to do. @aineniamh I will send a PR for that ASAP.

Dynamically updating packages that don't come from github cov-lineages/... repositories is a stickier issue because there are so many installations of pangolin on so many platforms. I think some installations can't even do the current cov-lineages-only pangolin --update because of container write permissions issues, although I guess the situation for container users can be addressed by making new containers from fresh installations that have the latest versions of all tools installed.

If most of the non-container installations are in a user-writeable conda environment, then in theory it should be possible for pangolin to run conda update commands as part of pangolin --update. Unfortunately I don't have sufficient conda experience to predict the proportion of installations in which that would work, though I bet it's not 100%, nor to know the pitfalls of that approach.

@kapsakcj @pvanheus @fmaguire your thoughts?

@AngieHinrichs
Copy link
Member

@aretchless Thanks for the suggestion of making versions explicit for the assignment cache. Since the cache is precomputed at UCSC from my local installation of pangolin (with the version of the UShER lineageTree.pb that is about to be released via cov-lineages/pangolin-data), I don't know of a way to sync the cache version with any other installation of pangolin. But I will add my local version info to the pangolin-assignment release notes going forward - will that help?

@AngieHinrichs
Copy link
Member

P.S. Regarding the latest release 0.5.4 of UShER: I see it has been merged into bioconda, but at the moment 0.5.3 is the latest version of UShER that I see from conda search usher. So while 0.5.4 is not available from bioconda quite yet, hopefully it will be very soon. (?)

When 0.5.4 becomes available, this should work to update conda installations of pangolin:

conda activate pangolin
conda update usher

@AngieHinrichs
Copy link
Member

Looks like 0.5.4 is available now:

conda search usher
...
usher                          0.5.4      hf1ae886_0  bioconda            

@kapsakcj
Copy link

+1 for adding the output of usher --version to the end of the output of pangolin --all-versions command.

Other dependencies may be useful to include there too, but are less important in my opinion if they are not updating frequently. For example minimap2. It probably isn't going to change in the future.

If using conda/mamba - users can always run conda list with the environment activated to see all things & versions installed via conda & pip. Perhaps you could add a tip like this into the Pangolin documentation somewhere?

If folks are using containers and do plan on updating pangolin dependencies via pangolin --update in the ephemeral container then they most likely will have to run as root user which is default in some configs. Totally fine, but in my opinion defeats the purpose of having a static & reproducible environment. That's why I've gone to great lengths to have a thorough dockerfile with versions of the important bits pinned and spelled out 😄

@aineniamh
Copy link
Member

I believe this is resolved with the recent release of pangolin 4.1 (although we're working through an issue #467 that this change has introduced). See #467 for progress!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants