-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Dependency Resolver Mapping #3444
Implement Dependency Resolver Mapping #3444
Conversation
This looks excellent! 👍 (also for merging before branching off 17.01)
For this we'd need to ship a template with galaxy, right? I think we could even strip out the cached_dependency stuff ... this should become a corner case where a tool defines both toolshed and conda dependencies that differ in name or version. So if people do run into locking issues, all they (or us if we ship a template) have to do is map the redundant dependency onto the dependency that can be resolved. We just need to do a set on the requirements. |
Yeah - but I think the pieces to do that are here in this file. I'd consider it to be a bug fix to ship this file after the release myself. I can add an empty variant of such a file if you'd like.
Obviously I'm on board for that too but lets keep that out of this PR. I feel like a lot of what happened in the failed #3117 was people stepping up and describing use cases and the desire to implement the mapping stuff (now in this PR) - lets get this in as is and then we can think about use cases driving the specification stuff and make an argument for with the mapping stuff as given.
That is an excellent point. |
495802d
to
039b206
Compare
Rebased so all the links above are invalid. But I've added a variant of #3433 and a default Conda mapping file in 039b206 along with a test case that verifies R is mapped to "r-base" when using the Conda resolver. Update: Actually forget to git add some stuff - maybe a95eebb is a better attempt at the sample and default stuff. |
a95eebb
to
89b920d
Compare
I'm fairly confident that failing API test is unrelated. |
@jmchilton can we also map such things? https://github.com/galaxyproject/galaxy/blob/dev/tools/plotting/bar_chart.xml#L39 |
@bgruening We can just update those to use the actual conda requirements directly because requirements of type |
Fair enough, but then we can remove these deps from the tool and add real conda deps? And you could iterate during startup over all internal tools and install deps? |
…ency resolver. I'm not sure if we need to update the version since previously these tags were purely for documentation and had no functional purpose. xref this comment galaxyproject#3444 (comment) from @bgruening
@bgruening I suppose - the PR that adds the script that iterates over all the tools and creates missing Conda environments is #3430 - no one really clamored to have it in the release so it remains as WIP. If we want to have it in 17.01 - I can rebase it on top of this PR and keep working on polishing it up. It might take another day or so. It takes so long though and certain things are always going to fail - that I think it would be better to just document that you can run the script (perhaps each time there is a failed dependency resolution). This would still be a significant step forward in terms of getting a working Galaxy running quickly. I'm open to working on that - it is up to everyone else how much we delay the release for this stuff. Also it would be nice to have a UI - that just showed all the tools and gave a little install button to run through and do this work. |
I agree with you here. Maybe we can collapse identical requirements, so that it's not attempting X times to install the thing that fails, but even then I think this needs a UI and shouldn't delay branching.
I agree on that as well, but for me this PR is good and ready to go -- I would be in favor of merging. |
@galaxybot test this |
I'm a little bit scared by this large PR, but I'm also in favor of merging it now. |
Merge it then 😉, we can delay the tool installation portion for the next release I'm fine with that. |
Sorry, can you merge in dev, I will merge it asap then! |
Doh - that went very stale, very quickly. I'm working on it. |
89b920d
to
47e39b0
Compare
Okay - I rebased everything on top of dev again. There was some oddness when merging with differences in #3461. I've looked over the resulting diff and ran the tool dependency related tests though and things seem fine. |
return ToolRequirement( name=name, type=type, version=version, specs=specs ) | ||
|
||
def __eq__(self, other): | ||
return self.name == other.name and self.type == other.type and self.version == other.version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is only correct if we assume that specs are the same if name type and version are the same, otherwise we also need to check that the specs are equal (same for __hash__
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also it looks like something went wrong (duplication of lines 75-82 41-48?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah - I messed something up - I had hand copied something are the rebase from around here - let me take a look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I figured out what happened - it took these lines and added them to RequirementSpecification (or put the class definition inside of ToolRequirement) and I didn't notice. Then the tests were failing so I added the eq, hash, etc... lines back to ToolRequirement not noticing that RequirementSpecification had the wrong lines.
I've pushed a commit to fix this I think and I think I also handled specs correctly in the eq and hash calculations - but let me know if you think they could be improved or need to be fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, looking good. I'm gonna merge once the tests pass.
This adapts CWL-style ``SoftwareRequirement`` ``specs`` to solve galaxyproject#1927. Here I'm trying to implement the CWL specification in a way that helps enable the feasibility of Conda packaging in Galaxy. It is a delicate balancing act aimed to upset as many interested parties as I can. To understand the problem - consider the ``blast+`` requirement found in the Galaxy wrappers. It looks something like this: ``` <requirement type="package" version="2.2.31" name="blast+"> ``` Great, that works for Galaxy and Tool Shed packages. It doesn't work for bioconda at all. I think this problem is fairly uncommon - most packages have simple names shared across Debian, Brew, Conda, etc... - but it does happen in some cases that there are inconsistencies. Some people have taken to duplicating the requirement - this is bad and should not be done since they are mutually exclusive and Galaxy will attempt to resolve both. This introduces the following syntax for tools with profile >= 16.10: ``` <requirement type="package" version="2.2.31" name="blast+"> <specification uri="https://anaconda.org/bioconda/blast" /> <specification uri="https://packages.debian.org/sid/ncbi-blast+" version="2.2.31-3" /> </requirement> ``` This allows finer grain resolution of the requirement without sacrificing the abstract name at the top. It allows the name and the version to be adapted by resolvers as needed (hopefully rarely so). This syntax is the future facing one, but obviously this tool would not work on older Galaxy versions. To remedy this - an alternative syntax can be used for tools targetting Galaxy verions pre-16.10: ``` <requirement type="package" version="2.2" specification_uris="https://anaconda.org/bioconda/[email protected],https://packages.debian.org/jessie/[email protected]">blast+</requirement> ``` This syntax sucks - but it does give newer Galaxies the ability to resolve the specifications without breaking the more simple functionality for older Galaxies. For more information on the CWL side of this - checkout the discussion on common-workflow-language/cwltool#214. The CWL specification information is defined at http://www.commonwl.org/v1.0/CommandLineTool.html#SoftwarePackage. Conflicts: lib/galaxy/tools/deps/__init__.py test/functional/tools/samples_tool_conf.xml
Per discussion on galaxyproject#3117 - some people were in favor of the idea but some were opposed. I think it is still important to implement this for CWL tools and useful from a library perspective. This just removes changes to Galaxy tool XML.
The handling of global configuration options for Conda specific options was hacky and a poor design. This is much more modular and allows the elegant cascading option sources for all resolvers.
This follows the existing pattern within the module.
If resolve_all takes in requirements, then so should resolve.
…laxy. - Construct a DependencyManager according to Galaxy's configuration before Galaxy starts to allow auto installation of Conda. - Add script for managing tool dependencies outside of a Galaxy environment. - Refactor to allow this to be done without duplicating logic related to defaults, configuration handling, etc.... Implements galaxyproject#3426.
Everyone gets a Conda when Galaxy starts up.
- Map R to r-base and blast+ to blast for conda dependencies. - Add sample environment modules file showing off the new mapping feature. - Add integration test case to verify the default conda mapping (R -> r-base in particular).
This became the default behavior with conda_auto_init defaulting to True.
47e39b0
to
3e3d41f
Compare
Alright, thanks a lot @jmchilton, I think this is a big step ahead! |
@@ -22,10 +25,11 @@ class DependencyResolver(Dictifiable, object): | |||
# resolution. | |||
disabled = False | |||
resolves_simple_dependencies = True | |||
config_options = {} | |||
__metaclass__ = ABCMeta |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI this is not Python3-safe, the @six.add_metaclass(ABCMeta) decorator should be used instead. There are a few more instances in this file and elsewhere in lib/galaxy/
, I will take care of fixing them all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in #3491.
@@ -98,6 +103,11 @@ if [ -z "$GALAXY_CONFIG_FILE" ]; then | |||
export GALAXY_CONFIG_FILE | |||
fi | |||
|
|||
if [ $INITIALIZE_TOOL_DEPENDENCIES -eq 1 ]; then | |||
# Install Conda environment if needed. | |||
python ./scripts/manage_tool_dependencies.py -c "$GALAXY_CONFIG_FILE" init_if_needed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jmchilton @natefoo Should this be run as part of ./scripts/common_startup.sh
(called at line 66) instead of only when Galaxy is started through run.sh
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really don't think so - this is of no use to the reports app or the tool shed for instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, common_startup.sh
is not the right place for the reason you mentioned. But I was concerned about servers started through e.g. supervisor by invoking directly uwsgi
or ./scripts/paster.py
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They will need to setup Conda - and it would be best to do that manually before hand but Galaxy will attempt to do it on its own also I think (with the multi-process problems that entails). I guess this becomes a documentation problem in my mind - at least until we get to the point where we have a GUI for managing this stuff (doing it from the GUI when an admin logs in would be ideal).
resolve_all
took inToolRequirement
objects whereasresolve
took in name, version, and type specifically. Mapping and specification handling is a lot cleaner the interface more cohesive when both methods just consumeToolRequirement
s - so this change is made in 6de74a7.This is labelled as 17.05 but if anyone wants to merge it before #3433 is merged and we branch - I wouldn't complain 😉. It will give us greater flexibility to adapt old tools to new Conda packages.