Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial Support for Consuming GA4GH DRS URIs #11819

Open
jmchilton opened this issue Apr 12, 2021 · 5 comments
Open

Initial Support for Consuming GA4GH DRS URIs #11819

jmchilton opened this issue Apr 12, 2021 · 5 comments

Comments

@jmchilton
Copy link
Member

Some documentation on resolving GA4GH DRS URIs can be found at:

https://ga4gh.github.io/data-repository-service-schemas/preview/develop/docs/#_hostname_based_drs_uris

Both upload.py and data_fetch.py use lib/galaxy/datatypes/sniff.py::stream_url_to_file to resolve URIs/URLs to POSIX files. This functionality is included in galaxy-data and would presumably be used by future job setup code supporting deferred, as-needed datasets (i.e. #10873) to materialize files for jobs. If this method supported DRS URIs I think most things would fall into place naturally from there.

  • Find a way to test DRS (maybe build a container from https://github.com/ga4gh/drs-server or maybe an existing server exists and I just haven't found it).
  • Add DRS resolution to that method.
  • Test (manually or automatically) uploads coming from both upload1 and the data fetch API that use DRS URIs.
  • Make sure such uploads result in dataset sources being tracked (https://github.com/galaxyproject/galaxy/pull/7487/files). We probably should make sure all URIs are tracked this way - including for galaxy file source plugins. This isn't related to DRS specifically but part of a similar thread.
@luke-c-sargent
Copy link
Member

not sure if this is what you meant by maybe an existing server exists and I just haven't found it but:

Martha v3 is an aggregator of DRS URI resolver results that we use in the AnVILFS plugin to translate DRS to viable endpoints. It requires a valid Google bearer token to auth against, however.

@hexylena
Copy link
Member

Just xref for the work I'm doing for CINECA, we're working on obtaining Elixir AAI refresh tokens + ga4gh_v1_passports within the login system, and then passing those to tools (e.g. ega downloader). Maybe that will help with the authn/z portion of accessing private data?

@nuwang
Copy link
Member

nuwang commented Sep 6, 2022

@hexylena Is there an issue for tracking this? Galaxy Australia is also interested in this functionality, so was wondering how far along you were?

@hexylena
Copy link
Member

hexylena commented Sep 6, 2022

@nuwang no, no separate issue. We got a proof of concept deployed internally, but it needs changes to the dependencies to support the additional attributes, and need to convert my cron job to a celery task.

@nuwang
Copy link
Member

nuwang commented Sep 6, 2022

Thanks. If you have any PRs/commits etc. handy, would be great to take a look.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants