Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make it possible to Data API act as a htsget proxy #51

Closed
juhtornr opened this issue Oct 5, 2018 · 3 comments
Closed

Make it possible to Data API act as a htsget proxy #51

juhtornr opened this issue Oct 5, 2018 · 3 comments

Comments

@juhtornr
Copy link
Collaborator

juhtornr commented Oct 5, 2018

We have a use case in Tryggve project where user wants to stream data from multiple htsget endpoints but doesn't have network access to all of them due security reasons (so they are not visible to public network) but the Data API servers can access each others endpoints.

Therefore we need to have a proxy service that fetches the data from another endpoint, caches that and serves the content to the original requestor. So when user is going make following query to Data API

GET https://data-api.csc.fi/another-endpoint/data/reads

Data API should then fetch the data from another endpoint:

GET https://another-endpoint.se/data/reads

And serve the content to the original requestor.

In order to make this work it is necessary to have endpoint names and URL mapped in database or in configuration. In Data API the flow would be following:

  1. Data API gets the query
  2. Data API does database/configuration lookup and figures out the URL that has to be queried
  3. Data API checks if it already has the data cached
  4. a) Data API makes the query to another endpoint and returns the data
  5. b) Data API returns the data from cache
  6. Data API will cache the data if it's not already cached

Most likely Zuul service is the right component to do this feature.

@juhtornr
Copy link
Collaborator Author

juhtornr commented Oct 5, 2018

@AlexanderSenf @blankdots @dtitov Comments about this? :)

@juhtornr juhtornr changed the title Make it possible to use Data API as htsget proxy Make it possible to Data API act as a htsget proxy Oct 5, 2018
@AlexanderSenf
Copy link
Contributor

My two thoughts:

  1. You could run a simple proxy/redirect endpoint next to Zuul. That way you would enable access to the various htsget endpoints that are not directly accessible to the user, and no software changes would be required to htsget client or server. I have done a setup like this before; it should be quite easy. Maybe Zuul itself could actually be set up for that.
  2. htsget could be modified to allow proxy setups.. although that should be added to the htsget specs then, if it works. (htsget networks..?)

The caching solution might also work; although I would prefer simply forwarding the request, I think. But I am open to any solution.. Zuul itself should not be doing too much "work" however - the htsget functionality should be a step behind, so in DataEdge, or a new htsget proxy microservice.

@juhtornr
Copy link
Collaborator Author

We ended up using proposed solution number 1). The required configuration for Zuul is:

zuul.routes.sweden.path: /sweden/elixir/data/**
zuul.routes.sweden.url: http://<another Data API url>/elixir/data/
zuul.routes.sweden.sensitive-headers=

And with that the following query redirects to another Data API:

curl -vvvv -H "Authorization: Bearer $TOKEN" http://<original Data API url>/elixir/sweden/elixir/data/files/EGAF01?destinationFormat=plain\&startCoordinate=0\&endCoordinate=100

No code changes were needed but of course this is just simple proxy without caching.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants