Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

no Http Proxy for Connector Download and no authentication method #2787

Closed
shalberd opened this issue Jun 21, 2022 · 5 comments
Closed

no Http Proxy for Connector Download and no authentication method #2787

shalberd opened this issue Jun 21, 2022 · 5 comments
Assignees

Comments

@shalberd
Copy link
Contributor

shalberd commented Jun 21, 2022

https://medium.com/ibm-data-ai/getting-started-with-apache-airflow-operators-in-elyra-aae882f80c4a

On the face of it, the component catalog feature is great, though I do not understand why common airflow and kubeflow pipeline components are not included in the e.g. Red Hat Operatorhub elyra image by default.

opendatahub-io/odh-manifests#546

In most enterprise environments, as in openshift, there are often cluster-level http- and https proxies involved.

https://docs.openshift.com/container-platform/4.8/networking/enable-cluster-wide-proxy.html

I find no way to integrate apache airflow package operator catalog wheel files via a download url and proxies.
For the gitlab plugin, I was able to do it via the command line.

@akchinSTC
Copy link
Member

Sounds like a short term solution would be to include the most common AF and KF components in with the image OR predownload/upload the packages tot a location in the whitelist, and long term would be to tool in the functionality required to work with the openshift cluster level proxies e.g. username and password

I havent much experience yet using the cluster level proxy functionality in openshift. Looking that the link provided, would another short term solution by to tell the proxy to add a rule to allow bypass to the required urls (e.g. files.pythonhosted.org)? perhaps some bigger companies may have an internal mirror for packages and can use that?

@shalberd
Copy link
Contributor Author

shalberd commented Jun 22, 2022

  • " like a short term solution would be to include the most common AF and KF components in with the image"

Yes, that would be great in my opinion. I am not sure how the ODH Project Image updates and people working on that project work together with you, as in this issue opendatahub-io/odh-manifests#546, where a new Docker image was integrated into ODH.

  • "long term, the functionality required to work with the openshift cluster level proxies e.g. username and password"

Yes, like some sort of environment variable that one can pass into the container for http_proxy and https_proxy.
For example, in the jupyter github / gitlab plugin, I can make it work with an enterprise proxy by setting git config --global http.proxy http://myproxy:port and it asks for an api key when cloning a repo.

  • " to tell the proxy to add a rule to allow bypass to the required urls (e.g. files.pythonhosted.org)?"

not an option in our case, we work only with either docker images in enterprise-internal repositories or with enterprise-internal package repositories like Artifactory. Those internal domains are then included in the noproxy-section of openshift cluster config.

  • " perhaps some bigger companies may have an internal mirror for packages and can use that?"

That is what I did now. I uploaded the wheel file in question to a repository in our internal artifactory. However, I believe I am not the only one faced with that issue, Chief Information Security Officers insist on some sort of authentication and non-anonymous access, either via Bearer Token or Basic Auth https://www.jfrog.com/confluence/display/JFROG/Artifactory+REST+API

@kiersten-stokes Is it possible to include Bearer Token and/or Basic Auth functionality to the airflow package catalog connector at

?

In your article
https://medium.com/ibm-data-ai/getting-started-with-apache-airflow-operators-in-elyra-aae882f80c4a

you mention in the section "Airflow Package Catalog Connector"

"Lastly, you’ll need to configure the Airflow package download URL. The URL must meet a few constraints:

  • it must point to a built distribution (wheel) file
  • it must reference a location that Elyra can access using an HTTP GET request without the need to authenticate"

in some sensitive enterprise environments, that is not feasible (no need to authenticate), even if using an internal package repository like Artifactory.

@shalberd shalberd changed the title no Http Proxy for Connector Download no Http Proxy for Connector Download and no authentication method Jun 22, 2022
@ptitzler
Copy link
Member

  • " like a short term solution would be to include the most common AF and KF components in with the image"

I don't believe we should do this for the following reasons:

  • We can enable connectors that download resources from the web to optionally accept credentials - in theory these should be minor changes.
  • If we pre-package resources (such as components) in container images, it's not trivial to remove them in subsequent releases, as users might rely on them being present by default. We've encountered this with the system-owned runtime images, which we have deprecated and can't remove until the next major release (4.0) because it is considered a breaking change.
  • In general we are trying to include less in our base images to keep them more flexible. Users can always customize images to meet their specific requirements.

@kiersten-stokes Is it possible to included Bearer Token and/or Basic Auth functionality to the airflow package catalog connector at
...
in some sensitive enterprise environments, that is not feasible (no need to authenticate), even if using an internal package repository like Artifactory.

Understood. Adding optional basic authentication should not be a problem.

@ptitzler ptitzler added kind:enhancement New feature or request and removed status:Needs Triage labels Jun 22, 2022
@shalberd
Copy link
Contributor Author

@ptitzler
Thank you regarding the enhancement with authentication.

@akchinSTC
Regarding optional proxy handling: I found out the following for openshift: "The cluster-wide proxy configuration cascades to OpenShift-managed resources only. Proxy configuration for user workloads is handled as part of application management."

source: https://access.redhat.com/solutions/5251461

So, in any case, best practice would be to have (an) optional env variable/s HTTP_PROXY and HTTPS_PROXY that can passed to the deployment env section via e.g. configmaps on OpenShift. in our case.

On your side, I believe the relevant section is here

https://github.com/akchinSTC/odh-manifests/blob/v3.6.0/jupyterhub/notebook-images/overlays/build/elyra-notebook-buildconfig.yaml

respectively ODH in the past

opendatahub-io/odh-manifests#546

@shalberd
Copy link
Contributor Author

shalberd commented Sep 2, 2022

Basic auth support is now there and working, closing issue. private PKI CA-bundle trust handled in #2787. Http-proxy support will be handled later in order to not overload this issue.

@shalberd shalberd closed this as completed Sep 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants