Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pipelines upstream connect error or disconnect/reset before headers. reset reason: connection failure #4469

Closed
luigicr1965 opened this issue Sep 6, 2020 · 17 comments
Labels
kind/bug lifecycle/stale The issue / pull request is stale, any activities remove this label.

Comments

@luigicr1965
Copy link

/kind bug

I have deployed Kubeflow in Azure using kfctl_k8s_istio.v1.1.0.yaml from the master after having modified the repos section

repos:

name: manifests
uri: https://github.com/kubeflow/manifests/archive/master.tar.gz
#uri: https://github.com/kubeflow/manifests/archive/v1.1-branch.tar.gz
version: master
#version: v1.1-branch
I used port forward as follow

kubectl port-forward svc/istio-ingressgateway -n istio-system 8080:80

Everything looks like working from the UI but pipelines where I get the following error once clicking on the left bar Pipelines menu:

upstream connect error or disconnect/reset before headers. reset reason: connection failure

I do expect that the pipelines UI is shown

Environment:

Kubeflow version as above stated is kfctl_k8s_istio.v1.1.0.yaml

Kubeflow version: (version number can be found at the bottom left corner of the Kubeflow dashboard):
kfctl version: kfctl v1.1.0-0-g9a3621e
Kubernetes platform: Azure AKS
Kubernetes version: v1.16.13
pipeline

@amybachir
Copy link

I have the same issue with Kubeflow 1.1. The readiness probe is failing on the ml-pipeline-ui pod in the user namespace.
Here are the containers' logs:

(base) root@US-BachirAm-1:~# kubectl get po -n amy-bachir
NAME                                               READY   STATUS    RESTARTS   AGE
ml-pipeline-ui-artifact-bd978746-c47ws             2/2     Running   0          6m58s
ml-pipeline-visualizationserver-865c7865bc-gqhss   2/2     Running   0          46m
(base) root@US-BachirAm-1:~#
(base) root@US-BachirAm-1:~#
(base) root@US-BachirAm-1:~#
(base) root@US-BachirAm-1:~#
(base) root@US-BachirAm-1:~#
(base) root@US-BachirAm-1:~# kubectl describe po ml-pipeline-ui-artifact-bd978746-c47ws -n amy-bachir
Name:         ml-pipeline-ui-artifact-bd978746-c47ws
Namespace:    amy-bachir
Priority:     0
Node:         aks-nodepool1-78398931-vmss000000/10.240.0.4
Start Time:   Wed, 09 Sep 2020 14:27:17 -0400
Labels:       app=ml-pipeline-ui-artifact
              pod-template-hash=bd978746
Annotations:  sidecar.istio.io/status:
                {"version":"2696c96840179bdcd8f86ed60a643396f0cb250f56458b46a7fc667e6c75ef7f","initContainers":["istio-init"],"containers":["istio-proxy"]...
Status:       Running
IP:           10.244.0.79
IPs:
  IP:           10.244.0.79
Controlled By:  ReplicaSet/ml-pipeline-ui-artifact-bd978746
Init Containers:
  istio-init:
    Container ID:  docker://5edd4af0624814024e8885acb873cd8736089b8958235a40721f91f82bb00375
    Image:         gcr.io/istio-release/proxy_init:release-1.3-latest-daily
    Image ID:      docker-pullable://gcr.io/istio-release/proxy_init@sha256:5c4489204016425da102d18ec31111d2fc56c787575f55582c7c5fcf69fd09df
    Port:          <none>
    Host Port:     <none>
    Args:
      -p
      15001
      -z
      15006
      -u
      1337
      -m
      REDIRECT
      -i
      *
      -x

      -b
      *
      -d
      15020
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Wed, 09 Sep 2020 14:27:19 -0400
      Finished:     Wed, 09 Sep 2020 14:27:20 -0400
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     100m
      memory:  50Mi
    Requests:
      cpu:        10m
      memory:     10Mi
    Environment:  <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-editor-token-b99bs (ro)
Containers:
  ml-pipeline-ui-artifact:
    Container ID:   docker://3ed453c801b8be439708835520dcb10cee6ddac90ea2111b698395a0836cb9ed
    Image:          gcr.io/ml-pipeline/frontend:1.0.0
    Image ID:       docker-pullable://gcr.io/ml-pipeline/frontend@sha256:b02ec2d2bc916b7cd1b18eb9b18d4c7c727ac3a8675b73d913ec929448958f2a
    Port:           3000/TCP
    Host Port:      0/TCP
    State:          Running
      Started:      Wed, 09 Sep 2020 14:27:21 -0400
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-editor-token-b99bs (ro)
  istio-proxy:
    Container ID:  docker://3c1078cb7827eeac8955b1a623f5f45fed92044a7dd6199c906a4fb51ef83c78
    Image:         gcr.io/istio-release/proxyv2:release-1.3-latest-daily
    Image ID:      docker-pullable://gcr.io/istio-release/proxyv2@sha256:de1bcf19d8d709dc6c66aab1d2b9f13f27626b86ab0690a015fb4c72b062eb1d
    Port:          15090/TCP
    Host Port:     0/TCP
    Args:
      proxy
      sidecar
      --domain
      $(POD_NAMESPACE).svc.cluster.local
      --configPath
      /etc/istio/proxy
      --binaryPath
      /usr/local/bin/envoy
      --serviceCluster
      ml-pipeline-ui-artifact.$(POD_NAMESPACE)
      --drainDuration
      45s
      --parentShutdownDuration
      1m0s
      --discoveryAddress
      istio-pilot.istio-system:15011
      --zipkinAddress
      zipkin.istio-system:9411
      --dnsRefreshRate
      300s
      --connectTimeout
      10s
      --proxyAdminPort
      15000
      --concurrency
      2
      --controlPlaneAuthPolicy
      MUTUAL_TLS
      --statusPort
      15020
      --applicationPorts
      3000
    State:          Running
      Started:      Wed, 09 Sep 2020 14:27:21 -0400
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     2
      memory:  1Gi
    Requests:
      cpu:      100m
      memory:   128Mi
    Readiness:  http-get http://:15020/healthz/ready delay=1s timeout=1s period=2s #success=1 #failure=30
    Environment:
      POD_NAME:                          ml-pipeline-ui-artifact-bd978746-c47ws (v1:metadata.name)
      ISTIO_META_POD_PORTS:              [
                                             {"containerPort":3000,"protocol":"TCP"}
                                         ]
      ISTIO_META_CLUSTER_ID:             Kubernetes
      POD_NAMESPACE:                     amy-bachir (v1:metadata.namespace)
      INSTANCE_IP:                        (v1:status.podIP)
      SERVICE_ACCOUNT:                    (v1:spec.serviceAccountName)
      ISTIO_META_POD_NAME:               ml-pipeline-ui-artifact-bd978746-c47ws (v1:metadata.name)
      ISTIO_META_CONFIG_NAMESPACE:       amy-bachir (v1:metadata.namespace)
      SDS_ENABLED:                       true
      ISTIO_META_INTERCEPTION_MODE:      REDIRECT
      ISTIO_META_INCLUDE_INBOUND_PORTS:  3000
      ISTIO_METAJSON_LABELS:             {"app":"ml-pipeline-ui-artifact","pod-template-hash":"bd978746"}

      ISTIO_META_WORKLOAD_NAME:          ml-pipeline-ui-artifact
      ISTIO_META_OWNER:                  kubernetes://api/apps/v1/namespaces/amy-bachir/deployments/ml-pipeline-ui-artifact
    Mounts:
      /etc/istio/proxy from istio-envoy (rw)
      /var/run/sds from sds-uds-path (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from default-editor-token-b99bs (ro)
      /var/run/secrets/tokens from istio-token (rw)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  default-editor-token-b99bs:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-editor-token-b99bs
    Optional:    false
  istio-envoy:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     Memory
    SizeLimit:  <unset>
  sds-uds-path:
    Type:          HostPath (bare host directory volume)
    Path:          /var/run/sds
    HostPathType:
  istio-token:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  43200
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute for 300s
                             node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason     Age    From                                        Message
  ----     ------     ----   ----                                        -------
  Normal   Scheduled  7m11s  default-scheduler                           Successfully assigned amy-bachir/ml-pipeline-ui-artifact-bd978746-c47ws to aks-nodepool1-78398931-vmss000000
  Normal   Pulled     7m10s  kubelet, aks-nodepool1-78398931-vmss000000  Container image "gcr.io/istio-release/proxy_init:release-1.3-latest-daily" already present on machine
  Normal   Created    7m10s  kubelet, aks-nodepool1-78398931-vmss000000  Created container istio-init
  Normal   Started    7m9s   kubelet, aks-nodepool1-78398931-vmss000000  Started container istio-init
  Normal   Pulled     7m8s   kubelet, aks-nodepool1-78398931-vmss000000  Container image "gcr.io/ml-pipeline/frontend:1.0.0" already present on machine
  Normal   Created    7m8s   kubelet, aks-nodepool1-78398931-vmss000000  Created container ml-pipeline-ui-artifact
  Normal   Started    7m7s   kubelet, aks-nodepool1-78398931-vmss000000  Started container ml-pipeline-ui-artifact
  Normal   Pulled     7m7s   kubelet, aks-nodepool1-78398931-vmss000000  Container image "gcr.io/istio-release/proxyv2:release-1.3-latest-daily" already present on machine
  Normal   Created    7m7s   kubelet, aks-nodepool1-78398931-vmss000000  Created container istio-proxy
  Normal   Started    7m7s   kubelet, aks-nodepool1-78398931-vmss000000  Started container istio-proxy
  Warning  Unhealthy  7m6s   kubelet, aks-nodepool1-78398931-vmss000000  Readiness probe failed: HTTP probe failed with statuscode: 503
(base) root@US-BachirAm-1:~#
(base) root@US-BachirAm-1:~#
(base) root@US-BachirAm-1:~# kubectl logs ml-pipeline-ui-artifact-bd978746-c47ws -n amy-bachir
error: a container name must be specified for pod ml-pipeline-ui-artifact-bd978746-c47ws, choose one of: [ml-pipeline-ui-artifact istio-proxy] or one of the init containers: [istio-init]
(base) root@US-BachirAm-1:~# kubectl logs ml-pipeline-ui-artifact-bd978746-c47ws -n amy-bachir -c istio-init
Environment:
------------
ENVOY_PORT=
INBOUND_CAPTURE_PORT=
ISTIO_INBOUND_INTERCEPTION_MODE=
ISTIO_INBOUND_TPROXY_MARK=
ISTIO_INBOUND_TPROXY_ROUTE_TABLE=
ISTIO_INBOUND_PORTS=
ISTIO_LOCAL_EXCLUDE_PORTS=
ISTIO_SERVICE_CIDR=
ISTIO_SERVICE_EXCLUDE_CIDR=

Variables:
----------
PROXY_PORT=15001
PROXY_INBOUND_CAPTURE_PORT=15006
PROXY_UID=1337
INBOUND_INTERCEPTION_MODE=REDIRECT
INBOUND_TPROXY_MARK=1337
INBOUND_TPROXY_ROUTE_TABLE=133
INBOUND_PORTS_INCLUDE=*
INBOUND_PORTS_EXCLUDE=15020
OUTBOUND_IP_RANGES_INCLUDE=*
OUTBOUND_IP_RANGES_EXCLUDE=
OUTBOUND_PORTS_EXCLUDE=
KUBEVIRT_INTERFACES=
ENABLE_INBOUND_IPV6=

+ iptables -t nat -N ISTIO_REDIRECT
+ iptables -t nat -A ISTIO_REDIRECT -p tcp -j REDIRECT --to-port 15001
+ iptables -t nat -N ISTIO_IN_REDIRECT
+ '[' '*' == '*' ']'
+ iptables -t nat -A ISTIO_IN_REDIRECT -p tcp -j REDIRECT --to-port 15006
+ '[' -n '*' ']'
+ '[' REDIRECT = TPROXY ']'
+ table=nat
+ iptables -t nat -N ISTIO_INBOUND
+ iptables -t nat -A PREROUTING -p tcp -j ISTIO_INBOUND
+ '[' '*' == '*' ']'
+ iptables -t nat -A ISTIO_INBOUND -p tcp --dport 22 -j RETURN
+ '[' -n 15020 ']'
+ for port in '${INBOUND_PORTS_EXCLUDE}'
+ iptables -t nat -A ISTIO_INBOUND -p tcp --dport 15020 -j RETURN
+ '[' REDIRECT = TPROXY ']'
+ iptables -t nat -A ISTIO_INBOUND -p tcp -j ISTIO_IN_REDIRECT
+ iptables -t nat -N ISTIO_OUTPUT
+ iptables -t nat -A OUTPUT -p tcp -j ISTIO_OUTPUT
+ '[' -n '' ']'
+ iptables -t nat -A ISTIO_OUTPUT -o lo -s 127.0.0.6/32 -j RETURN
+ '[' -z '' ']'
+ iptables -t nat -A ISTIO_OUTPUT -o lo '!' -d 127.0.0.1/32 -j ISTIO_IN_REDIRECT
+ for uid in '${PROXY_UID}'
+ iptables -t nat -A ISTIO_OUTPUT -m owner --uid-owner 1337 -j RETURN
+ for gid in '${PROXY_GID}'
+ iptables -t nat -A ISTIO_OUTPUT -m owner --gid-owner 1337 -j RETURN
+ iptables -t nat -A ISTIO_OUTPUT -d 127.0.0.1/32 -j RETURN
+ '[' 0 -gt 0 ']'
+ '[' 1 -gt 0 ']'
+ '[' '*' == '*' ']'
+ iptables -t nat -A ISTIO_OUTPUT -j ISTIO_REDIRECT
+ set +o nounset
+ '[' -n '' ']'
+ ip6tables -F INPUT
ip6tables v1.6.0: can't initialize ip6tables table `filter': Table does not exist (do you need to insmod?)
Perhaps ip6tables or your kernel needs to be upgraded.
+ true
+ ip6tables -A INPUT -m state --state ESTABLISHED -j ACCEPT
ip6tables v1.6.0: can't initialize ip6tables table `filter': Table does not exist (do you need to insmod?)
Perhaps ip6tables or your kernel needs to be upgraded.
+ true
+ ip6tables -A INPUT -i lo -d ::1 -j ACCEPT
ip6tables v1.6.0: can't initialize ip6tables table `filter': Table does not exist (do you need to insmod?)
Perhaps ip6tables or your kernel needs to be upgraded.
+ true
+ ip6tables -A INPUT -j REJECT
ip6tables v1.6.0: can't initialize ip6tables table `filter': Table does not exist (do you need to insmod?)
Perhaps ip6tables or your kernel needs to be upgraded.
+ true
+ dump
+ iptables-save
# Generated by iptables-save v1.6.0 on Wed Sep  9 18:27:19 2020
*mangle
:PREROUTING ACCEPT [0:0]
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]
COMMIT
# Completed on Wed Sep  9 18:27:19 2020
# Generated by iptables-save v1.6.0 on Wed Sep  9 18:27:19 2020
*nat
:PREROUTING ACCEPT [0:0]
:INPUT ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]
:ISTIO_INBOUND - [0:0]
:ISTIO_IN_REDIRECT - [0:0]
:ISTIO_OUTPUT - [0:0]
:ISTIO_REDIRECT - [0:0]
-A PREROUTING -p tcp -j ISTIO_INBOUND
-A OUTPUT -p tcp -j ISTIO_OUTPUT
-A ISTIO_INBOUND -p tcp -m tcp --dport 22 -j RETURN
-A ISTIO_INBOUND -p tcp -m tcp --dport 15020 -j RETURN
-A ISTIO_INBOUND -p tcp -j ISTIO_IN_REDIRECT
-A ISTIO_IN_REDIRECT -p tcp -j REDIRECT --to-ports 15006
-A ISTIO_OUTPUT -s 127.0.0.6/32 -o lo -j RETURN
-A ISTIO_OUTPUT ! -d 127.0.0.1/32 -o lo -j ISTIO_IN_REDIRECT
-A ISTIO_OUTPUT -m owner --uid-owner 1337 -j RETURN
-A ISTIO_OUTPUT -m owner --gid-owner 1337 -j RETURN
-A ISTIO_OUTPUT -d 127.0.0.1/32 -j RETURN
-A ISTIO_OUTPUT -j ISTIO_REDIRECT
-A ISTIO_REDIRECT -p tcp -j REDIRECT --to-ports 15001
COMMIT
# Completed on Wed Sep  9 18:27:19 2020
+ ip6tables-save
(base) root@US-BachirAm-1:~#
(base) root@US-BachirAm-1:~#
(base) root@US-BachirAm-1:~#
(base) root@US-BachirAm-1:~# kubectl logs ml-pipeline-ui-artifact-bd978746-c47ws -n amy-bachir -c istio-proxy
2020-09-09T18:27:21.487334Z     info    FLAG: --applicationPorts="[3000]"
2020-09-09T18:27:21.487367Z     info    FLAG: --binaryPath="/usr/local/bin/envoy"
2020-09-09T18:27:21.487373Z     info    FLAG: --concurrency="2"
2020-09-09T18:27:21.487377Z     info    FLAG: --configPath="/etc/istio/proxy"
2020-09-09T18:27:21.487381Z     info    FLAG: --connectTimeout="10s"
2020-09-09T18:27:21.487385Z     info    FLAG: --controlPlaneAuthPolicy="MUTUAL_TLS"
2020-09-09T18:27:21.487389Z     info    FLAG: --controlPlaneBootstrap="true"
2020-09-09T18:27:21.487393Z     info    FLAG: --customConfigFile=""
2020-09-09T18:27:21.487396Z     info    FLAG: --datadogAgentAddress=""
2020-09-09T18:27:21.487399Z     info    FLAG: --disableInternalTelemetry="false"
2020-09-09T18:27:21.487403Z     info    FLAG: --discoveryAddress="istio-pilot.istio-system:15011"
2020-09-09T18:27:21.487406Z     info    FLAG: --dnsRefreshRate="300s"
2020-09-09T18:27:21.487410Z     info    FLAG: --domain="amy-bachir.svc.cluster.local"
2020-09-09T18:27:21.487414Z     info    FLAG: --drainDuration="45s"
2020-09-09T18:27:21.487417Z     info    FLAG: --envoyAccessLogService=""
2020-09-09T18:27:21.487420Z     info    FLAG: --envoyMetricsServiceAddress=""
2020-09-09T18:27:21.487423Z     info    FLAG: --help="false"
2020-09-09T18:27:21.487426Z     info    FLAG: --id=""
2020-09-09T18:27:21.487430Z     info    FLAG: --ip=""
2020-09-09T18:27:21.487433Z     info    FLAG: --lightstepAccessToken=""
2020-09-09T18:27:21.487436Z     info    FLAG: --lightstepAddress=""
2020-09-09T18:27:21.487439Z     info    FLAG: --lightstepCacertPath=""
2020-09-09T18:27:21.487442Z     info    FLAG: --lightstepSecure="false"
2020-09-09T18:27:21.487445Z     info    FLAG: --log_as_json="false"
2020-09-09T18:27:21.487448Z     info    FLAG: --log_caller=""
2020-09-09T18:27:21.487451Z     info    FLAG: --log_output_level="default:info"
2020-09-09T18:27:21.487454Z     info    FLAG: --log_rotate=""
2020-09-09T18:27:21.487458Z     info    FLAG: --log_rotate_max_age="30"
2020-09-09T18:27:21.487461Z     info    FLAG: --log_rotate_max_backups="1000"
2020-09-09T18:27:21.487465Z     info    FLAG: --log_rotate_max_size="104857600"
2020-09-09T18:27:21.487469Z     info    FLAG: --log_stacktrace_level="default:none"
2020-09-09T18:27:21.487475Z     info    FLAG: --log_target="[stdout]"
2020-09-09T18:27:21.487478Z     info    FLAG: --mixerIdentity=""
2020-09-09T18:27:21.487481Z     info    FLAG: --parentShutdownDuration="1m0s"
2020-09-09T18:27:21.487484Z     info    FLAG: --pilotIdentity=""
2020-09-09T18:27:21.487489Z     info    FLAG: --proxyAdminPort="15000"
2020-09-09T18:27:21.487492Z     info    FLAG: --proxyComponentLogLevel="misc:error"
2020-09-09T18:27:21.487495Z     info    FLAG: --proxyLogLevel="warning"
2020-09-09T18:27:21.487499Z     info    FLAG: --serviceCluster="ml-pipeline-ui-artifact.amy-bachir"
2020-09-09T18:27:21.487502Z     info    FLAG: --serviceregistry="Kubernetes"
2020-09-09T18:27:21.487505Z     info    FLAG: --statsdUdpAddress=""
2020-09-09T18:27:21.487509Z     info    FLAG: --statusPort="15020"
2020-09-09T18:27:21.487512Z     info    FLAG: --templateFile=""
2020-09-09T18:27:21.487515Z     info    FLAG: --trust-domain=""
2020-09-09T18:27:21.487519Z     info    FLAG: --zipkinAddress="zipkin.istio-system:9411"
2020-09-09T18:27:21.487537Z     info    Version [email protected]/istio-release-release-1.3-20200214-10-15-3db95dfc23ffc081803b42549934915ba3b0a3d5-Clean
2020-09-09T18:27:21.487752Z     info    Obtained private IP [10.244.0.79]
2020-09-09T18:27:21.487804Z     info    Proxy role: &model.Proxy{ClusterID:"", Type:"sidecar", IPAddresses:[]string{"10.244.0.79", "10.244.0.79"}, ID:"ml-pipeline-ui-artifact-bd978746-c47ws.amy-bachir", Locality:(*core.Locality)(nil), DNSDomain:"amy-bachir.svc.cluster.local", TrustDomain:"cluster.local", PilotIdentity:"", MixerIdentity:"", ConfigNamespace:"", Metadata:map[string]string{}, SidecarScope:(*model.SidecarScope)(nil), MergedGateway:(*model.MergedGateway)(nil), ServiceInstances:[]*model.ServiceInstance(nil), WorkloadLabels:labels.Collection(nil), IstioVersion:(*model.IstioVersion)(nil)}
2020-09-09T18:27:21.487818Z     info    PilotSAN []string{"spiffe://cluster.local/ns/istio-system/sa/istio-pilot-service-account"}
2020-09-09T18:27:21.488316Z     info    Effective config: binaryPath: /usr/local/bin/envoy
concurrency: 2
configPath: /etc/istio/proxy
connectTimeout: 10s
controlPlaneAuthPolicy: MUTUAL_TLS
discoveryAddress: istio-pilot.istio-system:15011
drainDuration: 45s
envoyAccessLogService: {}
envoyMetricsService: {}
parentShutdownDuration: 60s
proxyAdminPort: 15000
serviceCluster: ml-pipeline-ui-artifact.amy-bachir
statNameLength: 189
tracing:
  zipkin:
    address: zipkin.istio-system:9411

2020-09-09T18:27:21.488363Z     info    waiting 1m0s for /var/run/sds/uds_path
2020-09-09T18:27:21.488413Z     info    PilotSAN []string{"spiffe://cluster.local/ns/istio-system/sa/istio-pilot-service-account"}
2020-09-09T18:27:21.488430Z     info    Opening status port 15020

2020-09-09T18:27:21.488495Z     info    Starting proxy agent
2020-09-09T18:27:21.489264Z     info    Received new config, resetting budget
2020-09-09T18:27:21.489282Z     info    Reconciling retry (budget 10)
2020-09-09T18:27:21.489298Z     info    Epoch 0 starting
2020-09-09T18:27:21.489354Z     warn    watching /etc/certs encountered an error no such file or directory
2020-09-09T18:27:21.495161Z     info    Envoy command: [-c /etc/istio/proxy/envoy-rev0.json --restart-epoch 0 --drain-time-s 45 --parent-shutdown-time-s 60 --service-cluster ml-pipeline-ui-artifact.amy-bachir --service-node sidecar~10.244.0.79~ml-pipeline-ui-artifact-bd978746-c47ws.amy-bachir~amy-bachir.svc.cluster.local --max-obj-name-len 189 --local-address-ip-version v4 --allow-unknown-fields -l warning --component-log-level misc:error --concurrency 2]
[2020-09-09 18:27:21.508][18][warning][config] [external/envoy/source/server/options_impl.cc:193] --allow-unknown-fields is deprecated, use --allow-unknown-static-fields instead.
[2020-09-09 18:27:21.541][18][warning][config] [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:87] gRPC config stream closed: 14, no healthy upstream
[2020-09-09 18:27:21.541][18][warning][config] [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:50] Unable to establish new stream
2020-09-09T18:27:22.171412Z     info    Envoy proxy is NOT ready: config not received from Pilot (is Pilot running?): cds updates: 0 successful, 0 rejected; lds updates: 0 successful, 0 rejected
[2020-09-09 18:27:23.053][18][warning][filter] [src/envoy/http/authn/http_filter_factory.cc:102] mTLS PERMISSIVE mode is used, connection can be either plaintext or TLS, and client cert can be omitted. Please consider to upgrade to mTLS STRICT mode for more secure configuration that only allows TLS connection with client cert. See https://istio.io/docs/tasks/security/mtls-migration/
[2020-09-09 18:27:23.057][18][warning][filter] [src/envoy/http/authn/http_filter_factory.cc:102] mTLS PERMISSIVE mode is used, connection can be either plaintext or TLS, and client cert can be omitted. Please consider to upgrade to mTLS STRICT mode for more secure configuration that only allows TLS connection with client cert. See https://istio.io/docs/tasks/security/mtls-migration/
[2020-09-09 18:27:23.479][18][warning][filter] [src/envoy/http/authn/http_filter_factory.cc:102] mTLS PERMISSIVE mode is used, connection can be either plaintext or TLS, and client cert can be omitted. Please consider to upgrade to mTLS STRICT mode for more secure configuration that only allows TLS connection with client cert. See https://istio.io/docs/tasks/security/mtls-migration/
[2020-09-09 18:27:23.483][18][warning][filter] [src/envoy/http/authn/http_filter_factory.cc:102] mTLS PERMISSIVE mode is used, connection can be either plaintext or TLS, and client cert can be omitted. Please consider to upgrade to mTLS STRICT mode for more secure configuration that only allows TLS connection with client cert. See https://istio.io/docs/tasks/security/mtls-migration/
2020-09-09T18:27:24.173037Z     info    Envoy proxy is ready
[2020-09-09 18:32:24.357][18][warning][config] [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:87] gRPC config stream closed: 13,
(base) root@US-BachirAm-1:~#
(base) root@US-BachirAm-1:~#
(base) root@US-BachirAm-1:~#
(base) root@US-BachirAm-1:~#
(base) root@US-BachirAm-1:~# kubectl logs ml-pipeline-ui-artifact-bd978746-c47ws -n amy-bachir -c ml-pipeline-ui-artifact
{
  argo: {
    archiveArtifactory: 'minio',
    archiveBucketName: 'mlpipeline',
    archiveLogs: false,
    archivePrefix: 'logs'
  },
  artifacts: 'Artifacts config contains credentials, so it is omitted',
  metadata: { envoyService: { host: 'localhost', port: '9090' } },
  pipeline: { host: 'localhost', port: '3001' },
  server: {
    apiVersionPrefix: 'apis/v1beta1',
    basePath: '/pipeline',
    deployment: 'NOT_SPECIFIED',
    port: 3000,
    staticDir: '/client'
  },
  viewer: {
    tensorboard: {
      podTemplateSpec: undefined,
      tfImageName: 'tensorflow/tensorflow'
    }
  },
  visualizations: { allowCustomVisualizations: false },
  gkeMetadata: { disabled: false },
  auth: {
    enabled: false,
    kubeflowUserIdHeader: 'x-goog-authenticated-user-email',
    kubeflowUserIdPrefix: 'accounts.google.com:'
  }
}
[HPM] Proxy created: /  ->  http://localhost:9090
[HPM] Proxy created: /  ->  http://127.0.0.1
[HPM] Subscribed to http-proxy events:  [ 'error', 'close' ]
[HPM] Proxy created: /  ->  http://127.0.0.1
[HPM] Subscribed to http-proxy events:  [ 'error', 'close' ]
[HPM] Proxy created: /  ->  http://localhost:3001
[HPM] Subscribed to http-proxy events:  [ 'proxyReq', 'error', 'close' ]
[HPM] Proxy created: /  ->  http://localhost:3001
[HPM] Subscribed to http-proxy events:  [ 'proxyReq', 'error', 'close' ]
Server listening at http://localhost:3000
(base) root@US-BachirAm-1:~#

@amybachir
Copy link

amybachir commented Sep 9, 2020

ip6tables v1.6.0: can't initialize ip6tables table `filter': Table does not exist (do you need to insmod?)
Perhaps ip6tables or your kernel needs to be upgraded.

This is the only thing that stands out to me. Here is my Kubernetes version:
Server Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.10", GitCommit:"89d8075525967c7a619641fabcb267358d28bf08", GitTreeState:"clean", BuildDate:"2020-06-23T02:52:37Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}

@JasonPad19
Copy link

I have the same issue after installing kubeflow 1.1 on kubernetes v1.18.

besides, i also encountered an installation issue.
WARN[0226] Encountered error applying application kubeflow-apps:
(kubeflow.error): Code 500 with message: Apply.Run : error when creating "/tmp/kout768084929": CustomResourceDefinition.apiextensions.k8s.io "seldondeployments.machinelearning.seldon.io" is invalid:
[spec.validation.openAPIV3Schema.properties[spec].properties[predictors].items.properties[explainer].properties[containerSpec].properties[ports].items.properties[protocol].default: Required value: this property is in x-kubernetes-list-map-keys, so it must have a default or be a required property, spec.validation.openAPIV3Schema.properties[spec].properties[predictors].items.properties[componentSpecs].items.properties[spec].properties[initContainers].items.properties[ports].items.properties[protocol].default: Required value: this property is in x-kubernetes-list-map-keys, so itmust have a default or be a required property, spec.validation.openAPIV3Schema.properties[spec].properties[predictors].items.properties[componentSpecs].items.properties[spec].properties[containers].items.properties[ports].items.properties[protocol].default: Required value: this property is in x-kubernetes-list-map-keys, so it must have a default or be a required property] filename="kustomize/kustomize.go:266"
WARN[0226] Will retry in 4 seconds. filename="kustomize/kustomize.go:267"

@danishsamad
Copy link
Contributor

Similar issue, with a hack to resolve it, discussed here
Using both kfctl_k8s_istio.v1.1.0.yaml & kfctl_istio_dex.v1.1.0.yaml generate the same issue on AKS

@thesuperzapper
Copy link
Member

Hi all, the solution to this is simple, just make sure your Namespace/kubeflow has the right labels.

PLEASE NOTE: you will need to recreate all resources in Namespace/kubeflow, after making this change.

apiVersion: v1
kind: Namespace
metadata:
  name: kubeflow
  labels:
    control-plane: kubeflow
    istio-injection: enabled

@JasonPad19
Copy link

Hi, after installing kubeflow, i had a check on the namespace side. it already contains the correct labels.

image

@Bobgy
Copy link
Contributor

Bobgy commented Sep 30, 2020

Can you give your output for kubectl get pod -n kubeflow?

@midhun1998
Copy link
Member

Hi @amybachir . I'm facing the same issue with ml-pipeline pod. The readiness probe is failing with 503. Did you find any fix or hack?

@Junaid-Ahmed94
Copy link

@Bobgy I am also facing the same issue. Given below is the output that you requested.
image

What I did.

  1. Used kfctl_azure_aad.v1.2.0
  2. updated generic to multiuser as suggested Pipelines Multi User

but after the above steps I still get the upstream connect error

@solarist
Copy link

solarist commented Feb 2, 2021

Same here (HTTP 503) on prem with version 1.2 with dex auth. namespace has correct labels, and disabling mTLS on ml-pipeline leads to "RBAC: access denied" error.
(istio sidecar is also running for ml-pipeline)

@Junaid-Ahmed94
Copy link

Junaid-Ahmed94 commented Feb 3, 2021

So after going through few other issues and using their 'hack' it worked (for Azure KF1.2 deployment) by changing the ISTIO_MUTUAL to DISABLE
in istio-authorization-config.yaml

@berndverst, after comparison of AKS KF deployment with GKE KF deployment. we found that AKS ns/kubeflow was missing sidecars see the above image too.

@solarist
Copy link

solarist commented Feb 5, 2021

So instead of disabling mTLS, suggestion of @jonasdebeukelaer in kubeflow/kubeflow#5561 (comment) solved it for me.

@Junaid-Ahmed94
Copy link

@solarist I Just checked in my azure deployment but I have the meshPolicy as suggested by @jonasdebeukelaer but still the access doesn't work without disabling the mTLS
image

@berndverst
Copy link
Member

@Junaid-Ahmed94 Sorry you've had to troubleshoot this. I sort of inherited whatever was done for v1.1 and moved it over to v1.2 so no surprise the issue still exists here.

For Kubeflow v1.3 the goal is to support modern versions of Istio (I believe that work is on track for the v1.3 release). Hopefully I can remove the bundled Istio then and have everything just work. I don't know too much about Istio configuration.

@andre-lx
Copy link

andre-lx commented Apr 29, 2021

UPDATE:

The problem is indeed that the istio pods are not being injected to the ns kubeflow, the reason is explained in my comment here:

#5244 (comment)

@stale
Copy link

stale bot commented Aug 28, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the lifecycle/stale The issue / pull request is stale, any activities remove this label. label Aug 28, 2021
@stale
Copy link

stale bot commented Mar 3, 2022

This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug lifecycle/stale The issue / pull request is stale, any activities remove this label.
Projects
None yet
Development

No branches or pull requests