Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pod exits with error 139 In-cluster #869

Closed
timdesi opened this issue Apr 7, 2022 · 9 comments
Closed

Pod exits with error 139 In-cluster #869

timdesi opened this issue Apr 7, 2022 · 9 comments
Labels
bug Something isn't working

Comments

@timdesi
Copy link

timdesi commented Apr 7, 2022

Current and expected behavior

I am running event_watcher example from source code from latest master.

Out of cluster example runs as expected but in-cluster pod exits with error code 139.

I have checked, all ENV variables are available inside pod. Pod is running with default service account that have all admin permissions on my cluster. Also check with go-client to be sure that everything is fine in cluster.

Possible solution

No response

Additional context

Out of cluster

❯ cargo run --example event_watcher
    Finished dev [unoptimized + debuginfo] target(s) in 0.37s
     Running /prgs/rust/kube-rs/target/debug/examples/event_watcher
[2022-04-07T07:20:54Z DEBUG kube_client::client::builder] HTTP; http.method=GET http.url=https://192.168.64.2:8443/api/v1/events? otel.name="list" otel.kind="client"
[2022-04-07T07:20:54Z DEBUG kube_client::client::builder] requesting
[2022-04-07T07:20:54Z DEBUG kube_client::client::builder] HTTP; http.status_code=200
[2022-04-07T07:20:54Z INFO  event_watcher] New Event: Created container dnsutils (via Pod dnsutils)
[2022-04-07T07:20:54Z INFO  event_watcher] New Event: Started container dnsutils (via Pod dnsutils)
[2022-04-07T07:20:54Z INFO  event_watcher] New Event: Node is not ready (via Pod dnsutils)
[2022-04-07T07:20:54Z INFO  event_watcher] New Event: Container image "k8s.gcr.io/e2e-test-images/jessie-dnsutils:1.3" already present on machine (via Pod dnsutils)
[2022-04-07T07:20:54Z INFO  event_watcher] New Event: Node is not ready (via Pod in-cluster-49cv6)
[2022-04-07T07:20:54Z INFO  event_watcher] New Event: Node minikube status is now: NodeHasSufficientMemory (via Node minikube)
[2022-04-07T07:20:54Z INFO  event_watcher] New Event: Node minikube status is now: NodeHasNoDiskPressure (via Node minikube)
[2022-04-07T07:20:54Z INFO  event_watcher] New Event: Node minikube status is now: NodeHasSufficientPID (via Node minikube)

In-cluster
no any logs ...

Environment

k8s with minikube

❯ kubectl version --short
Client Version: v1.22.4
Server Version: v1.23.3

Dockerfile OS : busybox or busybox42/alpine-pod

Configuration and features

kube = { path = "../kube", version = "^0.70.0", default-features = false, features = ["admission"] }
kube-derive = { path = "../kube-derive", version = "^0.70.0", default-features = false } # only needed to opt out of schema
k8s-openapi = { version = "0.14.0", default-features = false }

❯ cargo tree | grep kube
kube v0.70.0 (/prgs/rust/kube-rs/kube)
├── kube-client v0.70.0 (/prgs/rust/kube-rs/kube-client)
│ ├── kube-core v0.70.0 (/prgs/rust/kube-rs/kube-core)
└── kube-core v0.70.0 (/prgs/rust/kube-rs/kube-core) (*)

Affected crates

No response

Would you like to work on fixing this bug?

No response

@timdesi timdesi added the bug Something isn't working label Apr 7, 2022
@clux
Copy link
Member

clux commented Apr 7, 2022

Hey there.

I think the configuration you have supplied is insufficient to run the event_watcher example:

kube = { path = "../kube", version = "^0.70.0", default-features = false, features = ["admission"] }

you have turned off default-features, which is ok if you are building an admission controller without the need for a kube client or runtime, but not if you are using the watch api with watcher (which uses both those feature).

try adding features = ["admission", "client", "runtime", "openssl-tls"]

@nightkr
Copy link
Member

nightkr commented Apr 7, 2022

That should just cause the build to fail though, not a segfault or other suicide once the pod is up and running...

@kazk
Copy link
Member

kazk commented Apr 7, 2022

Dockerfile OS : busybox or busybox42/alpine-pod

@timdesi Can you post the Dockerfile?

@timdesi
Copy link
Author

timdesi commented Apr 8, 2022

Hi, Dockerfile is like below;

FROM rust:alpine AS build

# Install Alpine Dependencies
RUN apk add --update make git bash openssl-dev musl-dev pkgconfig protoc
RUN rustup component add rustfmt

WORKDIR /src
COPY . .

# Build
RUN cd examples; cargo build --example event_watcher

# Final container
FROM busybox42/alpine-pod AS bin
COPY --from=build /src/target/debug/examples/event_watcher /app/
WORKDIR /app

CMD ./event_watcher

@timdesi
Copy link
Author

timdesi commented Apr 8, 2022

I also try to debug/trace a code, found that application passes kube_client/src/config/mod.rs - line 178 from_cluster_env() function but after that pod exits with segfault and could not trace any more.

    pub async fn infer() -> Result<Self, InferConfigError> {
        let mut config = match Self::from_kubeconfig(&KubeConfigOptions::default()).await {
            Err(kubeconfig_err) => {
                tracing::trace!(
                    error = &kubeconfig_err as &dyn std::error::Error,
                    "no local config found, falling back to local in-cluster config"
                );

                Self::from_cluster_env().map_err(|in_cluster_err| InferConfigError {
                    in_cluster: in_cluster_err,
                    kubeconfig: kubeconfig_err,
                })?
            }
            Ok(success) => success,
        };
        config.apply_debug_overrides();
        Ok(config)
    }

May help, ENV variables that are injected to pods, extracted from another pod at same namespace.


root@in-cluster-fn8ql:/# export
declare -x HOME="/root"
declare -x HOSTNAME="in-cluster-fn8ql"
declare -x KUBERNETES_PORT="tcp://10.96.0.1:443"
declare -x KUBERNETES_PORT_443_TCP="tcp://10.96.0.1:443"
declare -x KUBERNETES_PORT_443_TCP_ADDR="10.96.0.1"
declare -x KUBERNETES_PORT_443_TCP_PORT="443"
declare -x KUBERNETES_PORT_443_TCP_PROTO="tcp"
declare -x KUBERNETES_SERVICE_HOST="10.96.0.1"
declare -x KUBERNETES_SERVICE_PORT="443"
declare -x KUBERNETES_SERVICE_PORT_HTTPS="443"

@clux
Copy link
Member

clux commented Apr 8, 2022

This looks very similar to the problem encountered in #331. I.e. alpine struggling with openssl. #331 (comment) . The workaround might work or try a different build environment. Not sure how many of us use alpine these days.

@clux
Copy link
Member

clux commented May 23, 2022

@timdesi Did you get to the bottom of this?

@timdesi
Copy link
Author

timdesi commented May 24, 2022

Unfortunately not as expected. Expectation was to use minimal possible container, i mean scratch, busybox, alpine. Found same work arounds with other build environments as proposed at previous comment. Thx.

@timdesi timdesi closed this as completed May 24, 2022
@clux
Copy link
Member

clux commented May 24, 2022

Thanks for getting back. Yeah, compiling from alpine is problematic. You should still be able to use a minimal possible container for your production environment (scratch, busybox, alpine, distroless) if you use some cross-compiling builder as part of your CI process, but that builder image will not be as small as the output production image.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants