Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tikv not running #299

Closed
yanyixing opened this issue Mar 6, 2019 · 25 comments
Closed

tikv not running #299

yanyixing opened this issue Mar 6, 2019 · 25 comments

Comments

@yanyixing
Copy link

yanyixing commented Mar 6, 2019

I using tidb-operator install a tidb on my three nodes k8s env, but the tikv not running.
The result is as blew:

kubectl get pods -n tidb
NAME                              READY   STATUS             RESTARTS   AGE
demo-discovery-5468c7c556-5c624   1/1     Running            5          102m
demo-monitor-84446b7957-wrxlg     2/2     Running            0          102m
demo-monitor-configurator-v697r   1/1     Running            0          102m
demo-pd-0                         0/1     CrashLoopBackOff   5          102m
demo-pd-1                         1/1     Running            0          102m
demo-pd-2                         1/1     Running            5          102m
demo-tidb-initializer-jfxwl       1/1     Running            0          102m
demo-tikv-0                       1/2     CrashLoopBackOff   21         58m
demo-tikv-1                       1/2     CrashLoopBackOff   16         58m
demo-tikv-2                       1/2     CrashLoopBackOff   16         57m


kubectl describe pods demo-tikv-0 -n tidb
Name:               demo-tikv-0
Namespace:          tidb
Priority:           0
PriorityClassName:  <none>
Node:               umstor14/192.168.180.138
Start Time:         Wed, 06 Mar 2019 17:52:08 +0800
Labels:             app.kubernetes.io/component=tikv
                    app.kubernetes.io/instance=tidb-cluster
                    app.kubernetes.io/managed-by=tidb-operator
                    app.kubernetes.io/name=tidb-cluster
                    controller-revision-hash=demo-tikv-bffdb79d9
                    statefulset.kubernetes.io/pod-name=demo-tikv-0
Annotations:        pingcap.com/last-applied-configuration:
                      {"volumes":[{"name":"annotations","downwardAPI":{"items":[{"path":"annotations","fieldRef":{"fieldPath":"metadata.annotations"}}]}},{"name...
                    prometheus.io/path: /metrics
                    prometheus.io/port: 9091
                    prometheus.io/scrape: true
Status:             Running
IP:                 10.200.1.17
Controlled By:      StatefulSet/demo-tikv
Containers:
  tikv:
    Container ID:  docker://5b14f6d21e524992ea9974e2a1d61d40476cd328d77606d2b5a8afc7014fa563
    Image:         pingcap/tikv:v2.1.0
    Image ID:      docker-pullable://pingcap/tikv@sha256:7611c99f244fe537b7a00288b178187200611661503ca8401fe27f019e365db0
    Port:          20160/TCP
    Host Port:     0/TCP
    Command:
      /bin/sh
      /usr/local/bin/tikv_start_script.sh
    State:          Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Wed, 06 Mar 2019 17:52:26 +0800
      Finished:     Wed, 06 Mar 2019 17:52:26 +0800
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Wed, 06 Mar 2019 17:52:10 +0800
      Finished:     Wed, 06 Mar 2019 17:52:10 +0800
    Ready:          False
    Restart Count:  2
    Environment:
      NAMESPACE:              tidb (v1:metadata.namespace)
      CLUSTER_NAME:           demo
      HEADLESS_SERVICE_NAME:  demo-tikv-peer
      CAPACITY:               0
      TZ:                     UTC
    Mounts:
      /etc/podinfo from annotations (ro)
      /etc/tikv from config (ro)
      /usr/local/bin from startup-script (ro)
      /var/lib/tikv from tikv (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-f4sff (ro)
  pushgateway:
    Container ID:   docker://8bc0e2e8f26588818bc107f9371ed08339907141eeb1338bfade613c4030b4d6
    Image:          prom/pushgateway:v0.3.1
    Image ID:       docker-pullable://prom/pushgateway@sha256:a108d9749fc0b9e6dac38c3c1dd612b24ff34f278078b0b70aba39c0aaced81e
    Port:           9091/TCP
    Host Port:      0/TCP
    State:          Running
      Started:      Wed, 06 Mar 2019 17:52:09 +0800
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     100m
      memory:  100Mi
    Requests:
      cpu:     50m
      memory:  50Mi
    Environment:
      TZ:  UTC
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-f4sff (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  tikv:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  tikv-demo-tikv-0
    ReadOnly:   false
  annotations:
    Type:  DownwardAPI (a volume populated by information about the pod)
    Items:
      metadata.annotations -> annotations
  config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      demo-tikv
    Optional:  false
  startup-script:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      demo-tikv
    Optional:  false
  default-token-f4sff:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-f4sff
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason     Age               From               Message
  ----     ------     ----              ----               -------
  Normal   Scheduled  20s               tidb-scheduler     Successfully assigned tidb/demo-tikv-0 to umstor14
  Normal   Pulled     19s               kubelet, umstor14  Container image "prom/pushgateway:v0.3.1" already present on machine
  Normal   Created    19s               kubelet, umstor14  Created container
  Normal   Started    19s               kubelet, umstor14  Started container
  Normal   Pulled     2s (x3 over 19s)  kubelet, umstor14  Container image "pingcap/tikv:v2.1.0" already present on machine
  Normal   Created    2s (x3 over 19s)  kubelet, umstor14  Created container
  Normal   Started    2s (x3 over 19s)  kubelet, umstor14  Started container
  Warning  BackOff    1s (x3 over 17s)  kubelet, umstor14  Back-off restarting failed container



kubectl get pv -n tidb
NAME                CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM                   STORAGECLASS    REASON   AGE
local-pv-1157f630   99Gi       RWO            Retain           Bound       tidb/pd-demo-pd-2       local-storage            84m
local-pv-1552be44   99Gi       RWO            Delete           Available                           local-storage            84m
local-pv-1ad1b13f   99Gi       RWO            Delete           Available                           local-storage            82m
local-pv-226a8e57   99Gi       RWO            Delete           Available                           local-storage            84m
local-pv-2ac2c85b   99Gi       RWO            Retain           Bound       tidb/pd-demo-pd-0       local-storage            84m
local-pv-4b3596b6   99Gi       RWO            Retain           Bound       tidb/tikv-demo-tikv-0   local-storage            84m
local-pv-5e37e14e   99Gi       RWO            Retain           Bound       tidb/tikv-demo-tikv-1   local-storage            82m
local-pv-6ee370e9   149Gi      RWO            Delete           Available                           local-storage            84m
local-pv-7437a961   99Gi       RWO            Retain           Bound       tidb/pd-demo-pd-1       local-storage            83m
local-pv-76003c42   99Gi       RWO            Delete           Available                           local-storage            82m
local-pv-79bdc895   99Gi       RWO            Retain           Bound       tidb/tikv-demo-tikv-2   local-storage            82m
local-pv-8a1a18e2   149Gi      RWO            Delete           Available                           local-storage            84m
local-pv-9c46be9d   99Gi       RWO            Delete           Available                           local-storage            84m
local-pv-c5fc6ec3   149Gi      RWO            Delete           Available                           local-storage            82m
local-pv-dcb06550   149Gi      RWO            Delete           Available                           local-storage            82m
local-pv-e9dfc52c   99Gi       RWO            Delete           Available                           local-storage            82m


kubectl get pvc -n tidb
NAME               STATUS   VOLUME              CAPACITY   ACCESS MODES   STORAGECLASS    AGE
pd-demo-pd-0       Bound    local-pv-2ac2c85b   99Gi       RWO            local-storage   105m
pd-demo-pd-1       Bound    local-pv-7437a961   99Gi       RWO            local-storage   105m
pd-demo-pd-2       Bound    local-pv-1157f630   99Gi       RWO            local-storage   105m
tikv-demo-tikv-0   Bound    local-pv-4b3596b6   99Gi       RWO            local-storage   82m
tikv-demo-tikv-1   Bound    local-pv-5e37e14e   99Gi       RWO            local-storage   82m
tikv-demo-tikv-2   Bound    local-pv-79bdc895   99Gi       RWO            local-storage   82m

@weekface
Copy link
Contributor

weekface commented Mar 6, 2019

@yanyixing Can you get the logs of tikv:

$ kubectl logs -f -n tidb demo-tikv-0
$ kubectl logs -f -n tidb demo-tikv-0 -p

And which version of tidb-operator are you using?

@yanyixing
Copy link
Author

@weekface

kubectl logs -f -n tidb demo-tikv-0 -c tikv
starting tikv-server ...
/tikv-server --pd=demo-pd:2379 --advertise-addr=demo-tikv-0.demo-tikv-peer.tidb.svc:20160 --addr=0.0.0.0:20160 --data-dir=/var/lib/tikv --capacity=0 --config=/etc/tikv/tikv.toml

2019/03/06 09:57:55.620 INFO mod.rs:26: Welcome to TiKV.
Release Version:   2.1.3
Git Commit Hash:   f8c478ad646d045fb99cdf06274a7ad229e8b822
Git Commit Branch: release-2.1
UTC Build Time:    2019-01-28 06:38:24
Rust Version:      rustc 1.29.0-nightly (4f3c7a472 2018-07-17)
2019/03/06 09:57:55.620 INFO tikv-server.rs:432: using config: {
  "log-level": "info",
  "log-file": "",
  "log-rotation-timespan": "24h",
  "readpool": {
    "storage": {
      "high-concurrency": 4,
      "normal-concurrency": 4,
      "low-concurrency": 4,
      "max-tasks-per-worker-high": 2000,
      "max-tasks-per-worker-normal": 2000,
      "max-tasks-per-worker-low": 2000,
      "stack-size": "10MB"
    },
    "coprocessor": {
      "high-concurrency": 25,
      "normal-concurrency": 25,
      "low-concurrency": 25,
      "max-tasks-per-worker-high": 2000,
      "max-tasks-per-worker-normal": 2000,
      "max-tasks-per-worker-low": 2000,
      "stack-size": "10MB"
    }
  },
  "server": {
    "addr": "0.0.0.0:20160",
    "advertise-addr": "demo-tikv-0.demo-tikv-peer.tidb.svc:20160",
    "status-addr": "127.0.0.1:20180",
    "status-thread-pool-size": 1,
    "grpc-compression-type": "none",
    "grpc-concurrency": 4,
    "grpc-concurrent-stream": 1024,
    "grpc-raft-conn-num": 10,
    "grpc-stream-initial-window-size": "2MB",
    "grpc-keepalive-time": "10s",
    "grpc-keepalive-timeout": "3s",
    "concurrent-send-snap-limit": 32,
    "concurrent-recv-snap-limit": 32,
    "end-point-recursion-limit": 1000,
    "end-point-stream-channel-size": 8,
    "end-point-batch-row-limit": 64,
    "end-point-stream-batch-row-limit": 128,
    "end-point-request-max-handle-duration": "1m",
    "snap-max-write-bytes-per-sec": "100MB",
    "snap-max-total-size": "0KB",
    "labels": {}
  },
  "storage": {
    "data-dir": "/var/lib/tikv",
    "gc-ratio-threshold": 1.1,
    "max-key-size": 4096,
    "scheduler-notify-capacity": 10240,
    "scheduler-concurrency": 2048000,
    "scheduler-worker-pool-size": 8,
    "scheduler-pending-write-threshold": "100MB"
  },
  "pd": {
    "endpoints": [
      "demo-pd:2379"
    ]
  },
  "metric": {
    "interval": "15s",
    "address": "http://localhost:9091",
    "job": "tikv"
  },
  "raftstore": {
    "sync-log": true,
    "prevote": true,
    "raftdb-path": "/var/lib/tikv/raft",
    "capacity": "0KB",
    "raft-base-tick-interval": "1s",
    "raft-heartbeat-ticks": 2,
    "raft-election-timeout-ticks": 10,
    "raft-min-election-timeout-ticks": 10,
    "raft-max-election-timeout-ticks": 20,
    "raft-max-size-per-msg": "1MB",
    "raft-max-inflight-msgs": 256,
    "raft-entry-max-size": "8MB",
    "raft-log-gc-tick-interval": "10s",
    "raft-log-gc-threshold": 50,
    "raft-log-gc-count-limit": 73728,
    "raft-log-gc-size-limit": "72MB",
    "raft-entry-cache-life-time": "30s",
    "raft-reject-transfer-leader-duration": "3s",
    "split-region-check-tick-interval": "10s",
    "region-split-check-diff": "6MB",
    "region-compact-check-interval": "5m",
    "clean-stale-peer-delay": "11m",
    "region-compact-check-step": 100,
    "region-compact-min-tombstones": 10000,
    "region-compact-tombstones-percent": 30,
    "pd-heartbeat-tick-interval": "1m",
    "pd-store-heartbeat-tick-interval": "10s",
    "snap-mgr-gc-tick-interval": "1m",
    "snap-gc-timeout": "4h",
    "lock-cf-compact-interval": "10m",
    "lock-cf-compact-bytes-threshold": "256MB",
    "notify-capacity": 40960,
    "messages-per-tick": 4096,
    "max-peer-down-duration": "5m",
    "max-leader-missing-duration": "2h",
    "abnormal-leader-missing-duration": "10m",
    "peer-stale-state-check-interval": "5m",
    "leader-transfer-max-log-lag": 10,
    "snap-apply-batch-size": "10MB",
    "consistency-check-interval": "0s",
    "report-region-flow-interval": "1m",
    "raft-store-max-leader-lease": "9s",
    "right-derive-when-split": true,
    "allow-remove-leader": false,
    "merge-max-log-gap": 10,
    "merge-check-tick-interval": "10s",
    "use-delete-range": false,
    "cleanup-import-sst-interval": "10m",
    "local-read-batch-size": 1024
  },
  "coprocessor": {
    "split-region-on-table": true,
    "batch-split-limit": 10,
    "region-max-size": "144MB",
    "region-split-size": "96MB",
    "region-max-keys": 1440000,
    "region-split-keys": 960000
  },
  "rocksdb": {
    "wal-recovery-mode": 2,
    "wal-dir": "",
    "wal-ttl-seconds": 0,
    "wal-size-limit": "0KB",
    "max-total-wal-size": "4GB",
    "max-background-jobs": 6,
    "max-manifest-file-size": "20MB",
    "create-if-missing": true,
    "max-open-files": 40960,
    "enable-statistics": true,
    "stats-dump-period": "10m",
    "compaction-readahead-size": "0KB",
    "info-log-max-size": "1GB",
    "info-log-roll-time": "0s",
    "info-log-keep-log-file-num": 10,
    "info-log-dir": "",
    "rate-bytes-per-sec": "0KB",
    "bytes-per-sync": "1MB",
    "wal-bytes-per-sync": "512KB",
    "max-sub-compactions": 1,
    "writable-file-max-buffer-size": "1MB",
    "use-direct-io-for-flush-and-compaction": false,
    "enable-pipelined-write": true,
    "defaultcf": {
      "block-size": "64KB",
      "block-cache-size": "16050MB",
      "disable-block-cache": false,
      "cache-index-and-filter-blocks": true,
      "pin-l0-filter-and-index-blocks": true,
      "use-bloom-filter": true,
      "whole-key-filtering": true,
      "bloom-filter-bits-per-key": 10,
      "block-based-bloom-filter": false,
      "read-amp-bytes-per-bit": 0,
      "compression-per-level": [
        "no",
        "no",
        "lz4",
        "lz4",
        "lz4",
        "zstd",
        "zstd"
      ],
      "write-buffer-size": "128MB",
      "max-write-buffer-number": 5,
      "min-write-buffer-number-to-merge": 1,
      "max-bytes-for-level-base": "512MB",
      "target-file-size-base": "8MB",
      "level0-file-num-compaction-trigger": 4,
      "level0-slowdown-writes-trigger": 20,
      "level0-stop-writes-trigger": 36,
      "max-compaction-bytes": "2GB",
      "compaction-pri": 3,
      "dynamic-level-bytes": true,
      "num-levels": 7,
      "max-bytes-for-level-multiplier": 10,
      "compaction-style": 0,
      "disable-auto-compactions": false,
      "soft-pending-compaction-bytes-limit": "64GB",
      "hard-pending-compaction-bytes-limit": "256GB"
    },
    "writecf": {
      "block-size": "64KB",
      "block-cache-size": "9630MB",
      "disable-block-cache": false,
      "cache-index-and-filter-blocks": true,
      "pin-l0-filter-and-index-blocks": true,
      "use-bloom-filter": true,
      "whole-key-filtering": false,
      "bloom-filter-bits-per-key": 10,
      "block-based-bloom-filter": false,
      "read-amp-bytes-per-bit": 0,
      "compression-per-level": [
        "no",
        "no",
        "lz4",
        "lz4",
        "lz4",
        "zstd",
        "zstd"
      ],
      "write-buffer-size": "128MB",
      "max-write-buffer-number": 5,
      "min-write-buffer-number-to-merge": 1,
      "max-bytes-for-level-base": "512MB",
      "target-file-size-base": "8MB",
      "level0-file-num-compaction-trigger": 4,
      "level0-slowdown-writes-trigger": 20,
      "level0-stop-writes-trigger": 36,
      "max-compaction-bytes": "2GB",
      "compaction-pri": 3,
      "dynamic-level-bytes": true,
      "num-levels": 7,
      "max-bytes-for-level-multiplier": 10,
      "compaction-style": 0,
      "disable-auto-compactions": false,
      "soft-pending-compaction-bytes-limit": "64GB",
      "hard-pending-compaction-bytes-limit": "256GB"
    },
    "lockcf": {
      "block-size": "16KB",
      "block-cache-size": "1GB",
      "disable-block-cache": false,
      "cache-index-and-filter-blocks": true,
      "pin-l0-filter-and-index-blocks": true,
      "use-bloom-filter": true,
      "whole-key-filtering": true,
      "bloom-filter-bits-per-key": 10,
      "block-based-bloom-filter": false,
      "read-amp-bytes-per-bit": 0,
      "compression-per-level": [
        "no",
        "no",
        "no",
        "no",
        "no",
        "no",
        "no"
      ],
      "write-buffer-size": "128MB",
      "max-write-buffer-number": 5,
      "min-write-buffer-number-to-merge": 1,
      "max-bytes-for-level-base": "128MB",
      "target-file-size-base": "8MB",
      "level0-file-num-compaction-trigger": 1,
      "level0-slowdown-writes-trigger": 20,
      "level0-stop-writes-trigger": 36,
      "max-compaction-bytes": "2GB",
      "compaction-pri": 0,
      "dynamic-level-bytes": true,
      "num-levels": 7,
      "max-bytes-for-level-multiplier": 10,
      "compaction-style": 0,
      "disable-auto-compactions": false,
      "soft-pending-compaction-bytes-limit": "64GB",
      "hard-pending-compaction-bytes-limit": "256GB"
    },
    "raftcf": {
      "block-size": "16KB",
      "block-cache-size": "128MB",
      "disable-block-cache": false,
      "cache-index-and-filter-blocks": true,
      "pin-l0-filter-and-index-blocks": true,
      "use-bloom-filter": true,
      "whole-key-filtering": true,
      "bloom-filter-bits-per-key": 10,
      "block-based-bloom-filter": false,
      "read-amp-bytes-per-bit": 0,
      "compression-per-level": [
        "no",
        "no",
        "no",
        "no",
        "no",
        "no",
        "no"
      ],
      "write-buffer-size": "128MB",
      "max-write-buffer-number": 5,
      "min-write-buffer-number-to-merge": 1,
      "max-bytes-for-level-base": "128MB",
      "target-file-size-base": "8MB",
      "level0-file-num-compaction-trigger": 1,
      "level0-slowdown-writes-trigger": 20,
      "level0-stop-writes-trigger": 36,
      "max-compaction-bytes": "2GB",
      "compaction-pri": 0,
      "dynamic-level-bytes": true,
      "num-levels": 7,
      "max-bytes-for-level-multiplier": 10,
      "compaction-style": 0,
      "disable-auto-compactions": false,
      "soft-pending-compaction-bytes-limit": "64GB",
      "hard-pending-compaction-bytes-limit": "256GB"
    }
  },
  "raftdb": {
    "wal-recovery-mode": 2,
    "wal-dir": "",
    "wal-ttl-seconds": 0,
    "wal-size-limit": "0KB",
    "max-total-wal-size": "4GB",
    "max-manifest-file-size": "20MB",
    "create-if-missing": true,
    "max-open-files": 40960,
    "enable-statistics": true,
    "stats-dump-period": "10m",
    "compaction-readahead-size": "0KB",
    "info-log-max-size": "1GB",
    "info-log-roll-time": "0s",
    "info-log-keep-log-file-num": 10,
    "info-log-dir": "",
    "max-sub-compactions": 1,
    "writable-file-max-buffer-size": "1MB",
    "use-direct-io-for-flush-and-compaction": false,
    "enable-pipelined-write": true,
    "allow-concurrent-memtable-write": false,
    "bytes-per-sync": "1MB",
    "wal-bytes-per-sync": "512KB",
    "defaultcf": {
      "block-size": "64KB",
      "block-cache-size": "1284MB",
      "disable-block-cache": false,
      "cache-index-and-filter-blocks": true,
      "pin-l0-filter-and-index-blocks": true,
      "use-bloom-filter": false,
      "whole-key-filtering": true,
      "bloom-filter-bits-per-key": 10,
      "block-based-bloom-filter": false,
      "read-amp-bytes-per-bit": 0,
      "compression-per-level": [
        "no",
        "no",
        "lz4",
        "lz4",
        "lz4",
        "zstd",
        "zstd"
      ],
      "write-buffer-size": "128MB",
      "max-write-buffer-number": 5,
      "min-write-buffer-number-to-merge": 1,
      "max-bytes-for-level-base": "512MB",
      "target-file-size-base": "8MB",
      "level0-file-num-compaction-trigger": 4,
      "level0-slowdown-writes-trigger": 20,
      "level0-stop-writes-trigger": 36,
      "max-compaction-bytes": "2GB",
      "compaction-pri": 0,
      "dynamic-level-bytes": true,
      "num-levels": 7,
      "max-bytes-for-level-multiplier": 10,
      "compaction-style": 0,
      "disable-auto-compactions": false,
      "soft-pending-compaction-bytes-limit": "64GB",
      "hard-pending-compaction-bytes-limit": "256GB"
    }
  },
  "security": {
    "ca-path": "",
    "cert-path": "",
    "key-path": ""
  },
  "import": {
    "import-dir": "/tmp/tikv/import",
    "num-threads": 8,
    "num-import-jobs": 8,
    "num-import-sst-jobs": 2,
    "max-prepare-duration": "5m",
    "region-split-size": "96MB",
    "stream-channel-window": 128,
    "max-open-engines": 8
  }
}

@weekface
Copy link
Contributor

weekface commented Mar 6, 2019

@yanyixing Get the previous logs of this tikv:

$ kubectl logs -f -n tidb demo-tikv-0 -p

And which version of tidb-operator are you using?

@yanyixing
Copy link
Author

@weekface
pushgateway log

kubectl logs -f -n tidb demo-tikv-0 -c pushgateway
time="2019-03-06T09:52:09Z" level=info msg="Starting pushgateway (version=0.3.1, branch=master, revision=602f856b0e840cbabc7e4893ea75cf3e9298af3e)" source="main.go:57"
time="2019-03-06T09:52:09Z" level=info msg="Build context (go=go1.7.3, user=root@ddfa0705f939, date=20161103-13:45:57)" source="main.go:58"
time="2019-03-06T09:52:09Z" level=info msg="Listening on :9091." source="main.go:102"

@yanyixing
Copy link
Author

@weekface
tidb-operator version: v1.0.0-beta.1-p2

@yanyixing
Copy link
Author

kubectl logs -f -n tidb demo-tikv-0 -p -c tikv

result

kubectl logs -f -n tidb demo-tikv-0 -p -c tikv
starting tikv-server ...
/tikv-server --pd=demo-pd:2379 --advertise-addr=demo-tikv-0.demo-tikv-peer.tidb.svc:20160 --addr=0.0.0.0:20160 --data-dir=/var/lib/tikv --capacity=0 --config=/etc/tikv/tikv.toml

2019/03/06 10:08:07.631 INFO mod.rs:26: Welcome to TiKV.
Release Version:   2.1.3
Git Commit Hash:   f8c478ad646d045fb99cdf06274a7ad229e8b822
Git Commit Branch: release-2.1
UTC Build Time:    2019-01-28 06:38:24
Rust Version:      rustc 1.29.0-nightly (4f3c7a472 2018-07-17)
2019/03/06 10:08:07.631 INFO tikv-server.rs:432: using config: {
  "log-level": "info",
  "log-file": "",
  "log-rotation-timespan": "24h",
  "readpool": {
    "storage": {
      "high-concurrency": 4,
      "normal-concurrency": 4,
      "low-concurrency": 4,
      "max-tasks-per-worker-high": 2000,
      "max-tasks-per-worker-normal": 2000,
      "max-tasks-per-worker-low": 2000,
      "stack-size": "10MB"
    },
    "coprocessor": {
      "high-concurrency": 25,
      "normal-concurrency": 25,
      "low-concurrency": 25,
      "max-tasks-per-worker-high": 2000,
      "max-tasks-per-worker-normal": 2000,
      "max-tasks-per-worker-low": 2000,
      "stack-size": "10MB"
    }
  },
  "server": {
    "addr": "0.0.0.0:20160",
    "advertise-addr": "demo-tikv-0.demo-tikv-peer.tidb.svc:20160",
    "status-addr": "127.0.0.1:20180",
    "status-thread-pool-size": 1,
    "grpc-compression-type": "none",
    "grpc-concurrency": 4,
    "grpc-concurrent-stream": 1024,
    "grpc-raft-conn-num": 10,
    "grpc-stream-initial-window-size": "2MB",
    "grpc-keepalive-time": "10s",
    "grpc-keepalive-timeout": "3s",
    "concurrent-send-snap-limit": 32,
    "concurrent-recv-snap-limit": 32,
    "end-point-recursion-limit": 1000,
    "end-point-stream-channel-size": 8,
    "end-point-batch-row-limit": 64,
    "end-point-stream-batch-row-limit": 128,
    "end-point-request-max-handle-duration": "1m",
    "snap-max-write-bytes-per-sec": "100MB",
    "snap-max-total-size": "0KB",
    "labels": {}
  },
  "storage": {
    "data-dir": "/var/lib/tikv",
    "gc-ratio-threshold": 1.1,
    "max-key-size": 4096,
    "scheduler-notify-capacity": 10240,
    "scheduler-concurrency": 2048000,
    "scheduler-worker-pool-size": 8,
    "scheduler-pending-write-threshold": "100MB"
  },
  "pd": {
    "endpoints": [
      "demo-pd:2379"
    ]
  },
  "metric": {
    "interval": "15s",
    "address": "http://localhost:9091",
    "job": "tikv"
  },
  "raftstore": {
    "sync-log": true,
    "prevote": true,
    "raftdb-path": "/var/lib/tikv/raft",
    "capacity": "0KB",
    "raft-base-tick-interval": "1s",
    "raft-heartbeat-ticks": 2,
    "raft-election-timeout-ticks": 10,
    "raft-min-election-timeout-ticks": 10,
    "raft-max-election-timeout-ticks": 20,
    "raft-max-size-per-msg": "1MB",
    "raft-max-inflight-msgs": 256,
    "raft-entry-max-size": "8MB",
    "raft-log-gc-tick-interval": "10s",
    "raft-log-gc-threshold": 50,
    "raft-log-gc-count-limit": 73728,
    "raft-log-gc-size-limit": "72MB",
    "raft-entry-cache-life-time": "30s",
    "raft-reject-transfer-leader-duration": "3s",
    "split-region-check-tick-interval": "10s",
    "region-split-check-diff": "6MB",
    "region-compact-check-interval": "5m",
    "clean-stale-peer-delay": "11m",
    "region-compact-check-step": 100,
    "region-compact-min-tombstones": 10000,
    "region-compact-tombstones-percent": 30,
    "pd-heartbeat-tick-interval": "1m",
    "pd-store-heartbeat-tick-interval": "10s",
    "snap-mgr-gc-tick-interval": "1m",
    "snap-gc-timeout": "4h",
    "lock-cf-compact-interval": "10m",
    "lock-cf-compact-bytes-threshold": "256MB",
    "notify-capacity": 40960,
    "messages-per-tick": 4096,
    "max-peer-down-duration": "5m",
    "max-leader-missing-duration": "2h",
    "abnormal-leader-missing-duration": "10m",
    "peer-stale-state-check-interval": "5m",
    "leader-transfer-max-log-lag": 10,
    "snap-apply-batch-size": "10MB",
    "consistency-check-interval": "0s",
    "report-region-flow-interval": "1m",
    "raft-store-max-leader-lease": "9s",
    "right-derive-when-split": true,
    "allow-remove-leader": false,
    "merge-max-log-gap": 10,
    "merge-check-tick-interval": "10s",
    "use-delete-range": false,
    "cleanup-import-sst-interval": "10m",
    "local-read-batch-size": 1024
  },
  "coprocessor": {
    "split-region-on-table": true,
    "batch-split-limit": 10,
    "region-max-size": "144MB",
    "region-split-size": "96MB",
    "region-max-keys": 1440000,
    "region-split-keys": 960000
  },
  "rocksdb": {
    "wal-recovery-mode": 2,
    "wal-dir": "",
    "wal-ttl-seconds": 0,
    "wal-size-limit": "0KB",
    "max-total-wal-size": "4GB",
    "max-background-jobs": 6,
    "max-manifest-file-size": "20MB",
    "create-if-missing": true,
    "max-open-files": 40960,
    "enable-statistics": true,
    "stats-dump-period": "10m",
    "compaction-readahead-size": "0KB",
    "info-log-max-size": "1GB",
    "info-log-roll-time": "0s",
    "info-log-keep-log-file-num": 10,
    "info-log-dir": "",
    "rate-bytes-per-sec": "0KB",
    "bytes-per-sync": "1MB",
    "wal-bytes-per-sync": "512KB",
    "max-sub-compactions": 1,
    "writable-file-max-buffer-size": "1MB",
    "use-direct-io-for-flush-and-compaction": false,
    "enable-pipelined-write": true,
    "defaultcf": {
      "block-size": "64KB",
      "block-cache-size": "16050MB",
      "disable-block-cache": false,
      "cache-index-and-filter-blocks": true,
      "pin-l0-filter-and-index-blocks": true,
      "use-bloom-filter": true,
      "whole-key-filtering": true,
      "bloom-filter-bits-per-key": 10,
      "block-based-bloom-filter": false,
      "read-amp-bytes-per-bit": 0,
      "compression-per-level": [
        "no",
        "no",
        "lz4",
        "lz4",
        "lz4",
        "zstd",
        "zstd"
      ],
      "write-buffer-size": "128MB",
      "max-write-buffer-number": 5,
      "min-write-buffer-number-to-merge": 1,
      "max-bytes-for-level-base": "512MB",
      "target-file-size-base": "8MB",
      "level0-file-num-compaction-trigger": 4,
      "level0-slowdown-writes-trigger": 20,
      "level0-stop-writes-trigger": 36,
      "max-compaction-bytes": "2GB",
      "compaction-pri": 3,
      "dynamic-level-bytes": true,
      "num-levels": 7,
      "max-bytes-for-level-multiplier": 10,
      "compaction-style": 0,
      "disable-auto-compactions": false,
      "soft-pending-compaction-bytes-limit": "64GB",
      "hard-pending-compaction-bytes-limit": "256GB"
    },
    "writecf": {
      "block-size": "64KB",
      "block-cache-size": "9630MB",
      "disable-block-cache": false,
      "cache-index-and-filter-blocks": true,
      "pin-l0-filter-and-index-blocks": true,
      "use-bloom-filter": true,
      "whole-key-filtering": false,
      "bloom-filter-bits-per-key": 10,
      "block-based-bloom-filter": false,
      "read-amp-bytes-per-bit": 0,
      "compression-per-level": [
        "no",
        "no",
        "lz4",
        "lz4",
        "lz4",
        "zstd",
        "zstd"
      ],
      "write-buffer-size": "128MB",
      "max-write-buffer-number": 5,
      "min-write-buffer-number-to-merge": 1,
      "max-bytes-for-level-base": "512MB",
      "target-file-size-base": "8MB",
      "level0-file-num-compaction-trigger": 4,
      "level0-slowdown-writes-trigger": 20,
      "level0-stop-writes-trigger": 36,
      "max-compaction-bytes": "2GB",
      "compaction-pri": 3,
      "dynamic-level-bytes": true,
      "num-levels": 7,
      "max-bytes-for-level-multiplier": 10,
      "compaction-style": 0,
      "disable-auto-compactions": false,
      "soft-pending-compaction-bytes-limit": "64GB",
      "hard-pending-compaction-bytes-limit": "256GB"
    },
    "lockcf": {
      "block-size": "16KB",
      "block-cache-size": "1GB",
      "disable-block-cache": false,
      "cache-index-and-filter-blocks": true,
      "pin-l0-filter-and-index-blocks": true,
      "use-bloom-filter": true,
      "whole-key-filtering": true,
      "bloom-filter-bits-per-key": 10,
      "block-based-bloom-filter": false,
      "read-amp-bytes-per-bit": 0,
      "compression-per-level": [
        "no",
        "no",
        "no",
        "no",
        "no",
        "no",
        "no"
      ],
      "write-buffer-size": "128MB",
      "max-write-buffer-number": 5,
      "min-write-buffer-number-to-merge": 1,
      "max-bytes-for-level-base": "128MB",
      "target-file-size-base": "8MB",
      "level0-file-num-compaction-trigger": 1,
      "level0-slowdown-writes-trigger": 20,
      "level0-stop-writes-trigger": 36,
      "max-compaction-bytes": "2GB",
      "compaction-pri": 0,
      "dynamic-level-bytes": true,
      "num-levels": 7,
      "max-bytes-for-level-multiplier": 10,
      "compaction-style": 0,
      "disable-auto-compactions": false,
      "soft-pending-compaction-bytes-limit": "64GB",
      "hard-pending-compaction-bytes-limit": "256GB"
    },
    "raftcf": {
      "block-size": "16KB",
      "block-cache-size": "128MB",
      "disable-block-cache": false,
      "cache-index-and-filter-blocks": true,
      "pin-l0-filter-and-index-blocks": true,
      "use-bloom-filter": true,
      "whole-key-filtering": true,
      "bloom-filter-bits-per-key": 10,
      "block-based-bloom-filter": false,
      "read-amp-bytes-per-bit": 0,
      "compression-per-level": [
        "no",
        "no",
        "no",
        "no",
        "no",
        "no",
        "no"
      ],
      "write-buffer-size": "128MB",
      "max-write-buffer-number": 5,
      "min-write-buffer-number-to-merge": 1,
      "max-bytes-for-level-base": "128MB",
      "target-file-size-base": "8MB",
      "level0-file-num-compaction-trigger": 1,
      "level0-slowdown-writes-trigger": 20,
      "level0-stop-writes-trigger": 36,
      "max-compaction-bytes": "2GB",
      "compaction-pri": 0,
      "dynamic-level-bytes": true,
      "num-levels": 7,
      "max-bytes-for-level-multiplier": 10,
      "compaction-style": 0,
      "disable-auto-compactions": false,
      "soft-pending-compaction-bytes-limit": "64GB",
      "hard-pending-compaction-bytes-limit": "256GB"
    }
  },
  "raftdb": {
    "wal-recovery-mode": 2,
    "wal-dir": "",
    "wal-ttl-seconds": 0,
    "wal-size-limit": "0KB",
    "max-total-wal-size": "4GB",
    "max-manifest-file-size": "20MB",
    "create-if-missing": true,
    "max-open-files": 40960,
    "enable-statistics": true,
    "stats-dump-period": "10m",
    "compaction-readahead-size": "0KB",
    "info-log-max-size": "1GB",
    "info-log-roll-time": "0s",
    "info-log-keep-log-file-num": 10,
    "info-log-dir": "",
    "max-sub-compactions": 1,
    "writable-file-max-buffer-size": "1MB",
    "use-direct-io-for-flush-and-compaction": false,
    "enable-pipelined-write": true,
    "allow-concurrent-memtable-write": false,
    "bytes-per-sync": "1MB",
    "wal-bytes-per-sync": "512KB",
    "defaultcf": {
      "block-size": "64KB",
      "block-cache-size": "1284MB",
      "disable-block-cache": false,
      "cache-index-and-filter-blocks": true,
      "pin-l0-filter-and-index-blocks": true,
      "use-bloom-filter": false,
      "whole-key-filtering": true,
      "bloom-filter-bits-per-key": 10,
      "block-based-bloom-filter": false,
      "read-amp-bytes-per-bit": 0,
      "compression-per-level": [
        "no",
        "no",
        "lz4",
        "lz4",
        "lz4",
        "zstd",
        "zstd"
      ],
      "write-buffer-size": "128MB",
      "max-write-buffer-number": 5,
      "min-write-buffer-number-to-merge": 1,
      "max-bytes-for-level-base": "512MB",
      "target-file-size-base": "8MB",
      "level0-file-num-compaction-trigger": 4,
      "level0-slowdown-writes-trigger": 20,
      "level0-stop-writes-trigger": 36,
      "max-compaction-bytes": "2GB",
      "compaction-pri": 0,
      "dynamic-level-bytes": true,
      "num-levels": 7,
      "max-bytes-for-level-multiplier": 10,
      "compaction-style": 0,
      "disable-auto-compactions": false,
      "soft-pending-compaction-bytes-limit": "64GB",
      "hard-pending-compaction-bytes-limit": "256GB"
    }
  },
  "security": {
    "ca-path": "",
    "cert-path": "",
    "key-path": ""
  },
  "import": {
    "import-dir": "/tmp/tikv/import",
    "num-threads": 8,
    "num-import-jobs": 8,
    "num-import-sst-jobs": 2,
    "max-prepare-duration": "5m",
    "region-split-size": "96MB",
    "stream-channel-window": 128,
    "max-open-engines": 8
  }
}

@weekface
Copy link
Contributor

weekface commented Mar 6, 2019

I can't find any useful pieces of information. What is the logs of kubelet:

$ sudo journalctl -f -u kubelet

@yanyixing
Copy link
Author

yanyixing commented Mar 6, 2019

journalctl -f -u kubelet
-- Logs begin at Wed 2019-03-06 14:37:27 CST. --
Mar 06 18:20:58 umstor12 kubelet[11037]: E0306 18:20:58.478023   11037 pod_workers.go:190] Error syncing pod 225d5d69-3fec-11e9-8e7d-0023aeee79dd ("demo-tikv-1_tidb(225d5d69-3fec-11e9-8e7d-0023aeee79dd)"), skipping: failed to "StartContainer" for "tikv" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=tikv pod=demo-tikv-1_tidb(225d5d69-3fec-11e9-8e7d-0023aeee79dd)"
Mar 06 18:21:04 umstor12 kubelet[11037]: E0306 18:21:04.477616   11037 pod_workers.go:190] Error syncing pod 2b6dc73e-3fec-11e9-8e7d-0023aeee79dd ("demo-tikv-2_tidb(2b6dc73e-3fec-11e9-8e7d-0023aeee79dd)"), skipping: failed to "StartContainer" for "tikv" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=tikv pod=demo-tikv-2_tidb(2b6dc73e-3fec-11e9-8e7d-0023aeee79dd)"
Mar 06 18:21:09 umstor12 kubelet[11037]: E0306 18:21:09.477560   11037 pod_workers.go:190] Error syncing pod 225d5d69-3fec-11e9-8e7d-0023aeee79dd ("demo-tikv-1_tidb(225d5d69-3fec-11e9-8e7d-0023aeee79dd)"), skipping: failed to "StartContainer" for "tikv" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=tikv pod=demo-tikv-1_tidb(225d5d69-3fec-11e9-8e7d-0023aeee79dd)"
Mar 06 18:21:15 umstor12 kubelet[11037]: E0306 18:21:15.477516   11037 pod_workers.go:190] Error syncing pod 2b6dc73e-3fec-11e9-8e7d-0023aeee79dd ("demo-tikv-2_tidb(2b6dc73e-3fec-11e9-8e7d-0023aeee79dd)"), skipping: failed to "StartContainer" for "tikv" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=tikv pod=demo-tikv-2_tidb(2b6dc73e-3fec-11e9-8e7d-0023aeee79dd)"
Mar 06 18:21:23 umstor12 kubelet[11037]: E0306 18:21:23.477544   11037 pod_workers.go:190] Error syncing pod 225d5d69-3fec-11e9-8e7d-0023aeee79dd ("demo-tikv-1_tidb(225d5d69-3fec-11e9-8e7d-0023aeee79dd)"), skipping: failed to "StartContainer" for "tikv" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=tikv pod=demo-tikv-1_tidb(225d5d69-3fec-11e9-8e7d-0023aeee79dd)"
Mar 06 18:21:30 umstor12 kubelet[11037]: E0306 18:21:30.477624   11037 pod_workers.go:190] Error syncing pod 2b6dc73e-3fec-11e9-8e7d-0023aeee79dd ("demo-tikv-2_tidb(2b6dc73e-3fec-11e9-8e7d-0023aeee79dd)"), skipping: failed to "StartContainer" for "tikv" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=tikv pod=demo-tikv-2_tidb(2b6dc73e-3fec-11e9-8e7d-0023aeee79dd)"
Mar 06 18:21:36 umstor12 kubelet[11037]: E0306 18:21:36.477599   11037 pod_workers.go:190] Error syncing pod 225d5d69-3fec-11e9-8e7d-0023aeee79dd ("demo-tikv-1_tidb(225d5d69-3fec-11e9-8e7d-0023aeee79dd)"), skipping: failed to "StartContainer" for "tikv" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=tikv pod=demo-tikv-1_tidb(225d5d69-3fec-11e9-8e7d-0023aeee79dd)"
Mar 06 18:21:41 umstor12 kubelet[11037]: E0306 18:21:41.477561   11037 pod_workers.go:190] Error syncing pod 2b6dc73e-3fec-11e9-8e7d-0023aeee79dd ("demo-tikv-2_tidb(2b6dc73e-3fec-11e9-8e7d-0023aeee79dd)"), skipping: failed to "StartContainer" for "tikv" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=tikv pod=demo-tikv-2_tidb(2b6dc73e-3fec-11e9-8e7d-0023aeee79dd)"
Mar 06 18:21:51 umstor12 kubelet[11037]: E0306 18:21:51.477568   11037 pod_workers.go:190] Error syncing pod 225d5d69-3fec-11e9-8e7d-0023aeee79dd ("demo-tikv-1_tidb(225d5d69-3fec-11e9-8e7d-0023aeee79dd)"), skipping: failed to "StartContainer" for "tikv" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=tikv pod=demo-tikv-1_tidb(225d5d69-3fec-11e9-8e7d-0023aeee79dd)"
Mar 06 18:21:52 umstor12 kubelet[11037]: E0306 18:21:52.477625   11037 pod_workers.go:190] Error syncing pod 2b6dc73e-3fec-11e9-8e7d-0023aeee79dd ("demo-tikv-2_tidb(2b6dc73e-3fec-11e9-8e7d-0023aeee79dd)"), skipping: failed to "StartContainer" for "tikv" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=tikv pod=demo-tikv-2_tidb(2b6dc73e-3fec-11e9-8e7d-0023aeee79dd)"
Mar 06 18:22:03 umstor12 kubelet[11037]: E0306 18:22:03.477726   11037 pod_workers.go:190] Error syncing pod 225d5d69-3fec-11e9-8e7d-0023aeee79dd ("demo-tikv-1_tidb(225d5d69-3fec-11e9-8e7d-0023aeee79dd)"), skipping: failed to "StartContainer" for "tikv" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=tikv pod=demo-tikv-1_tidb(225d5d69-3fec-11e9-8e7d-0023aeee79dd)"
Mar 06 18:22:03 umstor12 kubelet[11037]: E0306 18:22:03.477779   11037 pod_workers.go:190] Error syncing pod 2b6dc73e-3fec-11e9-8e7d-0023aeee79dd ("demo-tikv-2_tidb(2b6dc73e-3fec-11e9-8e7d-0023aeee79dd)"), skipping: failed to "StartContainer" for "tikv" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=tikv pod=demo-tikv-2_tidb(2b6dc73e-3fec-11e9-8e7d-0023aeee79dd)"
journalctl -f -u kubelet
-- Logs begin at Wed 2019-03-06 13:33:55 CST. --
Mar 06 18:21:26 umstor14 kubelet[70948]: E0306 18:21:26.453838   70948 pod_workers.go:190] Error syncing pod 818d9466-3ff5-11e9-8e7d-0023aeee79dd ("demo-tikv-0_tidb(818d9466-3ff5-11e9-8e7d-0023aeee79dd)"), skipping: failed to "StartContainer" for "tikv" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=tikv pod=demo-tikv-0_tidb(818d9466-3ff5-11e9-8e7d-0023aeee79dd)"
Mar 06 18:21:40 umstor14 kubelet[70948]: E0306 18:21:40.454023   70948 pod_workers.go:190] Error syncing pod 818d9466-3ff5-11e9-8e7d-0023aeee79dd ("demo-tikv-0_tidb(818d9466-3ff5-11e9-8e7d-0023aeee79dd)"), skipping: failed to "StartContainer" for "tikv" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=tikv pod=demo-tikv-0_tidb(818d9466-3ff5-11e9-8e7d-0023aeee79dd)"
Mar 06 18:21:53 umstor14 kubelet[70948]: E0306 18:21:53.454032   70948 pod_workers.go:190] Error syncing pod 818d9466-3ff5-11e9-8e7d-0023aeee79dd ("demo-tikv-0_tidb(818d9466-3ff5-11e9-8e7d-0023aeee79dd)"), skipping: failed to "StartContainer" for "tikv" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=tikv pod=demo-tikv-0_tidb(818d9466-3ff5-11e9-8e7d-0023aeee79dd)"
Mar 06 18:22:08 umstor14 kubelet[70948]: E0306 18:22:08.454433   70948 pod_workers.go:190] Error syncing pod 818d9466-3ff5-11e9-8e7d-0023aeee79dd ("demo-tikv-0_tidb(818d9466-3ff5-11e9-8e7d-0023aeee79dd)"), skipping: failed to "StartContainer" for "tikv" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=tikv pod=demo-tikv-0_tidb(818d9466-3ff5-11e9-8e7d-0023aeee79dd)"
Mar 06 18:22:22 umstor14 kubelet[70948]: E0306 18:22:22.454050   70948 pod_workers.go:190] Error syncing pod 818d9466-3ff5-11e9-8e7d-0023aeee79dd ("demo-tikv-0_tidb(818d9466-3ff5-11e9-8e7d-0023aeee79dd)"), skipping: failed to "StartContainer" for "tikv" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=tikv pod=demo-tikv-0_tidb(818d9466-3ff5-11e9-8e7d-0023aeee79dd)"
Mar 06 18:22:35 umstor14 kubelet[70948]: E0306 18:22:35.454132   70948 pod_workers.go:190] Error syncing pod 818d9466-3ff5-11e9-8e7d-0023aeee79dd ("demo-tikv-0_tidb(818d9466-3ff5-11e9-8e7d-0023aeee79dd)"), skipping: failed to "StartContainer" for "tikv" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=tikv pod=demo-tikv-0_tidb(818d9466-3ff5-11e9-8e7d-0023aeee79dd)"
Mar 06 18:22:46 umstor14 kubelet[70948]: E0306 18:22:46.454069   70948 pod_workers.go:190] Error syncing pod 818d9466-3ff5-11e9-8e7d-0023aeee79dd ("demo-tikv-0_tidb(818d9466-3ff5-11e9-8e7d-0023aeee79dd)"), skipping: failed to "StartContainer" for "tikv" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=tikv pod=demo-tikv-0_tidb(818d9466-3ff5-11e9-8e7d-0023aeee79dd)"
Mar 06 18:23:00 umstor14 kubelet[70948]: E0306 18:23:00.454085   70948 pod_workers.go:190] Error syncing pod 818d9466-3ff5-11e9-8e7d-0023aeee79dd ("demo-tikv-0_tidb(818d9466-3ff5-11e9-8e7d-0023aeee79dd)"), skipping: failed to "StartContainer" for "tikv" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=tikv pod=demo-tikv-0_tidb(818d9466-3ff5-11e9-8e7d-0023aeee79dd)"
Mar 06 18:23:15 umstor14 kubelet[70948]: E0306 18:23:15.453936   70948 pod_workers.go:190] Error syncing pod 818d9466-3ff5-11e9-8e7d-0023aeee79dd ("demo-tikv-0_tidb(818d9466-3ff5-11e9-8e7d-0023aeee79dd)"), skipping: failed to "StartContainer" for "tikv" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=tikv pod=demo-tikv-0_tidb(818d9466-3ff5-11e9-8e7d-0023aeee79dd)"
Mar 06 18:23:30 umstor14 kubelet[70948]: E0306 18:23:30.404429   70948 pod_workers.go:190] Error syncing pod 818d9466-3ff5-11e9-8e7d-0023aeee79dd ("demo-tikv-0_tidb(818d9466-3ff5-11e9-8e7d-0023aeee79dd)"), skipping: failed to "StartContainer" for "tikv" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=tikv pod=demo-tikv-0_tidb(818d9466-3ff5-11e9-8e7d-0023aeee79dd)"

@weekface

@weekface
Copy link
Contributor

weekface commented Mar 6, 2019

cat /etc/systemd/system/docker.service

What's the value of LimitNOFILE ?

@yanyixing
Copy link
Author

cat /usr/lib/systemd/system/docker.service
[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network-online.target firewalld.service
Wants=network-online.target

[Service]
Type=notify
# the default is not to use systemd for cgroups because the delegate issues still
# exists and systemd currently does not support the cgroup feature set required
# for containers run by docker
ExecStart=/usr/bin/dockerd
ExecReload=/bin/kill -s HUP $MAINPID
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
# Uncomment TasksMax if your systemd version supports it.
# Only systemd 226 and above support this version.
#TasksMax=infinity
TimeoutStartSec=0
# set delegate yes so that systemd does not reset the cgroups of docker containers
Delegate=yes
# kill only the docker process, not all processes in the cgroup
KillMode=process
# restart the docker process if it exits prematurely
Restart=on-failure
StartLimitBurst=3
StartLimitInterval=60s

[Install]
WantedBy=multi-user.target

and the LimitNOFILE=infinity
@weekface

@weekface
Copy link
Contributor

weekface commented Mar 6, 2019

What is the ulimit of your host and container :

$ ulimit -n

@yanyixing
Copy link
Author

host ulimit is 1024

and container ulimit is 65536

@weekface
Copy link
Contributor

weekface commented Mar 6, 2019

Set LimitNOFILE and LimitNPROC to 1048576 in the file: /usr/lib/systemd/system/docker.service

@weekface
Copy link
Contributor

weekface commented Mar 6, 2019

Sorry, it is maybe: /etc/systemd/system/docker.service not /usr/lib/systemd/system/docker.service.

@yanyixing
Copy link
Author

kubectl get pods -n tidb
NAME                              READY   STATUS      RESTARTS   AGE
demo-discovery-5468c7c556-5c624   1/1     Running     5          3h15m
demo-monitor-84446b7957-wrxlg     2/2     Running     0          3h15m
demo-monitor-configurator-v697r   1/1     Running     0          3h15m
demo-pd-0                         1/1     Running     6          3h15m
demo-pd-1                         1/1     Running     0          3h15m
demo-pd-2                         1/1     Running     5          3h15m
demo-tidb-0                       1/1     Running     0          2m7s
demo-tidb-1                       1/1     Running     0          2m7s
demo-tidb-initializer-jfxwl       0/1     Completed   0          3h15m
demo-tikv-0                       2/2     Running     0          107s
demo-tikv-1                       2/2     Running     34         151m
demo-tikv-2                       2/2     Running     0          2m23s

thanks , tikv is running now. @weekface

@aylei
Copy link
Contributor

aylei commented Mar 6, 2019

I think we should update the corresponding section of setup.md, that seems out-dated:

Because TiDB by default will use at most 40960 file descriptors, the worker node and its Docker daemon's ulimit must be configured to greater than 40960:

$ sudo vim /etc/systemd/system/docker.service

Set LimitNOFILE to equal or greater than 40960.

@weekface
Copy link
Contributor

weekface commented Mar 6, 2019

Yes, I will open a PR.

@weekface
Copy link
Contributor

Closing it in favor of #300

@wgimperial
Copy link

I also using tidb-operator install a tidb on my three nodes k8s env, and the tikv not running.
tidb-operator version:release-1.0

kubectl logs -f -n tidb demo-tikv-0 -c tikv

result:
"starting tikv-server ...
/tikv-server --pd=demo-pd:2379 --advertise-addr=demo-tikv-0.demo-tikv-peer.tidb.svc:20160 --addr=0.0.0.0:20160 --data-dir=/var/lib/tikv --capacity=0 --config=/etc/tikv/tikv.toml"

all logs is here .
@weekface please。

@weekface
Copy link
Contributor

@wgimperial

First, add an annotation to the TiKV pod:

$ kubectl annotate po -n tidb demo-tikv-0 runmode=debug

Wait for this pod restart again, and then exec to it, start the tikv-server manually:

$ kubectl exec -it -n tidb demo-tikv-0

/tikv-server --pd=demo-pd:2379 --advertise-addr=demo-tikv-0.demo-tikv-peer.tidb.svc:20160 --addr=0.0.0.0:20160 --data-dir=/var/lib/tikv --capacity=0 --config=/etc/tikv/tikv.toml

And finally, you may see the full errors. Please let me know if you meet a problem.

@tennix
Copy link
Member

tennix commented Apr 18, 2019

Note the TiKV logs may be lost when panic. See tikv/tikv#4328 and tikv/tikv#3387. This got fixed recently tikv/tikv#4448. So to view the full error logs, @wgimperial you have to follow @weekface 's suggestion.

@wgimperial
Copy link

@weekface

I have set LimitNOFILE and LimitNPROC to 1048576 in the file: /usr/lib/systemd/system/docker.service
, and reload my docker service.

then follow your suggestion,add an annotation to the TiKV pod,
after exec <kubectl annotate po -n tidb demo-tikv-0 runmode=debug>, demo-tikv-0 is running,result:

demo-tikv-0 2/2 running 5 58m
demo-tikv-1 1/2 CrashLoopBackOff 7 58m
demo-tikv-2 1/2 CrashLoopBackOff 7 57m

continue exec:
$ kubectl exec -it -n tidb demo-tikv-0
/tikv-server --pd=demo-pd:2379 --advertise-addr=demo-tikv-0.demo-tikv-peer.tidb.svc:20160 --addr=0.0.0.0:20160 --data-dir=/var/lib/tikv --capacity=0 --config=/etc/tikv/tikv.toml

i get the full error logs:
ERROR tikv-server.rs:84: Limit("the maximum number of open file descriptors is too small ,got 65536, expect greater or equal to 82920")

@weekface
Copy link
Contributor

You should edit /etc/systemd/system/docker.service instead of /usr/lib/systemd/system/docker.service.

@wgimperial
Copy link

@weekface it's ok. thank you very much!

@romberli
Copy link

You should edit /etc/systemd/system/docker.service instead of /usr/lib/systemd/system/docker.service.

i'm using centos7.8, and there is no such file /etc/systemd/system/docker.service, though, there is a file /etc/systemd/system/multi-user.target.wants/docker.service, and this is a soft link of /usr/lib/systemd/system/docker.service,
so I wonder why we should edit /etc/systemd/system/docker.service instead of /usr/lib/systemd/system/docker.service...

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants