Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write binlog failed, the last error no avaliable pump to write binlog #753

Closed
thuyenptr opened this issue Sep 19, 2019 · 12 comments
Closed
Labels

Comments

@thuyenptr
Copy link

General Question

  • Today, I try to deploy TiDB cluster and TiDB-Binlog using tidb-docker-compose, after deploy I got some error log in TiDB:
[2019/09/19 11:26:47.055 +00:00] [INFO] [printer.go:40] ["Welcome to TiDB."] ["Release Version"=v4.0.0-alpha-284-g6c0ec6247-dirty] ["Git Commit Hash"=6c0ec6247c8e5bf3eb38e87c4c3a08f10561761d] ["Git Branch"=master] ["UTC Build Time"="2019-09-18 01:09:59"] [GoVersion="go version go1.12 linux/amd64"] ["Race Enabled"=false] ["Check Table Before Drop"=false] ["TiKV Min Version"=v3.0.0-60965b006877ca7234adaced7890d7b029ed1306]
[2019/09/19 11:26:47.060 +00:00] [INFO] [printer.go:53] ["loaded config"] [config="{\"host\":\"0.0.0.0\",\"advertise-address\":\"\",\"port\":4000,\"cors\":\"\",\"store\":\"tikv\",\"path\":\"pd0:2379,pd1:2379,pd2:2379\",\"socket\":\"\",\"lease\":\"0\",\"run-ddl\":true,\"split-table\":true,\"token-limit\":1000,\"oom-action\":\"log\",\"mem-quota-query\":34359738368,\"enable-streaming\":false,\"txn-local-latches\":{\"enabled\":false,\"capacity\":2048000},\"lower-case-table-names\":2,\"log\":{\"level\":\"error\",\"format\":\"text\",\"disable-timestamp\":false,\"disable-error-stack\":true,\"file\":{\"filename\":\"/logs/tidb.log\",\"log-rotate\":true,\"max-size\":300,\"max-days\":0,\"max-backups\":0},\"slow-query-file\":\"\",\"slow-threshold\":300,\"expensive-threshold\":10000,\"query-log-max-len\":2048},\"security\":{\"skip-grant-table\":false,\"ssl-ca\":\"\",\"ssl-cert\":\"\",\"ssl-key\":\"\",\"cluster-ssl-ca\":\"\",\"cluster-ssl-cert\":\"\",\"cluster-ssl-key\":\"\"},\"status\":{\"status-host\":\"0.0.0.0\",\"metrics-addr\":\"pushgateway:9091\",\"status-port\":10080,\"metrics-interval\":15,\"report-status\":true,\"record-db-qps\":false},\"performance\":{\"max-procs\":0,\"max-memory\":0,\"stats-lease\":\"3s\",\"stmt-count-limit\":5000,\"feedback-probability\":0,\"query-feedback-limit\":1024,\"pseudo-estimate-ratio\":0.7,\"force-priority\":\"NO_PRIORITY\",\"bind-info-lease\":\"3s\",\"txn-total-size-limit\":104857600,\"tcp-keep-alive\":true,\"cross-join\":true,\"run-auto-analyze\":true},\"prepared-plan-cache\":{\"enabled\":false,\"capacity\":100,\"memory-guard-ratio\":0.1},\"opentracing\":{\"enable\":false,\"rpc-metrics\":false,\"sampler\":{\"type\":\"const\",\"param\":1,\"sampling-server-url\":\"\",\"max-operations\":0,\"sampling-refresh-interval\":0},\"reporter\":{\"queue-size\":0,\"buffer-flush-interval\":0,\"log-spans\":false,\"local-agent-host-port\":\"\"}},\"proxy-protocol\":{\"networks\":\"\",\"header-timeout\":5},\"tikv-client\":{\"grpc-connection-count\":16,\"grpc-keepalive-time\":10,\"grpc-keepalive-timeout\":3,\"commit-timeout\":\"41s\",\"max-txn-time-use\":590,\"max-batch-size\":128,\"overload-threshold\":200,\"max-batch-wait-time\":0,\"batch-wait-size\":8},\"binlog\":{\"enable\":true,\"ignore-error\":false,\"write-timeout\":\"15s\",\"binlog-socket\":\"\",\"strategy\":\"range\"},\"compatible-kill-query\":false,\"plugin\":{\"dir\":\"\",\"load\":\"\"},\"pessimistic-txn\":{\"enable\":true,\"max-retry-count\":256,\"ttl\":\"40s\"},\"check-mb4-value-in-utf8\":true,\"treat-old-version-utf8-as-utf8mb4\":true,\"enable-table-lock\":false,\"delay-clean-table-lock\":0,\"split-region-max-num\":1000,\"stmt-summary\":{\"max-stmt-count\":100,\"max-sql-length\":4096}}"]
[2019/09/19 11:27:30.216 +00:00] [INFO] [printer.go:40] ["Welcome to TiDB."] ["Release Version"=v4.0.0-alpha-284-g6c0ec6247-dirty] ["Git Commit Hash"=6c0ec6247c8e5bf3eb38e87c4c3a08f10561761d] ["Git Branch"=master] ["UTC Build Time"="2019-09-18 01:09:59"] [GoVersion="go version go1.12 linux/amd64"] ["Race Enabled"=false] ["Check Table Before Drop"=false] ["TiKV Min Version"=v3.0.0-60965b006877ca7234adaced7890d7b029ed1306]
[2019/09/19 11:27:30.218 +00:00] [INFO] [printer.go:53] ["loaded config"] [config="{\"host\":\"0.0.0.0\",\"advertise-address\":\"\",\"port\":4000,\"cors\":\"\",\"store\":\"tikv\",\"path\":\"pd0:2379,pd1:2379,pd2:2379\",\"socket\":\"\",\"lease\":\"0\",\"run-ddl\":true,\"split-table\":true,\"token-limit\":1000,\"oom-action\":\"log\",\"mem-quota-query\":34359738368,\"enable-streaming\":false,\"txn-local-latches\":{\"enabled\":false,\"capacity\":2048000},\"lower-case-table-names\":2,\"log\":{\"level\":\"error\",\"format\":\"text\",\"disable-timestamp\":false,\"disable-error-stack\":true,\"file\":{\"filename\":\"/logs/tidb.log\",\"log-rotate\":true,\"max-size\":300,\"max-days\":0,\"max-backups\":0},\"slow-query-file\":\"\",\"slow-threshold\":300,\"expensive-threshold\":10000,\"query-log-max-len\":2048},\"security\":{\"skip-grant-table\":false,\"ssl-ca\":\"\",\"ssl-cert\":\"\",\"ssl-key\":\"\",\"cluster-ssl-ca\":\"\",\"cluster-ssl-cert\":\"\",\"cluster-ssl-key\":\"\"},\"status\":{\"status-host\":\"0.0.0.0\",\"metrics-addr\":\"pushgateway:9091\",\"status-port\":10080,\"metrics-interval\":15,\"report-status\":true,\"record-db-qps\":false},\"performance\":{\"max-procs\":0,\"max-memory\":0,\"stats-lease\":\"3s\",\"stmt-count-limit\":5000,\"feedback-probability\":0,\"query-feedback-limit\":1024,\"pseudo-estimate-ratio\":0.7,\"force-priority\":\"NO_PRIORITY\",\"bind-info-lease\":\"3s\",\"txn-total-size-limit\":104857600,\"tcp-keep-alive\":true,\"cross-join\":true,\"run-auto-analyze\":true},\"prepared-plan-cache\":{\"enabled\":false,\"capacity\":100,\"memory-guard-ratio\":0.1},\"opentracing\":{\"enable\":false,\"rpc-metrics\":false,\"sampler\":{\"type\":\"const\",\"param\":1,\"sampling-server-url\":\"\",\"max-operations\":0,\"sampling-refresh-interval\":0},\"reporter\":{\"queue-size\":0,\"buffer-flush-interval\":0,\"log-spans\":false,\"local-agent-host-port\":\"\"}},\"proxy-protocol\":{\"networks\":\"\",\"header-timeout\":5},\"tikv-client\":{\"grpc-connection-count\":16,\"grpc-keepalive-time\":10,\"grpc-keepalive-timeout\":3,\"commit-timeout\":\"41s\",\"max-txn-time-use\":590,\"max-batch-size\":128,\"overload-threshold\":200,\"max-batch-wait-time\":0,\"batch-wait-size\":8},\"binlog\":{\"enable\":true,\"ignore-error\":false,\"write-timeout\":\"15s\",\"binlog-socket\":\"\",\"strategy\":\"range\"},\"compatible-kill-query\":false,\"plugin\":{\"dir\":\"\",\"load\":\"\"},\"pessimistic-txn\":{\"enable\":true,\"max-retry-count\":256,\"ttl\":\"40s\"},\"check-mb4-value-in-utf8\":true,\"treat-old-version-utf8-as-utf8mb4\":true,\"enable-table-lock\":false,\"delay-clean-table-lock\":0,\"split-region-max-num\":1000,\"stmt-summary\":{\"max-stmt-count\":100,\"max-sql-length\":4096}}"]
[2019/09/19 11:28:41.568 +00:00] [ERROR] [binloginfo.go:115] ["write binlog failed"] [error="write binlog failed, the last error rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection closed"]
[2019/09/19 11:28:41.571 +00:00] [ERROR] [ddl_worker.go:141] ["[ddl] handle DDL job failed"] [worker="worker 1, tp general"] [error="[global:3]critical error write binlog failed, the last error rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection closed"]
[2019/09/19 11:29:24.105 +00:00] [ERROR] [binloginfo.go:115] ["write binlog failed"] [error="write binlog failed, the last error rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection closed"]
[2019/09/19 11:29:24.106 +00:00] [ERROR] [ddl_worker.go:141] ["[ddl] handle DDL job failed"] [worker="worker 1, tp general"] [error="[global:3]critical error write binlog failed, the last error rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection closed"]
[2019/09/19 11:30:06.734 +00:00] [ERROR] [binloginfo.go:115] ["write binlog failed"] [error="write binlog failed, the last error rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection closed"]
[2019/09/19 11:30:06.735 +00:00] [ERROR] [ddl_worker.go:141] ["[ddl] handle DDL job failed"] [worker="worker 1, tp general"] [error="[global:3]critical error write binlog failed, the last error rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection closed"]
[2019/09/19 11:30:49.757 +00:00] [ERROR] [binloginfo.go:115] ["write binlog failed"] [error="write binlog failed, the last error rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection closed"]
[2019/09/19 11:30:49.758 +00:00] [ERROR] [ddl_worker.go:141] ["[ddl] handle DDL job failed"] [worker="worker 1, tp general"] [error="[global:3]critical error write binlog failed, the last error rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection closed"]
[2019/09/19 11:31:32.431 +00:00] [ERROR] [binloginfo.go:115] ["write binlog failed"] [error="write binlog failed, the last error rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection closed"]
[2019/09/19 11:31:32.432 +00:00] [ERROR] [ddl_worker.go:141] ["[ddl] handle DDL job failed"] [worker="worker 1, tp general"] [error="[global:3]critical error write binlog failed, the last error rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection closed"]
[2019/09/19 11:32:15.210 +00:00] [ERROR] [binloginfo.go:115] ["write binlog failed"] [error="write binlog failed, the last error rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection closed"]
[2019/09/19 11:32:15.215 +00:00] [ERROR] [ddl_worker.go:141] ["[ddl] handle DDL job failed"] [worker="worker 1, tp general"] [error="[global:3]critical error write binlog failed, the last error rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection closed"]
[2019/09/19 11:32:58.195 +00:00] [ERROR] [binloginfo.go:115] ["write binlog failed"] [error="write binlog failed, the last error rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection closed"]
[2019/09/19 11:32:58.196 +00:00] [ERROR] [ddl_worker.go:141] ["[ddl] handle DDL job failed"] [worker="worker 1, tp general"] [error="[global:3]critical error write binlog failed, the last error rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection closed"]
[2019/09/19 11:33:28.213 +00:00] [ERROR] [binloginfo.go:115] ["write binlog failed"] [error="write binlog failed, the last error no avaliable pump to write binlog"]
[2019/09/19 11:33:28.215 +00:00] [ERROR] [ddl_worker.go:141] ["[ddl] handle DDL job failed"] [worker="worker 1, tp general"] [error="[global:3]critical error write binlog failed, the last error no avaliable pump to write binlog"]
  • I try to connect to PD and got some info about pump cluster, it works normally, here is the code:
etcdUrls := "http://127.0.0.1:2379,http://127.0.0.1:2379,http://127.0.0.1:2379"
	client, err := pumpCli.NewPumpsClient(etcdUrls, "range", 15 * time.Second, pd.SecurityOption{})
	if err != nil {
		logrus.Error(err)
		return
	}

	for key, value := range client.Pumps.AvaliablePumps {
		logrus.Infof("id: %v, value: %v", key, value)
	}

Can you help me to solve this problem? Thank you.

@thuyenptr thuyenptr changed the title rite binlog failed, the last error no avaliable pump to write binlog Write binlog failed, the last error no avaliable pump to write binlog Sep 19, 2019
@july2993
Copy link
Contributor

  • Make sure pump is started before TiDB starts.
  • If pump works normally, you can just restart TiDB to recover TiDB

@thuyenptr
Copy link
Author

  • I tried to start pumps before start TiDB, but not work.
  • After restart TiDB (2 minutes later), I got the error:
[2019/09/20 03:51:58.170 +00:00] [ERROR] [binloginfo.go:115] ["write binlog failed"] [error="write binlog failed, the last error rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection closed"]
[2019/09/20 03:51:58.173 +00:00] [ERROR] [ddl_worker.go:141] ["[ddl] handle DDL job failed"] [worker="worker 1, tp general"] [error="[global:3]critical error write binlog failed, the last error rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection closed"]
  • Here is my docker-compose file:
---
# Source: tidb-docker-compose/templates/docker-compose.yml
# dashboard-installer has been deleted, because we don't need it any more in docker-compose model
version: '2.1'

services:
  pd0:
    image: pingcap/pd:latest
    ports:
      - "2379:2379"
    volumes:
      - ./config/pd.toml:/pd.toml:ro
      - ./data:/data
      - ./logs:/logs
    command:
      - --name=pd0
      - --client-urls=http://0.0.0.0:2379
      - --peer-urls=http://0.0.0.0:2380
      - --advertise-client-urls=http://pd0:2379
      - --advertise-peer-urls=http://pd0:2380
      - --initial-cluster=pd0=http://pd0:2380,pd1=http://pd1:2380,pd2=http://pd2:2380
      - --data-dir=/data/pd0
      - --config=/pd.toml
      - --log-file=/logs/pd0.log
    # sysctls:
    #   net.core.somaxconn: 32768
    # ulimits:
    #   nofile:
    #     soft: 1000000
    #     hard: 1000000
    restart: on-failure
  
  pd1:
    image: pingcap/pd:latest
    ports:
      - "3379:2379"
    volumes:
      - ./config/pd.toml:/pd.toml:ro
      - ./data:/data
      - ./logs:/logs
    command:
      - --name=pd1
      - --client-urls=http://0.0.0.0:2379
      - --peer-urls=http://0.0.0.0:2380
      - --advertise-client-urls=http://pd1:2379
      - --advertise-peer-urls=http://pd1:2380
      - --initial-cluster=pd0=http://pd0:2380,pd1=http://pd1:2380,pd2=http://pd2:2380
      - --data-dir=/data/pd1
      - --config=/pd.toml
      - --log-file=/logs/pd1.log
    # sysctls:
    #   net.core.somaxconn: 32768
    # ulimits:
    #   nofile:
    #     soft: 1000000
    #     hard: 1000000
    restart: on-failure
  
  pd2:
    image: pingcap/pd:latest
    ports:
      - "4379:2379"
    volumes:
      - ./config/pd.toml:/pd.toml:ro
      - ./data:/data
      - ./logs:/logs
    command:
      - --name=pd2
      - --client-urls=http://0.0.0.0:2379
      - --peer-urls=http://0.0.0.0:2380
      - --advertise-client-urls=http://pd2:2379
      - --advertise-peer-urls=http://pd2:2380
      - --initial-cluster=pd0=http://pd0:2380,pd1=http://pd1:2380,pd2=http://pd2:2380
      - --data-dir=/data/pd2
      - --config=/pd.toml
      - --log-file=/logs/pd2.log
    # sysctls:
    #   net.core.somaxconn: 32768
    # ulimits:
    #   nofile:
    #     soft: 1000000
    #     hard: 1000000
    restart: on-failure
  
  tikv0:
    image: pingcap/tikv:latest
    volumes:
      - ./config/tikv.toml:/tikv.toml:ro
      - ./data:/data
      - ./logs:/logs
    command:
      - --addr=0.0.0.0:20160
      - --advertise-addr=tikv0:20160
      - --data-dir=/data/tikv0
      - --pd=pd0:2379,pd1:2379,pd2:2379
      - --config=/tikv.toml
      - --log-file=/logs/tikv0.log
    depends_on:
      - "pd0"
      - "pd1"
      - "pd2"
    # sysctls:
    #   net.core.somaxconn: 32768
    # ulimits:
    #   nofile:
    #     soft: 1000000
    #     hard: 1000000
    restart: on-failure
  
  tikv1:
    image: pingcap/tikv:latest
    volumes:
      - ./config/tikv.toml:/tikv.toml:ro
      - ./data:/data
      - ./logs:/logs
    command:
      - --addr=0.0.0.0:20160
      - --advertise-addr=tikv1:20160
      - --data-dir=/data/tikv1
      - --pd=pd0:2379,pd1:2379,pd2:2379
      - --config=/tikv.toml
      - --log-file=/logs/tikv1.log
    depends_on:
      - "pd0"
      - "pd1"
      - "pd2"
    # sysctls:
    #   net.core.somaxconn: 32768
    # ulimits:
    #   nofile:
    #     soft: 1000000
    #     hard: 1000000
    restart: on-failure
  
  tikv2:
    image: pingcap/tikv:latest
    volumes:
      - ./config/tikv.toml:/tikv.toml:ro
      - ./data:/data
      - ./logs:/logs
    command:
      - --addr=0.0.0.0:20160
      - --advertise-addr=tikv2:20160
      - --data-dir=/data/tikv2
      - --pd=pd0:2379,pd1:2379,pd2:2379
      - --config=/tikv.toml
      - --log-file=/logs/tikv2.log
    depends_on:
      - "pd0"
      - "pd1"
      - "pd2"
    # sysctls:
    #   net.core.somaxconn: 32768
    # ulimits:
    #   nofile:
    #     soft: 1000000
    #     hard: 1000000
    restart: on-failure

  tidb:
    image: pingcap/tidb:latest
    ports:
      - "4000:4000"
      - "10080:10080"
    volumes:
      - ./config/tidb.toml:/tidb.toml:ro
      - ./logs:/logs
    command:
      - --store=tikv
      - --path=pd0:2379,pd1:2379,pd2:2379
      - --config=/tidb.toml
      - --log-file=/logs/tidb.log
      - --enable-binlog=true
    depends_on:
      - "tikv0"
      - "tikv1"
      - "tikv2"
      - "pump0"
      - "pump1"
      - "pump2"
    # sysctls:
    #   net.core.somaxconn: 32768
    # ulimits:
    #   nofile:
    #     soft: 1000000
    #     hard: 1000000
    restart: on-failure
  
  pump0:
    image: pingcap/tidb-binlog:latest
    volumes:
      - ./config/pump.toml:/pump.toml:ro
      - ./data:/data
      - ./logs:/logs
    command: 
      - sh
      - -c 
      - |
        /pump --addr=0.0.0.0:8250 --advertise-addr=pump0:8250 --data-dir=/data/pump0 --log-file=/logs/pump0.log --node-id=pump0 --metrics-addr=pushgateway:9091 --pd-urls=http://pd0:2379,http://pd1:2379,http://pd2:2379 --config=/pump.toml;
    depends_on:
      - "pd0"
      - "pd1"
      - "pd2"
    restart: on-failure
  
  pump1:
    image: pingcap/tidb-binlog:latest
    volumes:
      - ./config/pump.toml:/pump.toml:ro
      - ./data:/data
      - ./logs:/logs
    command:
      - sh 
      - -c 
      - |
        /pump --addr=0.0.0.0:8250 --advertise-addr=pump1:8250 --data-dir=/data/pump1 --log-file=/logs/pump1.log --node-id=pump1 --metrics-addr=pushgateway:9091 --pd-urls=http://pd0:2379,http://pd1:2379,http://pd2:2379 --config=/pump.toml
    depends_on:
      - "pd0"
      - "pd1"
      - "pd2"
    restart: on-failure
  
  pump2:
    image: pingcap/tidb-binlog:latest
    volumes:
      - ./config/pump.toml:/pump.toml:ro
      - ./data:/data
      - ./logs:/logs
    command:
      - sh 
      - -c 
      - |
        /pump --addr=0.0.0.0:8250 --advertise-addr=pump2:8250 --data-dir=/data/pump2 --log-file=/logs/pump2.log --node-id=pump2 --metrics-addr=pushgateway:9091 --pd-urls=http://pd0:2379,http://pd1:2379,http://pd2:2379 --config=/pump.toml
    depends_on:
      - "pd0"
      - "pd1"
      - "pd2"
    restart: on-failure

  drainer:
    image: pingcap/tidb-binlog:latest
    volumes:
      - ./config/drainer.toml:/drainer.toml:ro
      - ./data:/data
      - ./logs:/logs
    command:
      - /drainer
      - --addr=0.0.0.0:8249
      - --data-dir=/data/data.drainer
      - --log-file=/logs/drainer.log
      - --pd-urls=http://pd0:2379,http://pd1:2379,http://pd2:2379
      - --config=/drainer.toml
      - --initial-commit-ts=0
      - --metrics-addr=pushgateway:9091
      - --dest-db-type=mysql
    depends_on:
      - "pd0"
      - "pd1"
      - "pd2"
      - "some-mysql"
      # - "kafka0"
      # - "kafka1"
      # - "kafka2"
    restart: on-failure

  some-mysql:
    container_name: some-mysql
    image: mysql:latest
    ports:
      - 3306:3306
    environment:
      MYSQL_ROOT_PASSWORD: rootpw
    restart: on-failure

  tidb-vision:
    image: pingcap/tidb-vision:latest
    environment:
      PD_ENDPOINT: pd0:2379
    ports:
      - "8010:8010"
    restart: on-failure
  pushgateway:
    image: prom/pushgateway:v0.3.1
    command:
      - --log.level=error
    restart: on-failure

  prometheus:
    user: root
    image: prom/prometheus:v2.2.1
    command:
      - --log.level=error
      - --storage.tsdb.path=/data/prometheus
      - --config.file=/etc/prometheus/prometheus.yml
    ports:
      - "9090:9090"
    volumes:
      - ./config/prometheus.yml:/etc/prometheus/prometheus.yml:ro
      - ./config/pd.rules.yml:/etc/prometheus/pd.rules.yml:ro
      - ./config/tikv.rules.yml:/etc/prometheus/tikv.rules.yml:ro
      - ./config/tidb.rules.yml:/etc/prometheus/tidb.rules.yml:ro
      - ./data:/data
    restart: on-failure
  grafana:
    image: grafana/grafana:5.3.0
    user: "0"
    environment:
      GF_LOG_LEVEL: error
      GF_PATHS_PROVISIONING: /etc/grafana/provisioning
      GF_PATHS_CONFIG: /etc/grafana/grafana.ini
    ports:
      - "3000:3000"
    volumes:
      - ./config/grafana:/etc/grafana
      - ./config/dashboards:/tmp/dashboards
      - ./data/grafana:/var/lib/grafana
    restart: on-failure

@thuyenptr
Copy link
Author

Today, I try to deploy using binaries instead of docker-compose but got the same error.
I do following this blog: https://pingcap.com/blog/tidb-binlog-tutorial/ (do exactly step by step)
And here is error log of TiDB:

[2019/09/23 09:35:33.401 +00:00] [INFO] [printer.go:40] ["Welcome to TiDB."] ["Release Version"=v4.0.0-alpha-316-g582076b5c-dirty] ["Git Commit Hash"=582076b5cc9e0743bdd4aa95fbfd2ca1cc0e86e0] ["Git Branch"=master] ["UTC Build Time"="2019-09-23 07:10:54"] [GoVersion="go version go1.12 linux/amd64"] ["Race Enabled"=false] ["Check Table Before Drop"=false] ["TiKV Min Version"=v3.0.0-60965b006877ca7234adaced7890d7b029ed1306]
[2019/09/23 09:35:33.401 +00:00] [INFO] [printer.go:53] ["loaded config"] [config="{\"host\":\"0.0.0.0\",\"advertise-address\":\"\",\"port\":4000,\"cors\":\"\",\"store\":\"tikv\",\"path\":\"127.0.0.1:2379\",\"socket\":\"\",\"lease\":\"45s\",\"run-ddl\":true,\"split-table\":true,\"token-limit\":1000,\"oom-action\":\"log\",\"mem-quota-query\":34359738368,\"enable-streaming\":false,\"txn-local-latches\":{\"enabled\":false,\"capacity\":2048000},\"lower-case-table-names\":2,\"log\":{\"level\":\"info\",\"format\":\"text\",\"disable-timestamp\":false,\"disable-error-stack\":true,\"file\":{\"filename\":\"tidb.log\",\"log-rotate\":true,\"max-size\":300,\"max-days\":0,\"max-backups\":0},\"slow-query-file\":\"tidb-slow.log\",\"slow-threshold\":300,\"expensive-threshold\":10000,\"query-log-max-len\":2048},\"security\":{\"skip-grant-table\":false,\"ssl-ca\":\"\",\"ssl-cert\":\"\",\"ssl-key\":\"\",\"cluster-ssl-ca\":\"\",\"cluster-ssl-cert\":\"\",\"cluster-ssl-key\":\"\"},\"status\":{\"status-host\":\"0.0.0.0\",\"metrics-addr\":\"\",\"status-port\":10080,\"metrics-interval\":15,\"report-status\":true,\"record-db-qps\":false},\"performance\":{\"max-procs\":0,\"max-memory\":0,\"stats-lease\":\"3s\",\"stmt-count-limit\":5000,\"feedback-probability\":0.05,\"query-feedback-limit\":1024,\"pseudo-estimate-ratio\":0.8,\"force-priority\":\"NO_PRIORITY\",\"bind-info-lease\":\"3s\",\"txn-total-size-limit\":104857600,\"tcp-keep-alive\":true,\"cross-join\":true,\"run-auto-analyze\":true},\"prepared-plan-cache\":{\"enabled\":false,\"capacity\":100,\"memory-guard-ratio\":0.1},\"opentracing\":{\"enable\":false,\"rpc-metrics\":false,\"sampler\":{\"type\":\"const\",\"param\":1,\"sampling-server-url\":\"\",\"max-operations\":0,\"sampling-refresh-interval\":0},\"reporter\":{\"queue-size\":0,\"buffer-flush-interval\":0,\"log-spans\":false,\"local-agent-host-port\":\"\"}},\"proxy-protocol\":{\"networks\":\"\",\"header-timeout\":5},\"tikv-client\":{\"grpc-connection-count\":16,\"grpc-keepalive-time\":10,\"grpc-keepalive-timeout\":3,\"commit-timeout\":\"41s\",\"max-txn-time-use\":590,\"max-batch-size\":128,\"overload-threshold\":200,\"max-batch-wait-time\":0,\"batch-wait-size\":8},\"binlog\":{\"enable\":true,\"ignore-error\":false,\"write-timeout\":\"15s\",\"binlog-socket\":\"\",\"strategy\":\"range\"},\"compatible-kill-query\":false,\"plugin\":{\"dir\":\"\",\"load\":\"\"},\"pessimistic-txn\":{\"enable\":true,\"max-retry-count\":256,\"ttl\":\"40s\"},\"check-mb4-value-in-utf8\":true,\"treat-old-version-utf8-as-utf8mb4\":true,\"enable-table-lock\":false,\"delay-clean-table-lock\":0,\"split-region-max-num\":1000,\"stmt-summary\":{\"max-stmt-count\":100,\"max-sql-length\":4096}}"]
[2019/09/23 09:35:33.401 +00:00] [INFO] [client.go:148] ["[pd] create pd client with endpoints"] [pd-address="[127.0.0.1:2379]"]
[2019/09/23 09:35:33.404 +00:00] [INFO] [client.go:256] ["[pd] switch leader"] [new-leader=http://127.0.0.1:2379] [old-leader=]
[2019/09/23 09:35:33.404 +00:00] [INFO] [client.go:167] ["[pd] init cluster id"] [cluster-id=6739797247068160123]
[2019/09/23 09:35:33.406 +00:00] [INFO] [main.go:274] [tidb-server] ["create pumps client success, ignore binlog error"=false]
[2019/09/23 09:35:33.406 +00:00] [INFO] [main.go:283] ["disable Prometheus push client"]
[2019/09/23 09:35:33.406 +00:00] [INFO] [store.go:68] ["new store"] [path=tikv://127.0.0.1:2379]
[2019/09/23 09:35:33.406 +00:00] [INFO] [client.go:148] ["[pd] create pd client with endpoints"] [pd-address="[127.0.0.1:2379]"]
[2019/09/23 09:35:33.407 +00:00] [INFO] [systime_mon.go:25] ["start system time monitor"]
[2019/09/23 09:35:33.408 +00:00] [INFO] [client.go:256] ["[pd] switch leader"] [new-leader=http://127.0.0.1:2379] [old-leader=]
[2019/09/23 09:35:33.408 +00:00] [INFO] [client.go:167] ["[pd] init cluster id"] [cluster-id=6739797247068160123]
[2019/09/23 09:35:33.409 +00:00] [INFO] [store.go:74] ["new store with retry success"]
[2019/09/23 09:35:33.421 +00:00] [INFO] [tidb.go:70] ["new domain"] [store=tikv-6739797247068160123] ["ddl lease"=45s] ["stats lease"=3s]
[2019/09/23 09:35:33.426 +00:00] [INFO] [ddl.go:423] ["[ddl] start DDL"] [ID=431c1995-c6a8-416d-8d7d-7fa5c4253c92] [runWorker=true]
[2019/09/23 09:35:33.427 +00:00] [INFO] [ddl.go:411] ["[ddl] start delRangeManager OK"] ["is a emulator"=false]
[2019/09/23 09:35:33.427 +00:00] [INFO] [ddl_worker.go:119] ["[ddl] start DDL worker"] [worker="worker 1, tp general"]
[2019/09/23 09:35:33.427 +00:00] [INFO] [ddl_worker.go:119] ["[ddl] start DDL worker"] [worker="worker 2, tp add index"]
[2019/09/23 09:35:33.429 +00:00] [INFO] [manager.go:316] ["get owner"] ["owner info"="[ddl] /tidb/ddl/fg/owner ownerManager 431c1995-c6a8-416d-8d7d-7fa5c4253c92"] [ownerID=431c1995-c6a8-416d-8d7d-7fa5c4253c92]
[2019/09/23 09:35:33.431 +00:00] [INFO] [domain.go:141] ["full load InfoSchema success"] [usedSchemaVersion=0] [neededSchemaVersion=0] ["start time"=594.477µs]
[2019/09/23 09:35:33.432 +00:00] [INFO] [domain.go:350] ["full load and reset schema validator"]
[2019/09/23 09:35:33.432 +00:00] [INFO] [tidb.go:168] ["rollbackTxn for ddl/autocommit failed"]
[2019/09/23 09:35:33.432 +00:00] [WARN] [session.go:1020] ["run statement failed"] [schemaVersion=0] [error="[schema:1049]Unknown database 'mysql'"] [session="{\n  \"currDBName\": \"\",\n  \"id\": 0,\n  \"status\": 2,\n  \"strictMode\": true,\n  \"user\": null\n}"]
[2019/09/23 09:35:33.432 +00:00] [WARN] [session.go:1106] ["compile SQL failed"] [error="[schema:1146]Table 'mysql.tidb' doesn't exist"] [SQL="SELECT HIGH_PRIORITY VARIABLE_VALUE FROM mysql.tidb WHERE VARIABLE_NAME=\"bootstrapped\""]
[2019/09/23 09:35:33.432 +00:00] [INFO] [session.go:1949] ["CRUCIAL OPERATION"] [conn=0] [schemaVersion=0] [cur_db=] [sql="CREATE DATABASE IF NOT EXISTS test"] [user=]
[2019/09/23 09:35:33.445 +00:00] [INFO] [ddl.go:584] ["[ddl] start DDL job"] [job="ID:2, Type:create schema, State:none, SchemaState:none, SchemaID:1, TableID:0, RowCount:0, ArgLen:1, start time: 2019-09-23 09:35:33.426 +0000 UTC, Err:<nil>, ErrCount:0, SnapshotVersion:0"] [query="CREATE DATABASE IF NOT EXISTS test"]
[2019/09/23 09:35:33.446 +00:00] [INFO] [ddl_worker.go:488] ["[ddl] run DDL job"] [worker="worker 1, tp general"] [job="ID:2, Type:create schema, State:none, SchemaState:none, SchemaID:1, TableID:0, RowCount:0, ArgLen:0, start time: 2019-09-23 09:35:33.426 +0000 UTC, Err:<nil>, ErrCount:0, SnapshotVersion:0"]
[2019/09/23 09:35:43.450 +00:00] [WARN] [client.go:288] ["[pumps client] write binlog to pump failed"] [NodeID=instance-1:8250] ["binlog type"=Prewrite] ["start ts"=411364578669625356] ["commit ts"=0] [length=72] [error="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection closed"]
[2019/09/23 09:35:53.551 +00:00] [WARN] [client.go:288] ["[pumps client] write binlog to pump failed"] [NodeID=instance-1:8250] ["binlog type"=Prewrite] ["start ts"=411364578669625356] ["commit ts"=0] [length=72] [error="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection closed"]
[2019/09/23 09:35:55.940 +00:00] [INFO] [domain.go:141] ["full load InfoSchema success"] [usedSchemaVersion=0] [neededSchemaVersion=0] ["start time"=2.589996ms]
[2019/09/23 09:35:55.940 +00:00] [INFO] [domain.go:350] ["full load and reset schema validator"]
[2019/09/23 09:36:03.652 +00:00] [WARN] [client.go:288] ["[pumps client] write binlog to pump failed"] [NodeID=instance-1:8250] ["binlog type"=Prewrite] ["start ts"=411364578669625356] ["commit ts"=0] [length=72] [error="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection closed"]
[2019/09/23 09:36:03.652 +00:00] [INFO] [client.go:321] ["[pumps client] write binlog to avaliable pumps all failed, will try unavaliable pumps"]
[2019/09/23 09:36:03.652 +00:00] [ERROR] [binloginfo.go:115] ["write binlog failed"] [error="write binlog failed, the last error rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection closed"]
[2019/09/23 09:36:03.652 +00:00] [ERROR] [ddl_worker.go:141] ["[ddl] handle DDL job failed"] [worker="worker 1, tp general"] [error="[global:3]critical error write binlog failed, the last error rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection closed"]
[2019/09/23 09:36:03.652 +00:00] [INFO] [client.go:321] ["[pumps client] write binlog to avaliable pumps all failed, will try unavaliable pumps"]
[2019/09/23 09:36:03.655 +00:00] [INFO] [2pc.go:897] ["2PC clean up done"] [txnStartTS=411364578669625356]
[2019/09/23 09:36:03.656 +00:00] [INFO] [ddl_worker.go:488] ["[ddl] run DDL job"] [worker="worker 1, tp general"] [job="ID:2, Type:create schema, State:none, SchemaState:none, SchemaID:1, TableID:0, RowCount:0, ArgLen:0, start time: 2019-09-23 09:35:33.426 +0000 UTC, Err:<nil>, ErrCount:0, SnapshotVersion:0"]
[2019/09/23 09:36:03.658 +00:00] [WARN] [client.go:288] ["[pumps client] write binlog to pump failed"] [NodeID=instance-1:8250] ["binlog type"=Prewrite] ["start ts"=411364586585587714] ["commit ts"=0] [length=72] [error="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection closed"]
[2019/09/23 09:36:13.758 +00:00] [WARN] [client.go:288] ["[pumps client] write binlog to pump failed"] [NodeID=instance-1:8250] ["binlog type"=Prewrite] ["start ts"=411364586585587714] ["commit ts"=0] [length=72] [error="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection closed"]
[2019/09/23 09:36:18.439 +00:00] [INFO] [domain.go:141] ["full load InfoSchema success"] [usedSchemaVersion=0] [neededSchemaVersion=0] ["start time"=2.396826ms]
[2019/09/23 09:36:18.439 +00:00] [INFO] [domain.go:350] ["full load and reset schema validator"]
[2019/09/23 09:36:23.859 +00:00] [WARN] [client.go:288] ["[pumps client] write binlog to pump failed"] [NodeID=instance-1:8250] ["binlog type"=Prewrite] ["start ts"=411364586585587714] ["commit ts"=0] [length=72] [error="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection closed"]
[2019/09/23 09:36:23.860 +00:00] [INFO] [client.go:321] ["[pumps client] write binlog to avaliable pumps all failed, will try unavaliable pumps"]
[2019/09/23 09:36:23.860 +00:00] [ERROR] [binloginfo.go:115] ["write binlog failed"] [error="write binlog failed, the last error rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection closed"]
[2019/09/23 09:36:23.860 +00:00] [ERROR] [ddl_worker.go:141] ["[ddl] handle DDL job failed"] [worker="worker 1, tp general"] [error="[global:3]critical error write binlog failed, the last error rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection closed"]
[2019/09/23 09:36:23.860 +00:00] [INFO] [client.go:321] ["[pumps client] write binlog to avaliable pumps all failed, will try unavaliable pumps"]
[2019/09/23 09:36:23.862 +00:00] [INFO] [2pc.go:897] ["2PC clean up done"] [txnStartTS=411364586585587714]
[2019/09/23 09:36:23.863 +00:00] [INFO] [ddl_worker.go:488] ["[ddl] run DDL job"] [worker="worker 1, tp general"] [job="ID:2, Type:create schema, State:none, SchemaState:none, SchemaID:1, TableID:0, RowCount:0, ArgLen:0, start time: 2019-09-23 09:35:33.426 +0000 UTC, Err:<nil>, ErrCount:0, SnapshotVersion:0"]
[2019/09/23 09:36:23.864 +00:00] [WARN] [client.go:288] ["[pumps client] write binlog to pump failed"] [NodeID=instance-1:8250] ["binlog type"=Prewrite] ["start ts"=411364591880896514] ["commit ts"=0] [length=72] [error="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection closed"]
[2019/09/23 09:36:33.965 +00:00] [WARN] [client.go:288] ["[pumps client] write binlog to pump failed"] [NodeID=instance-1:8250] ["binlog type"=Prewrite] ["start ts"=411364591880896514] ["commit ts"=0] [length=72] [error="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection closed"]

Tags: @july2993

@july2993
Copy link
Contributor

https://pingcap.com/blog/tidb-binlog-tutorial/

can you check the log of pump ?

@thuyenptr
Copy link
Author

Of course,
Seem it work normally, here is the log:

[2019/09/23 09:35:30.395 +00:00] [INFO] [version.go:50] ["Welcome to Pump"] ["Release Version"=v3.0.3-7-g43f34a6] ["Git Commit Hash"=43f34a64f9ada61b4b4f00114bf561d165bba311] ["Build TS"="2019-09-18 10:43:53"] ["Go Version"=go1.12] ["Go OS/Arch"=linux/amd64]
[2019/09/23 09:35:30.395 +00:00] [INFO] [client.go:148] ["[pd] create pd client with endpoints"] [pd-address="[http://127.0.0.1:2379]"]
[2019/09/23 09:35:31.410 +00:00] [ERROR] [client.go:204] ["[pd] failed to get cluster id"] [error="rpc error: code = Unknown desc = server not started"] [errorVerbose="rpc error: code = Unknown desc = server not started\ngithub.com/pingcap/pd/client.(*client).getMembers\n\t/home/jenkins/workspace/build_tidb_binlog_master/go/pkg/mod/github.com/pingcap/[email protected]/client/client.go:239\ngithub.com/pingcap/pd/client.(*client).initClusterID\n\t/home/jenkins/workspace/build_tidb_binlog_master/go/pkg/mod/github.com/pingcap/[email protected]/client/client.go:201\ngithub.com/pingcap/pd/client.(*client).initRetry\n\t/home/jenkins/workspace/build_tidb_binlog_master/go/pkg/mod/github.com/pingcap/[email protected]/client/client.go:188\ngithub.com/pingcap/pd/client.NewClient\n\t/home/jenkins/workspace/build_tidb_binlog_master/go/pkg/mod/github.com/pingcap/[email protected]/client/client.go:161\ngithub.com/pingcap/tidb-binlog/pkg/util.GetPdClient\n\t/home/jenkins/workspace/build_tidb_binlog_master/go/src/github.com/pingcap/tidb-binlog/pkg/util/util.go:249\ngithub.com/pingcap/tidb-binlog/pump.NewServer\n\t/home/jenkins/workspace/build_tidb_binlog_master/go/src/github.com/pingcap/tidb-binlog/pump/server.go:122\nmain.main\n\t/home/jenkins/workspace/build_tidb_binlog_master/go/src/github.com/pingcap/tidb-binlog/cmd/pump/main.go:49\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:200\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1337"]
[2019/09/23 09:35:32.412 +00:00] [INFO] [client.go:256] ["[pd] switch leader"] [new-leader=http://127.0.0.1:2379] [old-leader=]
[2019/09/23 09:35:32.412 +00:00] [INFO] [client.go:167] ["[pd] init cluster id"] [cluster-id=6739797247068160123]
[2019/09/23 09:35:32.412 +00:00] [INFO] [server.go:128] ["get clusterID success"] [clusterID=6739797247068160123]
[2019/09/23 09:35:32.412 +00:00] [INFO] [client.go:148] ["[pd] create pd client with endpoints"] [pd-address="[http://127.0.0.1:2379]"]
[2019/09/23 09:35:32.413 +00:00] [INFO] [client.go:256] ["[pd] switch leader"] [new-leader=http://127.0.0.1:2379] [old-leader=]
[2019/09/23 09:35:32.413 +00:00] [INFO] [client.go:167] ["[pd] init cluster id"] [cluster-id=6739797247068160123]
[2019/09/23 09:35:32.414 +00:00] [INFO] [store.go:68] ["new store"] [path="tikv://127.0.0.1:2379?disableGC=true"]
[2019/09/23 09:35:32.414 +00:00] [INFO] [client.go:148] ["[pd] create pd client with endpoints"] [pd-address="[127.0.0.1:2379]"]
[2019/09/23 09:35:32.415 +00:00] [INFO] [client.go:256] ["[pd] switch leader"] [new-leader=http://127.0.0.1:2379] [old-leader=]
[2019/09/23 09:35:32.415 +00:00] [INFO] [client.go:167] ["[pd] init cluster id"] [cluster-id=6739797247068160123]
[2019/09/23 09:35:32.430 +00:00] [INFO] [store.go:74] ["new store with retry success"]
[2019/09/23 09:35:32.430 +00:00] [INFO] [storage.go:135] [NewAppendWithResolver] [options="{\"ValueLogFileSize\":524288000,\"Sync\":true,\"KVChanCapacity\":1048576,\"SlowWriteThreshold\":1,\"StopWriteAtAvailableSpace\":10737418240,\"KVConfig\":null}"]
[2019/09/23 09:35:32.430 +00:00] [INFO] [storage.go:1333] ["open metadata db"] [config="{\"block-cache-capacity\":8388608,\"block-restart-interval\":16,\"block-size\":4096,\"compaction-L0-trigger\":8,\"compaction-table-size\":67108864,\"compaction-total-size\":536870912,\"compaction-total-size-multiplier\":8,\"write-buffer\":67108864,\"write-L0-pause-trigger\":24,\"write-L0-slowdown-trigger\":17}"]
[2019/09/23 09:35:32.459 +00:00] [INFO] [storage.go:218] ["Append info"] [gcTS=0] [maxCommitTS=0] [headPointer="{\"Fid\":0,\"Offset\":0}"] [handlePointer="{\"Fid\":0,\"Offset\":0}"]
[2019/09/23 09:35:32.488 +00:00] [INFO] [registry.go:95] ["node dosen't exist, will create one"] [id=instance-1:8250]
[2019/09/23 09:35:32.489 +00:00] [INFO] [server.go:404] ["register success"] [NodeID=instance-1:8250]
[2019/09/23 09:35:32.490 +00:00] [INFO] [server.go:420] ["start to server request"] [addr=http://127.0.0.1:8250]
[2019/09/23 09:35:42.466 +00:00] [INFO] [storage.go:385] [DBStats] [DBStats="{\"WriteDelayCount\":0,\"WriteDelayDuration\":0,\"WritePaused\":false,\"AliveSnapshots\":0,\"AliveIterators\":0,\"IOWrite\":461,\"IORead\":0,\"BlockCacheSize\":0,\"OpenedTablesCount\":0,\"LevelSizes\":null,\"LevelTablesCounts\":null,\"LevelRead\":null,\"LevelWrite\":null,\"LevelDurations\":null}"]
[2019/09/23 09:35:42.470 +00:00] [INFO] [server.go:532] ["server info tick"] [writeBinlogCount=0] [alivePullerCount=0] [MaxCommitTS=411364580779098113]
[2019/09/23 09:35:52.466 +00:00] [INFO] [storage.go:385] [DBStats] [DBStats="{\"WriteDelayCount\":0,\"WriteDelayDuration\":0,\"WritePaused\":false,\"AliveSnapshots\":0,\"AliveIterators\":0,\"IOWrite\":1038,\"IORead\":0,\"BlockCacheSize\":0,\"OpenedTablesCount\":0,\"LevelSizes\":null,\"LevelTablesCounts\":null,\"LevelRead\":null,\"LevelWrite\":null,\"LevelDurations\":null}"]
[2019/09/23 09:35:52.470 +00:00] [INFO] [server.go:532] ["server info tick"] [writeBinlogCount=0] [alivePullerCount=0] [MaxCommitTS=411364583138394113]
[2019/09/23 09:36:02.466 +00:00] [INFO] [storage.go:385] [DBStats] [DBStats="{\"WriteDelayCount\":0,\"WriteDelayDuration\":0,\"WritePaused\":false,\"AliveSnapshots\":0,\"AliveIterators\":0,\"IOWrite\":1530,\"IORead\":0,\"BlockCacheSize\":0,\"OpenedTablesCount\":0,\"LevelSizes\":null,\"LevelTablesCounts\":null,\"LevelRead\":null,\"LevelWrite\":null,\"LevelDurations\":null}"]
[2019/09/23 09:36:02.470 +00:00] [INFO] [server.go:532] ["server info tick"] [writeBinlogCount=0] [alivePullerCount=0] [MaxCommitTS=411364585497690113]
[2019/09/23 09:36:12.466 +00:00] [INFO] [storage.go:385] [DBStats] [DBStats="{\"WriteDelayCount\":0,\"WriteDelayDuration\":0,\"WritePaused\":false,\"AliveSnapshots\":0,\"AliveIterators\":0,\"IOWrite\":2101,\"IORead\":0,\"BlockCacheSize\":0,\"OpenedTablesCount\":0,\"LevelSizes\":null,\"LevelTablesCounts\":null,\"LevelRead\":null,\"LevelWrite\":null,\"LevelDurations\":null}"]
[2019/09/23 09:36:12.470 +00:00] [INFO] [server.go:532] ["server info tick"] [writeBinlogCount=0] [alivePullerCount=0] [MaxCommitTS=411364588643418113]
[2019/09/23 09:36:22.466 +00:00] [INFO] [storage.go:385] [DBStats] [DBStats="{\"WriteDelayCount\":0,\"WriteDelayDuration\":0,\"WritePaused\":false,\"AliveSnapshots\":0,\"AliveIterators\":0,\"IOWrite\":2678,\"IORead\":0,\"BlockCacheSize\":0,\"OpenedTablesCount\":0,\"LevelSizes\":null,\"LevelTablesCounts\":null,\"LevelRead\":null,\"LevelWrite\":null,\"LevelDurations\":null}"]
[2019/09/23 09:36:22.470 +00:00] [INFO] [server.go:532] ["server info tick"] [writeBinlogCount=0] [alivePullerCount=0] [MaxCommitTS=411364591002714113]
[2019/09/23 09:36:32.466 +00:00] [INFO] [storage.go:385] [DBStats] [DBStats="{\"WriteDelayCount\":0,\"WriteDelayDuration\":0,\"WritePaused\":false,\"AliveSnapshots\":0,\"AliveIterators\":0,\"IOWrite\":3170,\"IORead\":0,\"BlockCacheSize\":0,\"OpenedTablesCount\":0,\"LevelSizes\":null,\"LevelTablesCounts\":null,\"LevelRead\":null,\"LevelWrite\":null,\"LevelDurations\":null}"]
[2019/09/23 09:36:32.470 +00:00] [INFO] [server.go:532] ["server info tick"] [writeBinlogCount=0] [alivePullerCount=0] [MaxCommitTS=411364593362010113]
[2019/09/23 09:36:42.466 +00:00] [INFO] [storage.go:385] [DBStats] [DBStats="{\"WriteDelayCount\":0,\"WriteDelayDuration\":0,\"WritePaused\":false,\"AliveSnapshots\":0,\"AliveIterators\":0,\"IOWrite\":3741,\"IORead\":0,\"BlockCacheSize\":0,\"OpenedTablesCount\":0,\"LevelSizes\":null,\"LevelTablesCounts\":null,\"LevelRead\":null,\"LevelWrite\":null,\"LevelDurations\":null}"]
[2019/09/23 09:36:42.470 +00:00] [INFO] [server.go:532] ["server info tick"] [writeBinlogCount=0] [alivePullerCount=0] [MaxCommitTS=411364596507738113]
[2019/09/23 09:36:52.466 +00:00] [INFO] [storage.go:385] [DBStats] [DBStats="{\"WriteDelayCount\":0,\"WriteDelayDuration\":0,\"WritePaused\":false,\"AliveSnapshots\":0,\"AliveIterators\":0,\"IOWrite\":4318,\"IORead\":0,\"BlockCacheSize\":0,\"OpenedTablesCount\":0,\"LevelSizes\":null,\"LevelTablesCounts\":null,\"LevelRead\":null,\"LevelWrite\":null,\"LevelDurations\":null}"]
[2019/09/23 09:36:52.470 +00:00] [INFO] [server.go:532] ["server info tick"] [writeBinlogCount=0] [alivePullerCount=0] [MaxCommitTS=411364598867034113]

@july2993
Copy link
Contributor

look like pump never receive the request and tidb can't connect pump, can you check addvertise-addr(binlogctl -cmd pumps or run sql show pump status; ) of the pump and make sure it' acceptable?

@thuyenptr
Copy link
Author

thuyenptr commented Sep 23, 2019

pump status still: online

[2019/09/23 11:32:28.781 +00:00] [INFO] [nodes.go:49] ["query node"] [type=pump] [node="{NodeID: instance-1:8250, Addr: 127.0.0.1:8250, State: online, MaxCommitTS: 411366417107779585, UpdateTime: 2019-09-23 11:32:28 +0000 UTC}"]

addvertise-addr and addr are the same: 127.0.0.1:8250

When I start TiDB, it can't start because it can't connect to Pump instance (connection closed). But when I start drainer, drainer can connect to pump instance and call PullBinlog.

@july2993
Copy link
Contributor

pump status still: online

[2019/09/23 11:32:28.781 +00:00] [INFO] [nodes.go:49] ["query node"] [type=pump] [node="{NodeID: instance-1:8250, Addr: 127.0.0.1:8250, State: online, MaxCommitTS: 411366417107779585, UpdateTime: 2019-09-23 11:32:28 +0000 UTC}"]

addvertise-addr and add is the same: 127.0.0.1:8250

When I start TiDB, it can't start because it can't connect to Pump instance (connection closed). But when I start drainer, drainer can connect to pump instance and call PullBinlog.

can you use the release version of TiDB, It seems the latest version of TiDB is broken for some unknown reason, I use the same version of TiDB as you use, it fails to write binlog, will find out and fix later, sorry for that.

@july2993
Copy link
Contributor

@billhcmus
This is caused by pingcap/tidb#12055 upgrade gRPC to 1.23 and pump use 1.17, the client can't connect to the server for some gRPC bug in older version.

@thuyenptr
Copy link
Author

Thank you, let's upgrade pump's & drainer's grpc version

@july2993
Copy link
Contributor

Thank you, let's upgrade pump's & drainer's grpc version

You should use the release version of TiDB/Pump/Drainer in production env, but seems you are maintaining your own version?
To update gRPC version in tidb-binlog will like this:#756

@thuyenptr
Copy link
Author

Thank you, I used the stable version of TiDB/Pump/Drainer instead of the latest version and it works normally. Close this issue here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants