Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dial tcp: i/o timeout during backup create #292

Closed
Philippians413 opened this issue Oct 27, 2021 · 9 comments · Fixed by #310
Closed

dial tcp: i/o timeout during backup create #292

Philippians413 opened this issue Oct 27, 2021 · 9 comments · Fixed by #310
Assignees

Comments

@Philippians413
Copy link

Hello,

With version 1.2.0 I'm trying to create backup, but get next error:

clickhouse-backup@clickhouse-backup-manual-ntnwr:~$ clickhouse-backup create
2021/10/26 14:41:12  info done                      backup=2021-10-26T14-41-11 operation=create table=M.DATA1
2021/10/26 14:41:12  info done                      backup=2021-10-26T14-41-11 operation=create table=A..inner.MV_DATA2
2021/10/26 14:41:23 error can't freeze partition '06549045342506dff3142ba0f3d2d58d': dial tcp: i/o timeout backup=2021-10-26T14-41-11 operation=create table=A.DATA3
2021/10/26 14:41:23  info done                      backup=2021-10-26T14-41-11 duration=60ms location=local operation=delete
2021/10/26 14:41:23 error can't freeze partition '06549045342506dff3142ba0f3d2d58d': dial tcp: i/o timeout

Could you take a look and help us, please?
With version 0.6.0 we don't have this error.

Regards,
Bohdan

@Slach
Copy link
Collaborator

Slach commented Oct 27, 2021

@Philippians413 could share following command result?

LOG_LEVEL=debug CLICKHOUSE_TIMEOUT=10m clickhouse-backup create

@Philippians413
Copy link
Author

@Slach The result is the same

2021/10/28 11:10:07 debug create data               backup=2021-10-28T11-10-06 operation=create table=A.DATA3
......
2021/10/28 11:05:27 debug ALTER TABLE `A`.`DATA3` FREEZE PARTITION ID '508199a1a2b90620204e8d3188ef35f9' WITH NAME '06bf3d90ca1742f9bb967df5ca40bfbb';
2021/10/28 11:05:32 error can't freeze partition '508199a1a2b90620204e8d3188ef35f9': dial tcp: i/o timeout backup=2021-10-28T11-04-12 operation=create table=A.DATA3
2021/10/28 11:05:32 debug SELECT value FROM `system`.`build_options` where name='VERSION_INTEGER'
2021/10/28 11:05:32 debug SELECT * FROM system.disks;
2021/10/28 11:05:32 debug SELECT value FROM `system`.`build_options` where name='VERSION_INTEGER'
2021/10/28 11:05:32 debug SELECT * FROM system.disks;
2021/10/28 11:05:32 debug remove '2021-10-28T11-04-12' path=/var/lib/clickhouse/
2021/10/28 11:05:32  info done                      backup=2021-10-28T11-04-12 duration=109ms location=local operation=delete
2021/10/28 11:05:32 error can't freeze partition '508199a1a2b90620204e8d3188ef35f9': dial tcp: i/o timeout

@Slach
Copy link
Collaborator

Slach commented Oct 29, 2021

2021/10/28 11:05:27 debug ALTER TABLE A.DATA3 FREEZE PARTITION ID '508199a1a2b90620204e8d3188ef35f9' WITH NAME '06bf3d90ca1742f9bb967df5ca40bfbb';
2021/10/28 11:05:32 error can't freeze partition '508199a1a2b90620204e8d3188ef35f9': dial tcp: i/o timeout backup=2021-10-28T11-04-12 operation=create table=A.DATA3

Only 5 seconds between begin query and i/o timeout error message

Are you sure you set up CLICKHOUSE_TIMEOUT=10m?

Could you share your /etc/clickhouse-backup/config.yml without credentials?

Also, could you check https://github.com/Altinity/clickhouse-backup/releases/tag/v1.2.1

Let's try again and add clickhouse-go debug level

LOG_LEVEL=debug CLICKHOUSE_DEBUG=true CLICKHOUSE_TIMEOUT=10m clickhouse-backup create

@Philippians413
Copy link
Author

Philippians413 commented Oct 29, 2021

Are you sure you set up CLICKHOUSE_TIMEOUT=10m?

Yes, I copy pasted your command.

Could you share your /etc/clickhouse-backup/config.yml without credentials?

I use env vars for configuration, so config.yml has default state.

Also, could you check https://github.com/Altinity/clickhouse-backup/releases/tag/v1.2.1

Yes

# clickhouse-backup --version
Version:	 1.2.1
Git Commit:	 d45f00d70d7b508305f02c7f3f23f7e2d8c3ad3f
Build Date:	 2021-10-29

Let's try again and add clickhouse-go debug level

The result (IP address hidden):

2021/10/29 18:11:27 debug   partition 'ccb66296a383fa75a3b54ff603662748'
2021/10/29 18:11:27 debug ALTER TABLE `A`.`DATA3` FREEZE PARTITION ID 'ccb66296a383fa75a3b54ff603662748' WITH NAME 'a61797d48600457ab0ba5bc08cb8ab32';
[clickhouse]host(s)=chi-dev-replcluster-0-0-0.chi-dev-replcluster-0-0.development.svc.cluster.local:9000, database=system, username=username
[clickhouse][dial err] secure=false, skip_verify=false, strategy=random, ident=7093, addr=chi-dev-replcluster-0-0-0.chi-dev-replcluster-0-0.development.svc.cluster.local:9000
&net.OpError{Op:"dial", Net:"tcp", Source:net.Addr(nil), Addr:net.Addr(nil), Err:(*net.timeoutError)(0x1e1ce60)}
2021/10/29 18:11:32 error can't freeze partition 'ccb66296a383fa75a3b54ff603662748': dial tcp: i/o timeout backup=2021-10-29T18-08-09 operation=create table=A.DATA3
[clickhouse]host(s)=chi-dev-replcluster-0-0-0.chi-dev-replcluster-0-0.development.svc.cluster.local:9000, database=system, username=username
[clickhouse][dial] secure=false, skip_verify=false, strategy=random, ident=7094, server=0 -> x.x.x.x:9000
[clickhouse][connect=7094][hello] -> Golang SQLDriver 1.1.54213
[clickhouse][connect=7094][hello] <- ClickHouse 20.8.54438 (UTC)
[clickhouse][connect=7094]-> ping
[clickhouse][connect=7094][process] <- pong
2021/10/29 18:11:32 debug SELECT value FROM `system`.`build_options` where name='VERSION_INTEGER'
[clickhouse]host(s)=chi-dev-replcluster-0-0-0.chi-dev-replcluster-0-0.development.svc.cluster.local:9000, database=system, username=username
[clickhouse][dial] secure=false, skip_verify=false, strategy=random, ident=7095, server=0 -> x.x.x.x:9000
[clickhouse][connect=7095][hello] -> Golang SQLDriver 1.1.54213
[clickhouse][connect=7095][hello] <- ClickHouse 20.8.54438 (UTC)
[clickhouse][connect=7095][prepare] SELECT value FROM `system`.`build_options` where name='VERSION_INTEGER'
[clickhouse][connect=7095][send query] SELECT value FROM `system`.`build_options` where name='VERSION_INTEGER'
[clickhouse][connect=7095][query settings] log_queries=0&receive_timeout=600&send_timeout=600
[clickhouse][connect=7095][send external tables] count 0
[clickhouse][connect=7095][read meta] <- data: packet=1, columns=1, rows=0
[clickhouse][connect=7095][rows] <- data: packet=1, columns=1, rows=1, elapsed=45.299µs
[clickhouse][connect=7095][rows] <- profiling: rows=1, bytes=1536, blocks=1
[clickhouse][connect=7095][rows] <- progress: rows=47, bytes=2056, total rows=0
[clickhouse][connect=7095][rows] <- data: packet=1, columns=0, rows=0, elapsed=1.094µs
[clickhouse][connect=7095][rows] <- progress: rows=0, bytes=0, total rows=0
[clickhouse][connect=7095][rows] <- end of stream
[clickhouse][connect=7095][rows] close
[clickhouse][connect=7095][stmt] close
2021/10/29 18:11:32 debug SELECT * FROM system.disks;
[clickhouse]host(s)=chi-dev-replcluster-0-0-0.chi-dev-replcluster-0-0.development.svc.cluster.local:9000, database=system, username=username
[clickhouse][dial] secure=false, skip_verify=false, strategy=random, ident=7096, server=0 -> x.x.x.x:9000
[clickhouse][connect=7096][hello] -> Golang SQLDriver 1.1.54213
[clickhouse][connect=7096][hello] <- ClickHouse 20.8.54438 (UTC)
[clickhouse][connect=7096][prepare] SELECT * FROM system.disks;
[clickhouse][connect=7096][send query] SELECT * FROM system.disks;
[clickhouse][connect=7096][query settings] log_queries=0&receive_timeout=600&send_timeout=600
[clickhouse][connect=7096][send external tables] count 0
[clickhouse][connect=7096][read meta] <- data: packet=1, columns=6, rows=0
[clickhouse][connect=7096][rows] <- data: packet=1, columns=6, rows=1, elapsed=58.593µs
[clickhouse][connect=7096][rows] <- profiling: rows=1, bytes=24768, blocks=1
[clickhouse][connect=7096][rows] <- progress: rows=1, bytes=83, total rows=0
[clickhouse][connect=7096][rows] <- data: packet=1, columns=0, rows=0, elapsed=1.359µs
[clickhouse][connect=7096][rows] <- progress: rows=0, bytes=0, total rows=0
[clickhouse][connect=7096][rows] <- end of stream
[clickhouse][connect=7096][rows] close
[clickhouse][connect=7096][stmt] close
[clickhouse]host(s)=chi-dev-replcluster-0-0-0.chi-dev-replcluster-0-0.development.svc.cluster.local:9000, database=system, username=username
[clickhouse][dial] secure=false, skip_verify=false, strategy=random, ident=7097, server=0 -> x.x.x.x:9000
[clickhouse][connect=7097][hello] -> Golang SQLDriver 1.1.54213
[clickhouse][connect=7097][hello] <- ClickHouse 20.8.54438 (UTC)
[clickhouse][connect=7097]-> ping
[clickhouse][connect=7097][process] <- pong
2021/10/29 18:11:32 debug SELECT value FROM `system`.`build_options` where name='VERSION_INTEGER'
[clickhouse]host(s)=chi-dev-replcluster-0-0-0.chi-dev-replcluster-0-0.development.svc.cluster.local:9000, database=system, username=username
[clickhouse][dial] secure=false, skip_verify=false, strategy=random, ident=7098, server=0 -> x.x.x.x:9000
[clickhouse][connect=7098][hello] -> Golang SQLDriver 1.1.54213
[clickhouse][connect=7098][hello] <- ClickHouse 20.8.54438 (UTC)
[clickhouse][connect=7098][prepare] SELECT value FROM `system`.`build_options` where name='VERSION_INTEGER'
[clickhouse][connect=7098][send query] SELECT value FROM `system`.`build_options` where name='VERSION_INTEGER'
[clickhouse][connect=7098][query settings] log_queries=0&receive_timeout=600&send_timeout=600
[clickhouse][connect=7098][send external tables] count 0
[clickhouse][connect=7098][read meta] <- data: packet=1, columns=1, rows=0
[clickhouse][connect=7098][rows] <- data: packet=1, columns=1, rows=1, elapsed=81.325µs
[clickhouse][connect=7098][rows] <- profiling: rows=1, bytes=1536, blocks=1
[clickhouse][connect=7098][rows] <- progress: rows=47, bytes=2056, total rows=0
[clickhouse][connect=7098][rows] <- data: packet=1, columns=0, rows=0, elapsed=1.465µs
[clickhouse][connect=7098][rows] <- progress: rows=0, bytes=0, total rows=0
[clickhouse][connect=7098][rows] <- end of stream
[clickhouse][connect=7098][rows] close
[clickhouse][connect=7098][stmt] close
2021/10/29 18:11:32 debug SELECT * FROM system.disks;
[clickhouse]host(s)=chi-dev-replcluster-0-0-0.chi-dev-replcluster-0-0.development.svc.cluster.local:9000, database=system, username=username
[clickhouse][dial] secure=false, skip_verify=false, strategy=random, ident=7099, server=0 -> x.x.x.x:9000
[clickhouse][connect=7099][hello] -> Golang SQLDriver 1.1.54213
[clickhouse][connect=7099][hello] <- ClickHouse 20.8.54438 (UTC)
[clickhouse][connect=7099][prepare] SELECT * FROM system.disks;
[clickhouse][connect=7099][send query] SELECT * FROM system.disks;
[clickhouse][connect=7099][query settings] log_queries=0&receive_timeout=600&send_timeout=600
[clickhouse][connect=7099][send external tables] count 0
[clickhouse][connect=7099][read meta] <- data: packet=1, columns=6, rows=0
[clickhouse][connect=7099][rows] <- data: packet=1, columns=6, rows=1, elapsed=45.207µs
[clickhouse][connect=7099][rows] <- profiling: rows=1, bytes=24768, blocks=1
[clickhouse][connect=7099][rows] <- progress: rows=1, bytes=83, total rows=0
[clickhouse][connect=7099][rows] <- data: packet=1, columns=0, rows=0, elapsed=931ns
[clickhouse][connect=7099][rows] <- progress: rows=0, bytes=0, total rows=0
[clickhouse][connect=7099][rows] <- end of stream
[clickhouse][connect=7099][rows] close
[clickhouse][connect=7099][stmt] close
2021/10/29 18:11:32 debug remove '2021-10-29T18-08-09' path=/var/lib/clickhouse/
2021/10/29 18:11:32  info done                      backup=2021-10-29T18-08-09 duration=82ms location=local operation=delete
2021/10/29 18:11:32 error can't freeze partition 'ccb66296a383fa75a3b54ff603662748': dial tcp: i/o timeout

@Philippians413
Copy link
Author

Env vars with config:

BACKUPS_TO_KEEP_LOCAL=1
CLICKHOUSE_PASSWORD=password
CLICKHOUSE_TIMEOUT=5m
CLICKHOUSE_HOST=chi-dev-replcluster-0-0-0.chi-dev-replcluster-0-0.development.svc.cluster.local
CLICKHOUSE_FREEZE_BY_PART=true
CLICKHOUSE_PORT=9000
CLICKHOUSE_DATA_PATH=/var/lib/clickhouse
CLICKHOUSE_USERNAME=username
CLICKHOUSE_SKIP_TABLES=system.*
SFTP_USERNAME=user
SFTP_PATH=backup/clickhouse/development
SFTP_PASSWORD=password
SFTP_ADDRESS=x.x.x.x:22

@Slach
Copy link
Collaborator

Slach commented Oct 30, 2021

Could you check the same backup with following parameter CLICKHOUSE_FREEZE_BY_PART=false?

@Philippians413
Copy link
Author

Philippians413 commented Nov 1, 2021

With CLICKHOUSE_FREEZE_BY_PART=false backup created successfully

# CLICKHOUSE_TIMEOUT=10m clickhouse-backup create
...
2021/11/01 11:51:51  info done                      backup=2021-11-01T11-50-43 duration=1m7.033s operation=create

Could you describe what was the problem, please?
How should we deal with it: set CLICKHOUSE_FREEZE_BY_PART=false or waiting to the new release with your commit?

@Slach
Copy link
Collaborator

Slach commented Nov 1, 2021

CLICKHOUSE_FREEZE_BY_PART=false is good solution as quick workaround.

Next release will contain fix which will respect CLICKHOUSE_TIMEOUT setting not only for read / write to socket and also for connection.

@Slach Slach self-assigned this Nov 1, 2021
@Philippians413
Copy link
Author

Thank you! We will be waiting for the next release

@Slach Slach mentioned this issue Nov 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants