Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tools, toc: reorganize TiDB Binlog documents #1078

Merged
merged 63 commits into from
May 5, 2019
Merged
Show file tree
Hide file tree
Changes from 45 commits
Commits
Show all changes
63 commits
Select commit Hold shift + click to select a range
23905d9
reorg the binlog documents
ericsyh Apr 24, 2019
3ade0f9
update the document title
ericsyh Apr 24, 2019
ce7377b
update TOC
ericsyh Apr 24, 2019
0b2a7c5
Merge remote-tracking branch 'docs/master' into reorganize-TiDB-Binlo…
ericsyh Apr 25, 2019
c182d0c
Merge remote-tracking branch 'docs/master' into reorganize-TiDB-Binlo…
ericsyh Apr 28, 2019
fda0cf3
update the biglog-slave-client document
ericsyh Apr 28, 2019
52abd1f
update the deploy document
ericsyh Apr 28, 2019
9c436bd
update the operation document
ericsyh Apr 28, 2019
ddbf259
udpate the kafka document
ericsyh Apr 28, 2019
12e32ac
update the local version document
ericsyh Apr 28, 2019
20709f1
update the deploy
ericsyh Apr 28, 2019
7edb11e
update several documents
ericsyh Apr 28, 2019
808da97
update the TOC
ericsyh Apr 28, 2019
5c21945
update some changes
ericsyh Apr 29, 2019
e7030b6
Update tools/binlog/binlog-slave-client.md
IANTHEREAL Apr 29, 2019
f78a5a7
Update tools/binlog/binlog-slave-client.md
IANTHEREAL Apr 29, 2019
29c5638
Update tools/binlog/deploy.md
IANTHEREAL Apr 29, 2019
54ac579
Update tools/binlog/deploy.md
IANTHEREAL Apr 29, 2019
59b511d
Update tools/binlog/tidb-binlog-local.md
lilin90 Apr 29, 2019
6765193
Update tools/binlog/upgrade.md
lilin90 Apr 29, 2019
bfc1b78
Update TOC.md
lilin90 Apr 29, 2019
ef535a6
Update TOC.md
lilin90 Apr 29, 2019
f83d89f
Update tools/binlog/binlog-slave-client.md
lilin90 Apr 29, 2019
885e0ab
Update tools/binlog/deploy.md
lilin90 Apr 29, 2019
8867d2a
Update tools/binlog/deploy.md
lilin90 Apr 29, 2019
4f4b13f
Update tools/binlog/operation.md
lilin90 Apr 29, 2019
858ca57
Update tools/binlog/tidb-binlog-kafka.md
lilin90 Apr 29, 2019
01aa21a
Merge remote-tracking branch 'docs/master' into reorganize-TiDB-Binlo…
ericsyh Apr 30, 2019
d6e0147
Merge remote-tracking branch 'origin/reorganize-TiDB-Binlog-documents…
ericsyh Apr 30, 2019
fd774f5
change the note format in deployment
ericsyh Apr 30, 2019
bf9232f
keep heading consistent with the title in metadata
ericsyh Apr 30, 2019
88d5060
tools/binlog: update note format
lilin90 Apr 30, 2019
35f6fcf
tools/binlog: fix inline code format and typo
lilin90 Apr 30, 2019
7f53e3e
Update tools/binlog/tidb-binlog-kafka.md
lilin90 Apr 30, 2019
94bdce7
Update tools/binlog/tidb-binlog-kafka.md
lilin90 Apr 30, 2019
d721b22
Update tools/binlog/tidb-binlog-local.md
lilin90 Apr 30, 2019
222be88
Update tools/binlog/deploy.md
lilin90 Apr 30, 2019
bb6d47e
Update tools/binlog/deploy.md
lilin90 Apr 30, 2019
73fe454
Update tools/binlog/deploy.md
lilin90 Apr 30, 2019
74acad5
Update tools/binlog/overview.md
lilin90 Apr 30, 2019
e5d9fc7
Update tools/binlog/overview.md
lilin90 Apr 30, 2019
08b354d
Update tools/binlog/overview.md
lilin90 Apr 30, 2019
ced48f2
Update tools/binlog/tidb-binlog-local.md
lilin90 Apr 30, 2019
73ebbef
Update tools/binlog/overview.md
lilin90 Apr 30, 2019
30dc164
rm previous documents of tidb-binlog
ericsyh Apr 30, 2019
75e4e1f
Update tools/binlog/binlog-slave-client.md
lilin90 May 1, 2019
2d12037
Update tools/binlog/tidb-binlog-local.md
lilin90 May 1, 2019
cf86272
Update tools/binlog/upgrade.md
lilin90 May 1, 2019
3a58c32
Update tools/binlog/upgrade.md
lilin90 May 1, 2019
45c9714
Update tools/binlog/tidb-binlog-local.md
lilin90 May 1, 2019
251863b
Update tools/binlog/monitor.md
lilin90 May 1, 2019
e523e0b
Update tools/binlog/overview.md
lilin90 May 1, 2019
f49e843
Update tools/binlog/monitor.md
lilin90 May 1, 2019
115a25b
Update tools/binlog/deploy.md
lilin90 May 1, 2019
25387fb
Update tools/binlog/deploy.md
lilin90 May 1, 2019
a9a9704
Update tools/binlog/operation.md
lilin90 May 1, 2019
2c64995
Update tools/binlog/operation.md
lilin90 May 1, 2019
c92c183
Update tools/binlog/overview.md
lilin90 May 1, 2019
234ce4d
update the upgrade process
ericsyh May 2, 2019
1450a2b
update the upgrade process format
ericsyh May 2, 2019
a1bba97
update table format in documents
ericsyh May 2, 2019
c7cdaed
Update tools/binlog/deploy.md
lilin90 May 5, 2019
709b125
tools/binlog: update format
lilin90 May 5, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 6 additions & 1 deletion TOC.md
Original file line number Diff line number Diff line change
Expand Up @@ -197,7 +197,12 @@
- [Table Filter](tools/lightning/filter.md)
- [CSV Support](tools/lightning/csv.md)
- [Monitor](tools/lightning/monitor.md)
- [TiDB-Binlog](tools/tidb-binlog-cluster.md)
+ TiDB-Binlog
- [Overview](tools/binlog/overview.md)
- [Deploy](tools/binlog/deploy.md)
- [Monitor](tools/binlog/monitor.md)
- [Maintain](tools/binlog/operation.md)
- [Upgrade](tools/binlog/upgrade.md)
- [PD Control](tools/pd-control.md)
- [PD Recover](tools/pd-recover.md)
- [TiKV Control](https://github.com/tikv/tikv/blob/master/docs/tools/tikv-control.md)
Expand Down
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
---
title: Binlog Slave Client User Guide
summary: Use Binglog Slave Client to parse the binlog data and output the data in a specific format to Kafka.
summary: Use Binlog Slave Client to consume TiDB slave binlog data from Kafka and output the data in a specific format.
category: tools
---

# Binlog Slave Client User Guide

Binlog Slave Client is used to parse the binlog data and output the data in a specific format to Kafka. Currently, Drainer supports outputting data in multiple formats including MySQL, TiDB, TheFlash, and pb. But sometimes users have customized requirements for outputting data to other formats, for example, Elasticsearch and Hive, so this feature is introduced. After data is output to Kafka, the user writes code to read data from Kafka and then processes the data.
Binlog Slave Client is used to consume TiDB slave binlog data from Kafka and output the data in a specific format. Currently, Drainer supports multiple kind of down streaming, including MySQL, TiDB, file and Kafka. But sometimes users have customized requirements for outputting data to other formats, for example, Elasticsearch and Hive, so this feature is introduced.
ericsyh marked this conversation as resolved.
Show resolved Hide resolved

## Configure Drainer

Expand Down Expand Up @@ -53,7 +53,7 @@ message ColumnInfo {
// https://dev.mysql.com/doc/refman/8.0/en/data-types.html
// for the `numeric` type: int bigint smallint tinyint float double decimal bit
// for the `string` type: text longtext mediumtext char tinytext varchar
// blob longblog mediumblog binary tinyblob varbinary
// blob longblob mediumblob binary tinyblob varbinary
// enum set
// for the `json` type: json
optional string mysql_type = 2 [ (gogoproto.nullable) = false ];
Expand Down Expand Up @@ -87,9 +87,9 @@ message TableMutation {
optional Row change_row = 3;
}

// `DMLData` stores all the mutations caused by DML in a table.
// `DMLData` stores all the mutations caused by DML in a transaction.
message DMLData {
// `tables` contains all the table changes.
// `tables` contains all the table changes in the transaction.
repeated Table tables = 1;
}

Expand All @@ -113,7 +113,6 @@ enum BinlogType {
message Binlog {
optional BinlogType type = 1 [ (gogoproto.nullable) = false ];
optional int64 commit_ts = 2 [ (gogoproto.nullable) = false ];
// `dml_data` is marshalled from the DML type.
optional DMLData dml_data = 3;
optional DDLData ddl_data = 4;
}
Expand All @@ -134,6 +133,7 @@ You need to configure the following information when using Driver:
* `CommitTS`: from which `commit ts` to start reading the binlog
* `Offset`: from which Kafka `offset` to start reading data. If `CommitTS` is set, you needn't configure this parameter
* `ClusterID`: the cluster ID of the TiDB cluster
* `Topic`: topic name of kafka, if Topic is empty, use the default name in drainer <ClusterID>_obinlog

You can use Driver by quoting the Driver code in package and refer to the example code provided by Driver to learn how to use Driver and parse the binlog data.

Expand Down
306 changes: 45 additions & 261 deletions tools/tidb-binlog-cluster.md → tools/binlog/deploy.md

Large diffs are not rendered by default.

8 changes: 4 additions & 4 deletions tools/tidb-binlog-monitor.md → tools/binlog/monitor.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
---
title: TiDB-Binlog Monitoring Metrics and Alert Rules
summary: Learn about three levels of monitoring metrics and alert rules of TiDB-Binlog.
title: TiDB-Binlog monitoring
ericsyh marked this conversation as resolved.
Show resolved Hide resolved
summary: Learn how to monitor the cluster version of TiDB-Binlog.
category: tools
---

# TiDB-Binlog Monitoring Metrics and Alert Rules
# TiDB-Binlog monitoring
ericsyh marked this conversation as resolved.
Show resolved Hide resolved

This document describes TiDB-Binlog monitoring metrics in Grafana and explains the alert rules.
After you have deployed TiDB-Binlog using Ansible successfully, you can go to the Grafana Web (default address: <http://grafana_ip:3000>, default account: admin, password: admin) to check the state of Pump and Drainer.

## Monitoring metrics

Expand Down
128 changes: 128 additions & 0 deletions tools/binlog/operation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,128 @@
---
title: TiDB-Binlog Cluster operations
ericsyh marked this conversation as resolved.
Show resolved Hide resolved
summary: Learn how to operate the cluster version of TiDB-Binlog.
category: tools
---

# TiDB-Binlog Cluster operations
ericsyh marked this conversation as resolved.
Show resolved Hide resolved

## Pump/Drainer state

Pump/Drainer state description:

* `online`: running normally.
* `pausing`: in the pausing process. It turns into this state after you use `kill` or press Ctrl + C to exit from the process. When Pump/Drainer exits all internal threads in safe, it becomes `paused`.
* `paused`: has been stopped. While Pump is in this state, it rejects the request of writing binlog into it and does not provide the binlog for Drainer any more. When Drainer is in this state, it does not synchronize data to the downstream. After Pump and Drainer exit normally from all the threads, they switch the state to `paused` and then exits from the process.
* `closing`: in the offline process. `binlogctl` is used to get Pump/Drainer offline and Pump/Drainer is in this state before the process exits. In this state, Pump does not accept new requests of writing binlog into it and waits for all the binlog data to be used up by Drainer.
* `offline`: becomes offline. After Pump sents all the binlog data that it saves to Drainer, its state is switched to `offline`.

> **Note:**
>
> * When Pump/Drainer is `pausing` or `paused`, the data synchronization is interrupted.
> * Pump and Drainer have several states, including `online`, `paused`, and `offline`. If you press Ctrl + C or kill the process, both Pump and Drainer become `pausing` then finally turn to `paused` . There is no need for Pump to send all the binlog data to Drainer before it become `paused` while Pump need to send all the binlog data to Drainer before it become `offline` . If you need to exit from Pump for a long period of time (or are permanently removing Pump from the cluster), use `binlogctl` to make Pump offline. The same goes for Drainer.
> * When Pump is `closing`, you need to guarantee that all the data has been consumed by all the Drainers that are not `offline`. So before making Pump offline, you need to guarantee all the Drainers are `online`. Otherwise, Pump cannot get offline normally.
> * The binlog data that Pump saves is processed by GC only when it has been consumed by all the Drainers that are not `offline`.
> * Close Drainer only when it will not be used any more.

For how to pause, close, check, and modify the state of Drainer, see the [binlogctl guide](#binlogctl-guide) as follows.

## `binlogctl` guide

[`binlogctl`](https://github.com/pingcap/tidb-tools/tree/master/tidb-binlog/binlogctl) is an operations tool for TiDB-Binlog with the following features:

* Obtaining the current `tso` of TiDB cluster
* Checking the Pump/Drainer state
* Modifying the Pump/Drainer state
* Pausing or closing Pump/Drainer

### Usage scenarios of `binlogctl`

* It is the first time you run Drainer and you need to obtain the current `tso` of TiDB cluster.
* When Pump/Drainer exits abnormally, its state is not updated and the service is affected. You can use this tool to modify the state.
* An error occurs during synchronization and you need to check the running status and the Pump/Drainer state.
* While maintaining the cluster, you need to pause or close Pump/Drainer.

### Download `binlogctl`

Your distribution of TiDB or TiDB-Binlog may already include binlogctl. If not, download `binlogctl`:

```bash
wget https://download.pingcap.org/binlogctl-new-linux-amd64.{tar.gz,sha256}

# Check the file integrity. It should return OK.
sha256sum -c tidb-binlog-new-linux-amd64.sha256
```

### `binlogctl` usage description

Command line parameters:

```
Usage of binlogctl:
-V
Outputs the binlogctl version information
-cmd string
the command mode, including "generate_meta", "pumps", "drainers", "update-pump" ,"update-drainer", "pause-pump", "pause-drainer", "offline-pump", and "offline-drainer"
-data-dir string
the file path where the checkpoint file of Drainer is stored ("binlog_position" by default)
-node-id string
ID of Pump/Drainer
-pd-urls string
the address of PD. If multiple addresses exist, use "," to separate each ("http://127.0.0.1:2379" by default)
-ssl-ca string
the file path of SSL CAs
-ssl-cert string
the file path of the X509 certificate file in the PEM format
-ssl-key string
the file path of X509 key file of the PEM format
-time-zone string
If a time zone is set, the corresponding time of the obtained `tso` is printed in the "generate_meta" mode. For example, "Asia/Shanghai" is the CST time zone and "Local" is the local time zone
```
Command example:

- Check the state of all the Pumps or Drainers:

Set `cmd` as `pumps` or `drainers` to check the state of all the Pumps or Drainers. For example,

```bash
bin/binlogctl -pd-urls=http://127.0.0.1:2379 -cmd pumps

INFO[0000] pump: {NodeID: ip-172-16-30-67:8250, Addr: 172.16.30.192:8250, State: online, MaxCommitTS: 405197570529820673, UpdateTime: 2018-12-25 14:23:37 +0800 CST}
```

- Modify the Pump/Drainer state:

Set `cmd` as `update-pump` or `update-drainer` to modify the state of Pump or Drainer, which can be `online`, `pausing`, `paused`, `closing` or `offline`.

```bash
bin/binlogctl -pd-urls=http://127.0.0.1:2379 -cmd update-pump -node-id ip-127-0-0-1:8250 -state paused
```

This command modifies the Pump/Drainer state saved in PD.

- Pause or close Pump/Drainer:

- Set `cmd` as `pause-pump` or `pause-drainer` to pause Pump or Drainer.

- Set `cmd` as `offline-pump` or `offline-drainer` to close Pump or Drainer.

For example,

```bash
bin/binlogctl -pd-urls=http://127.0.0.1:2379 -cmd pause-pump -node-id ip-127-0-0-1:8250
```

`binlogctl` sends the HTTP request to Pump/Drainer, and Pump/Drainer exits from the process after receiving the command and sets its state to `paused`/`offline`.

- Generate the meta file that Drainer needs to start:

```bash
bin/binlogctl -pd-urls=http://127.0.0.1:2379 -cmd generate_meta

INFO[0000] [pd] create pd client with endpoints [http://192.168.199.118:32379]
INFO[0000] [pd] leader switches to: http://192.168.199.118:32379, previous:
INFO[0000] [pd] init cluster id 6569368151110378289
2018/06/21 11:24:47 meta.go:117: [info] meta: &{CommitTS:400962745252184065}
```

This command generates a `{data-dir}/savepoint` file. This file stores the `tso` information which is needed for the initial start of Drainer.
66 changes: 66 additions & 0 deletions tools/binlog/overview.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
---
title: TiDB-Binlog Cluster User Guide
ericsyh marked this conversation as resolved.
Show resolved Hide resolved
summary: Learn overview of the cluster version of TiDB-Binlog.
category: tools
ericsyh marked this conversation as resolved.
Show resolved Hide resolved
aliases: ['docs/tools/tidb-binlog-cluster/']
---
# TiDB-Binlog Cluster User Guide
ericsyh marked this conversation as resolved.
Show resolved Hide resolved

This document introduces the architecture and the deployment of the cluster version of TiDB-Binlog.

TiDB-Binlog is tool used to collect binlog data from TiDB and provide real-time backup and synchronization to downstream platforms.

TiDB-Binlog has the following features:

* **Data synchronization:** synchronize the data in the TiDB cluster to other databases
* **Real-time backup and restoration:** back up the data in the TiDB cluster and restore the TiDB cluster when the cluster fails

## TiDB-Binlog architecture

The TiDB-Binlog architecture is as follows:

![TiDB-Binlog architecture](/media/tidb_binlog_cluster_architecture.png)

The TiDB-Binlog cluster is composed of Pump and Drainer.

### Pump

Pump is used to record the binlogs generated in TiDB, sort the binlogs based on the commit time of the transaction, and send binlogs to Drainer for consumption.

### Drainer

Drainer collects and merges binlogs from each Pump, converts the binlog to SQL or data of a specific format, and synchronizes the data to a specific downstream platform.

### `binlogctl` guide

[`binlogctl`](https://github.com/pingcap/tidb-tools/tree/master/tidb-binlog/binlogctl) is an operations tool for TiDB-Binlog with the following features:

* Obtaining the current `tso` of TiDB cluster
* Checking the Pump/Drainer state
* Modifying the Pump/Drainer state
* Pausing or closing Pump/Drainer

## Main features

* Multiple Pumps form a cluster which can scale out horizontally
* TiDB uses the built-in Pump Client to send the binlog to each Pump
* Pump stores binlogs and sends the binlogs to Drainer in order
* Drainer reads binlogs of each Pump, merges and sorts the binlogs, and sends the binlogs downstream

## Hardware requirements

Pump and Drainer can be deployed and run on common 64-bit hardware server platforms with the Intel x86-64 architecture.

The server hardware requirements for development, testing, and the production environment are as follows:

| Service | The Number of Servers | CPU | Disk | Memory |
| -------- | -------- | --------| --------------- | ------ |
| Pump | 3 | 8 core+ | SSD, 200 GB+ | 16G |
| Drainer | 1 | 8 core+ | SAS, 100 GB+ (If you need to output a local file, use SSD and increase the disk size) | 16G |

## Notes

* You need to use TiDB v2.0.8-binlog, v2.1.0-rc.5 or a later version. Older versions of TiDB cluster are not compatible with the cluster version of TiDB-Binlog.
* Drainer supports synchronizing binlogs to MySQL, TiDB, Kafka or local files. If you need to synchronize binlogs to other Drainer unsuppored destinations, you can set Drainer to synchronize the binlog to Kafka and read the data in Kafka for customized processing according to binlog slave protocol. See [Binlog Slave Client User Guide](/tools/binlog/binlog-slave-client.md).
* To use TiDB-Binlog for recovering incremental data, set the config `db-type` to `file` (local files in the proto buffer format). Drainer converts the binlog to data in the specified [proto buffer format](https://github.com/pingcap/tidb-binlog/blob/master/proto/binlog.proto) and writes the data to local files. In this way, you can use [Reparo](reparo.md) to recover data incrementally.
* If the downstream is MySQL, MariaDB, or another TiDB cluster, you can use [sync-diff-inspector](/tools/sync-diff-inspector.md) to verify the data after data synchronization.
26 changes: 13 additions & 13 deletions tools/reparo.md → tools/binlog/reparo.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ category: tools

Reparo is a TiDB-Binlog tool, used to recover the incremental data. To back up the incremental data, you can use Drainer of TiDB-Binlog to output the binlog data in the protobuf format to files. To restore the incremental data, you can use Reparo to parse the binlog data in the files and apply the binlog in TiDB/MySQL.

Download Reparo via [reparo-latest-linux-amd64.tar.gz](https://download.pingcap.org/reparo-latest-linux-amd64.tar.gz)
Download Reparo via [tidb-binlog-cluster-latest-linux-amd64.tar.gz](http://download.pingcap.org/tidb-binlog-cluster-latest-linux-amd64.tar.gz)

## Reparo usage

Expand Down Expand Up @@ -108,15 +108,15 @@ password = ""
./bin/reparo -config reparo.toml
```

### Note

* `data-dir` specifies the directory for the binlog file that Drainer outputs.
* Both `start-datatime` and `start-tso` are used to specify the time point for starting recovery, but they are different in the time format. If they are not set, the recovery process starts from the earliest binlog file by default.
* Both `stop-datetime` and `stop-tso` are used to specify the time point for finishing recovery, but they are different in the time format. If they are not set, the recovery process ends up with the last binlog file by default.
* `dest-type` specifies the destination type. Its value can be "mysql" and "print."

* When it is set tomysql, the data can be recovered to MySQL or TiDB that uses or is compatible with the MySQL protocol. In this case, you need to specify the database information in `[dest-db]` of the configuration information.
* When it is set to print, only the binlog information is printed. It is generally used for debugging and checking the binlog information. In this case, there is no need to specify `[dest-db]`.

* `replicate-do-db` specifies the database for recovery. If it is not set, all the databases are to be recovered.
* `replicate-do-table` specifies the table fo recovery. If it is not set, all the tables are to be recovered.
> **Note:**
>
> * `data-dir` specifies the directory for the binlog file that Drainer outputs.
> * Both `start-datatime` and `start-tso` are used to specify the time point for starting recovery, but they are different in the time format. If they are not set, the recovery process starts from the earliest binlog file by default.
> * Both `stop-datetime` and `stop-tso` are used to specify the time point for finishing recovery, but they are different in the time format. If they are not set, the recovery process ends up with the last binlog file by default.
> * `dest-type` specifies the destination type. Its value can be "mysql" and "print."
>
> * When it is set to `mysql`, the data can be recovered to MySQL or TiDB that uses or is compatible with the MySQL protocol. In this case, you need to specify the database information in `[dest-db]` of the configuration information.
> * When it is set to `print`, only the binlog information is printed. It is generally used for debugging and checking the binlog information. In this case, there is no need to specify `[dest-db]`.
>
> * `replicate-do-db` specifies the database for recovery. If it is not set, all the databases are to be recovered.
> * `replicate-do-table` specifies the table for recovery. If it is not set, all the tables are to be recovered.
12 changes: 6 additions & 6 deletions tools/tidb-binlog-kafka.md → tools/binlog/tidb-binlog-kafka.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ TiDB-Binlog supports the following scenarios:

The TiDB-Binlog architecture is as follows:

![TiDB-Binlog architecture](../media/tidb_binlog_kafka_architecture.png)
![TiDB-Binlog architecture](/media/tidb_binlog_kafka_architecture.png)

The TiDB-Binlog cluster mainly consists of three components:

Expand All @@ -39,7 +39,7 @@ The Kafka cluster stores the binlog data written by Pump and provides the binlog

> **Note:**
>
> In the local version of TiDB-Binlog, the binlog is stored in files, while in the latest version, the binlog is stored using Kafka.
> In the local version of TiDB-Binlog, the binlog is stored in files, while in the Kafka version, the binlog is stored using Kafka.

## Install TiDB-Binlog

Expand Down Expand Up @@ -74,7 +74,7 @@ cd tidb-binlog-kafka-linux-amd64

We set the startup parameter `binlog-socket` as the specified unix socket file path of the corresponding parameter `socket` in Pump. The final deployment architecture is as follows:

![TiDB Pump deployment architecture](../media/tidb_pump_deployment.jpeg)
![TiDB Pump deployment architecture](/media/tidb_pump_deployment.jpeg)

- Drainer does not support renaming DDL on the table of the ignored schemas (schemas in the filter list).

Expand Down Expand Up @@ -109,9 +109,9 @@ cd tidb-binlog-kafka-linux-amd64
db-type = "kafka"

# when db-type is kafka, you can uncomment this to config the down stream kafka, or it will be the same kafka addrs where drainer pulls binlog from.
# [syncer.to]
# kafka-addrs = "127.0.0.1:9092"
# kafka-version = "0.8.2.0"
[syncer.to]
kafka-addrs = "127.0.0.1:9092"
kafka-version = "0.8.2.0"
```

The data which outputs to kafka follows the binlog format sorted by ts and defined by protobuf. See [driver](https://github.com/pingcap/tidb-tools/tree/master/tidb-binlog/driver) to access the data and sync to the down stream.
Expand Down
Loading