Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tools, toc: reorganize TiDB Binlog documents #1078

Merged
merged 63 commits into from
May 5, 2019
Merged
Show file tree
Hide file tree
Changes from 31 commits
Commits
Show all changes
63 commits
Select commit Hold shift + click to select a range
23905d9
reorg the binlog documents
ericsyh Apr 24, 2019
3ade0f9
update the document title
ericsyh Apr 24, 2019
ce7377b
update TOC
ericsyh Apr 24, 2019
0b2a7c5
Merge remote-tracking branch 'docs/master' into reorganize-TiDB-Binlo…
ericsyh Apr 25, 2019
c182d0c
Merge remote-tracking branch 'docs/master' into reorganize-TiDB-Binlo…
ericsyh Apr 28, 2019
fda0cf3
update the biglog-slave-client document
ericsyh Apr 28, 2019
52abd1f
update the deploy document
ericsyh Apr 28, 2019
9c436bd
update the operation document
ericsyh Apr 28, 2019
ddbf259
udpate the kafka document
ericsyh Apr 28, 2019
12e32ac
update the local version document
ericsyh Apr 28, 2019
20709f1
update the deploy
ericsyh Apr 28, 2019
7edb11e
update several documents
ericsyh Apr 28, 2019
808da97
update the TOC
ericsyh Apr 28, 2019
5c21945
update some changes
ericsyh Apr 29, 2019
e7030b6
Update tools/binlog/binlog-slave-client.md
IANTHEREAL Apr 29, 2019
f78a5a7
Update tools/binlog/binlog-slave-client.md
IANTHEREAL Apr 29, 2019
29c5638
Update tools/binlog/deploy.md
IANTHEREAL Apr 29, 2019
54ac579
Update tools/binlog/deploy.md
IANTHEREAL Apr 29, 2019
59b511d
Update tools/binlog/tidb-binlog-local.md
lilin90 Apr 29, 2019
6765193
Update tools/binlog/upgrade.md
lilin90 Apr 29, 2019
bfc1b78
Update TOC.md
lilin90 Apr 29, 2019
ef535a6
Update TOC.md
lilin90 Apr 29, 2019
f83d89f
Update tools/binlog/binlog-slave-client.md
lilin90 Apr 29, 2019
885e0ab
Update tools/binlog/deploy.md
lilin90 Apr 29, 2019
8867d2a
Update tools/binlog/deploy.md
lilin90 Apr 29, 2019
4f4b13f
Update tools/binlog/operation.md
lilin90 Apr 29, 2019
858ca57
Update tools/binlog/tidb-binlog-kafka.md
lilin90 Apr 29, 2019
01aa21a
Merge remote-tracking branch 'docs/master' into reorganize-TiDB-Binlo…
ericsyh Apr 30, 2019
d6e0147
Merge remote-tracking branch 'origin/reorganize-TiDB-Binlog-documents…
ericsyh Apr 30, 2019
fd774f5
change the note format in deployment
ericsyh Apr 30, 2019
bf9232f
keep heading consistent with the title in metadata
ericsyh Apr 30, 2019
88d5060
tools/binlog: update note format
lilin90 Apr 30, 2019
35f6fcf
tools/binlog: fix inline code format and typo
lilin90 Apr 30, 2019
7f53e3e
Update tools/binlog/tidb-binlog-kafka.md
lilin90 Apr 30, 2019
94bdce7
Update tools/binlog/tidb-binlog-kafka.md
lilin90 Apr 30, 2019
d721b22
Update tools/binlog/tidb-binlog-local.md
lilin90 Apr 30, 2019
222be88
Update tools/binlog/deploy.md
lilin90 Apr 30, 2019
bb6d47e
Update tools/binlog/deploy.md
lilin90 Apr 30, 2019
73fe454
Update tools/binlog/deploy.md
lilin90 Apr 30, 2019
74acad5
Update tools/binlog/overview.md
lilin90 Apr 30, 2019
e5d9fc7
Update tools/binlog/overview.md
lilin90 Apr 30, 2019
08b354d
Update tools/binlog/overview.md
lilin90 Apr 30, 2019
ced48f2
Update tools/binlog/tidb-binlog-local.md
lilin90 Apr 30, 2019
73ebbef
Update tools/binlog/overview.md
lilin90 Apr 30, 2019
30dc164
rm previous documents of tidb-binlog
ericsyh Apr 30, 2019
75e4e1f
Update tools/binlog/binlog-slave-client.md
lilin90 May 1, 2019
2d12037
Update tools/binlog/tidb-binlog-local.md
lilin90 May 1, 2019
cf86272
Update tools/binlog/upgrade.md
lilin90 May 1, 2019
3a58c32
Update tools/binlog/upgrade.md
lilin90 May 1, 2019
45c9714
Update tools/binlog/tidb-binlog-local.md
lilin90 May 1, 2019
251863b
Update tools/binlog/monitor.md
lilin90 May 1, 2019
e523e0b
Update tools/binlog/overview.md
lilin90 May 1, 2019
f49e843
Update tools/binlog/monitor.md
lilin90 May 1, 2019
115a25b
Update tools/binlog/deploy.md
lilin90 May 1, 2019
25387fb
Update tools/binlog/deploy.md
lilin90 May 1, 2019
a9a9704
Update tools/binlog/operation.md
lilin90 May 1, 2019
2c64995
Update tools/binlog/operation.md
lilin90 May 1, 2019
c92c183
Update tools/binlog/overview.md
lilin90 May 1, 2019
234ce4d
update the upgrade process
ericsyh May 2, 2019
1450a2b
update the upgrade process format
ericsyh May 2, 2019
a1bba97
update table format in documents
ericsyh May 2, 2019
c7cdaed
Update tools/binlog/deploy.md
lilin90 May 5, 2019
709b125
tools/binlog: update format
lilin90 May 5, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 6 additions & 1 deletion TOC.md
Original file line number Diff line number Diff line change
Expand Up @@ -197,7 +197,12 @@
- [Table Filter](tools/lightning/filter.md)
- [CSV Support](tools/lightning/csv.md)
- [Monitor](tools/lightning/monitor.md)
- [TiDB-Binlog](tools/tidb-binlog-cluster.md)
+ TiDB-Binlog
- [Overview](tools/binlog/overview.md)
- [Deploy](tools/binlog/deploy.md)
- [Monitor](tools/binlog/monitor.md)
- [Maintain](tools/binlog/operation.md)
- [Upgrade](tools/binlog/upgrade.md)
- [PD Control](tools/pd-control.md)
- [PD Recover](tools/pd-recover.md)
- [TiKV Control](https://github.com/tikv/tikv/blob/master/docs/tools/tikv-control.md)
Expand Down
148 changes: 148 additions & 0 deletions tools/binlog/binlog-slave-client.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,148 @@
---
title: Binlog Slave Client User Guide
summary: Use Binlog Slave Client to consume TiDB slave binlog data from Kafka and output the data in a specific format.
category: tools
---

# Binlog Slave Client User Guide

Binlog Slave Client is used to consume TiDB slave binlog data from Kafka and output the data in a specific format. Currently, Drainer supports multiple kind of down streaming, including MySQL, TiDB, file and Kafka. But sometimes users have customized requirements for outputting data to other formats, for example, Elasticsearch and Hive, so this feature is introduced.
ericsyh marked this conversation as resolved.
Show resolved Hide resolved

## Configure Drainer

Modify the configuration file of Drainer and set it to output the data to Kafka:

```
[syncer]
db-type = "kafka"

[syncer.to]
# the Kafka address
kafka-addrs = "127.0.0.1:9092"
# the Kafka version
kafka-version = "0.8.2.0"
```

## Customized development

### Data format

Firstly, you need to obtain the format information of the data which is output to Kafka by Drainer:

```
// `Column` stores the column data in the corresponding variable based on the data type.
message Column {
// Indicates whether the data is null
optional bool is_null = 1 [ default = false ];
// Stores `int` data
optional int64 int64_value = 2;
// Stores `uint`, `enum`, and `set` data
optional uint64 uint64_value = 3;
// Stores `float` and `double` data
optional double double_value = 4;
// Stores `bit`, `blob`, `binary` and `json` data
optional bytes bytes_value = 5;
// Stores `date`, `time`, `decimal`, `text`, `char` data
optional string string_value = 6;
}

// `ColumnInfo` stores the column information, including the column name, type, and whether it is the primary key.
message ColumnInfo {
optional string name = 1 [ (gogoproto.nullable) = false ];
// the lower case column field type in MySQL
// https://dev.mysql.com/doc/refman/8.0/en/data-types.html
// for the `numeric` type: int bigint smallint tinyint float double decimal bit
// for the `string` type: text longtext mediumtext char tinytext varchar
// blob longblob mediumblob binary tinyblob varbinary
// enum set
// for the `json` type: json
optional string mysql_type = 2 [ (gogoproto.nullable) = false ];
optional bool is_primary_key = 3 [ (gogoproto.nullable) = false ];
}

// `Row` stores the actual data of a row.
message Row { repeated Column columns = 1; }

// `MutationType` indicates the DML type.
enum MutationType {
Insert = 0;
Update = 1;
Delete = 2;
}

// `Table` contains mutations in a table.
message Table {
optional string schema_name = 1;
optional string table_name = 2;
repeated ColumnInfo column_info = 3;
repeated TableMutation mutations = 4;
}

// `TableMutation` stores mutations of a row.
message TableMutation {
required MutationType type = 1;
// data after modification
required Row row = 2;
// data before modification. It only takes effect for `Update MutationType`.
optional Row change_row = 3;
}

// `DMLData` stores all the mutations caused by DML in a transaction.
message DMLData {
// `tables` contains all the table changes in the transaction.
repeated Table tables = 1;
}

// `DDLData` stores the DDL information.
message DDLData {
// the database used currently
optional string schema_name = 1;
// the relates table
optional string table_name = 2;
// `ddl_query` is the original DDL statement query.
optional bytes ddl_query = 3;
}

// `BinlogType` indicates the binlog type, including DML and DDL.
enum BinlogType {
DML = 0; // Has `dml_data`
DDL = 1; // Has `ddl_query`
}

// `Binlog` stores all the changes in a transaction. Kafka stores the serialized result of the structure data.
message Binlog {
optional BinlogType type = 1 [ (gogoproto.nullable) = false ];
optional int64 commit_ts = 2 [ (gogoproto.nullable) = false ];
optional DMLData dml_data = 3;
optional DDLData ddl_data = 4;
}
```

For the definition of the data format, see [`binlog.proto`](https://github.com/pingcap/tidb-tools/blob/master/tidb-binlog/slave_binlog_proto/proto/binlog.proto).

### Driver

The [TiDB-Tools](https://github.com/pingcap/tidb-tools/) project provides [Driver](https://github.com/pingcap/tidb-tools/tree/master/tidb-binlog/driver), which is used to read the binlog data in Kafka. It has the following features:

* Read the Kafka data.
* Locate the binlog stored in Kafka based on `commit ts`.

You need to configure the following information when using Driver:

* `KafkaAddr`: the address of the Kafka cluster
* `CommitTS`: from which `commit ts` to start reading the binlog
* `Offset`: from which Kafka `offset` to start reading data. If `CommitTS` is set, you needn't configure this parameter
* `ClusterID`: the cluster ID of the TiDB cluster
* `Topic`: topic name of kafka, if Topic is empty, use the default name in drainer <ClusterID>_obinlog

You can use Driver by quoting the Driver code in package and refer to the example code provided by Driver to learn how to use Driver and parse the binlog data.

Currently, two examples are provided:

* Using Driver to synchronize data to MySQL. This example shows how to convert a binlog to SQL
* Using Driver to print data

> **Note:**
>
> - The example code only shows how to use Driver. If you want to use Driver in the production environment, you need to optimize the code.
> - Currently, only the Golang version of Driver and example code are available. If you want to use other languages, you need to generate the code file in the corresponding language based on the binlog proto file and develop an application to read the binlog data in Kafka, parse the data, and output the data to the downstream. You are also welcome to optimize the example code and submit the example code of other languages to [TiDB-Tools](https://github.com/pingcap/tidb-tools).
Loading