From de39cfc2543482789ae92a1f7fd2445530d88c02 Mon Sep 17 00:00:00 2001 From: nhsmw Date: Mon, 2 Dec 2024 12:26:34 +0800 Subject: [PATCH 01/17] wip --- ticdc/ticdc-debezium.md | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/ticdc/ticdc-debezium.md b/ticdc/ticdc-debezium.md index 7092f04ba0d71..24ba253f73e5b 100644 --- a/ticdc/ticdc-debezium.md +++ b/ticdc/ticdc-debezium.md @@ -5,7 +5,7 @@ summary: Learn the concept of the TiCDC Debezium Protocol and how to use it. # TiCDC Debezium Protocol -[Debezium](https://debezium.io/) is a tool for capturing database changes. It converts each captured database change into a message called an "event" and sends these events to Kafka. Starting from v8.0.0, TiCDC supports sending TiDB changes to Kafka using a Debezium style output format, simplifying migration from MySQL databases for users who had previously been using Debezium's MySQL integration. +[Debezium](https://debezium.io/) is a tool for capturing database changes. It converts each captured database change into a message called an "event" and sends these events to Kafka. Starting from v8.0.0, TiCDC supports sending TiDB changes to Kafka using a Debezium style output format, simplifying migration from MySQL databases for users who had previously been using Debezium's MySQL integration. Starting from v8.6, TiCDC supports DDL events and watermark events. ## Use the Debezium message format @@ -27,11 +27,13 @@ In addition, the original Debezium format does not include important fields such This section describes the format definition of the DML event output in the Debezium format. +### DDL event + ### DML event TiCDC encodes a DML event into a Kafka message, with both the key and value encoded in the Debezium format. -### Key format +#### Key format ```json { @@ -164,6 +166,9 @@ The key fields of the preceding JSON data are explained as follows: | schema.optional| Boolean | Indicates whether the field is optional. When it is `true`, the field is optional. | | schema.type | String | The data type of the field. | +### WATERMARK + + ### Data type mapping The data format mapping in the TiCDC Debezium message basically follows the [Debezium data type mapping rules](https://debezium.io/documentation/reference/2.4/connectors/mysql.html#mysql-data-types), which is generally consistent with the native message of the Debezium Connector for MySQL. However, for some data types, the following differences exist between TiCDC Debezium and Debezium Connector messages: From 8c957e516f380832736578ac4eb554bc7e7e9f51 Mon Sep 17 00:00:00 2001 From: nhsmw Date: Fri, 6 Dec 2024 16:54:13 +0800 Subject: [PATCH 02/17] Update ticdc-debezium.md --- ticdc/ticdc-debezium.md | 400 +++++++++++++++++++++++++++++++++++++++- 1 file changed, 397 insertions(+), 3 deletions(-) diff --git a/ticdc/ticdc-debezium.md b/ticdc/ticdc-debezium.md index 24ba253f73e5b..71aa1831522a4 100644 --- a/ticdc/ticdc-debezium.md +++ b/ticdc/ticdc-debezium.md @@ -11,7 +11,11 @@ summary: Learn the concept of the TiCDC Debezium Protocol and how to use it. When you use Kafka as the downstream sink, specify the `protocol` field as `debezium` in `sink-uri` configuration. Then TiCDC encapsulates the Debezium messages based on the events and sends TiDB data change events to the downstream. -Currently, the Debezium protocol only supports Row Changed events and directly ignores DDL events and WATERMARK events. A Row changed event represents a data change in a row. When a row changes, the Row Changed event is sent, including relevant information about the row both before and after the change. A WATERMARK event marks the replication progress of a table, indicating that all events earlier than the watermark have been sent to the downstream. +There are three types of Events: + +DDL Event: Represents a DDL change record. It is sent after an upstream DDL statement is successfully executed. The DDL Event is sent to the MQ Partition with the index being 0. +DML Event: Represents a row data change record. This type of Event is sent when a row change occurs. It contains the information about the row after the change occurs. +WATERMARK Event: Represents a special time point. It indicates that the Events received before this point is complete. The configuration example for using the Debezium message format is as follows: @@ -25,10 +29,400 @@ In addition, the original Debezium format does not include important fields such ## Message format definition -This section describes the format definition of the DML event output in the Debezium format. - ### DDL event +TiCDC encodes a DDL event into a Kafka message, with both the key and value encoded in the Debezium format. + +#### Key format + +```json +{ + "payload": { + "databaseName": "test" + }, + "schema": { + "type": "struct", + "name": "io.debezium.connector.mysql.SchemaChangeKey", + "optional": false, + "version": 1, + "fields": [ + { + "field": "databaseName", + "optional": false, + "type": "string" + } + ] + } +} +``` + +The fields in the key only include database name. The fields are explained as follows: + +| Field | Type | Description | +|:------------------|:--------|:----------------------------------------------------------------------------| +| `payload` | JSON | The information about database name. | +| `schema.fields` | JSON | The type information of each field in the payload. | +| `schema.name` | String | Constant value "io.debezium.connector.mysql.SchemaChangeKey" | +| `schema.type` | String | The data type of the field. | +| `schema.optional`| Boolean | Indicates whether the field is optional. When it is `true`, the field is optional. | +| `schema.version` | String | The schema version. | + + +#### Value format + +```json +{ + "payload": { + "source": { + "version": "2.4.0.Final", + "connector": "TiCDC", + "name": "test_cluster", + "ts_ms": 0, + "snapshot": "false", + "db": "test", + "table": "table1", + "server_id": 0, + "gtid": null, + "file": "", + "pos": 0, + "row": 0, + "thread": 0, + "query": null, + "commit_ts": 1, + "cluster_id": "test_cluster" + }, + "ts_ms": 1701326309000, + "databaseName": "test", + "schemaName": null, + "ddl": "RENAME TABLE test.table1 to test.table2", + "tableChanges": [ + { + "type": "ALTER", + "id": "\"test\".\"table2\",\"test\".\"table1\"", + "table": { + "defaultCharsetName": "", + "primaryKeyColumnNames": [ + "id" + ], + "columns": [ + { + "name": "id", + "jdbcType": 4, + "nativeType": null, + "comment": null, + "defaultValueExpression": null, + "enumValues": null, + "typeName": "INT", + "typeExpression": "INT", + "charsetName": null, + "length": 0, + "scale": null, + "position": 1, + "optional": false, + "autoIncremented": false, + "generated": false + } + ], + "comment": null + } + } + ] + }, + "schema": { + "optional": false, + "type": "struct", + "version": 1, + "name": "io.debezium.connector.mysql.SchemaChangeValue", + "fields": [ + { + "field": "source", + "name": "io.debezium.connector.mysql.Source", + "optional": false, + "type": "struct", + "fields": [ + { + "field": "version", + "optional": false, + "type": "string" + }, + { + "field": "connector", + "optional": false, + "type": "string" + }, + { + "field": "name", + "optional": false, + "type": "string" + }, + { + "field": "ts_ms", + "optional": false, + "type": "int64" + }, + { + "field": "snapshot", + "optional": true, + "type": "string", + "parameters": { + "allowed": "true,last,false,incremental" + }, + "default": "false", + "name": "io.debezium.data.Enum", + "version": 1 + }, + { + "field": "db", + "optional": false, + "type": "string" + }, + { + "field": "sequence", + "optional": true, + "type": "string" + }, + { + "field": "table", + "optional": true, + "type": "string" + }, + { + "field": "server_id", + "optional": false, + "type": "int64" + }, + { + "field": "gtid", + "optional": true, + "type": "string" + }, + { + "field": "file", + "optional": false, + "type": "string" + }, + { + "field": "pos", + "optional": false, + "type": "int64" + }, + { + "field": "row", + "optional": false, + "type": "int32" + }, + { + "field": "thread", + "optional": true, + "type": "int64" + }, + { + "field": "query", + "optional": true, + "type": "string" + } + ] + }, + { + "field": "ts_ms", + "optional": false, + "type": "int64" + }, + { + "field": "databaseName", + "optional": true, + "type": "string" + }, + { + "field": "schemaName", + "optional": true, + "type": "string" + }, + { + "field": "ddl", + "optional": true, + "type": "string" + }, + { + "field": "tableChanges", + "optional": false, + "type": "array", + "items": { + "name": "io.debezium.connector.schema.Change", + "optional": false, + "type": "struct", + "version": 1, + "fields": [ + { + "field": "type", + "optional": false, + "type": "string" + }, + { + "field": "id", + "optional": false, + "type": "string" + }, + { + "field": "table", + "optional": true, + "type": "struct", + "name": "io.debezium.connector.schema.Table", + "version": 1, + "fields": [ + { + "field": "defaultCharsetName", + "optional": true, + "type": "string" + }, + { + "field": "primaryKeyColumnNames", + "optional": true, + "type": "array", + "items": { + "type": "string", + "optional": false + } + }, + { + "field": "columns", + "optional": false, + "type": "array", + "items": { + "name": "io.debezium.connector.schema.Column", + "optional": false, + "type": "struct", + "version": 1, + "fields": [ + { + "field": "name", + "optional": false, + "type": "string" + }, + { + "field": "jdbcType", + "optional": false, + "type": "int32" + }, + { + "field": "nativeType", + "optional": true, + "type": "int32" + }, + { + "field": "typeName", + "optional": false, + "type": "string" + }, + { + "field": "typeExpression", + "optional": true, + "type": "string" + }, + { + "field": "charsetName", + "optional": true, + "type": "string" + }, + { + "field": "length", + "optional": true, + "type": "int32" + }, + { + "field": "scale", + "optional": true, + "type": "int32" + }, + { + "field": "position", + "optional": false, + "type": "int32" + }, + { + "field": "optional", + "optional": true, + "type": "boolean" + }, + { + "field": "autoIncremented", + "optional": true, + "type": "boolean" + }, + { + "field": "generated", + "optional": true, + "type": "boolean" + }, + { + "field": "comment", + "optional": true, + "type": "string" + }, + { + "field": "defaultValueExpression", + "optional": true, + "type": "string" + }, + { + "field": "enumValues", + "optional": true, + "type": "array", + "items": { + "type": "string", + "optional": false + } + } + ] + } + }, + { + "field": "comment", + "optional": true, + "type": "string" + } + ] + } + ] + } + } + ] + } +} +``` + +The key fields of the preceding JSON data are explained as follows: + +| Field | Type | Description | +|:----------|:-------|:-------------------------------------------------------| +| payload.op | String | The type of the change event. `"c"` indicates an `INSERT` event, `"u"` indicates an `UPDATE` event, and `"d"` indicates a `DELETE` event. | +| payload.ts_ms | Number | The timestamp (in milliseconds) when TiCDC generates this message. | +| payload.ddl | String | The SQL of DDL event. | +| payload.databaseName | String | The name of the database where the event occurs. | +| payload.source.commit_ts | Number | The `CommitTs` identifier when TiCDC generates this message. | +| payload.source.db | String | The name of the database where the event occurs. | +| payload.source.table | String | The name of the table where the event occurs. | +| payload.tableChanges | Array | A structured representation of the entire table schema after the schema change. The tableChanges field contains an array that includes entries for each column of the table. Because the structured representation presents data in JSON or Avro format, consumers can easily read messages without first processing them through a DDL parser. | +| payload.tableChanges.type | String | Describes the kind of change. The value is one of the following: CREATE Table created. ALTER Table modified. DROP Table deleted. | +| payload.tableChanges.id | String | Full identifier of the table that was created, altered, or dropped. In the case of a table rename, this identifier is a concatenation of , table names. | +| payload.tableChanges.table.defaultCharsetName | string | The charset of the table where the event occurs. | +| payload.tableChanges.table.primaryKeyColumnNames | string | List of columns that compose the table’s primary key. | +| payload.tableChanges.table.columns | Array | Metadata for each column in the changed table. | +| payload.tableChanges.table.columns.name | String | The name of the column. | +| payload.tableChanges.table.columns.jdbcType | Number | The jdbc type of the column. | +| payload.tableChanges.table.columns.comment | String | The comment of the column. | +| payload.tableChanges.table.columns.defaultValueExpression | String | The default value of the column. notice "CURRENT_TIMESTAMP" is converted to "1970-01-01 00:00:00" | +| payload.tableChanges.table.columns.enumValues | String | The enum values of the column. Format is ENUM ('e1', 'e2') or SET ('e1', 'e2') | +| payload.tableChanges.table.columns.charsetName | String | The charset of the column. | +| payload.tableChanges.table.columns.length | Number | The length of the column. | +| payload.tableChanges.table.columns.scale | Number | The scale of the column. | +| payload.tableChanges.table.columns.position | Number | The position of the column. | +| payload.tableChanges.table.columns.optional | Boolean | Indicates whether the column is not null. | +| schema.fields | JSON | The type information of each field in the payload, including the schema information of the row data before and after the change. | +| schema.name | String | The name of the schema, in the `"{cluster-name}.{schema-name}.{table-name}.Envelope"` format. | +| schema.optional| Boolean | Indicates whether the field is optional. When it is `true`, the field is optional. | +| schema.type | String | The data type of the field. + ### DML event TiCDC encodes a DML event into a Kafka message, with both the key and value encoded in the Debezium format. From a0dc693181be039e16e03f73608a7e6ab2ea8c03 Mon Sep 17 00:00:00 2001 From: nhsmw Date: Fri, 6 Dec 2024 17:05:09 +0800 Subject: [PATCH 03/17] Update ticdc-debezium.md --- ticdc/ticdc-debezium.md | 408 ++++++++++++++++++++++++++++++---------- 1 file changed, 311 insertions(+), 97 deletions(-) diff --git a/ticdc/ticdc-debezium.md b/ticdc/ticdc-debezium.md index 71aa1831522a4..0fb9955842e3d 100644 --- a/ticdc/ticdc-debezium.md +++ b/ticdc/ticdc-debezium.md @@ -418,10 +418,10 @@ The key fields of the preceding JSON data are explained as follows: | payload.tableChanges.table.columns.scale | Number | The scale of the column. | | payload.tableChanges.table.columns.position | Number | The position of the column. | | payload.tableChanges.table.columns.optional | Boolean | Indicates whether the column is not null. | -| schema.fields | JSON | The type information of each field in the payload, including the schema information of the row data before and after the change. | -| schema.name | String | The name of the schema, in the `"{cluster-name}.{schema-name}.{table-name}.Envelope"` format. | -| schema.optional| Boolean | Indicates whether the field is optional. When it is `true`, the field is optional. | -| schema.type | String | The data type of the field. +| schema.fields | JSON | The type information of each field in the payload, including the schema information of the column of table changes. | +| schema.name | String | The name of the schema, in the `"{cluster-name}.{schema-name}.{table-name}.SchemaChangeValue"` format. | +| schema.optional | Boolean | Indicates whether the field is optional. When it is `true`, the field is optional. | +| schema.type | String | The data type of the field. | ### DML event @@ -431,21 +431,21 @@ TiCDC encodes a DML event into a Kafka message, with both the key and value enco ```json { - "payload": { - "a": 4 - }, - "schema": { - "fields": [ - { - "field": "a", - "optional": true, - "type": "int32" - } - ], - "name": "default.test.t2.Key", - "optional": false, - "type": "struct" - } + "payload": { + "tiny": 1 + }, + "schema": { + "fields": [ + { + "field":"tiny", + "optional":true, + "type":"int16" + } + ], + "name": "test_cluster.test.table1.Key", + "optional": false, + "type":"struct" + } } ``` @@ -463,84 +463,102 @@ The fields in the key only include primary key or unique index columns. The fiel ```json { - "payload":{ - "ts_ms":1707103832957, - "transaction":null, - "op":"c", - "before":null, - "after":{ - "a":4, - "b":2 - }, - "source":{ - "version":"2.4.0.Final", - "connector":"TiCDC", - "name":"default", - "ts_ms":1707103832263, - "snapshot":"false", - "db":"test", - "table":"t2", - "server_id":0, - "gtid":null, - "file":"", - "pos":0, - "row":0, - "thread":0, - "query":null, - "commit_ts":447507027004751877, - "cluster_id":"default" - } - }, - "schema":{ - "type":"struct", - "optional":false, - "name":"default.test.t2.Envelope", - "version":1, - "fields":{ - { - "type":"struct", - "optional":true, - "name":"default.test.t2.Value", - "field":"before", - "fields":[ - { - "type":"int32", - "optional":false, - "field":"a" - }, - { - "type":"int32", - "optional":true, - "field":"b" - } - ] - }, - { - "type":"struct", - "optional":true, - "name":"default.test.t2.Value", - "field":"after", - "fields":[ - { - "type":"int32", - "optional":false, - "field":"a" - }, - { - "type":"int32", - "optional":true, - "field":"b" - } - ] - }, - { - "type":"string", - "optional":false, - "field":"op" - }, - ... - } - } + "payload": { + "source": { + "version": "2.4.0.Final", + "connector": "TiCDC", + "name": "test_cluster", + "ts_ms": 0, + "snapshot": "false", + "db": "test", + "table": "table1", + "server_id": 0, + "gtid": null, + "file": "", + "pos": 0, + "row": 0, + "thread": 0, + "query": null, + "commit_ts": 1, + "cluster_id": "test_cluster" + }, + "ts_ms": 1701326309000, + "transaction": null, + "op": "u", + "before": { "tiny": 2 }, + "after": { "tiny": 1 } + }, + "schema": { + "type": "struct", + "optional": false, + "name": "test_cluster.test.table1.Envelope", + "version": 1, + "fields": [ + { + "type": "struct", + "optional": true, + "name": "test_cluster.test.table1.Value", + "field": "before", + "fields": [{ "type": "int16", "optional": true, "field": "tiny" }] + }, + { + "type": "struct", + "optional": true, + "name": "test_cluster.test.table1.Value", + "field": "after", + "fields": [{ "type": "int16", "optional": true, "field": "tiny" }] + }, + { + "type": "struct", + "fields": [ + { "type": "string", "optional": false, "field": "version" }, + { "type": "string", "optional": false, "field": "connector" }, + { "type": "string", "optional": false, "field": "name" }, + { "type": "int64", "optional": false, "field": "ts_ms" }, + { + "type": "string", + "optional": true, + "name": "io.debezium.data.Enum", + "version": 1, + "parameters": { "allowed": "true,last,false,incremental" }, + "default": "false", + "field": "snapshot" + }, + { "type": "string", "optional": false, "field": "db" }, + { "type": "string", "optional": true, "field": "sequence" }, + { "type": "string", "optional": true, "field": "table" }, + { "type": "int64", "optional": false, "field": "server_id" }, + { "type": "string", "optional": true, "field": "gtid" }, + { "type": "string", "optional": false, "field": "file" }, + { "type": "int64", "optional": false, "field": "pos" }, + { "type": "int32", "optional": false, "field": "row" }, + { "type": "int64", "optional": true, "field": "thread" }, + { "type": "string", "optional": true, "field": "query" } + ], + "optional": false, + "name": "io.debezium.connector.mysql.Source", + "field": "source" + }, + { "type": "string", "optional": false, "field": "op" }, + { "type": "int64", "optional": true, "field": "ts_ms" }, + { + "type": "struct", + "fields": [ + { "type": "string", "optional": false, "field": "id" }, + { "type": "int64", "optional": false, "field": "total_order" }, + { + "type": "int64", + "optional": false, + "field": "data_collection_order" + } + ], + "optional": true, + "name": "event.block", + "version": 1, + "field": "transaction" + } + ] + } } ``` @@ -562,6 +580,202 @@ The key fields of the preceding JSON data are explained as follows: ### WATERMARK +TiCDC encodes a Checkpoint event into a Kafka message, with both the key and value encoded in the Debezium format. + +#### Key format + +```json +{ + "payload": {}, + "schema": { + "fields": [], + "optional": false, + "name": "test_cluster.watermark.Key", + "type": "struct" + } +} +``` + +The fields are explained as follows: + +| Field | Type | Description | +|:------------------|:--------|:----------------------------------------------------------------------------| +| `schema.name` | String | The name of the schema, in the `"{cluster-name}.watermark.Key"` format. | + +#### Value format + +```json +{ + "payload": { + "source": { + "version": "2.4.0.Final", + "connector": "TiCDC", + "name": "test_cluster", + "ts_ms": 0, + "snapshot": "false", + "db": "", + "table": "", + "server_id": 0, + "gtid": null, + "file": "", + "pos": 0, + "row": 0, + "thread": 0, + "query": null, + "commit_ts": 3, + "cluster_id": "test_cluster" + }, + "op": "m", + "ts_ms": 1701326309000, + "transaction": null + }, + "schema": { + "type": "struct", + "optional": false, + "name": "test_cluster.watermark.Envelope", + "version": 1, + "fields": [ + { + "type": "struct", + "fields": [ + { + "type": "string", + "optional": false, + "field": "version" + }, + { + "type": "string", + "optional": false, + "field": "connector" + }, + { + "type": "string", + "optional": false, + "field": "name" + }, + { + "type": "int64", + "optional": false, + "field": "ts_ms" + }, + { + "type": "string", + "optional": true, + "name": "io.debezium.data.Enum", + "version": 1, + "parameters": { + "allowed": "true,last,false,incremental" + }, + "default": "false", + "field": "snapshot" + }, + { + "type": "string", + "optional": false, + "field": "db" + }, + { + "type": "string", + "optional": true, + "field": "sequence" + }, + { + "type": "string", + "optional": true, + "field": "table" + }, + { + "type": "int64", + "optional": false, + "field": "server_id" + }, + { + "type": "string", + "optional": true, + "field": "gtid" + }, + { + "type": "string", + "optional": false, + "field": "file" + }, + { + "type": "int64", + "optional": false, + "field": "pos" + }, + { + "type": "int32", + "optional": false, + "field": "row" + }, + { + "type": "int64", + "optional": true, + "field": "thread" + }, + { + "type": "string", + "optional": true, + "field": "query" + } + ], + "optional": false, + "name": "io.debezium.connector.mysql.Source", + "field": "source" + }, + { + "type": "string", + "optional": false, + "field": "op" + }, + { + "type": "int64", + "optional": true, + "field": "ts_ms" + }, + { + "type": "struct", + "fields": [ + { + "type": "string", + "optional": false, + "field": "id" + }, + { + "type": "int64", + "optional": false, + "field": "total_order" + }, + { + "type": "int64", + "optional": false, + "field": "data_collection_order" + } + ], + "optional": true, + "name": "event.block", + "version": 1, + "field": "transaction" + } + ] + } +} +``` + +The key fields of the preceding JSON data are explained as follows: + +| Field | Type | Description | +|:----------|:-------|:-------------------------------------------------------| +| payload.op | String | The type of the change event. `"m"` indicates an watermark event. | +| payload.ts_ms | Number | The timestamp (in milliseconds) when TiCDC generates this message. | +| payload.source.commit_ts | Number | The `CommitTs` identifier when TiCDC generates this message. | +| payload.source.db | String | The name of the database where the event occurs. | +| payload.source.table | String | The name of the table where the event occurs. | +| schema.fields | JSON | The type information of each field in the payload, including the schema information of the row data before and after the change. | +| schema.name | String | The name of the schema, in the `"{cluster-name}.watermark.Envelope"` format. | +| schema.optional| Boolean | Indicates whether the field is optional. When it is `true`, the field is optional. | +| schema.type | String | The data type of the field. | ### Data type mapping From 793e4850dd8d7d9e2a8c5f1d0f6e27192d941912 Mon Sep 17 00:00:00 2001 From: nhsmw Date: Fri, 6 Dec 2024 17:36:29 +0800 Subject: [PATCH 04/17] Update ticdc-debezium.md --- ticdc/ticdc-debezium.md | 28 +++++++++++++++++++++++++++- 1 file changed, 27 insertions(+), 1 deletion(-) diff --git a/ticdc/ticdc-debezium.md b/ticdc/ticdc-debezium.md index 0fb9955842e3d..120f213299e44 100644 --- a/ticdc/ticdc-debezium.md +++ b/ticdc/ticdc-debezium.md @@ -777,7 +777,9 @@ The key fields of the preceding JSON data are explained as follows: | schema.optional| Boolean | Indicates whether the field is optional. When it is `true`, the field is optional. | | schema.type | String | The data type of the field. | -### Data type mapping +### Changes in TiCDC Debezium + +#### Data type mapping The data format mapping in the TiCDC Debezium message basically follows the [Debezium data type mapping rules](https://debezium.io/documentation/reference/2.4/connectors/mysql.html#mysql-data-types), which is generally consistent with the native message of the Debezium Connector for MySQL. However, for some data types, the following differences exist between TiCDC Debezium and Debezium Connector messages: @@ -786,3 +788,27 @@ The data format mapping in the TiCDC Debezium message basically follows the [Deb - For string-like data types, including Varchar, String, VarString, TinyBlob, MediumBlob, BLOB, and LongBlob, when the column has the BINARY flag, TiCDC encodes it as a String type after encoding it in Base64; when the column does not have the BINARY flag, TiCDC encodes it directly as a String type. The native Debezium Connector encodes it in different ways according to `binary.handling.mode`. - For the Decimal data type, including `DECIMAL` and `NUMERIC`, TiCDC uses the float64 type to represent it. The native Debezium Connector encodes it in float32 or float64 according to the different precision of the data type. + +- TiCDC converts REAL to FLOAT when setting sql_mode='REAL_AS_FLOAT' + +- TiCDC converts BOOLEAN to TINYINT(1) + +#### Different values display + +The values of some columns may be different between Debezium and TiCDC: + +- Be careful with [time zone](https://debezium.io/documentation/reference/3.0/connectors/mysql.html#mysql-temporal-types). + +- In TiCDC, BLOB, TEXT, GEOMETRY, or JSON column 'binaryRepresentation' can't have a default value + +- The defaultValueExpression of BIT may not be the same when the length of the default value is not equal to the column length + +- Debezium FLOAT data convert "5.61" to "5.610000133514404", but TiCDC does not. + +- TiCDC print the wrong `flen` with the FLOAT tidb#57060 + +- Debezium converts charsetName to "utf8mb4" when column COLLATE is "utf8_unicode_ci" and CHARACTER is null, but TiCDC doesn't. + +- Debezium doesn't escape character, but TiCDC does. e.g. ENUM('c, 'd', 'g,''h') + +- TiCDC converts "TIME" default value '1000-00-00 01:00:00.000' to "1000-00-00", but Debezium doesn't. From 66f2dde0316884f2cd86f8fcc493e8923990bba5 Mon Sep 17 00:00:00 2001 From: nhsmw Date: Fri, 6 Dec 2024 17:44:06 +0800 Subject: [PATCH 05/17] Update ticdc-debezium.md --- ticdc/ticdc-debezium.md | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/ticdc/ticdc-debezium.md b/ticdc/ticdc-debezium.md index 120f213299e44..55b0d579d52f7 100644 --- a/ticdc/ticdc-debezium.md +++ b/ticdc/ticdc-debezium.md @@ -395,7 +395,6 @@ The key fields of the preceding JSON data are explained as follows: | Field | Type | Description | |:----------|:-------|:-------------------------------------------------------| -| payload.op | String | The type of the change event. `"c"` indicates an `INSERT` event, `"u"` indicates an `UPDATE` event, and `"d"` indicates a `DELETE` event. | | payload.ts_ms | Number | The timestamp (in milliseconds) when TiCDC generates this message. | | payload.ddl | String | The SQL of DDL event. | | payload.databaseName | String | The name of the database where the event occurs. | @@ -453,11 +452,11 @@ The fields in the key only include primary key or unique index columns. The fiel | Field | Type | Description | |:------------------|:--------|:----------------------------------------------------------------------------| -| `payload` | JSON | The information about primary key or unique index columns. The key and value in each field represent the column name and its current value, respectively. | -| `schema.fields` | JSON | The type information of each field in the payload, including the schema information of the row data before and after the change. | -| `schema.name` | String | The name of the schema, in the `"{cluster-name}.{schema-name}.{table-name}.Key"` format. | -| `schema.optional`| Boolean | Indicates whether the field is optional. When it is `true`, the field is optional. | -| `schema.type` | String | The data type of the field. | +| payload | JSON | The information about primary key or unique index columns. The key and value in each field represent the column name and its current value, respectively. | +| schema.fields | JSON | The type information of each field in the payload, including the schema information of the row data before and after the change. | +| schema.name` | String | The name of the schema, in the `"{cluster-name}.{schema-name}.{table-name}.Key"` format. | +| schema.optional | Boolean | Indicates whether the field is optional. When it is `true`, the field is optional. | +| schema.type | String | The data type of the field. | #### Value format @@ -805,7 +804,7 @@ The values of some columns may be different between Debezium and TiCDC: - Debezium FLOAT data convert "5.61" to "5.610000133514404", but TiCDC does not. -- TiCDC print the wrong `flen` with the FLOAT tidb#57060 +- TiCDC print the wrong `flen` with the FLOAT [tidb#57060](https://github.com/pingcap/tidb/issues/57060) - Debezium converts charsetName to "utf8mb4" when column COLLATE is "utf8_unicode_ci" and CHARACTER is null, but TiCDC doesn't. From b9c4f4b8585c34e84b1c74544425132bd2035c93 Mon Sep 17 00:00:00 2001 From: nhsmw Date: Fri, 6 Dec 2024 17:45:22 +0800 Subject: [PATCH 06/17] Update ticdc-debezium.md --- ticdc/ticdc-debezium.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ticdc/ticdc-debezium.md b/ticdc/ticdc-debezium.md index 55b0d579d52f7..5dee7d66215db 100644 --- a/ticdc/ticdc-debezium.md +++ b/ticdc/ticdc-debezium.md @@ -599,7 +599,7 @@ The fields are explained as follows: | Field | Type | Description | |:------------------|:--------|:----------------------------------------------------------------------------| -| `schema.name` | String | The name of the schema, in the `"{cluster-name}.watermark.Key"` format. | +| schema.name | String | The name of the schema, in the `"{cluster-name}.watermark.Key"` format. | #### Value format From 6da4609439e4667aa6970a75b4ef40fd46d0c5cc Mon Sep 17 00:00:00 2001 From: nhsmw Date: Fri, 6 Dec 2024 17:47:17 +0800 Subject: [PATCH 07/17] Update ticdc-debezium.md --- ticdc/ticdc-debezium.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/ticdc/ticdc-debezium.md b/ticdc/ticdc-debezium.md index 5dee7d66215db..314bed4c7215f 100644 --- a/ticdc/ticdc-debezium.md +++ b/ticdc/ticdc-debezium.md @@ -60,12 +60,12 @@ The fields in the key only include database name. The fields are explained as fo | Field | Type | Description | |:------------------|:--------|:----------------------------------------------------------------------------| -| `payload` | JSON | The information about database name. | -| `schema.fields` | JSON | The type information of each field in the payload. | -| `schema.name` | String | Constant value "io.debezium.connector.mysql.SchemaChangeKey" | -| `schema.type` | String | The data type of the field. | -| `schema.optional`| Boolean | Indicates whether the field is optional. When it is `true`, the field is optional. | -| `schema.version` | String | The schema version. | +| payload | JSON | The information about database name. | +| schema.fields | JSON | The type information of each field in the payload. | +| schema.name | String | Constant value "io.debezium.connector.mysql.SchemaChangeKey" | +| schema.type | String | The data type of the field. | +| schema.optional | Boolean | Indicates whether the field is optional. When it is `true`, the field is optional. | +| schema.version | String | The schema version. | #### Value format From afc151ec2259e650752031d16b17e352e3724e57 Mon Sep 17 00:00:00 2001 From: nhsmw Date: Mon, 9 Dec 2024 12:05:51 +0800 Subject: [PATCH 08/17] Update ticdc-debezium.md --- ticdc/ticdc-debezium.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/ticdc/ticdc-debezium.md b/ticdc/ticdc-debezium.md index 314bed4c7215f..48056153570ae 100644 --- a/ticdc/ticdc-debezium.md +++ b/ticdc/ticdc-debezium.md @@ -410,7 +410,7 @@ The key fields of the preceding JSON data are explained as follows: | payload.tableChanges.table.columns.name | String | The name of the column. | | payload.tableChanges.table.columns.jdbcType | Number | The jdbc type of the column. | | payload.tableChanges.table.columns.comment | String | The comment of the column. | -| payload.tableChanges.table.columns.defaultValueExpression | String | The default value of the column. notice "CURRENT_TIMESTAMP" is converted to "1970-01-01 00:00:00" | +| payload.tableChanges.table.columns.defaultValueExpression | String | The default value of the column. | | payload.tableChanges.table.columns.enumValues | String | The enum values of the column. Format is ENUM ('e1', 'e2') or SET ('e1', 'e2') | | payload.tableChanges.table.columns.charsetName | String | The charset of the column. | | payload.tableChanges.table.columns.length | Number | The length of the column. | @@ -796,7 +796,7 @@ The data format mapping in the TiCDC Debezium message basically follows the [Deb The values of some columns may be different between Debezium and TiCDC: -- Be careful with [time zone](https://debezium.io/documentation/reference/3.0/connectors/mysql.html#mysql-temporal-types). +- TIMESTAMP and DATETIME are converted to the column’s precision by using [UTC](https://debezium.io/documentation/reference/3.0/connectors/mysql.html#mysql-temporal-types). - In TiCDC, BLOB, TEXT, GEOMETRY, or JSON column 'binaryRepresentation' can't have a default value @@ -806,8 +806,8 @@ The values of some columns may be different between Debezium and TiCDC: - TiCDC print the wrong `flen` with the FLOAT [tidb#57060](https://github.com/pingcap/tidb/issues/57060) -- Debezium converts charsetName to "utf8mb4" when column COLLATE is "utf8_unicode_ci" and CHARACTER is null, but TiCDC doesn't. +- Debezium converts charsetName to "utf8mb4" when column COLLATE is "utf8_unicode_ci" and CHARACTER is null, but TiCDC does not. -- Debezium doesn't escape character, but TiCDC does. e.g. ENUM('c, 'd', 'g,''h') +- Debezium escapes character, but TiCDC does not. e.g. ENUM('c, 'd', 'g,''h') - TiCDC converts "TIME" default value '1000-00-00 01:00:00.000' to "1000-00-00", but Debezium doesn't. From 19aa23c453bfb7ee983c5833d238eecce5cae154 Mon Sep 17 00:00:00 2001 From: nhsmw Date: Mon, 9 Dec 2024 14:15:07 +0800 Subject: [PATCH 09/17] Update ticdc-debezium.md --- ticdc/ticdc-debezium.md | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/ticdc/ticdc-debezium.md b/ticdc/ticdc-debezium.md index 48056153570ae..8f909bd09e10d 100644 --- a/ticdc/ticdc-debezium.md +++ b/ticdc/ticdc-debezium.md @@ -5,7 +5,7 @@ summary: Learn the concept of the TiCDC Debezium Protocol and how to use it. # TiCDC Debezium Protocol -[Debezium](https://debezium.io/) is a tool for capturing database changes. It converts each captured database change into a message called an "event" and sends these events to Kafka. Starting from v8.0.0, TiCDC supports sending TiDB changes to Kafka using a Debezium style output format, simplifying migration from MySQL databases for users who had previously been using Debezium's MySQL integration. Starting from v8.6, TiCDC supports DDL events and watermark events. +[Debezium](https://debezium.io/) is a tool for capturing database changes. It converts each captured database change into a message called an "event" and sends these events to Kafka. Starting from v8.0.0, TiCDC supports sending TiDB changes to Kafka using a Debezium style output format, simplifying migration from MySQL databases for users who had previously been using Debezium's MySQL integration. Starting from v9.1, TiCDC supports DDL events and watermark events. ## Use the Debezium message format @@ -796,11 +796,7 @@ The data format mapping in the TiCDC Debezium message basically follows the [Deb The values of some columns may be different between Debezium and TiCDC: -- TIMESTAMP and DATETIME are converted to the column’s precision by using [UTC](https://debezium.io/documentation/reference/3.0/connectors/mysql.html#mysql-temporal-types). - -- In TiCDC, BLOB, TEXT, GEOMETRY, or JSON column 'binaryRepresentation' can't have a default value - -- The defaultValueExpression of BIT may not be the same when the length of the default value is not equal to the column length +- In TiCDC, BLOB, TEXT, GEOMETRY, or JSON column can't have a default value - Debezium FLOAT data convert "5.61" to "5.610000133514404", but TiCDC does not. From 61b0b2a614b8dc1b1d9ccbbf050d5c825433bb14 Mon Sep 17 00:00:00 2001 From: nhsmw Date: Mon, 9 Dec 2024 15:19:18 +0800 Subject: [PATCH 10/17] Update ticdc-debezium.md --- ticdc/ticdc-debezium.md | 15 ++++----------- 1 file changed, 4 insertions(+), 11 deletions(-) diff --git a/ticdc/ticdc-debezium.md b/ticdc/ticdc-debezium.md index 8f909bd09e10d..51c7d4ccc34ac 100644 --- a/ticdc/ticdc-debezium.md +++ b/ticdc/ticdc-debezium.md @@ -776,9 +776,8 @@ The key fields of the preceding JSON data are explained as follows: | schema.optional| Boolean | Indicates whether the field is optional. When it is `true`, the field is optional. | | schema.type | String | The data type of the field. | -### Changes in TiCDC Debezium -#### Data type mapping +### Data type mapping The data format mapping in the TiCDC Debezium message basically follows the [Debezium data type mapping rules](https://debezium.io/documentation/reference/2.4/connectors/mysql.html#mysql-data-types), which is generally consistent with the native message of the Debezium Connector for MySQL. However, for some data types, the following differences exist between TiCDC Debezium and Debezium Connector messages: @@ -786,15 +785,9 @@ The data format mapping in the TiCDC Debezium message basically follows the [Deb - For string-like data types, including Varchar, String, VarString, TinyBlob, MediumBlob, BLOB, and LongBlob, when the column has the BINARY flag, TiCDC encodes it as a String type after encoding it in Base64; when the column does not have the BINARY flag, TiCDC encodes it directly as a String type. The native Debezium Connector encodes it in different ways according to `binary.handling.mode`. -- For the Decimal data type, including `DECIMAL` and `NUMERIC`, TiCDC uses the float64 type to represent it. The native Debezium Connector encodes it in float32 or float64 according to the different precision of the data type. +- For the Decimal data type, including DECIMAL and NUMERIC, TiCDC uses the float64 type to represent it. The native Debezium Connector encodes it in float32 or float64 according to the different precision of the data type. -- TiCDC converts REAL to FLOAT when setting sql_mode='REAL_AS_FLOAT' - -- TiCDC converts BOOLEAN to TINYINT(1) - -#### Different values display - -The values of some columns may be different between Debezium and TiCDC: +- TiCDC converts REAL to DOUBLE, and converts BOOLEAN to TINYINT(1) when the length is one. - In TiCDC, BLOB, TEXT, GEOMETRY, or JSON column can't have a default value @@ -806,4 +799,4 @@ The values of some columns may be different between Debezium and TiCDC: - Debezium escapes character, but TiCDC does not. e.g. ENUM('c, 'd', 'g,''h') -- TiCDC converts "TIME" default value '1000-00-00 01:00:00.000' to "1000-00-00", but Debezium doesn't. +- TiCDC converts the default value of TIME like '1000-00-00 01:00:00.000' to "1000-00-00", but Debezium does not. From 996d8f98fbe5619c64f37d23731bedd861136ea2 Mon Sep 17 00:00:00 2001 From: nhsmw Date: Thu, 12 Dec 2024 11:06:37 +0800 Subject: [PATCH 11/17] Update ticdc-debezium.md --- ticdc/ticdc-debezium.md | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/ticdc/ticdc-debezium.md b/ticdc/ticdc-debezium.md index 51c7d4ccc34ac..d2a6a6de0b85b 100644 --- a/ticdc/ticdc-debezium.md +++ b/ticdc/ticdc-debezium.md @@ -62,7 +62,6 @@ The fields in the key only include database name. The fields are explained as fo |:------------------|:--------|:----------------------------------------------------------------------------| | payload | JSON | The information about database name. | | schema.fields | JSON | The type information of each field in the payload. | -| schema.name | String | Constant value "io.debezium.connector.mysql.SchemaChangeKey" | | schema.type | String | The data type of the field. | | schema.optional | Boolean | Indicates whether the field is optional. When it is `true`, the field is optional. | | schema.version | String | The schema version. | @@ -405,7 +404,7 @@ The key fields of the preceding JSON data are explained as follows: | payload.tableChanges.type | String | Describes the kind of change. The value is one of the following: CREATE Table created. ALTER Table modified. DROP Table deleted. | | payload.tableChanges.id | String | Full identifier of the table that was created, altered, or dropped. In the case of a table rename, this identifier is a concatenation of , table names. | | payload.tableChanges.table.defaultCharsetName | string | The charset of the table where the event occurs. | -| payload.tableChanges.table.primaryKeyColumnNames | string | List of columns that compose the table’s primary key. | +| payload.tableChanges.table.primaryKeyColumnNames | string | List of columns that compose the table's primary key. | | payload.tableChanges.table.columns | Array | Metadata for each column in the changed table. | | payload.tableChanges.table.columns.name | String | The name of the column. | | payload.tableChanges.table.columns.jdbcType | Number | The jdbc type of the column. | @@ -797,6 +796,6 @@ The data format mapping in the TiCDC Debezium message basically follows the [Deb - Debezium converts charsetName to "utf8mb4" when column COLLATE is "utf8_unicode_ci" and CHARACTER is null, but TiCDC does not. -- Debezium escapes character, but TiCDC does not. e.g. ENUM('c, 'd', 'g,''h') - +- Debezium escapes character, but TiCDC does not. for example, Debezium encode ENUM elements ('c', 'd', 'g,''h') to ('c','d','g,\'\'h') + - TiCDC converts the default value of TIME like '1000-00-00 01:00:00.000' to "1000-00-00", but Debezium does not. From 852aa42e0feb5c46c7160b2a45c0fd61a787b502 Mon Sep 17 00:00:00 2001 From: lilin90 Date: Mon, 6 Jan 2025 11:58:10 +0800 Subject: [PATCH 12/17] ticdc: replace each hard Tab with four spaces --- ticdc/ticdc-debezium.md | 1213 +++++++++++++++++++-------------------- 1 file changed, 606 insertions(+), 607 deletions(-) diff --git a/ticdc/ticdc-debezium.md b/ticdc/ticdc-debezium.md index d2a6a6de0b85b..de9ca451ab936 100644 --- a/ticdc/ticdc-debezium.md +++ b/ticdc/ticdc-debezium.md @@ -37,22 +37,22 @@ TiCDC encodes a DDL event into a Kafka message, with both the key and value enco ```json { - "payload": { - "databaseName": "test" - }, - "schema": { - "type": "struct", - "name": "io.debezium.connector.mysql.SchemaChangeKey", - "optional": false, - "version": 1, - "fields": [ - { - "field": "databaseName", - "optional": false, - "type": "string" - } - ] - } + "payload": { + "databaseName": "test" + }, + "schema": { + "type": "struct", + "name": "io.debezium.connector.mysql.SchemaChangeKey", + "optional": false, + "version": 1, + "fields": [ + { + "field": "databaseName", + "optional": false, + "type": "string" + } + ] + } } ``` @@ -71,322 +71,322 @@ The fields in the key only include database name. The fields are explained as fo ```json { - "payload": { - "source": { - "version": "2.4.0.Final", - "connector": "TiCDC", - "name": "test_cluster", - "ts_ms": 0, - "snapshot": "false", - "db": "test", - "table": "table1", - "server_id": 0, - "gtid": null, - "file": "", - "pos": 0, - "row": 0, - "thread": 0, - "query": null, - "commit_ts": 1, - "cluster_id": "test_cluster" - }, - "ts_ms": 1701326309000, - "databaseName": "test", - "schemaName": null, - "ddl": "RENAME TABLE test.table1 to test.table2", - "tableChanges": [ - { - "type": "ALTER", - "id": "\"test\".\"table2\",\"test\".\"table1\"", - "table": { - "defaultCharsetName": "", - "primaryKeyColumnNames": [ - "id" - ], - "columns": [ - { - "name": "id", - "jdbcType": 4, - "nativeType": null, - "comment": null, - "defaultValueExpression": null, - "enumValues": null, - "typeName": "INT", - "typeExpression": "INT", - "charsetName": null, - "length": 0, - "scale": null, - "position": 1, - "optional": false, - "autoIncremented": false, - "generated": false - } - ], - "comment": null - } - } - ] - }, - "schema": { - "optional": false, - "type": "struct", - "version": 1, - "name": "io.debezium.connector.mysql.SchemaChangeValue", - "fields": [ - { - "field": "source", - "name": "io.debezium.connector.mysql.Source", - "optional": false, - "type": "struct", - "fields": [ - { - "field": "version", - "optional": false, - "type": "string" - }, - { - "field": "connector", - "optional": false, - "type": "string" - }, - { - "field": "name", - "optional": false, - "type": "string" - }, - { - "field": "ts_ms", - "optional": false, - "type": "int64" - }, - { - "field": "snapshot", - "optional": true, - "type": "string", - "parameters": { - "allowed": "true,last,false,incremental" - }, - "default": "false", - "name": "io.debezium.data.Enum", - "version": 1 - }, - { - "field": "db", - "optional": false, - "type": "string" - }, - { - "field": "sequence", - "optional": true, - "type": "string" - }, - { - "field": "table", - "optional": true, - "type": "string" - }, - { - "field": "server_id", - "optional": false, - "type": "int64" - }, - { - "field": "gtid", - "optional": true, - "type": "string" - }, - { - "field": "file", - "optional": false, - "type": "string" - }, - { - "field": "pos", - "optional": false, - "type": "int64" - }, - { - "field": "row", - "optional": false, - "type": "int32" - }, - { - "field": "thread", - "optional": true, - "type": "int64" - }, - { - "field": "query", - "optional": true, - "type": "string" - } - ] - }, - { - "field": "ts_ms", - "optional": false, - "type": "int64" - }, - { - "field": "databaseName", - "optional": true, - "type": "string" - }, - { - "field": "schemaName", - "optional": true, - "type": "string" - }, - { - "field": "ddl", - "optional": true, - "type": "string" - }, - { - "field": "tableChanges", - "optional": false, - "type": "array", - "items": { - "name": "io.debezium.connector.schema.Change", - "optional": false, - "type": "struct", - "version": 1, - "fields": [ - { - "field": "type", - "optional": false, - "type": "string" - }, - { - "field": "id", - "optional": false, - "type": "string" - }, - { - "field": "table", - "optional": true, - "type": "struct", - "name": "io.debezium.connector.schema.Table", - "version": 1, - "fields": [ - { - "field": "defaultCharsetName", - "optional": true, - "type": "string" - }, - { - "field": "primaryKeyColumnNames", - "optional": true, - "type": "array", - "items": { - "type": "string", - "optional": false - } - }, - { - "field": "columns", - "optional": false, - "type": "array", - "items": { - "name": "io.debezium.connector.schema.Column", - "optional": false, - "type": "struct", - "version": 1, - "fields": [ - { - "field": "name", - "optional": false, - "type": "string" - }, - { - "field": "jdbcType", - "optional": false, - "type": "int32" - }, - { - "field": "nativeType", - "optional": true, - "type": "int32" - }, - { - "field": "typeName", - "optional": false, - "type": "string" - }, - { - "field": "typeExpression", - "optional": true, - "type": "string" - }, - { - "field": "charsetName", - "optional": true, - "type": "string" - }, - { - "field": "length", - "optional": true, - "type": "int32" - }, - { - "field": "scale", - "optional": true, - "type": "int32" - }, - { - "field": "position", - "optional": false, - "type": "int32" - }, - { - "field": "optional", - "optional": true, - "type": "boolean" - }, - { - "field": "autoIncremented", - "optional": true, - "type": "boolean" - }, - { - "field": "generated", - "optional": true, - "type": "boolean" - }, - { - "field": "comment", - "optional": true, - "type": "string" - }, - { - "field": "defaultValueExpression", - "optional": true, - "type": "string" - }, - { - "field": "enumValues", - "optional": true, - "type": "array", - "items": { - "type": "string", - "optional": false - } - } - ] - } - }, - { - "field": "comment", - "optional": true, - "type": "string" - } - ] - } - ] - } - } - ] - } + "payload": { + "source": { + "version": "2.4.0.Final", + "connector": "TiCDC", + "name": "test_cluster", + "ts_ms": 0, + "snapshot": "false", + "db": "test", + "table": "table1", + "server_id": 0, + "gtid": null, + "file": "", + "pos": 0, + "row": 0, + "thread": 0, + "query": null, + "commit_ts": 1, + "cluster_id": "test_cluster" + }, + "ts_ms": 1701326309000, + "databaseName": "test", + "schemaName": null, + "ddl": "RENAME TABLE test.table1 to test.table2", + "tableChanges": [ + { + "type": "ALTER", + "id": "\"test\".\"table2\",\"test\".\"table1\"", + "table": { + "defaultCharsetName": "", + "primaryKeyColumnNames": [ + "id" + ], + "columns": [ + { + "name": "id", + "jdbcType": 4, + "nativeType": null, + "comment": null, + "defaultValueExpression": null, + "enumValues": null, + "typeName": "INT", + "typeExpression": "INT", + "charsetName": null, + "length": 0, + "scale": null, + "position": 1, + "optional": false, + "autoIncremented": false, + "generated": false + } + ], + "comment": null + } + } + ] + }, + "schema": { + "optional": false, + "type": "struct", + "version": 1, + "name": "io.debezium.connector.mysql.SchemaChangeValue", + "fields": [ + { + "field": "source", + "name": "io.debezium.connector.mysql.Source", + "optional": false, + "type": "struct", + "fields": [ + { + "field": "version", + "optional": false, + "type": "string" + }, + { + "field": "connector", + "optional": false, + "type": "string" + }, + { + "field": "name", + "optional": false, + "type": "string" + }, + { + "field": "ts_ms", + "optional": false, + "type": "int64" + }, + { + "field": "snapshot", + "optional": true, + "type": "string", + "parameters": { + "allowed": "true,last,false,incremental" + }, + "default": "false", + "name": "io.debezium.data.Enum", + "version": 1 + }, + { + "field": "db", + "optional": false, + "type": "string" + }, + { + "field": "sequence", + "optional": true, + "type": "string" + }, + { + "field": "table", + "optional": true, + "type": "string" + }, + { + "field": "server_id", + "optional": false, + "type": "int64" + }, + { + "field": "gtid", + "optional": true, + "type": "string" + }, + { + "field": "file", + "optional": false, + "type": "string" + }, + { + "field": "pos", + "optional": false, + "type": "int64" + }, + { + "field": "row", + "optional": false, + "type": "int32" + }, + { + "field": "thread", + "optional": true, + "type": "int64" + }, + { + "field": "query", + "optional": true, + "type": "string" + } + ] + }, + { + "field": "ts_ms", + "optional": false, + "type": "int64" + }, + { + "field": "databaseName", + "optional": true, + "type": "string" + }, + { + "field": "schemaName", + "optional": true, + "type": "string" + }, + { + "field": "ddl", + "optional": true, + "type": "string" + }, + { + "field": "tableChanges", + "optional": false, + "type": "array", + "items": { + "name": "io.debezium.connector.schema.Change", + "optional": false, + "type": "struct", + "version": 1, + "fields": [ + { + "field": "type", + "optional": false, + "type": "string" + }, + { + "field": "id", + "optional": false, + "type": "string" + }, + { + "field": "table", + "optional": true, + "type": "struct", + "name": "io.debezium.connector.schema.Table", + "version": 1, + "fields": [ + { + "field": "defaultCharsetName", + "optional": true, + "type": "string" + }, + { + "field": "primaryKeyColumnNames", + "optional": true, + "type": "array", + "items": { + "type": "string", + "optional": false + } + }, + { + "field": "columns", + "optional": false, + "type": "array", + "items": { + "name": "io.debezium.connector.schema.Column", + "optional": false, + "type": "struct", + "version": 1, + "fields": [ + { + "field": "name", + "optional": false, + "type": "string" + }, + { + "field": "jdbcType", + "optional": false, + "type": "int32" + }, + { + "field": "nativeType", + "optional": true, + "type": "int32" + }, + { + "field": "typeName", + "optional": false, + "type": "string" + }, + { + "field": "typeExpression", + "optional": true, + "type": "string" + }, + { + "field": "charsetName", + "optional": true, + "type": "string" + }, + { + "field": "length", + "optional": true, + "type": "int32" + }, + { + "field": "scale", + "optional": true, + "type": "int32" + }, + { + "field": "position", + "optional": false, + "type": "int32" + }, + { + "field": "optional", + "optional": true, + "type": "boolean" + }, + { + "field": "autoIncremented", + "optional": true, + "type": "boolean" + }, + { + "field": "generated", + "optional": true, + "type": "boolean" + }, + { + "field": "comment", + "optional": true, + "type": "string" + }, + { + "field": "defaultValueExpression", + "optional": true, + "type": "string" + }, + { + "field": "enumValues", + "optional": true, + "type": "array", + "items": { + "type": "string", + "optional": false + } + } + ] + } + }, + { + "field": "comment", + "optional": true, + "type": "string" + } + ] + } + ] + } + } + ] + } } ``` @@ -419,7 +419,7 @@ The key fields of the preceding JSON data are explained as follows: | schema.fields | JSON | The type information of each field in the payload, including the schema information of the column of table changes. | | schema.name | String | The name of the schema, in the `"{cluster-name}.{schema-name}.{table-name}.SchemaChangeValue"` format. | | schema.optional | Boolean | Indicates whether the field is optional. When it is `true`, the field is optional. | -| schema.type | String | The data type of the field. | +| schema.type | String | The data type of the field. | ### DML event @@ -429,21 +429,21 @@ TiCDC encodes a DML event into a Kafka message, with both the key and value enco ```json { - "payload": { - "tiny": 1 - }, - "schema": { - "fields": [ - { - "field":"tiny", - "optional":true, - "type":"int16" - } - ], - "name": "test_cluster.test.table1.Key", - "optional": false, - "type":"struct" - } + "payload": { + "tiny": 1 + }, + "schema": { + "fields": [ + { + "field":"tiny", + "optional":true, + "type":"int16" + } + ], + "name": "test_cluster.test.table1.Key", + "optional": false, + "type":"struct" + } } ``` @@ -461,102 +461,102 @@ The fields in the key only include primary key or unique index columns. The fiel ```json { - "payload": { - "source": { - "version": "2.4.0.Final", - "connector": "TiCDC", - "name": "test_cluster", - "ts_ms": 0, - "snapshot": "false", - "db": "test", - "table": "table1", - "server_id": 0, - "gtid": null, - "file": "", - "pos": 0, - "row": 0, - "thread": 0, - "query": null, - "commit_ts": 1, - "cluster_id": "test_cluster" - }, - "ts_ms": 1701326309000, - "transaction": null, - "op": "u", - "before": { "tiny": 2 }, - "after": { "tiny": 1 } - }, - "schema": { - "type": "struct", - "optional": false, - "name": "test_cluster.test.table1.Envelope", - "version": 1, - "fields": [ - { - "type": "struct", - "optional": true, - "name": "test_cluster.test.table1.Value", - "field": "before", - "fields": [{ "type": "int16", "optional": true, "field": "tiny" }] - }, - { - "type": "struct", - "optional": true, - "name": "test_cluster.test.table1.Value", - "field": "after", - "fields": [{ "type": "int16", "optional": true, "field": "tiny" }] - }, - { - "type": "struct", - "fields": [ - { "type": "string", "optional": false, "field": "version" }, - { "type": "string", "optional": false, "field": "connector" }, - { "type": "string", "optional": false, "field": "name" }, - { "type": "int64", "optional": false, "field": "ts_ms" }, - { - "type": "string", - "optional": true, - "name": "io.debezium.data.Enum", - "version": 1, - "parameters": { "allowed": "true,last,false,incremental" }, - "default": "false", - "field": "snapshot" - }, - { "type": "string", "optional": false, "field": "db" }, - { "type": "string", "optional": true, "field": "sequence" }, - { "type": "string", "optional": true, "field": "table" }, - { "type": "int64", "optional": false, "field": "server_id" }, - { "type": "string", "optional": true, "field": "gtid" }, - { "type": "string", "optional": false, "field": "file" }, - { "type": "int64", "optional": false, "field": "pos" }, - { "type": "int32", "optional": false, "field": "row" }, - { "type": "int64", "optional": true, "field": "thread" }, - { "type": "string", "optional": true, "field": "query" } - ], - "optional": false, - "name": "io.debezium.connector.mysql.Source", - "field": "source" - }, - { "type": "string", "optional": false, "field": "op" }, - { "type": "int64", "optional": true, "field": "ts_ms" }, - { - "type": "struct", - "fields": [ - { "type": "string", "optional": false, "field": "id" }, - { "type": "int64", "optional": false, "field": "total_order" }, - { - "type": "int64", - "optional": false, - "field": "data_collection_order" - } - ], - "optional": true, - "name": "event.block", - "version": 1, - "field": "transaction" - } - ] - } + "payload": { + "source": { + "version": "2.4.0.Final", + "connector": "TiCDC", + "name": "test_cluster", + "ts_ms": 0, + "snapshot": "false", + "db": "test", + "table": "table1", + "server_id": 0, + "gtid": null, + "file": "", + "pos": 0, + "row": 0, + "thread": 0, + "query": null, + "commit_ts": 1, + "cluster_id": "test_cluster" + }, + "ts_ms": 1701326309000, + "transaction": null, + "op": "u", + "before": { "tiny": 2 }, + "after": { "tiny": 1 } + }, + "schema": { + "type": "struct", + "optional": false, + "name": "test_cluster.test.table1.Envelope", + "version": 1, + "fields": [ + { + "type": "struct", + "optional": true, + "name": "test_cluster.test.table1.Value", + "field": "before", + "fields": [{ "type": "int16", "optional": true, "field": "tiny" }] + }, + { + "type": "struct", + "optional": true, + "name": "test_cluster.test.table1.Value", + "field": "after", + "fields": [{ "type": "int16", "optional": true, "field": "tiny" }] + }, + { + "type": "struct", + "fields": [ + { "type": "string", "optional": false, "field": "version" }, + { "type": "string", "optional": false, "field": "connector" }, + { "type": "string", "optional": false, "field": "name" }, + { "type": "int64", "optional": false, "field": "ts_ms" }, + { + "type": "string", + "optional": true, + "name": "io.debezium.data.Enum", + "version": 1, + "parameters": { "allowed": "true,last,false,incremental" }, + "default": "false", + "field": "snapshot" + }, + { "type": "string", "optional": false, "field": "db" }, + { "type": "string", "optional": true, "field": "sequence" }, + { "type": "string", "optional": true, "field": "table" }, + { "type": "int64", "optional": false, "field": "server_id" }, + { "type": "string", "optional": true, "field": "gtid" }, + { "type": "string", "optional": false, "field": "file" }, + { "type": "int64", "optional": false, "field": "pos" }, + { "type": "int32", "optional": false, "field": "row" }, + { "type": "int64", "optional": true, "field": "thread" }, + { "type": "string", "optional": true, "field": "query" } + ], + "optional": false, + "name": "io.debezium.connector.mysql.Source", + "field": "source" + }, + { "type": "string", "optional": false, "field": "op" }, + { "type": "int64", "optional": true, "field": "ts_ms" }, + { + "type": "struct", + "fields": [ + { "type": "string", "optional": false, "field": "id" }, + { "type": "int64", "optional": false, "field": "total_order" }, + { + "type": "int64", + "optional": false, + "field": "data_collection_order" + } + ], + "optional": true, + "name": "event.block", + "version": 1, + "field": "transaction" + } + ] + } } ``` @@ -584,13 +584,13 @@ TiCDC encodes a Checkpoint event into a Kafka message, with both the key and val ```json { - "payload": {}, - "schema": { - "fields": [], - "optional": false, - "name": "test_cluster.watermark.Key", - "type": "struct" - } + "payload": {}, + "schema": { + "fields": [], + "optional": false, + "name": "test_cluster.watermark.Key", + "type": "struct" + } } ``` @@ -604,160 +604,160 @@ The fields are explained as follows: ```json { - "payload": { - "source": { - "version": "2.4.0.Final", - "connector": "TiCDC", - "name": "test_cluster", - "ts_ms": 0, - "snapshot": "false", - "db": "", - "table": "", - "server_id": 0, - "gtid": null, - "file": "", - "pos": 0, - "row": 0, - "thread": 0, - "query": null, - "commit_ts": 3, - "cluster_id": "test_cluster" - }, - "op": "m", - "ts_ms": 1701326309000, - "transaction": null - }, - "schema": { - "type": "struct", - "optional": false, - "name": "test_cluster.watermark.Envelope", - "version": 1, - "fields": [ - { - "type": "struct", - "fields": [ - { - "type": "string", - "optional": false, - "field": "version" - }, - { - "type": "string", - "optional": false, - "field": "connector" - }, - { - "type": "string", - "optional": false, - "field": "name" - }, - { - "type": "int64", - "optional": false, - "field": "ts_ms" - }, - { - "type": "string", - "optional": true, - "name": "io.debezium.data.Enum", - "version": 1, - "parameters": { - "allowed": "true,last,false,incremental" - }, - "default": "false", - "field": "snapshot" - }, - { - "type": "string", - "optional": false, - "field": "db" - }, - { - "type": "string", - "optional": true, - "field": "sequence" - }, - { - "type": "string", - "optional": true, - "field": "table" - }, - { - "type": "int64", - "optional": false, - "field": "server_id" - }, - { - "type": "string", - "optional": true, - "field": "gtid" - }, - { - "type": "string", - "optional": false, - "field": "file" - }, - { - "type": "int64", - "optional": false, - "field": "pos" - }, - { - "type": "int32", - "optional": false, - "field": "row" - }, - { - "type": "int64", - "optional": true, - "field": "thread" - }, - { - "type": "string", - "optional": true, - "field": "query" - } - ], - "optional": false, - "name": "io.debezium.connector.mysql.Source", - "field": "source" - }, - { - "type": "string", - "optional": false, - "field": "op" - }, - { - "type": "int64", - "optional": true, - "field": "ts_ms" - }, - { - "type": "struct", - "fields": [ - { - "type": "string", - "optional": false, - "field": "id" - }, - { - "type": "int64", - "optional": false, - "field": "total_order" - }, - { - "type": "int64", - "optional": false, - "field": "data_collection_order" - } - ], - "optional": true, - "name": "event.block", - "version": 1, - "field": "transaction" - } - ] - } + "payload": { + "source": { + "version": "2.4.0.Final", + "connector": "TiCDC", + "name": "test_cluster", + "ts_ms": 0, + "snapshot": "false", + "db": "", + "table": "", + "server_id": 0, + "gtid": null, + "file": "", + "pos": 0, + "row": 0, + "thread": 0, + "query": null, + "commit_ts": 3, + "cluster_id": "test_cluster" + }, + "op": "m", + "ts_ms": 1701326309000, + "transaction": null + }, + "schema": { + "type": "struct", + "optional": false, + "name": "test_cluster.watermark.Envelope", + "version": 1, + "fields": [ + { + "type": "struct", + "fields": [ + { + "type": "string", + "optional": false, + "field": "version" + }, + { + "type": "string", + "optional": false, + "field": "connector" + }, + { + "type": "string", + "optional": false, + "field": "name" + }, + { + "type": "int64", + "optional": false, + "field": "ts_ms" + }, + { + "type": "string", + "optional": true, + "name": "io.debezium.data.Enum", + "version": 1, + "parameters": { + "allowed": "true,last,false,incremental" + }, + "default": "false", + "field": "snapshot" + }, + { + "type": "string", + "optional": false, + "field": "db" + }, + { + "type": "string", + "optional": true, + "field": "sequence" + }, + { + "type": "string", + "optional": true, + "field": "table" + }, + { + "type": "int64", + "optional": false, + "field": "server_id" + }, + { + "type": "string", + "optional": true, + "field": "gtid" + }, + { + "type": "string", + "optional": false, + "field": "file" + }, + { + "type": "int64", + "optional": false, + "field": "pos" + }, + { + "type": "int32", + "optional": false, + "field": "row" + }, + { + "type": "int64", + "optional": true, + "field": "thread" + }, + { + "type": "string", + "optional": true, + "field": "query" + } + ], + "optional": false, + "name": "io.debezium.connector.mysql.Source", + "field": "source" + }, + { + "type": "string", + "optional": false, + "field": "op" + }, + { + "type": "int64", + "optional": true, + "field": "ts_ms" + }, + { + "type": "struct", + "fields": [ + { + "type": "string", + "optional": false, + "field": "id" + }, + { + "type": "int64", + "optional": false, + "field": "total_order" + }, + { + "type": "int64", + "optional": false, + "field": "data_collection_order" + } + ], + "optional": true, + "name": "event.block", + "version": 1, + "field": "transaction" + } + ] + } } ``` @@ -775,7 +775,6 @@ The key fields of the preceding JSON data are explained as follows: | schema.optional| Boolean | Indicates whether the field is optional. When it is `true`, the field is optional. | | schema.type | String | The data type of the field. | - ### Data type mapping The data format mapping in the TiCDC Debezium message basically follows the [Debezium data type mapping rules](https://debezium.io/documentation/reference/2.4/connectors/mysql.html#mysql-data-types), which is generally consistent with the native message of the Debezium Connector for MySQL. However, for some data types, the following differences exist between TiCDC Debezium and Debezium Connector messages: @@ -797,5 +796,5 @@ The data format mapping in the TiCDC Debezium message basically follows the [Deb - Debezium converts charsetName to "utf8mb4" when column COLLATE is "utf8_unicode_ci" and CHARACTER is null, but TiCDC does not. - Debezium escapes character, but TiCDC does not. for example, Debezium encode ENUM elements ('c', 'd', 'g,''h') to ('c','d','g,\'\'h') - + - TiCDC converts the default value of TIME like '1000-00-00 01:00:00.000' to "1000-00-00", but Debezium does not. From abbe96d51923d1f90ea31b7c77b2c624cbf00ab9 Mon Sep 17 00:00:00 2001 From: lilin90 Date: Mon, 6 Jan 2025 12:02:31 +0800 Subject: [PATCH 13/17] ticdc: update format --- ticdc/ticdc-debezium.md | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/ticdc/ticdc-debezium.md b/ticdc/ticdc-debezium.md index de9ca451ab936..e34151e4891db 100644 --- a/ticdc/ticdc-debezium.md +++ b/ticdc/ticdc-debezium.md @@ -66,7 +66,6 @@ The fields in the key only include database name. The fields are explained as fo | schema.optional | Boolean | Indicates whether the field is optional. When it is `true`, the field is optional. | | schema.version | String | The schema version. | - #### Value format ```json @@ -451,11 +450,11 @@ The fields in the key only include primary key or unique index columns. The fiel | Field | Type | Description | |:------------------|:--------|:----------------------------------------------------------------------------| -| payload | JSON | The information about primary key or unique index columns. The key and value in each field represent the column name and its current value, respectively. | -| schema.fields | JSON | The type information of each field in the payload, including the schema information of the row data before and after the change. | -| schema.name` | String | The name of the schema, in the `"{cluster-name}.{schema-name}.{table-name}.Key"` format. | -| schema.optional | Boolean | Indicates whether the field is optional. When it is `true`, the field is optional. | -| schema.type | String | The data type of the field. | +| `payload` | JSON | The information about primary key or unique index columns. The key and value in each field represent the column name and its current value, respectively. | +| `schema.fields` | JSON | The type information of each field in the payload, including the schema information of the row data before and after the change. | +| `schema.name` | String | The name of the schema, in the `"{cluster-name}.{schema-name}.{table-name}.Key"` format. | +| `schema.optional` | Boolean | Indicates whether the field is optional. When it is `true`, the field is optional. | +| `schema.type` | String | The data type of the field. | #### Value format From bc9013d483e164589a40b7cce1cb2e741dc0e023 Mon Sep 17 00:00:00 2001 From: lilin90 Date: Mon, 6 Jan 2025 12:14:34 +0800 Subject: [PATCH 14/17] ticdc: avoid manual line breaks --- ticdc/ticdc-debezium.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/ticdc/ticdc-debezium.md b/ticdc/ticdc-debezium.md index e34151e4891db..f5d60c889f194 100644 --- a/ticdc/ticdc-debezium.md +++ b/ticdc/ticdc-debezium.md @@ -14,7 +14,9 @@ When you use Kafka as the downstream sink, specify the `protocol` field as `debe There are three types of Events: DDL Event: Represents a DDL change record. It is sent after an upstream DDL statement is successfully executed. The DDL Event is sent to the MQ Partition with the index being 0. + DML Event: Represents a row data change record. This type of Event is sent when a row change occurs. It contains the information about the row after the change occurs. + WATERMARK Event: Represents a special time point. It indicates that the Events received before this point is complete. The configuration example for using the Debezium message format is as follows: From e6311da8d8719bf3aedba147776e7c32b76855ef Mon Sep 17 00:00:00 2001 From: lilin90 Date: Mon, 6 Jan 2025 14:32:03 +0800 Subject: [PATCH 15/17] ticdc: fix unclosed tags --- ticdc/ticdc-debezium.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ticdc/ticdc-debezium.md b/ticdc/ticdc-debezium.md index f5d60c889f194..3383978a69d16 100644 --- a/ticdc/ticdc-debezium.md +++ b/ticdc/ticdc-debezium.md @@ -403,7 +403,7 @@ The key fields of the preceding JSON data are explained as follows: | payload.source.table | String | The name of the table where the event occurs. | | payload.tableChanges | Array | A structured representation of the entire table schema after the schema change. The tableChanges field contains an array that includes entries for each column of the table. Because the structured representation presents data in JSON or Avro format, consumers can easily read messages without first processing them through a DDL parser. | | payload.tableChanges.type | String | Describes the kind of change. The value is one of the following: CREATE Table created. ALTER Table modified. DROP Table deleted. | -| payload.tableChanges.id | String | Full identifier of the table that was created, altered, or dropped. In the case of a table rename, this identifier is a concatenation of , table names. | +| payload.tableChanges.id | String | Full identifier of the table that was created, altered, or dropped. In the case of a table rename, this identifier is a concatenation of `` and `` table names. | | payload.tableChanges.table.defaultCharsetName | string | The charset of the table where the event occurs. | | payload.tableChanges.table.primaryKeyColumnNames | string | List of columns that compose the table's primary key. | | payload.tableChanges.table.columns | Array | Metadata for each column in the changed table. | From 5c2a154bd55740ce4fe423b76b42ae803262afa8 Mon Sep 17 00:00:00 2001 From: nhsmw Date: Fri, 10 Jan 2025 12:32:39 +0800 Subject: [PATCH 16/17] Update ticdc-debezium.md --- ticdc/ticdc-debezium.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/ticdc/ticdc-debezium.md b/ticdc/ticdc-debezium.md index 3383978a69d16..f2d6a1d66bb87 100644 --- a/ticdc/ticdc-debezium.md +++ b/ticdc/ticdc-debezium.md @@ -5,7 +5,7 @@ summary: Learn the concept of the TiCDC Debezium Protocol and how to use it. # TiCDC Debezium Protocol -[Debezium](https://debezium.io/) is a tool for capturing database changes. It converts each captured database change into a message called an "event" and sends these events to Kafka. Starting from v8.0.0, TiCDC supports sending TiDB changes to Kafka using a Debezium style output format, simplifying migration from MySQL databases for users who had previously been using Debezium's MySQL integration. Starting from v9.1, TiCDC supports DDL events and watermark events. +[Debezium](https://debezium.io/) is a tool for capturing database changes. It converts each captured database change into a message called an "event" and sends these events to Kafka. Starting from v8.0.0, TiCDC supports sending TiDB changes to Kafka using a Debezium style output format, simplifying migration from MySQL databases for users who had previously been using Debezium's MySQL integration. Starting from v9.0, TiCDC supports DDL events and watermark events. ## Use the Debezium message format @@ -788,7 +788,7 @@ The data format mapping in the TiCDC Debezium message basically follows the [Deb - TiCDC converts REAL to DOUBLE, and converts BOOLEAN to TINYINT(1) when the length is one. -- In TiCDC, BLOB, TEXT, GEOMETRY, or JSON column can't have a default value +- In TiCDC, BLOB, TEXT, GEOMETRY, or JSON column haven't a default value - Debezium FLOAT data convert "5.61" to "5.610000133514404", but TiCDC does not. @@ -796,6 +796,6 @@ The data format mapping in the TiCDC Debezium message basically follows the [Deb - Debezium converts charsetName to "utf8mb4" when column COLLATE is "utf8_unicode_ci" and CHARACTER is null, but TiCDC does not. -- Debezium escapes character, but TiCDC does not. for example, Debezium encode ENUM elements ('c', 'd', 'g,''h') to ('c','d','g,\'\'h') +- Debezium escapes ENUM elements, but TiCDC does not. for example, Debezium encodes ENUM elements ('c', 'd', 'g,''h') to ('c','d','g,\'\'h') - TiCDC converts the default value of TIME like '1000-00-00 01:00:00.000' to "1000-00-00", but Debezium does not. From b4fbf1d0a0749b5bac1b1d481e08f8115b301ea1 Mon Sep 17 00:00:00 2001 From: nhsmw Date: Mon, 13 Jan 2025 10:59:04 +0800 Subject: [PATCH 17/17] Update ticdc-debezium.md --- ticdc/ticdc-debezium.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ticdc/ticdc-debezium.md b/ticdc/ticdc-debezium.md index f2d6a1d66bb87..1ffabd19e74eb 100644 --- a/ticdc/ticdc-debezium.md +++ b/ticdc/ticdc-debezium.md @@ -17,7 +17,7 @@ DDL Event: Represents a DDL change record. It is sent after an upstream DDL stat DML Event: Represents a row data change record. This type of Event is sent when a row change occurs. It contains the information about the row after the change occurs. -WATERMARK Event: Represents a special time point. It indicates that the Events received before this point is complete. +WATERMARK Event: Represents a special time point. It indicates that the Events received before this point is complete. It applies only to the TiDB extension field and takes effect when you set `enable-tidb-extension` to `true` in `sink-uri`. The configuration example for using the Debezium message format is as follows: