From dab6cdfd58e37f71ab93fb6fd8194314e4e1ec00 Mon Sep 17 00:00:00 2001 From: Nikhil Benesch Date: Sun, 10 Dec 2023 17:24:47 -0500 Subject: [PATCH] doc/user: document configurable group and transactional IDs for Kafka This commit adds documentation for the features added in #23792. See that PR for details. --- doc/user/content/sql/create-sink/kafka.md | 17 +++++++----- doc/user/content/sql/create-source/kafka.md | 27 +++++++------------ .../content/sql/system-catalog/mz_internal.md | 2 +- 3 files changed, 21 insertions(+), 25 deletions(-) diff --git a/doc/user/content/sql/create-sink/kafka.md b/doc/user/content/sql/create-sink/kafka.md index 56d8816e4a823..b794d4c996bed 100644 --- a/doc/user/content/sql/create-sink/kafka.md +++ b/doc/user/content/sql/create-sink/kafka.md @@ -56,10 +56,13 @@ _item_name_ | The name of the source, table or materialized view you want ### `CONNECTION` options -Field | Value | Description ----------------------|--------|------------ -`TOPIC` | `text` | The prefix used to generate the Kafka topic name to create and write to. -`COMPRESSION TYPE` | `text` | Default: `none`. The type of compression to apply to messages before they are sent to Kafka: `none`, `gzip`, `snappy`, `lz4`, or `zstd`. +Field | Value | Description +---------------------------|--------|------------ +`TOPIC` | `text` | The name of the Kafka topic to write to. +`COMPRESSION TYPE` | `text` | Default: `none`. The type of compression to apply to messages before they are sent to Kafka: `none`, `gzip`, `snappy`, `lz4`, or `zstd`. +`TRANSACTIONAL ID PREFIX` | `text` | The prefix of the transactional ID to use when producing to the Kafka topic.
Default: `materialize-{REGION ID}-{CONNECTION ID}-{SINK ID}` +`PROGRESS GROUP ID PREFIX` | `text` | The prefix of the consumer group ID to use when reading from the progress topic.
Default: `materialize-{REGION ID}-{CONNECTION ID}-{SINK ID}` + ### CSR `CONNECTION` options @@ -391,9 +394,9 @@ to perform the following operations on the following resources: Operation type | Resource type | Resource name ----------------|------------------|-------------- Read, Write | Topic | Consult `mz_kafka_connections.sink_progress_topic` for the sink's connection -Write | Topic | The specified `TOPIC` option -Write | Transactional ID | `mz-producer-{SINK ID}-*` -Read | Group | `materialize-bootstrap-sink-{SINK ID}` +Write | Topic | The specified [`TOPIC` option](#connection-options) +Write | Transactional ID | All transactional IDs beginning with the specified [`TRANSACTIONAL ID PREFIX` option](#connection-options) +Read | Group | All group IDs beginning with the specified [`PROGRESS GROUP ID PREFIX` option](#connection-options) When using [automatic topic creation](#automatic-topic-creation), Materialize additionally requires access to the following operations: diff --git a/doc/user/content/sql/create-source/kafka.md b/doc/user/content/sql/create-source/kafka.md index c1c9bdc6ae666..8f0b9044b6714 100644 --- a/doc/user/content/sql/create-source/kafka.md +++ b/doc/user/content/sql/create-source/kafka.md @@ -49,6 +49,7 @@ The same syntax, supported formats and features can be used to connect to a [Red Field | Value | Description -------------------------------------|-----------|------------------------------------- `TOPIC` | `text` | The Kafka topic you want to subscribe to. +`GROUP ID PREFIX` | `text` | The prefix of the consumer group ID to use. See [Monitoring consumer lag](#monitoring-consumer-lag).
Default: `materialize-{REGION-ID}-{CONNECTION-ID}-{SOURCE_ID}` ### `WITH` options @@ -362,22 +363,20 @@ provided solely for the benefit of Kafka monitoring tools. {{< /note >}} Committed offsets are associated with a consumer group specific to the source. -The ID of the consumer group has a prefix with the following format: - -``` -materialize-{REGION-ID}-{CONNECTION-ID}-{SOURCE_ID} -``` +The ID of the consumer group consists of the prefix configured with the [`GROUP +ID PREFIX` option](#connection-options) followed by a Materialize-generated +suffix. You should not make assumptions about the number of consumer groups that Materialize will use to consume from a given source. The only guarantee is that -the ID of each consumer group will begin with the above prefix. +the ID of each consumer group will begin with the configured prefix. -The rendered consumer group ID prefix for each Kafka source in the system is -available in the `group_id_base` column of the [`mz_kafka_sources`] table. To -look up the `group_id_base` for a source by name, use: +The consumer group ID prefix for each Kafka source in the system is available in +the `group_id_prefix` column of the [`mz_kafka_sources`] table. To look up the +`group_id_prefix` for a source by name, use: ```sql -SELECT group_id_base +SELECT group_id_prefix FROM mz_internal.mz_kafka_sources ks JOIN mz_sources s ON s.id = ks.id WHERE s.name = '' @@ -391,13 +390,7 @@ to perform the following operations on the following resources: Operation type | Resource type | Resource name ---------------|------------------|-------------- Read | Topic | The specified `TOPIC` option - -To allow Materialize to [commit offsets](#monitoring-consumer-lag) to the Kafka -broker, Materialize additionally requires access to the following operations: - -Operation type | Resource type | Resource name ----------------|------------------|-------------- -Read | Group | `materialize-{REGION-ID}-{CONNECTION-ID}-{SOURCE_ID}*` +Read | Group | All group IDs starting with the specified [`GROUP ID PREFIX` option](#connection-options) ## Examples diff --git a/doc/user/content/sql/system-catalog/mz_internal.md b/doc/user/content/sql/system-catalog/mz_internal.md index 277defb2084d1..05029720d267c 100644 --- a/doc/user/content/sql/system-catalog/mz_internal.md +++ b/doc/user/content/sql/system-catalog/mz_internal.md @@ -286,7 +286,7 @@ The `mz_kafka_sources` table contains a row for each Kafka source in the system. | Field | Type | Meaning | |------------------------|----------------|-----------------------------------------------------------------------------------------------------------| | `id` | [`text`] | The ID of the Kafka source. Corresponds to [`mz_catalog.mz_sources.id`](../mz_catalog#mz_sources). | -| `group_id_prefix` | [`text`] | The prefix of the group ID that Materialize will use when consuming data for the Kafka source. | +| `group_id_prefix` | [`text`] | The value of the `GROUP ID PREFIX` connection option. | ### `mz_materialization_lag`