Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CDC: 19.1 updates #4403

Merged
merged 14 commits into from
Apr 3, 2019
Merged

CDC: 19.1 updates #4403

merged 14 commits into from
Apr 3, 2019

Conversation

lnhsingh
Copy link
Contributor

@lnhsingh lnhsingh commented Feb 20, 2019

Changes addressing #3992:

  • Added Responses section on CREATE CHANGEFEED to explain what messages get emitted to a Kafka topic for DML statements.
  • Added more description for updated and resolved timestamps, and cursor.
  • Removed results_buffer_size from docs.
  • Add info about schema changes with backfill
  • Add info about cloud storage sinks
  • Add Avro data types
  • Add how to debug

Misc changes:

  • Added / edited Avro core changefeed instructions

Closes #3992.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

Lauren added 5 commits March 28, 2019 14:37
Changes include:

- Added / edited Avro core changefeed instructions

Minor edit

Add expected responses for enterprise changefeeds

CDC updates

- Fix broken links
- Add info about cursor
- Add info about updated timestamps
- Add info about schema changes with backfill
Minor edits / links
@lnhsingh lnhsingh changed the title (WIP) CDC: 19.1 updates CDC: 19.1 updates Mar 28, 2019
@lnhsingh lnhsingh marked this pull request as ready for review March 28, 2019 18:39
@lnhsingh lnhsingh requested review from danhhz and rolandcrosby March 28, 2019 18:39
@lnhsingh lnhsingh requested a review from Amruta-Ranade March 28, 2019 18:39
Copy link

@rolandcrosby rolandcrosby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking good so far!

Reviewed 4 of 8 files at r2, 1 of 2 files at r3.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @Amruta-Ranade, @danhhz, @lhirata, and @rolandcrosby)


v19.1/change-data-capture.md, line 82 at r3 (raw file):

Rows that have been backfilled by a schema change are always re-emitted because Avro's default schema change functionality is not powerful enough to represent the schema changes that CockroachDB supports (e.g., CockroachDB columns can have default values that are arbitrary SQL expressions, but Avro only supports static default values).

To ensure that the Avro schemas that CockroachDB publishes will work with the (undocumented and inconsistent) schema compatibility rules used by the Confluent schema registry, CockroachDB emits all fields in Avro as nullable unions. This ensures that Avro and Confluent consider the schemas to be both backward- and forward-compatible. Note that the original CockroachDB column definition is also included in the schema as a doc field, so it's still possible to distinguish between a `NOT NULL` CockroachDB column and a `NULL` CockroachDB column.

on second thought, that parenthetical I added about the schema compatibility rules is a bit gratuitous


v19.1/change-data-capture.md, line 146 at r3 (raw file):

{% include copy-clipboard.html %}
~~~ sql
> CREATE CHANGEFEED FOR TABLE name INTO 'schema://host:port';

nit: scheme


v19.1/change-data-capture.md, line 213 at r3 (raw file):

{{site.data.alerts.callout_info}}
Debugging is only available for enterprise changefeeds.

"debugging is only available" sounds a little strange. Maybe "This section only applies to enterprise changefeeds using Kafka"?


v19.1/change-data-capture.md, line 216 at r3 (raw file):

{{site.data.alerts.end}}

For changefeeds connected to Kafka, use log information to debug connection issues (i.e., `kafka: client has run out of available brokers to talk to (Is your cluster reachable?)`). Debug by looking for lines in the logs with `[kafka-producer]` in them:

Link 'log information' to a page explaining CockroachDB's log files (assuming we have one)


v19.1/change-data-capture.md, line 243 at r3 (raw file):

    {% include copy-clipboard.html %}
    ~~~ shell
    oach sql --url="postgresql://[email protected]:26257?sslmode=disable" --format=csv

what happened to the beginning of this line?


v19.1/change-data-capture.md, line 306 at r3 (raw file):

    ~~~

### Create a core changefeed in Avro

"in Avro" sounds odd to me; maybe "using the Avro output format" or something?


v19.1/change-data-capture.md, line 308 at r3 (raw file):

### Create a core changefeed in Avro

<span class="version-tag">New in v19.1:</span> In this example, you'll set up a core changefeed for a single-node cluster that emits [Avro](https://docs.confluent.io/current/schema-registry/docs/serializer-formatter.html#wire-format) records.

Add a quick explanation of what the Confluent stuff is for - like "The binary Avro encoding convention used by CockroachDB uses the Confluent Schema Registry to store Avro schemas"


v19.1/change-data-capture.md, line 752 at r3 (raw file):

{% include {{ page.version.version }}/misc/experimental-warning.md %}

<span class="version-tag">New in v19.1:</span> In this example, you'll set up a changefeed for a single-node cluster that is connected to an AWS sink. Note that you can set up changefeeds for any of [these cloud storages](create-changefeed.html#cloud-storage-sink).

nit: cloud storage providers (also maybe say AWS S3 instead of just AWS)


v19.1/change-data-capture.md, line 828 at r3 (raw file):

    {% include copy-clipboard.html %}
    ~~~ sql
    > CREATE CHANGEFEED FOR TABLE office_dogs INTO 'experimental-s3://test-s3encryption/test?AWS_ACCESS_KEY_ID=enter_key-here&AWS_SECRET_ACCESS_KEY=enter_key_here' with updated, resolved='10s';

'test-s3encryption' is a slightly confusing name, maybe just 'example-bucket-name'?


v19.1/create-changefeed.md, line 61 at r3 (raw file):

----------+-------+---------------
`topic_prefix` | [`STRING`](string.html) | Adds a prefix to all of the topic names.<br><br>For example, `CREATE CHANGEFEED FOR TABLE foo INTO 'kafka://...?topic_prefix=bar_'` would emit rows under the topic `bar_foo` instead of `foo`.
`tls_enabled=true` | [`BOOL`](bool.html) | If `true`, use a Transport Layer Security (TLS) connection. This can be used with a `ca_cert` (see below).

"If true, enable Transport Layer Security on the connection to Kafka"


v19.1/create-changefeed.md, line 63 at r3 (raw file):

`tls_enabled=true` | [`BOOL`](bool.html) | If `true`, use a Transport Layer Security (TLS) connection. This can be used with a `ca_cert` (see below).
`ca_cert` | [`STRING`](string.html) | The base64-encoded `ca_cert` file.<br><br>Note: To encode your `ca.cert`, run `base64 -w 0 ca.cert`.
`sasl_enabled` | [`BOOL`](bool.html) | If `true`, use Simple Authentication and Security Layer (SASL) to authenticate. This requires a `sasl_user` and `sasl_password` (see below).

specifically SASL/PLAIN (link to https://docs.confluent.io/current/kafka/authentication_sasl/authentication_sasl_plain.html)


v19.1/create-changefeed.md, line 69 at r3 (raw file):

#### Cloud storage sink

Example of a cloud storage sink (i.e., AWS) URI:

AWS S3


v19.1/create-changefeed.md, line 87 at r3 (raw file):

Option | Value | Description
-------|-------|------------
`updated` | N/A | Include updated timestamps with each row.<br><br>If a `cursor` is provided, the "updated" timestamps will match the [MVCC](../v19.1/architecture/storage-layer.html#mvcc) timestamps of the emitted rows, and there is no initial scan.. If a `cursor` is not provided, the changefeed will perform an initial scan (as of the time the changefeed was created), and the "updated" timestamp for each change record emitted in the initial scan will be the timestamp of the initial scan. Similarly, when a [backfill is performed for a schema change](change-data-capture.html#schema-changes-with-column-backfill), the "updated" timestamp is set to the first timestamp for when the new schema is valid.

nit: .. -> .


v19.1/create-changefeed.md, line 128 at r3 (raw file):

## Responses

The messages (i.e., keys and values) emitted to a Kafka topic are composed of the following:

this is specific to the envelope format specified by the user (the default format is 'wrapped' I believe, which produces this output)


v19.1/create-changefeed.md, line 130 at r3 (raw file):

The messages (i.e., keys and values) emitted to a Kafka topic are composed of the following:

- **Key**: Always composed of the table's `PRIMARY KEY` field (e.g., `[1]` or `{"id":1}`).

specifically, the key is an array of the primary key fields of the row


v19.1/create-changefeed.md, line 131 at r3 (raw file):

- **Key**: Always composed of the table's `PRIMARY KEY` field (e.g., `[1]` or `{"id":1}`).
- **Value**:

should specify that there are three possible level fields in the value of a record emitted to CDC:

  • after, which contains the state of the row after the update (or 'null' for deletes)
  • updated, which contains the updated timestamp
  • resolved, which is emitted for records representing resolved timestamps (these records won't include an after field since they only function as checkpoints)

Copy link
Contributor Author

@lnhsingh lnhsingh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @Amruta-Ranade, @danhhz, and @rolandcrosby)


v19.1/change-data-capture.md, line 82 at r3 (raw file):

Previously, rolandcrosby (Roland Crosby) wrote…

on second thought, that parenthetical I added about the schema compatibility rules is a bit gratuitous

Removed


v19.1/change-data-capture.md, line 146 at r3 (raw file):

Previously, rolandcrosby (Roland Crosby) wrote…

nit: scheme

Done.


v19.1/change-data-capture.md, line 213 at r3 (raw file):

Previously, rolandcrosby (Roland Crosby) wrote…

"debugging is only available" sounds a little strange. Maybe "This section only applies to enterprise changefeeds using Kafka"?

On second thought, the callout seems redundant with the first sentence. Removing.


v19.1/change-data-capture.md, line 216 at r3 (raw file):

Previously, rolandcrosby (Roland Crosby) wrote…

Link 'log information' to a page explaining CockroachDB's log files (assuming we have one)

Done.


v19.1/change-data-capture.md, line 243 at r3 (raw file):

Previously, rolandcrosby (Roland Crosby) wrote…

what happened to the beginning of this line?

Weird. Fixed.


v19.1/change-data-capture.md, line 306 at r3 (raw file):

Previously, rolandcrosby (Roland Crosby) wrote…

"in Avro" sounds odd to me; maybe "using the Avro output format" or something?

Does "using Avro" make sense?


v19.1/change-data-capture.md, line 308 at r3 (raw file):

Previously, rolandcrosby (Roland Crosby) wrote…

Add a quick explanation of what the Confluent stuff is for - like "The binary Avro encoding convention used by CockroachDB uses the Confluent Schema Registry to store Avro schemas"

Edited


v19.1/change-data-capture.md, line 752 at r3 (raw file):

Previously, rolandcrosby (Roland Crosby) wrote…

nit: cloud storage providers (also maybe say AWS S3 instead of just AWS)

Done.


v19.1/change-data-capture.md, line 828 at r3 (raw file):

Previously, rolandcrosby (Roland Crosby) wrote…

'test-s3encryption' is a slightly confusing name, maybe just 'example-bucket-name'?

Done.


v19.1/create-changefeed.md, line 61 at r3 (raw file):

Previously, rolandcrosby (Roland Crosby) wrote…

"If true, enable Transport Layer Security on the connection to Kafka"

Done.


v19.1/create-changefeed.md, line 63 at r3 (raw file):

Previously, rolandcrosby (Roland Crosby) wrote…

specifically SASL/PLAIN (link to https://docs.confluent.io/current/kafka/authentication_sasl/authentication_sasl_plain.html)

Done.


v19.1/create-changefeed.md, line 69 at r3 (raw file):

Previously, rolandcrosby (Roland Crosby) wrote…

AWS S3

Done.


v19.1/create-changefeed.md, line 87 at r3 (raw file):

Previously, rolandcrosby (Roland Crosby) wrote…

nit: .. -> .

Done.


v19.1/create-changefeed.md, line 128 at r3 (raw file):

Previously, rolandcrosby (Roland Crosby) wrote…

this is specific to the envelope format specified by the user (the default format is 'wrapped' I believe, which produces this output)

Done.


v19.1/create-changefeed.md, line 130 at r3 (raw file):

Previously, rolandcrosby (Roland Crosby) wrote…

specifically, the key is an array of the primary key fields of the row

Done.


v19.1/create-changefeed.md, line 131 at r3 (raw file):

Previously, rolandcrosby (Roland Crosby) wrote…

should specify that there are three possible level fields in the value of a record emitted to CDC:

  • after, which contains the state of the row after the update (or 'null' for deletes)
  • updated, which contains the updated timestamp
  • resolved, which is emitted for records representing resolved timestamps (these records won't include an after field since they only function as checkpoints)

Done.

Copy link
Contributor

@danhhz danhhz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: once roland is happy

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @Amruta-Ranade, @danhhz, @lhirata, and @rolandcrosby)


_includes/v19.1/cdc/core-url.md, line 2 at r8 (raw file):

{{site.data.alerts.callout_info}}
Because core changefeeds return results differently than other SQL statements, they require a dedicated database connection with specific settings around result buffering. In normal operation, CockroachDB improves performance by buffering results server-side before returning them to a client. Core changefeeds also have different cancellation behavior than other queries: they can only be canceled by closing the underlying connection or issuing a  [`CANCEL QUERY`](cancel-query.html) statement on a separate connection. Combined, these attributes of changefeeds mean that applications should explicitly create dedicated connections to consume changefeed data, instead of using a connection pool as most client drivers do by default.

If we're going to mention the results buffer, then in place of the sentence you deleted, we should mention that we automatically disable it for core changefeeds. I'm also okay just removing any mention of results buffering. Up to you


_includes/v19.1/sql/settings/settings.md, line 83 at r8 (raw file):

<tr><td><code>sql.defaults.experimental_vectorize</code></td><td>enumeration</td><td><code>0</code></td><td>default experimental_vectorize mode [off = 0, on = 1, always = 2]</td></tr>
<tr><td><code>sql.defaults.optimizer</code></td><td>enumeration</td><td><code>1</code></td><td>default cost-based optimizer mode [off = 0, on = 1, local = 2]</td></tr>
<tr><td><code>sql.defaults.results_buffer.size</code></td><td>byte size</td><td><code>16 KiB</code></td><td>default size of the buffer that accumulates results for a statement or a batch of statements before they are sent to the client. This can be overridden on an individual connection with the 'results_buffer_size' parameter. Note that auto-retries generally only happen while no results have been delivered to the client, so reducing this size can increase the number of retriable errors a client receives. On the other hand, increasing the buffer size can increase the delay until the client receives the first result row. Updating the setting only affects new connections. Setting to 0 disables any buffering.</td></tr>

This one is still true. May want to leave it.


v19.1/change-data-capture.md, line 82 at r3 (raw file):

Previously, lhirata wrote…

Removed

lol. I do think it's worth calling out (without the shade) that confluent schema registry has a different set of rules for backward and forward schema compatibility than avro does. this was surprising to me


v19.1/change-data-capture.md, line 15 at r8 (raw file):

The core feature of CDC is the [changefeed](create-changefeed.html). Changefeeds target a whitelist of tables, called the "watched rows". Every change to a watched row is emitted as a record in a configurable format (JSON or Avro) to a configurable sink ([Kafka](https://kafka.apache.org/)).

## Ordering guarantees

I gave an overview to roland once about how all these rules build up to some useful (and much easier to reason about) top-level invariants. We should document them at the top here and use that as context for all this stuff below. @rolandcrosby, do you have time to sync with lauren and go over that?

I'm happy letting this be a followup, just happened to think of it while reading the changes in this PR


v19.1/change-data-capture.md, line 80 at r8 (raw file):

When schema changes with column backfill (e.g., adding a column with a default, adding a computed column, adding a `NOT NULL` column, dropping a column) are made to watched rows, the changefeed will emit some duplicates during the backfill. When it finishes, CockroachDB outputs all watched rows using the new schema.

Rows that have been backfilled by a schema change are always re-emitted because Avro's default schema change functionality is not powerful enough to represent the schema changes that CockroachDB supports (e.g., CockroachDB columns can have default values that are arbitrary SQL expressions, but Avro only supports static default values).

"not powerful enough" feels too shade-y for my taste. can we rephrase?


v19.1/change-data-capture.md, line 80 at r8 (raw file):

When schema changes with column backfill (e.g., adding a column with a default, adding a computed column, adding a `NOT NULL` column, dropping a column) are made to watched rows, the changefeed will emit some duplicates during the backfill. When it finishes, CockroachDB outputs all watched rows using the new schema.

Rows that have been backfilled by a schema change are always re-emitted because Avro's default schema change functionality is not powerful enough to represent the schema changes that CockroachDB supports (e.g., CockroachDB columns can have default values that are arbitrary SQL expressions, but Avro only supports static default values).

the transition here to talking about avro feels abrupt


v19.1/create-changefeed.md, line 130 at r3 (raw file):

Previously, lhirata wrote…

Done.

Maybe [1] for json or {"id":1} for avro? I was confused when I first read this and only figured it out after I read the below.

Also nit: it's not an array in avro


v19.1/create-changefeed.md, line 67 at r8 (raw file):

`sasl_password` | [`STRING`](string.html) | Your SASL password.

#### Cloud storage sink

we should mention somewhere that cloud storage sink currently only works with format=json


v19.1/create-changefeed.md, line 90 at r8 (raw file):

`resolved` | [`INTERVAL`](interval.html) | Periodically emit resolved timestamps to the changefeed. Optionally, set a minimum duration between emitting resolved timestamps. If unspecified, all resolved timestamps are emitted.<br><br>Example: `resolved='10s'`
`envelope` | `key_only` / `wrapped` | Use `key_only` to emit only the key and no value, which is faster if you only want to know when the key changes.<br><br>Default: `envelope=wrapped`
`cursor` | [Timestamp](as-of-system-time.html#parameters)  | Emits any changes after the given timestamp, but does not output the current state of the table first. If `cursor` is not specified, the changefeed starts by doing an initial scan of all the watched rows and emits the current value, then moves to emitting any changes that happen after the scan.<br><br>When starting a changefeed at a specific `cursor`, the `cursor` cannot be before the configured garbage collection window (see [`gc.ttlseconds`](configure-replication-zones.html#replication-zone-variables)) for the table you're trying to follow; otherwise, the changefeed will error. By default, you cannot create a changefeed that starts more than 25 hours in the past.<br><br>`cursor` can be used to [start a new changefeed where a previous changefeed ended.](#start-a-new-changefeed-where-another-ended)<br><br>Example: `CURSOR=1536242855577149065.0000000000`

nit: "With default garbage collection settings, this means you cannot"


v19.1/create-changefeed.md, line 132 at r8 (raw file):

- **Key**: An array always composed of the row's `PRIMARY KEY` field(s) (e.g., `[1]` or `{"id":1}`).
- **Value**:
    - One of three possible level fields:

I think roland meant top-level : - )

Copy link

@rolandcrosby rolandcrosby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 3 files at r6, 1 of 2 files at r8.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @Amruta-Ranade, @danhhz, @lhirata, and @rolandcrosby)


v19.1/change-data-capture.md, line 306 at r3 (raw file):

Previously, lhirata wrote…

Does "using Avro" make sense?

yeah, I like that


v19.1/change-data-capture.md, line 752 at r3 (raw file):

Previously, lhirata wrote…

Done.

can you change the link text to "these cloud storage providers" too?


v19.1/change-data-capture.md, line 15 at r8 (raw file):

Previously, danhhz (Daniel Harrison) wrote…

I gave an overview to roland once about how all these rules build up to some useful (and much easier to reason about) top-level invariants. We should document them at the top here and use that as context for all this stuff below. @rolandcrosby, do you have time to sync with lauren and go over that?

I'm happy letting this be a followup, just happened to think of it while reading the changes in this PR

I was just talking to Lauren offline about providing some pseudocode for "how to correctly consume a topic and interpret changefeed messages" - that might also be a good place to talk about the invariants?


v19.1/create-changefeed.md, line 130 at r3 (raw file):

Previously, danhhz (Daniel Harrison) wrote…

Maybe [1] for json or {"id":1} for avro? I was confused when I first read this and only figured it out after I read the below.

Also nit: it's not an array in avro

d'oh, I forgot it was a record in Avro, I second Dan's suggestion


v19.1/create-changefeed.md, line 67 at r8 (raw file):

Previously, danhhz (Daniel Harrison) wrote…

we should mention somewhere that cloud storage sink currently only works with format=json

good catch, yes, should specifically say it only works with JSON and always emits newline-delimited json files


v19.1/create-changefeed.md, line 132 at r8 (raw file):

Previously, danhhz (Daniel Harrison) wrote…

I think roland meant top-level : - )

yup!

Copy link
Contributor Author

@lnhsingh lnhsingh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @Amruta-Ranade, @danhhz, and @rolandcrosby)


_includes/v19.1/cdc/core-url.md, line 2 at r8 (raw file):

Previously, danhhz (Daniel Harrison) wrote…

If we're going to mention the results buffer, then in place of the sentence you deleted, we should mention that we automatically disable it for core changefeeds. I'm also okay just removing any mention of results buffering. Up to you

Added.


_includes/v19.1/sql/settings/settings.md, line 83 at r8 (raw file):

Previously, danhhz (Daniel Harrison) wrote…

This one is still true. May want to leave it.

Added back.


v19.1/change-data-capture.md, line 82 at r3 (raw file):

Previously, danhhz (Daniel Harrison) wrote…

lol. I do think it's worth calling out (without the shade) that confluent schema registry has a different set of rules for backward and forward schema compatibility than avro does. this was surprising to me

Done.


v19.1/change-data-capture.md, line 306 at r3 (raw file):

Previously, rolandcrosby (Roland Crosby) wrote…

yeah, I like that

👍


v19.1/change-data-capture.md, line 752 at r3 (raw file):

Previously, rolandcrosby (Roland Crosby) wrote…

can you change the link text to "these cloud storage providers" too?

Done.


v19.1/change-data-capture.md, line 80 at r8 (raw file):

Previously, danhhz (Daniel Harrison) wrote…

"not powerful enough" feels too shade-y for my taste. can we rephrase?

I think I can just remove and combine with the above paragraph


v19.1/change-data-capture.md, line 80 at r8 (raw file):

Previously, danhhz (Daniel Harrison) wrote…

the transition here to talking about avro feels abrupt

I feel like I was trying to shoehorn this into the section. Created a new section for it.


v19.1/create-changefeed.md, line 130 at r3 (raw file):

Previously, rolandcrosby (Roland Crosby) wrote…

d'oh, I forgot it was a record in Avro, I second Dan's suggestion

Done.


v19.1/create-changefeed.md, line 67 at r8 (raw file):

Previously, rolandcrosby (Roland Crosby) wrote…

good catch, yes, should specifically say it only works with JSON and always emits newline-delimited json files

Done.


v19.1/create-changefeed.md, line 90 at r8 (raw file):

Previously, danhhz (Daniel Harrison) wrote…

nit: "With default garbage collection settings, this means you cannot"

Done.


v19.1/create-changefeed.md, line 132 at r8 (raw file):

Previously, rolandcrosby (Roland Crosby) wrote…

yup!

My bad! Done.

Copy link
Contributor

@Amruta-Ranade Amruta-Ranade left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lhirata Awesome work! 🎉


Parameter | Value | Description
----------+-------+---------------
`topic_prefix` | [`STRING`](string.html) | Adds a prefix to all of the topic names.<br><br>For example, `CREATE CHANGEFEED FOR TABLE foo INTO 'kafka://...?topic_prefix=bar_'` would emit rows under the topic `bar_foo` instead of `foo`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "to all of the" > "to all"

- **Key**: An array always composed of the row's `PRIMARY KEY` field(s) (e.g., `[1]` for `JSON` or `{"id":1}` for Avro).
- **Value**:
- One of three possible top-level fields:
- `after`, which contains the state of the row after the update (or 'null' for `DELETE`s).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: 'null' > null?

Copy link
Contributor Author

@lnhsingh lnhsingh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @Amruta-Ranade, @danhhz, and @rolandcrosby)


v19.1/change-data-capture.md, line 15 at r8 (raw file):

Previously, rolandcrosby (Roland Crosby) wrote…

I was just talking to Lauren offline about providing some pseudocode for "how to correctly consume a topic and interpret changefeed messages" - that might also be a good place to talk about the invariants?

FYI, moved this into a separate issue: #4590


v19.1/create-changefeed.md, line 60 at r9 (raw file):

Previously, Amruta-Ranade (Amruta Ranade) wrote…

nit: "to all of the" > "to all"

Done.


v19.1/create-changefeed.md, line 139 at r9 (raw file):

Previously, Amruta-Ranade (Amruta Ranade) wrote…

nit: 'null' > null?

Good catch. Done.

Copy link

@rolandcrosby rolandcrosby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 3 of 4 files at r9, 1 of 1 files at r10.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale)

@lnhsingh lnhsingh requested a review from jseldess April 3, 2019 14:50
Copy link
Contributor

@jseldess jseldess left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent work, @lhirata. :lgtm_strong: as long as one of the reviewers actually tested the steps.

Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (and 1 stale) (waiting on @jseldess)

@lnhsingh lnhsingh merged commit 6bba1a9 into master Apr 3, 2019
@lnhsingh lnhsingh deleted the cdc branch April 3, 2019 16:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Change Data Capture (CDC) Iteration 2
6 participants