Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transactional handling for Debezium PG CDC #81

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

shawkins
Copy link

There are some things here that could be teased apart, but it is probably good just to see what it's working towards. This was done for a POC of honoring transactaional metadata produced by Debezium - in particular for postgresql.

With a config that includes new serdes and a transactions topic, such as:

topics:
  default:
    acks: "all"
    auto.offset.reset: "earliest"
    bootstrap.servers: "localhost:9092"
    client.id: "southpaw"
    group.id: "southpaw"
    enable.auto.commit: false
    key.serde.class: "com.jwplayer.southpaw.serde.DebeziumJsonSerde"
    schema.registry.url: "http://localhost:80"
    topic.class: "com.jwplayer.southpaw.topic.KafkaTopic"
    value.serde.class: "com.jwplayer.southpaw.serde.DebeziumJsonSerde"
  CustomerWithAddresses:
    compression.type: "snappy"
    jackson.serde.class: "com.jwplayer.southpaw.json.DenormalizedRecord"
    key.serde.class: "org.apache.kafka.common.serialization.Serdes$ByteArraySerde"
    topic.class: "com.jwplayer.southpaw.topic.KafkaTopic"
    topic.name: "customers-with-addresses"
    value.serde.class: "com.jwplayer.southpaw.serde.JacksonSerde"
  customer:
    topic.name: "dbserver1.inventory.customers"
  address:
    topic.name: "dbserver1.inventory.addresses"
  transactions:
    topic.name: "dbserver1.transaction"
    value.serde.class: "com.jwplayer.southpaw.serde.JsonSerde"
    key.serde.class: "com.jwplayer.southpaw.serde.JsonSerde"
    persistent: false

One can consume the events from https://github.com/debezium/debezium-examples/tree/master/kstreams-fk-join with the connector configured with "provide.transaction.metadata": "true" and emit denormalizations that are consistent with the transaction boundaries. It also degrades if transaction metadata is not available to the normal eventually consistent processing. Please reach out if something like that is of interest.

In the earliest commit I'm trying to address redundant or unnecessary deserialization by holding onto the deserialized value and making getting the old value for filtering optional. It also add support for wrapped debezium json cdc events.

In the next commit there's code to make for a tighter polling loop to avoid setting or incurring a polling timeout on topics that don't change much.

Let me know if you want separate PRs for those changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant