-
-
Notifications
You must be signed in to change notification settings - Fork 250
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add an example of a WAL decoder #1845
Conversation
Could I trouble you to put a README.md together for this crate that details what it does and how to use it? |
Sure, I wrote the explanations in the preamble comment of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Looks good to me. I indeed have some mostly-stylistic nits in mind here.
b5d1da7
to
62c2be1
Compare
I'm perplexed why we don't implement both deserialization and serialization. |
The purpose of a WAL decoding extension is to receive a WAL through the logical decoding mechanism and transform it into text representations of the operations performed. In this example, the extension will produce a JSON output but it will never have to injest any JSON input. It's really a one way road. The question of who will consume the JSON output and how it will be consumed is out of the scope. Imagine that you are legally required to archive any change made to a given table in your database. You might want to use a Kafka storage for that. First the WAL decoder will generate the JSON output in the replication slot and then you'd write a client that consumes events in the replication slot and push them to kafka. |
Hmm. I guess it just feels weird that we don't check we can reconstruct our type from its own output. If we could, that capability... the interface type... could live in its own library, you see, that both producer and consumer could reference. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks fine.
Does it work with |
It does :) |
This example shows how to build a basic Change Data Capture (CDC) mecanism using the Postgres Logical Decoding capabilities. The changes occuring on the database are serialized into JSON and pushed to a queue (a "logical replication slot") where they can be consumed by remote clients. A Postgres CDC extension can be a used in various purposes: * Adhoc replication systems (e.g. Postgres => SQL Server ) * External commit-log for a distributed system (such as Khafka) * Advanced Monitoring ( e.g. Prometheus/Loki ) This example tries to find the right tradeoff between simplicity and usefulness. Currently it has strong limitations but they should be easy to overcome. Rust really shine in this example compared to similar implementations in C (see the wal2json extension). The Serde crate provides JSON serialization out of the box, whereas all the C implementations are forced to rewrite their JSON formater.
Renamed , Squashed and Rebased :) |
Thanks! Oh, re: soundness, nothing seems obviously wrong? But I haven't mind-numbingly-closely audited it for soundness as it's just an example, and thus is user code, basically: any brain cells I spend on doing that are better off spent on making it easier to write code soundly than torturing you with revisions of some pretty gnarly unsafe code. |
Hi !
I recently wrote this example as an exercice to learn how to build a WAL decoder with PGRX.... I built this as « the thing I would have loved to have before » when I started digging on this topic. Feel free to modify it heavily or discard it completely if it is too specific or if the coding style is not up to the PGRX standards. I tried my best to write idiomic Rust but there may be some naïve/unsound parts...
This example shows how to build a basic Change Data Capture (CDC) mechanism using the Postgres Logical Decoding capabilities. The changes occurring on the database are serialized into JSON and pushed to a queue (a "logical replication slot") where they can be consumed by remote clients.
A Postgres CDC extension can be a used in various purposes:
This example tries to find the right tradeoff between simplicity and usefulness. Currently it has strong limitations but they should be easy to overcome.
Rust really shines in this example compared to similar implementations in C (see the wal2json extension). The Serde crate provides JSON serialization out of the box, whereas all the C implementations are forced to write their own JSON formatter.