Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an example of a WAL decoder #1845

Merged
merged 1 commit into from
Sep 12, 2024

Conversation

daamien
Copy link
Contributor

@daamien daamien commented Sep 9, 2024

Hi !

I recently wrote this example as an exercice to learn how to build a WAL decoder with PGRX.... I built this as « the thing I would have loved to have before » when I started digging on this topic. Feel free to modify it heavily or discard it completely if it is too specific or if the coding style is not up to the PGRX standards. I tried my best to write idiomic Rust but there may be some naïve/unsound parts...

This example shows how to build a basic Change Data Capture (CDC) mechanism using the Postgres Logical Decoding capabilities. The changes occurring on the database are serialized into JSON and pushed to a queue (a "logical replication slot") where they can be consumed by remote clients.

A Postgres CDC extension can be a used in various purposes:

  • Ad hoc replication systems (e.g. Postgres => SQL Server )
  • External commit-log for a distributed system (such as Kafka)
  • Advanced Monitoring ( e.g. Prometheus/Loki )

This example tries to find the right tradeoff between simplicity and usefulness. Currently it has strong limitations but they should be easy to overcome.

Rust really shines in this example compared to similar implementations in C (see the wal2json extension). The Serde crate provides JSON serialization out of the box, whereas all the C implementations are forced to write their own JSON formatter.

@eeeebbbbrrrr
Copy link
Contributor

Could I trouble you to put a README.md together for this crate that details what it does and how to use it?

@daamien
Copy link
Contributor Author

daamien commented Sep 9, 2024

Could I trouble you to put a README.md together for this crate that details what it does and how to use it?

Sure, I wrote the explanations in the preamble comment of the src/lib.rs but I can move it into README file

Copy link
Member

@workingjubilee workingjubilee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Looks good to me. I indeed have some mostly-stylistic nits in mind here.

pgrx-examples/decoder/src/lib.rs Outdated Show resolved Hide resolved
pgrx-examples/decoder/src/lib.rs Outdated Show resolved Hide resolved
pgrx-examples/decoder/src/lib.rs Outdated Show resolved Hide resolved
pgrx-examples/decoder/src/bin/pgrx_embed.rs Outdated Show resolved Hide resolved
pgrx-examples/decoder/README.md Outdated Show resolved Hide resolved
pgrx-examples/decoder/README.md Outdated Show resolved Hide resolved
pgrx-examples/decoder/src/lib.rs Outdated Show resolved Hide resolved
pgrx-examples/decoder/README.md Outdated Show resolved Hide resolved
pgrx-examples/decoder/src/lib.rs Outdated Show resolved Hide resolved
pgrx-examples/decoder/src/lib.rs Outdated Show resolved Hide resolved
@daamien daamien force-pushed the decoder branch 2 times, most recently from b5d1da7 to 62c2be1 Compare September 10, 2024 08:41
@workingjubilee
Copy link
Member

I'm perplexed why we don't implement both deserialization and serialization.

@daamien
Copy link
Contributor Author

daamien commented Sep 11, 2024

I'm perplexed why we don't implement both deserialization and serialization.

The purpose of a WAL decoding extension is to receive a WAL through the logical decoding mechanism and transform it into text representations of the operations performed. In this example, the extension will produce a JSON output but it will never have to injest any JSON input. It's really a one way road.

The question of who will consume the JSON output and how it will be consumed is out of the scope.

Imagine that you are legally required to archive any change made to a given table in your database. You might want to use a Kafka storage for that. First the WAL decoder will generate the JSON output in the replication slot and then you'd write a client that consumes events in the replication slot and push them to kafka.

@workingjubilee
Copy link
Member

Hmm. I guess it just feels weird that we don't check we can reconstruct our type from its own output. If we could, that capability... the interface type... could live in its own library, you see, that both producer and consumer could reference.

Copy link
Member

@workingjubilee workingjubilee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks fine.

pgrx-examples/decoder/src/lib.rs Outdated Show resolved Hide resolved
@workingjubilee
Copy link
Member

Does it work with ./run-examples.sh?

@daamien
Copy link
Contributor Author

daamien commented Sep 12, 2024

Does it work with ./run-examples.sh?

It does :)

This example shows how to build a basic Change Data Capture (CDC)
mecanism using the Postgres Logical Decoding capabilities. The changes
occuring on the database are serialized into JSON and pushed to a
queue (a "logical replication slot") where they can be consumed by
remote clients.

A Postgres CDC extension can be a used in various purposes:

* Adhoc replication systems (e.g. Postgres => SQL Server )
* External commit-log for a distributed system (such as Khafka)
* Advanced Monitoring ( e.g. Prometheus/Loki )

This example tries to find the right tradeoff between simplicity and
usefulness. Currently it has strong limitations but they should be
easy to overcome.

Rust really shine in this example compared to similar implementations
in C (see the wal2json extension). The Serde crate provides JSON
serialization out of the box, whereas all the C implementations are
forced to rewrite their JSON formater.
@daamien
Copy link
Contributor Author

daamien commented Sep 12, 2024

Renamed , Squashed and Rebased :)

@workingjubilee
Copy link
Member

Thanks!

Oh, re: soundness, nothing seems obviously wrong? But I haven't mind-numbingly-closely audited it for soundness as it's just an example, and thus is user code, basically: any brain cells I spend on doing that are better off spent on making it easier to write code soundly than torturing you with revisions of some pretty gnarly unsafe code.

@workingjubilee workingjubilee merged commit 729e70f into pgcentralfoundation:develop Sep 12, 2024
14 checks passed
@daamien daamien deleted the decoder branch September 23, 2024 19:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants