Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REPLICA IDENTITY inheritance #502

Closed
keithf4 opened this issue Jan 12, 2023 · 4 comments
Closed

REPLICA IDENTITY inheritance #502

keithf4 opened this issue Jan 12, 2023 · 4 comments

Comments

@keithf4
Copy link
Collaborator

keithf4 commented Jan 12, 2023

Check and see if the REPLICA IDENTITY property is inherited by native partitioning. If not, add this as a template feature.

@keithf4 keithf4 added this to the Future milestone Jan 12, 2023
@keithf4 keithf4 modified the milestones: Future, 5.1 Mar 22, 2023
@james-johnston-thumbtack

@keithf4 It looks like inheritance from parent to child partitions does not happen automatically. Here is an example:

CREATE TABLE kafka_outbox (
    message_id UUID,
    row_time TIMESTAMP NOT NULL
) PARTITION BY RANGE (row_time);
ALTER TABLE kafka_outbox REPLICA IDENTITY FULL;

CREATE SCHEMA partman;
CREATE EXTENSION pg_partman SCHEMA partman;
SELECT partman.create_parent(
    p_parent_table => 'public.kafka_outbox',
    p_control => 'row_time',
    p_type => 'native',
    p_interval => 'daily',
    p_premake => 7
);

The following query will show all the tables that do not have REPLICA IDENTITY FULL. Alternatively you could use the psql command to inspect that by hand.

SELECT
    pg_namespace.nspname AS "schema",
    pg_class.relname AS "table"
FROM pg_class
INNER JOIN pg_namespace ON pg_class.relnamespace = pg_namespace.oid
WHERE
    pg_class.relreplident != 'f'
    AND pg_class.relkind IN ('r', 'p')
    -- Exclude pg_partman schema: we aren't interested in replicating those tables to Kafka, and we
    -- have no direct control over their schema (and therefore replica identity)
    AND pg_namespace.nspname != 'partman'
    -- Exclude system schemas
    AND pg_namespace.nspname != 'information_schema'
    AND pg_namespace.nspname != 'pg_catalog'
ORDER BY "schema", "table";

And here is a list of tables returned by the above SELECT query. These partition tables did not get REPLICA IDENTITY FULL:

[('public', 'kafka_outbox_default'),
 ('public', 'kafka_outbox_p2023_03_19'),
 ('public', 'kafka_outbox_p2023_03_20'),
 ('public', 'kafka_outbox_p2023_03_21'),
 ('public', 'kafka_outbox_p2023_03_22'),
 ('public', 'kafka_outbox_p2023_03_23'),
 ('public', 'kafka_outbox_p2023_03_24'),
 ('public', 'kafka_outbox_p2023_03_25'),
 ('public', 'kafka_outbox_p2023_03_26'),
 ('public', 'kafka_outbox_p2023_03_27'),
 ('public', 'kafka_outbox_p2023_03_28'),
 ('public', 'kafka_outbox_p2023_03_29'),
 ('public', 'kafka_outbox_p2023_03_30'),
 ('public', 'kafka_outbox_p2023_03_31'),
 ('public', 'kafka_outbox_p2023_04_01'),
 ('public', 'kafka_outbox_p2023_04_02')]

This is a problem if the partitions are being replicated to Kafka via CDC, such as with Debezium. Since the Debezium subscriber does not have random access to old row versions in Kafka (that's not how Kafka works), it's common to set REPLICA IDENTITY FULL so that full row versions are always available to write to Kafka. However, if pg_partman doesn't copy over the replication identity to the child tables, then the whole replication scheme would break down.

This use case is described in more detail at:

At the moment, I am not blocked by this, because I am currently using pg_partman only for outbox table use case, which doesn't involve UPDATE or DELETE statements (pg_partman handles truncation of outbox data for me). However, this would be an issue for more general-purpose use of pg_partman with Debezium or other CDC use cases.

@keithf4
Copy link
Collaborator Author

keithf4 commented May 8, 2023

@james-johnston-thumbtack

Thank you for the investigation on this. You might want to also bring this up on the official PostgreSQL mailing lists as a missing inheritance feature for native partitioning. That's really where this should be being handled in the long run. pg_partman is just using native partitioning under the hood (trigger-based partitioning is being dropped soon) and adding support for a feature like this should only be done as a stop-gap until native properly supports it (as is being done for non-partition-key uniques and unlogged).

@keithf4
Copy link
Collaborator Author

keithf4 commented Jan 24, 2024

I have a beta PR up that should automatically inherit the REPLICA IDENTITY from the parent table. If you're able to test, any feedback would be appreciated

#615

@keithf4
Copy link
Collaborator Author

keithf4 commented Apr 5, 2024

Version 5.1 has been released with this fix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants