-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
All tables in Postgres should have a REPLICA IDENTITY
available so that Postgres logical replication can be used
#16224
Comments
Maybe we should give every table a primary key? IIRC that would have been useful elsewhere, e.g. #15583 |
Agreed but may not be trivial. I would hope that all new tables we create have primary keys. One thing to note is that many of these tables probably have worthy unique indices already. In Postgres, those can be converted for free with |
I'm researching the possibility of an HA setup based on pgEdge, which, like logical replication, requires
These are the only tables that don't have unique indexes.
A unique index can be used as a REPLICA IDENTITY ALTER TABLE table_name REPLICA IDENTITY USING INDEX index_name; Adding such statements in
And the following tables would seem to likely candidates for
That leaves the tables below...
I don't have deep knowledge of synapse, so take the above with a grain of salt. All table research aside, the main thing I'm hoping for is that unique indexes as REPLICA IDENTITY are the easy path forward here. |
Going to try and put up a quick solution for this shortly. Here's a list of the tables above, either
|
Really looking forward to be able to use logical replication with our pg instances for synapse. Just to clarify if/when the PR is merged will that make logical replication feasible with synapse? Or is there other work to do still? |
That's the hope! I haven't gotten around to trying it again since setting the replica identities yet, but it should be another hurdle removed from being able to do this. If I run into problems again then I will be trying to fix those, since long story short I need to do this to move my homeserver without it taking hours to restore a backup, so I kind of need to get this done soon :) |
(I should probably add some kind of caveat that you should only try this for now if you know a bit about what you're doing and can sort yourself out if you run into trouble.) |
…cit one. This should allow use of Postgres logical replication. (#16456) * Add Postgres replica identities to tables that don't have an implicit one Fixes #16224 * Newsfile Signed-off-by: Olivier Wilkinson (reivilibre) <[email protected]> * Move the delta to version 83 as we missed the boat for 82 * Add a test that all tables have a REPLICA IDENTITY * Extend the test to include when indices are deleted * isort * black * Fully qualify `oid` as it is a 'hidden attribute' in Postgres 11 * Update tests/storage/test_database.py Co-authored-by: Patrick Cloke <[email protected]> * Add missed tables --------- Signed-off-by: Olivier Wilkinson (reivilibre) <[email protected]> Co-authored-by: Patrick Cloke <[email protected]>
This is not quite correct yet. When I created the replication slot sending messages started failing and I had to apply these (from reading logs): ALTER TABLE stats_incremental_position REPLICA IDENTITY USING INDEX stats_incremental_position_lock_key;
ALTER TABLE room_forgetter_stream_pos REPLICA IDENTITY USING INDEX room_forgetter_stream_pos_lock_key;
ALTER TABLE devices REPLICA IDENTITY USING INDEX device_uniqueness;
ALTER TABLE redactions REPLICA IDENTITY USING INDEX redactions_event_id_key;
ALTER TABLE user_directory_stream_pos REPLICA IDENTITY USING INDEX user_directory_stream_pos_lock_key;
ALTER TABLE event_push_summary_last_receipt_stream_id REPLICA IDENTITY USING INDEX event_push_summary_last_receipt_stream_id_lock_key;
ALTER TABLE event_push_summary_stream_ordering REPLICA IDENTITY USING INDEX event_push_summary_stream_ordering_lock_key;
ALTER TABLE event_forward_extremities REPLICA IDENTITY USING INDEX event_forward_extremities_event_id_room_id_key;
ALTER TABLE event_backward_extremities REPLICA IDENTITY USING INDEX event_backward_extremities_event_id_room_id_key;
ALTER TABLE event_push_actions REPLICA IDENTITY FULL; -- only unique index has nullable column
ALTER TABLE appservice_stream_position REPLICA IDENTITY USING INDEX appservice_stream_position_lock_key; Obviously my query to find tables that are missing a primary key is not quite correct. So this needs investigating and:
|
this all got backed out because of deadlocking :( |
I had to revert this when doing the 1.98.0rc. We saw migration pain on matrix.org:
The column is not nullable in schema dump 72: synapse/synapse/storage/schema/main/full_schemas/72/full.sql.postgres Lines 567 to 571 in 0a38c7e
We have a background update which rewrites the localpart column synapse/synapse/storage/databases/main/filtering.py Lines 58 to 60 in 7ec0a14
Presumably part of #15396 I don't understand the full machinations of all this (e.g. is everyone affected? only old deployments? are background updates relevant?) |
Description:
When migrating a Postgres database, there are a few options:
pg_dump
which is then fed intopsql
on the restoring end). Simple but in my case this is taking 150 minutes to restore, so I have outgrown this solution really as 150 minutes of downtime is quite hard to schedule.pg_dump --schema-only
and restore that, then useCREATE PUBLICATION
(primary) andCREATE SUBSCRIPTION
(secondary) and Postgres takes care of the rest... even across different Postgres versions and different machine architectures... but with Synapse this currently causes a problem on the primary.The problem is that Postgres needs to identify individual rows in the tables using a so-called
REPLICA IDENTITY
. This defaults to the primary key of the table if one is set — but many Synapse tables just don't have a primary key.The net effect is that once you start logical replication, all
UPDATE
s andDELETE
s to the tables without primary keys now fail. In turn, this causes basic features like/sync
to stop working (as it deletes fromdevice_inbox
at least).You can set the
REPLICA IDENTITY
per table manually, either to an existing unique index or to the full record as a fallback if you really have no better option. (see https://www.postgresql.org/docs/15/sql-altertable.html#SQL-ALTERTABLE-REPLICA-IDENTITY)It would be nice to do this in Synapse so that logical replication works out of the box. We might even consider keeping a lint around to check that all tables have a replica identity?
I think this SQL can be used to find tables that are set to use the default replica identity, but which don't have a primary key (i.e. they don't have a valid replica identity after all):
This currently gives me
The text was updated successfully, but these errors were encountered: