Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

startup error upgrading 1.69 -> 1.70 or greater: constraint "receipts_graph_uniqueness" does not exist #14377

Closed
kylrth opened this issue Nov 6, 2022 · 14 comments
Labels
O-Uncommon Most users are unlikely to come across this or unexpected workflow S-Minor Blocks non-critical functionality, workarounds exist. T-Other Questions, user support, anything else. X-Needs-Info This issue is blocked awaiting information from the reporter

Comments

@kylrth
Copy link

kylrth commented Nov 6, 2022

After upgrading the Docker container from v1.68.0 to v1.70.1, I see this error on startup:

Starting synapse with args -m synapse.app.homeserver --config-path /data/homeserver.yaml
2022-11-06 16:17:37,820 - root - 345 - WARNING - main - ***** STARTING SERVER *****
2022-11-06 16:17:37,820 - root - 346 - WARNING - main - Server /usr/local/lib/python3.9/site-packages/synapse/app/homeserver.py version 1.70.1
2022-11-06 16:17:37,820 - root - 351 - INFO - main - Server hostname: kylrth.com
2022-11-06 16:17:37,820 - root - 352 - INFO - main - Instance name: master
2022-11-06 16:17:37,820 - root - 353 - INFO - main - Twisted reactor: EPollReactor
2022-11-06 16:17:37,820 - synapse.app.homeserver - 372 - INFO - main - Setting up server
2022-11-06 16:17:37,820 - synapse.server - 307 - INFO - main - Setting up.
2022-11-06 16:17:37,825 - synapse.storage.databases - 66 - INFO - main - [database config 'master']: Checking database server
2022-11-06 16:17:37,827 - synapse.storage.databases - 69 - INFO - main - [database config 'master']: Preparing for databases ['main', 'state']
2022-11-06 16:17:37,827 - synapse.storage.prepare_database - 115 - INFO - main - ['main', 'state']: Checking existing schema version
2022-11-06 16:17:37,830 - synapse.storage.prepare_database - 119 - INFO - main - ['main', 'state']: Existing schema is 73 (+8 deltas)
2022-11-06 16:17:37,830 - synapse.storage.databases.main - 311 - INFO - main - Checking database for consistency with configuration...
2022-11-06 16:17:37,831 - synapse.storage.prepare_database - 411 - INFO - main - Applying schema deltas for v73
2022-11-06 16:17:37,832 - synapse.storage.prepare_database - 519 - INFO - main - Applying schema 73/06thread_notifications_thread_id_idx.sql
2022-11-06 16:17:37,833 - synapse.storage.prepare_database - 527 - INFO - main - Applying engine-specific schema 73/08thread_receipts_non_null.sql.postgres
2022-11-06 16:17:37,834 - synapse.app._base - 205 - ERROR - main - Exception during startup
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/synapse/app/homeserver.py", line 375, in setup
    hs.setup()
  File "/usr/local/lib/python3.9/site-packages/synapse/server.py", line 309, in setup
    self.datastores = Databases(self.DATASTORE_CLASS, self)
  File "/usr/local/lib/python3.9/site-packages/synapse/storage/databases/__init__.py", line 74, in __init__
    prepare_database(
  File "/usr/local/lib/python3.9/site-packages/synapse/storage/prepare_database.py", line 136, in prepare_database
    _upgrade_existing_database(
  File "/usr/local/lib/python3.9/site-packages/synapse/storage/prepare_database.py", line 528, in _upgrade_existing_database
    database_engine.execute_script_file(cur, absolute_path)
  File "/usr/local/lib/python3.9/site-packages/synapse/storage/engines/_base.py", line 145, in execute_script_file
    cls.executescript(cursor, f.read())
  File "/usr/local/lib/python3.9/site-packages/synapse/storage/engines/postgres.py", line 224, in executescript
    cursor.execute(script)
  File "/usr/local/lib/python3.9/site-packages/synapse/storage/database.py", line 388, in execute
    self._do_execute(self.txn.execute, sql, *args)
  File "/usr/local/lib/python3.9/site-packages/synapse/storage/database.py", line 436, in _do_execute
    return func(sql, *args, **kwargs)
psycopg2.errors.UndefinedObject: constraint "receipts_graph_uniqueness" of relation "receipts_graph" does not exist

**********************************************************************************
 Error during initialisation:
    constraint "receipts_graph_uniqueness" of relation "receipts_graph" does not exist
 
 There may be more information in the logs.
**********************************************************************************

I then proceeded to downgrade to 1.70.0, and then 1.69.0, at which point the server worked correctly. Just to be sure, I then tried to upgrade back to 1.70.1, but I saw the same issue. Currently running 1.69.0 now.

@squahtx squahtx added S-Minor Blocks non-critical functionality, workarounds exist. T-Other Questions, user support, anything else. O-Uncommon Most users are unlikely to come across this or unexpected workflow labels Nov 7, 2022
@squahtx
Copy link
Contributor

squahtx commented Nov 7, 2022

The Synapse database is missing the receipts_graph_uniqueness constraint for some reason.
That constraint was added in schema version 21, 7 years ago, so it's not clear how it went missing. (see https://github.com/matrix-org/synapse/blob/develop/synapse/storage/schema/main/delta/21/receipts.sql#L23)

Can you try running

ALTER TABLE ONLY receipts_graph
    ADD CONSTRAINT receipts_graph_uniqueness UNIQUE (room_id, receipt_type, user_id);

in psql, then upgrading again?

@squahtx squahtx added the X-Needs-Info This issue is blocked awaiting information from the reporter label Nov 7, 2022
@kylrth
Copy link
Author

kylrth commented Nov 7, 2022

What do I do if there's a duplicate key?

synapse=# ALTER TABLE ONLY receipts_graph ADD CONSTRAINT receipts_graph_uniqueness UNIQUE (room_id, receipt_type, user_id);
ERROR:  could not create unique index "receipts_graph_uniqueness"
DETAIL:  Key (room_id, receipt_type, user_id)=(!LZirCxnkkeBudrQzPj:matrix.org, m.read, @f:whomst.online) is duplicated.

Sorry, I'm a database noob.

@squahtx
Copy link
Contributor

squahtx commented Nov 7, 2022

In that case let's try something different. We're going to execute the 73/08thread_receipts_non_null.sql.postgres migration manually and mark it as done.

  1. First, double check that 73/06thread_notifications_thread_id_idx.sql is the last schema delta applied:
SELECT * FROM applied_schema_deltas;
 version |                               file
---------+-------------------------------------------------------------------
...
      73 | 73/04pending_device_list_updates.sql
      73 | 73/05old_push_actions.sql.postgres
      73 | 73/06thread_notifications_thread_id_idx.sql
(### rows)
  1. If that's the case, then try:
-- It's okay if this fails because `receipts_linearized_uniqueness` does not exist.
ALTER TABLE receipts_linearized DROP CONSTRAINT receipts_linearized_uniqueness;

-- It's okay if this fails because `receipts_graph_uniqueness` does not exist.
ALTER TABLE receipts_graph DROP CONSTRAINT receipts_graph_uniqueness;

INSERT INTO applied_schema_deltas(version, file) VALUES (73, '73/08thread_receipts_non_null.sql.postgres');
  1. Then try upgrading Synapse again.

@kylrth
Copy link
Author

kylrth commented Nov 7, 2022

Wow thank you so much for the help. I'm seeing the last delta is 73/05old_push_actions.sql.postgres. What do I do in this case?

@squahtx
Copy link
Contributor

squahtx commented Nov 8, 2022

I'm seeing the last delta is 73/05old_push_actions.sql.postgres. What do I do in this case?

We'll have to run 73/06... manually first. After running these in order, try upgrading Synapse again:

-- 73/06thread_notifications_thread_id_idx.sql
-- Allow there to be multiple summaries per user/room.
DROP INDEX IF EXISTS event_push_summary_unique_index;

INSERT INTO background_updates (ordering, update_name, progress_json, depends_on) VALUES
  (7306, 'event_push_actions_thread_id_null', '{}', 'event_push_backfill_thread_id');

INSERT INTO background_updates (ordering, update_name, progress_json, depends_on) VALUES
  (7306, 'event_push_summary_thread_id_null', '{}', 'event_push_backfill_thread_id');

INSERT INTO applied_schema_deltas(version, file) VALUES (73, '73/06thread_notifications_thread_id_idx.sql');
-- 73/08thread_receipts_non_null.sql.postgres
-- It's okay if this fails because `receipts_linearized_uniqueness` does not exist.
ALTER TABLE receipts_linearized DROP CONSTRAINT receipts_linearized_uniqueness;

-- It's okay if this fails because `receipts_graph_uniqueness` does not exist.
ALTER TABLE receipts_graph DROP CONSTRAINT receipts_graph_uniqueness;

INSERT INTO applied_schema_deltas(version, file) VALUES (73, '73/08thread_receipts_non_null.sql.postgres');

@kylrth
Copy link
Author

kylrth commented Nov 8, 2022

Ok I applied the migrations, upgraded to 1.71.0, and now federation is occurring but I'm seeing this error:

2022-11-08 16:21:42,226 - synapse.metrics.background_process_metrics - 244 - ERROR - background_updates-0 - Background process 'background_updates' threw an exception
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/synapse/storage/background_updates.py", line 294, in run_background_updates
    result = await self.do_next_background_update(sleep)
  File "/usr/local/lib/python3.9/site-packages/synapse/storage/background_updates.py", line 424, in do_next_background_update
    await self._do_background_update(desired_duration_ms)
  File "/usr/local/lib/python3.9/site-packages/synapse/storage/background_updates.py", line 467, in _do_background_update
    items_updated = await update_handler(progress, batch_size)
  File "/usr/local/lib/python3.9/site-packages/synapse/storage/background_updates.py", line 624, in updater
    await self.db_pool.runWithConnection(runner)
  File "/usr/local/lib/python3.9/site-packages/synapse/storage/database.py", line 976, in runWithConnection
    return await make_deferred_yieldable(
  File "/usr/local/lib/python3.9/site-packages/twisted/python/threadpool.py", line 244, in inContext
    result = inContext.theWork()  # type: ignore[attr-defined]
  File "/usr/local/lib/python3.9/site-packages/twisted/python/threadpool.py", line 260, in <lambda>
    inContext.theWork = lambda: context.call(  # type: ignore[attr-defined]
  File "/usr/local/lib/python3.9/site-packages/twisted/python/context.py", line 117, in callWithContext
    return self.currentContext().callWithContext(ctx, func, *args, **kw)
  File "/usr/local/lib/python3.9/site-packages/twisted/python/context.py", line 82, in callWithContext
    return func(*args, **kw)
  File "/usr/local/lib/python3.9/site-packages/twisted/enterprise/adbapi.py", line 282, in _runWithConnection
    result = func(conn, *args, **kw)
  File "/usr/local/lib/python3.9/site-packages/synapse/storage/database.py", line 969, in inner_func
    return func(db_conn, *args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/synapse/storage/background_updates.py", line 575, in create_index_psql
    c.execute(sql)
  File "/usr/local/lib/python3.9/site-packages/synapse/storage/database.py", line 388, in execute
    self._do_execute(self.txn.execute, sql, *args)
  File "/usr/local/lib/python3.9/site-packages/synapse/storage/database.py", line 436, in _do_execute
    return func(sql, *args, **kwargs)
psycopg2.errors.UniqueViolation: could not create unique index "receipts_graph_unique_index"
DETAIL:  Key (room_id, receipt_type, user_id)=(!LZirCxnkkeBudrQzPj:matrix.org, m.read, @f:whomst.online) is duplicated.


During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/synapse/metrics/background_process_metrics.py", line 242, in run
    return await func(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/synapse/storage/background_updates.py", line 299, in run_background_updates
    raise RuntimeError(
RuntimeError: 5 back-to-back background update failures; aborting.

In the mean time I downgraded to 1.69.0 because I worried that state wouldn't be stored correctly.

@squahtx
Copy link
Contributor

squahtx commented Nov 15, 2022

We're working on a fix that will go in 1.72.0. It's being tracked in #14406.

@erikjohnston erikjohnston removed the X-Needs-Info This issue is blocked awaiting information from the reporter label Nov 16, 2022
@kylrth
Copy link
Author

kylrth commented Nov 22, 2022

@squahtx upon upgrading to 1.72.0, I get this:

2022-11-22 15:11:46,824 - synapse.storage.background_updates - 431 - INFO - background_updates-0 - Starting update batch on background update 'receipts_graph_unique_index'
2022-11-22 15:11:46,865 - synapse.metrics.background_process_metrics - 244 - ERROR - background_updates-0 - Background process 'background_updates' threw an exception
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/synapse/storage/background_updates.py", line 294, in run_background_updates
    result = await self.do_next_background_update(sleep)
  File "/usr/local/lib/python3.9/site-packages/synapse/storage/background_updates.py", line 424, in do_next_background_update
    await self._do_background_update(desired_duration_ms)
  File "/usr/local/lib/python3.9/site-packages/synapse/storage/background_updates.py", line 467, in _do_background_update
    items_updated = await update_handler(progress, batch_size)
  File "/usr/local/lib/python3.9/site-packages/synapse/storage/databases/main/receipts.py", line 1053, in _background_receipts_graph_unique_index
    await self._create_receipts_index(
  File "/usr/local/lib/python3.9/site-packages/synapse/storage/databases/main/receipts.py", line 958, in _create_receipts_index
    await self.db_pool.runWithConnection(_create_index)
  File "/usr/local/lib/python3.9/site-packages/synapse/storage/database.py", line 976, in runWithConnection
    return await make_deferred_yieldable(
  File "/usr/local/lib/python3.9/site-packages/twisted/python/threadpool.py", line 244, in inContext
    result = inContext.theWork()  # type: ignore[attr-defined]
  File "/usr/local/lib/python3.9/site-packages/twisted/python/threadpool.py", line 260, in <lambda>
    inContext.theWork = lambda: context.call(  # type: ignore[attr-defined]
  File "/usr/local/lib/python3.9/site-packages/twisted/python/context.py", line 117, in callWithContext
    return self.currentContext().callWithContext(ctx, func, *args, **kw)
  File "/usr/local/lib/python3.9/site-packages/twisted/python/context.py", line 82, in callWithContext
    return func(*args, **kw)
  File "/usr/local/lib/python3.9/site-packages/twisted/enterprise/adbapi.py", line 282, in _runWithConnection
    result = func(conn, *args, **kw)
  File "/usr/local/lib/python3.9/site-packages/synapse/storage/database.py", line 969, in inner_func
    return func(db_conn, *args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/synapse/storage/databases/main/receipts.py", line 953, in _create_index
    c.execute(sql)
  File "/usr/local/lib/python3.9/site-packages/synapse/storage/database.py", line 388, in execute
    self._do_execute(self.txn.execute, sql, *args)
  File "/usr/local/lib/python3.9/site-packages/synapse/storage/database.py", line 436, in _do_execute
    return func(sql, *args, **kwargs)
psycopg2.errors.DuplicateTable: relation "receipts_graph_unique_index" already exists

That's repeated 5 times before giving up. I'm guessing this is because of the manual upgrades I did?

@squahtx
Copy link
Contributor

squahtx commented Nov 22, 2022

Was the receipts_graph_unique_index index created manually after we tried #14377 (comment) ?

Could you post the output of

 \d receipts_graph

and

\d receipts_linearized

?

@squahtx squahtx added the X-Needs-Info This issue is blocked awaiting information from the reporter label Nov 22, 2022
@kylrth
Copy link
Author

kylrth commented Nov 22, 2022

synapse=# \d receipts_graph
            Table "public.receipts_graph"
    Column    | Type | Collation | Nullable | Default 
--------------+------+-----------+----------+---------
 room_id      | text |           | not null | 
 receipt_type | text |           | not null | 
 user_id      | text |           | not null | 
 event_ids    | text |           | not null | 
 data         | text |           | not null | 
 thread_id    | text |           |          | 
Indexes:
    "receipts_graph_unique_index" UNIQUE, btree (room_id, receipt_type, user_id) WHERE thread_id IS NULL
    "receipts_graph_uniqueness_thread" UNIQUE CONSTRAINT, btree (room_id, receipt_type, user_id, thread_id)

synapse=# \d receipts_linearized
               Table "public.receipts_linearized"
        Column         |  Type  | Collation | Nullable | Default 
-----------------------+--------+-----------+----------+---------
 stream_id             | bigint |           | not null | 
 room_id               | text   |           | not null | 
 receipt_type          | text   |           | not null | 
 user_id               | text   |           | not null | 
 event_id              | text   |           | not null | 
 data                  | text   |           | not null | 
 instance_name         | text   |           |          | 
 event_stream_ordering | bigint |           |          | 
 thread_id             | text   |           |          | 
Indexes:
    "receipts_linearized_id" btree (stream_id)
    "receipts_linearized_room_stream" btree (room_id, stream_id)
    "receipts_linearized_unique_index" UNIQUE, btree (room_id, receipt_type, user_id) WHERE thread_id IS NULL
    "receipts_linearized_uniqueness_thread" UNIQUE CONSTRAINT, btree (room_id, receipt_type, user_id, thread_id)
    "receipts_linearized_user" btree (user_id)

@squahtx
Copy link
Contributor

squahtx commented Nov 22, 2022

Thanks. That all looks okay.
I'm not sure how or why the error in #14377 (comment) cleared itself up.

Could you run SELECT * FROM background_updates? I'm expecting to see receipts_graph_unique_index in there still.

@kylrth
Copy link
Author

kylrth commented Nov 22, 2022

synapse=# SELECT * FROM background_updates;
            update_name            | progress_json |          depends_on           | ordering 
-----------------------------------+---------------+-------------------------------+----------
 event_push_actions_thread_id_null | {}            | event_push_backfill_thread_id |     7306
 event_push_summary_thread_id_null | {}            | event_push_backfill_thread_id |     7306
 threads_backfill                  | {}            |                               |     7309
(3 rows)

If you're willing to help debug further somewhere other than GitHub (and huge thanks for what you've done so far), we could take this to matrix @kyle:kylrth.com

@squahtx
Copy link
Contributor

squahtx commented Nov 23, 2022

That looks okay then. Are you still seeing errors in the logs when restarting on 1.72.0?

@kylrth
Copy link
Author

kylrth commented Nov 23, 2022

Hmm, well now I restarted again with 1.72.0 and I'm not seeing any errors. Not sure how it got fixed, since I haven't made any changes since the previous attempt. Thanks so much for you help!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
O-Uncommon Most users are unlikely to come across this or unexpected workflow S-Minor Blocks non-critical functionality, workarounds exist. T-Other Questions, user support, anything else. X-Needs-Info This issue is blocked awaiting information from the reporter
Projects
None yet
Development

No branches or pull requests

3 participants