Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Fix ambiguous column name that would prevent use of MSC2716 History Import when using Postgres as a database. #12843

Merged
merged 3 commits into from
May 26, 2022

Conversation

reivilibre
Copy link
Contributor

@reivilibre reivilibre commented May 23, 2022

Fixes #12825.

@MadLittleMods I'd like to request your eyes to check this is what you intended.
It matches my reading of the surrounding commentary, but SQLite's query plan would have used i according to some experimentation!
Complement tests pass with either c or i so weren't much use in discerning the two :/.

@reivilibre reivilibre requested a review from a team as a code owner May 23, 2022 13:56
@reivilibre reivilibre requested a review from MadLittleMods May 23, 2022 13:56
@reivilibre reivilibre force-pushed the rei/wcom_importhistory_columns branch from 310ccd7 to e69a9ae Compare May 23, 2022 13:57
@@ -1057,7 +1057,7 @@ def _get_connected_batch_event_backfill_results_txn(
INNER JOIN batch_events AS c
ON i.next_batch_id = c.batch_id
/* Get the depth of the batch start event from the events table */
INNER JOIN events AS e USING (event_id)
INNER JOIN events AS e ON c.event_id = e.event_id
Copy link
Contributor

@MadLittleMods MadLittleMods May 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

c.event_id (where batch_events AS c) makes sense 👍

We want to return connected batch events and the info about the batch event (depth, stream_ordering).

Thanks for fixing this up!

@@ -1057,7 +1057,7 @@ def _get_connected_batch_event_backfill_results_txn(
INNER JOIN batch_events AS c
ON i.next_batch_id = c.batch_id
/* Get the depth of the batch start event from the events table */
INNER JOIN events AS e USING (event_id)
INNER JOIN events AS e ON c.event_id = e.event_id
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we configure SQLite to yell about ambiguous columns as well?

Copy link
Contributor Author

@reivilibre reivilibre May 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't found a way. I've opened a thread on their forum about this (since I didn't find anyone mentioning this behaviour anywhere, so I may as well be the one to do so).
If any SQLite pro comes back to me with a 'oh you forgot to enable the PRAGMA no_silly_stuff=1 flag' then I'll come back and enable it :-).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Cross link please? 🙏 )

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's intentional behaviour and there are test cases ensuring it picks the first one: https://www.sqlite.org/forum/forumpost/941664fc450a4f56
Mostly as a legacy compatibility thing, but RIGHT and FULL joins will cause ambiguity checking (because old SQLites didn't support them and hence no applications can be relying on this).

@@ -1057,7 +1057,7 @@ def _get_connected_batch_event_backfill_results_txn(
INNER JOIN batch_events AS c
ON i.next_batch_id = c.batch_id
/* Get the depth of the batch start event from the events table */
INNER JOIN events AS e USING (event_id)
INNER JOIN events AS e ON c.event_id = e.event_id
Copy link
Contributor

@MadLittleMods MadLittleMods May 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Complement tests pass with either c or i so weren't much use in discerning the two :/.

For reference, backfilling from the insertion or batch event will work. But we already have the insertion event so there isn't a need to backfill it again. And the mistake here would only make it hint to backfill from the batch event earlier anyway.

They are both in the historical chain (diagram)

Copy link
Member

@erikjohnston erikjohnston left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise lgtm

@@ -0,0 +1 @@
Fix ambiguous column name that would prevent use of MSC2716 History Import when using Postgres as a database.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you rephrase this as what the end user would see, rather than what is happening internally? I.e. something like "Fix bug where servers using SQLite3 would fail to backfill in MSC2716 rooms" or something.

The changelog is aimed at server admins rather than developers

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the problem is that I have never used this feature and don't really understand how this would look to a server admin :-).

Fix bug where servers using a Postgres database would fail to import historical messages using MSC2716.

Would that be more suitable?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, potentially? @MadLittleMods may have a better idea

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't fail to import for me with my local Postgres Synapse setup.

Looking at this spot of code, I am guessing it will fail to backfill from another server when it sees an insertion event if you have experimental_features.msc2716_enabled enabled.

This seems more accurate:

Fix bug where servers using a Postgres database would fail to backfill from an insertion event when MSC2716 is enabled (experimental_features.msc2716_enabled).


$ psql -V
psql (PostgreSQL) 13.5

$ postgres -V
postgres (PostgreSQL) 13.5

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, my message was purely going off the name of the test. What you've written makes sense.

@reivilibre reivilibre enabled auto-merge (squash) May 26, 2022 09:28
@reivilibre reivilibre merged commit e768644 into develop May 26, 2022
@reivilibre reivilibre deleted the rei/wcom_importhistory_columns branch May 26, 2022 11:19
Fizzadar added a commit to Fizzadar/synapse that referenced this pull request Jun 15, 2022
Synapse 1.61.0 (2022-06-14)
===========================

This release removes support for the non-standard feature known both as 'groups' and as 'communities', which have been superseded by *Spaces*.

See [the upgrade notes](https://github.com/matrix-org/synapse/blob/develop/docs/upgrade.md#upgrading-to-v1610)
for more details.

Improved Documentation
----------------------

- Mention removed community/group worker endpoints in [upgrade.md](https://github.com/matrix-org/synapse/blob/develop/docs/upgrade.md#upgrading-to-v1610s). Contributed by @olmari. ([\matrix-org#13023](matrix-org#13023))

Synapse 1.61.0rc1 (2022-06-07)
==============================

Features
--------

- Add new `media_retention` options to the homeserver config for routinely cleaning up non-recently accessed media. ([\matrix-org#12732](matrix-org#12732), [\matrix-org#12972](matrix-org#12972), [\matrix-org#12977](matrix-org#12977))
- Experimental support for [MSC3772](matrix-org/matrix-spec-proposals#3772): Push rule for mutually related events. ([\matrix-org#12740](matrix-org#12740), [\matrix-org#12859](matrix-org#12859))
- Update to the `check_event_for_spam` module callback: Deprecate the current callback signature, replace it with a new signature that is both less ambiguous (replacing booleans with explicit allow/block) and more powerful (ability to return explicit error codes). ([\matrix-org#12808](matrix-org#12808))
- Add storage and module API methods to get monthly active users (and their corresponding appservices) within an optionally specified time range. ([\matrix-org#12838](matrix-org#12838), [\matrix-org#12917](matrix-org#12917))
- Support the new error code `ORG.MATRIX.MSC3823.USER_ACCOUNT_SUSPENDED` from [MSC3823](matrix-org/matrix-spec-proposals#3823). ([\matrix-org#12845](matrix-org#12845), [\matrix-org#12923](matrix-org#12923))
- Add a configurable background job to delete stale devices. ([\matrix-org#12855](matrix-org#12855))
- Improve URL previews for pages with empty elements. ([\matrix-org#12951](matrix-org#12951))
- Allow updating a user's password using the admin API without logging out their devices. Contributed by @jcgruenhage. ([\matrix-org#12952](matrix-org#12952))

Bugfixes
--------

- Always send an `access_token` in `/thirdparty/` requests to appservices, as required by the [Application Service API specification](https://spec.matrix.org/v1.1/application-service-api/#third-party-networks). ([\matrix-org#12746](matrix-org#12746))
- Implement [MSC3816](matrix-org/matrix-spec-proposals#3816): sending the root event in a thread should count as having 'participated' in it. ([\matrix-org#12766](matrix-org#12766))
- Delete events from the `federation_inbound_events_staging` table when a room is purged through the admin API. ([\matrix-org#12784](matrix-org#12784))
- Fix a bug where we did not correctly handle invalid device list updates over federation. Contributed by Carl Bordum Hansen. ([\matrix-org#12829](matrix-org#12829))
- Fix a bug which allowed multiple async operations to access database locks concurrently. Contributed by @sumnerevans @ Beeper. ([\matrix-org#12832](matrix-org#12832))
- Fix an issue introduced in Synapse 0.34 where the `/notifications` endpoint would only return notifications if a user registered at least one pusher. Contributed by Famedly. ([\matrix-org#12840](matrix-org#12840))
- Fix a bug where servers using a Postgres database would fail to backfill from an insertion event when MSC2716 is enabled (`experimental_features.msc2716_enabled`). ([\matrix-org#12843](matrix-org#12843))
- Fix [MSC3787](matrix-org/matrix-spec-proposals#3787) rooms being omitted from room directory, room summary and space hierarchy responses. ([\matrix-org#12858](matrix-org#12858))
- Fix a bug introduced in Synapse 1.54.0 which could sometimes cause exceptions when handling federated traffic. ([\matrix-org#12877](matrix-org#12877))
- Fix a bug introduced in Synapse 1.59.0 which caused room deletion to fail with a foreign key violation error. ([\matrix-org#12889](matrix-org#12889))
- Fix a long-standing bug which caused the `/messages` endpoint to return an incorrect `end` attribute when there were no more events. Contributed by @Vetchu. ([\matrix-org#12903](matrix-org#12903))
- Fix a bug introduced in Synapse 1.58.0 where `/sync` would fail if the most recent event in a room was a redaction of an event that has since been purged. ([\matrix-org#12905](matrix-org#12905))
- Fix a potential memory leak when generating thumbnails. ([\matrix-org#12932](matrix-org#12932))
- Fix a long-standing bug where a URL preview would break if the image failed to download. ([\matrix-org#12950](matrix-org#12950))

Improved Documentation
----------------------

- Fix typographical errors in documentation. ([\matrix-org#12863](matrix-org#12863))
- Fix documentation incorrectly stating the `sendToDevice` endpoint can be directed at generic workers. Contributed by Nick @ Beeper. ([\matrix-org#12867](matrix-org#12867))

Deprecations and Removals
-------------------------

- Remove support for the non-standard groups/communities feature from Synapse. ([\matrix-org#12553](matrix-org#12553), [\matrix-org#12558](matrix-org#12558), [\matrix-org#12563](matrix-org#12563), [\matrix-org#12895](matrix-org#12895), [\matrix-org#12897](matrix-org#12897), [\matrix-org#12899](matrix-org#12899), [\matrix-org#12900](matrix-org#12900), [\matrix-org#12936](matrix-org#12936), [\matrix-org#12966](matrix-org#12966))
- Remove contributed `kick_users.py` script. This is broken under Python 3, and is not added to the environment when `pip install`ing Synapse. ([\matrix-org#12908](matrix-org#12908))
- Remove `contrib/jitsimeetbridge`. This was an unused experiment that hasn't been meaningfully changed since 2014. ([\matrix-org#12909](matrix-org#12909))
- Remove unused `contrib/experiements/cursesio.py` script, which fails to run under Python 3. ([\matrix-org#12910](matrix-org#12910))
- Remove unused `contrib/experiements/test_messaging.py` script. This fails to run on Python 3. ([\matrix-org#12911](matrix-org#12911))

Internal Changes
----------------

- Test Synapse against Complement with workers. ([\matrix-org#12810](matrix-org#12810), [\matrix-org#12933](matrix-org#12933))
- Reduce the amount of state we pull from the DB. ([\matrix-org#12811](matrix-org#12811), [\matrix-org#12964](matrix-org#12964))
- Try other homeservers when re-syncing state for rooms with partial state. ([\matrix-org#12812](matrix-org#12812))
- Resume state re-syncing for rooms with partial state after a Synapse restart. ([\matrix-org#12813](matrix-org#12813))
- Remove Mutual Rooms' ([MSC2666](matrix-org/matrix-spec-proposals#2666)) endpoint dependency on the User Directory. ([\matrix-org#12836](matrix-org#12836))
- Experimental: expand `check_event_for_spam` with ability to return additional fields. This enables spam-checker implementations to experiment with mechanisms to give users more information about why they are blocked and whether any action is needed from them to be unblocked. ([\matrix-org#12846](matrix-org#12846))
- Remove `dont_notify` from the `.m.rule.room.server_acl` rule. ([\matrix-org#12849](matrix-org#12849))
- Remove the unstable `/hierarchy` endpoint from [MSC2946](matrix-org/matrix-spec-proposals#2946). ([\matrix-org#12851](matrix-org#12851))
- Pull out less state when handling gaps in room DAG. ([\matrix-org#12852](matrix-org#12852), [\matrix-org#12904](matrix-org#12904))
- Clean-up the push rules datastore. ([\matrix-org#12856](matrix-org#12856))
- Correct a type annotation in the URL preview source code. ([\matrix-org#12860](matrix-org#12860))
- Update `pyjwt` dependency to [2.4.0](https://github.com/jpadilla/pyjwt/releases/tag/2.4.0). ([\matrix-org#12865](matrix-org#12865))
- Enable the `/account/whoami` endpoint on synapse worker processes. Contributed by Nick @ Beeper. ([\matrix-org#12866](matrix-org#12866))
- Enable the `batch_send` endpoint on synapse worker processes. Contributed by Nick @ Beeper. ([\matrix-org#12868](matrix-org#12868))
- Don't generate empty AS transactions when the AS is flagged as down. Contributed by Nick @ Beeper. ([\matrix-org#12869](matrix-org#12869))
- Fix up the variable `state_store` naming. ([\matrix-org#12871](matrix-org#12871))
- Faster room joins: when querying the current state of the room, wait for state to be populated. ([\matrix-org#12872](matrix-org#12872))
- Avoid running queries which will never result in deletions. ([\matrix-org#12879](matrix-org#12879))
- Use constants for EDU types. ([\matrix-org#12884](matrix-org#12884))
- Reduce database load of `/sync` when presence is enabled. ([\matrix-org#12885](matrix-org#12885))
- Refactor `have_seen_events` to reduce memory consumed when processing federation traffic. ([\matrix-org#12886](matrix-org#12886))
- Refactor receipt linearization code. ([\matrix-org#12888](matrix-org#12888))
- Add type annotations to `synapse.logging.opentracing`. ([\matrix-org#12894](matrix-org#12894))
- Remove PyNaCl occurrences directly used in Synapse code. ([\matrix-org#12902](matrix-org#12902))
- Bump types-jsonschema from 4.4.1 to 4.4.6. ([\matrix-org#12912](matrix-org#12912))
- Rename storage classes. ([\matrix-org#12913](matrix-org#12913))
- Preparation for database schema simplifications: stop reading from `event_edges.room_id`. ([\matrix-org#12914](matrix-org#12914))
- Check if we are in a virtual environment before overriding the `PYTHONPATH` environment variable in the demo script. ([\matrix-org#12916](matrix-org#12916))
- Improve the logging when signature checks on events fail. ([\matrix-org#12925](matrix-org#12925))

# -----BEGIN PGP SIGNATURE-----
#
# iQFEBAABCgAuFiEEBTGR3/RnAzBGUif3pULk7RsPrAkFAmKoaa0QHGVyaWtAbWF0
# cml4Lm9yZwAKCRClQuTtGw+sCVn3B/sF8hdhrZ7hWW40ST3eG9cEKNFrj/xZXiaI
# zho3ryrxaQF68BSKot15AvZSprEdwBXWrb8WeTjyw+QH7vTKrCQDZ0p7qubn10Z7
# BuKq9hyYjyCLjBZrgy8d4U3Y8gsSByuO59YKHNLn+UTJLOs5GTH8Wprwh4mpU3Jl
# +o+cC+lMSVcyZij2hihFymcSxWq/I9WL0dsjRif8x0BUQwRXwmXc6+mhlgLBe2Zs
# 2dUouzJ8NVZcjfWvsg4noXPrNQ/IiyCVZlSIgaDftDIxVSPk5/rXiUUNex8Tn1+I
# TnOgnXhpOQD1vwVGYS8LrcfA0ubSili7xUJ8k2e5TkCjpkaVnYNu
# =q+7N
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue Jun 14 11:57:49 2022 BST
# gpg:                using RSA key 053191DFF4670330465227F7A542E4ED1B0FAC09
# gpg:                issuer "[email protected]"
# gpg: WARNING: server 'dirmngr' is older than us (2.2.34 < 2.3.6)
# gpg: Note: Outdated servers may lack important security fixes.
# gpg: Note: Use the command "gpgconf --kill all" to restart them.
# gpg: Can't check signature: No public key

# Conflicts:
#	docs/workers.md
#	synapse/handlers/message.py
#	synapse/handlers/pagination.py
#	synapse/push/bulk_push_rule_evaluator.py
#	synapse/push/push_rule_evaluator.py
#	synapse/rest/client/receipts.py
#	synapse/storage/databases/main/appservice.py
#	tests/push/test_push_rule_evaluator.py
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TestImportHistoricalMessages test fails in Complement when using workers
4 participants