Consider supporting PostgreSQL (and switching to it for matrix.org/vector.im) #468

reivilibre · 2021-11-12T12:44:54Z

Some of the recent problems with running the casefolding migration script against the live SQLite database have led us to think about whether we should support PostgreSQL as an alternative to SQLite (and use it in production).

Advantages

As a team, we generally seem more experienced with Postgres
- this extends to things like reading query plans
Postgres' concurrency model (multi-version concurrency control) owes itself more to running these out-of-process jobs without locking up the main Sydent process.
- even after thinking through SQLite's locks for this job, we haven't deduced why it does not work properly. (It may be a SQLite bug, but assuming it's not, we appear to have a better grasp of Postgres' concurrency model.)
Dubious: I can see some people being keen on this for 'scalability' reasons but I'm not aware of there being a huge load placed on Sydent anyway.
Dubious: Postgres supports things like replication, which may be useful for operating a reliable service — not sure if we care about that

Disadvantages

Obviously, it's potentially more code to support. We'd want to expand CI to run against Postgres.
If we needed to, moving back in the other direction seems less simple (I don't immediately find an off the shelf tool, though equally it may just be possible to create an empty Sydent schema in SQLite and only dump/load the table contents.)

Why adding support for PostgreSQL may not be so bad

SQLite is generally inspired by Postgres for its SQL dialect. Whilst not strictly true, it's almost as though Postgres SQL is a superset of SQLite SQL — I expect there to be very few (if any) query changes necessary. Moving in the other direction is harder because SQLite is pickier, especially around schema changes.
- essentially: if you write your queries for SQLite then they probably work on Postgres
Database migration can likely be done out-of-the-box with a tool such as pgloader, which converts from several database and file formats (including SQLite3) to Postgres
- it seems to handle indexes and NOT NULL properly.
The database is only small so we can probably do migration with a small amount of downtime without needing to worry too much
We only do fairly basic operations to the database so it's unlikely we would need to write engine-specific queries

The text was updated successfully, but these errors were encountered:

DMRobertson · 2021-11-15T11:03:42Z

Obviously, it's potentially more code to support. We'd want to expand CI to run against Postgres.

Only if you want to support both DBs in parallel. (But sqlalchemy exists to mitigate that somewhat anyway.)

reivilibre · 2021-12-21T19:00:25Z

As another reason I'm more comfortable with Postgres, I found out sqlite3's .dump is dangerous as it doesn't preserve PRAGMA user_version, then when you start up a new Sydent on the backup, it goes and corrupts itself. I did this on a copy of the backup so I didn't lose anything, but this sort of thing is concerning and in contrast I have more faith in pg_dump which I have used time and time again.

I would also go to say that maybe we should use a proper SQL migration library in Sydent instead of rolling our own, but oh well.

DMRobertson · 2022-01-07T14:57:44Z

I would also go to say that maybe we should use a proper SQL migration library in Sydent instead of rolling our own, but oh well.

I've had neutral to positve experiences with sqlalchemy + alembic. But I don't think Sydent's DB changes all that much; doubt the investment would be worth it. (Unless it was a testbed for synapse...)

jdauphant · 2022-03-18T18:38:04Z

In our case, we have high usage of sydent, the sydent process is regularly at 100% CPU on some of our sydent deployement (we use 17 sydent servers replicated).
Also backup isn't simplified by SQLite.

anoadragon453 · 2022-10-12T10:53:48Z

In terms of the work required, off the top of my head one would need to:

Set up config/env vars for postgres
Go through each query that SQLite currently executes and check that the same results would be returned by PostgreSQL. If not, try to rewrite it without sacrificing performance. As a last resort, pick separate queries to use based on the currently in-use database engine.
Writing an SQLite → PostgreSQL migrator.
Getting the tests to run on PostgreSQL (unit tests as a start is probably fine).
Write up documentation for using it.

So no small feat, but the codebase is fairly small and self-contained. I count 82 instances of cur.execute( in the source code, and the hope is that most of those queries will already “just work” with PostgreSQL.

One should also be weary of subtle differences between SQLite and PostgreSQL that may cause bugs.

It would also be nice to define minimum supported versions of each database engine. I propose just copying Synapse’s: https://matrix-org.github.io/synapse/latest/deprecation_policy.html

anoadragon453 added the Z-Time-Tracked Element employees should track their time spent on this issue/PR. label Oct 12, 2022

richvdh mentioned this issue Jan 31, 2023

i wish that it could have more database support. (feature) #125

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider supporting PostgreSQL (and switching to it for matrix.org/vector.im) #468

Consider supporting PostgreSQL (and switching to it for matrix.org/vector.im) #468

reivilibre commented Nov 12, 2021

DMRobertson commented Nov 15, 2021

reivilibre commented Dec 21, 2021

DMRobertson commented Jan 7, 2022

jdauphant commented Mar 18, 2022

anoadragon453 commented Oct 12, 2022

Consider supporting PostgreSQL (and switching to it for matrix.org/vector.im) #468

Consider supporting PostgreSQL (and switching to it for matrix.org/vector.im) #468

Comments

reivilibre commented Nov 12, 2021

Advantages

Disadvantages

Why adding support for PostgreSQL may not be so bad

DMRobertson commented Nov 15, 2021

reivilibre commented Dec 21, 2021

DMRobertson commented Jan 7, 2022

jdauphant commented Mar 18, 2022

anoadragon453 commented Oct 12, 2022