Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more queries, fix DB sequence ID's #113

Merged
merged 1 commit into from
Aug 21, 2024

Conversation

richardhuaaa
Copy link
Contributor

@richardhuaaa richardhuaaa commented Aug 21, 2024

I've fixed a number of issues, would like to suggest some best practices going forward that I've implemented here:

  1. Reliable sequence ID's. Whenever we add a table that uses a sequence for ordering, we should also add a Postgres stored function immediately below it that performs insertions by taking an advisory lock first. This ensures that inserts on the table are processed serially.
  2. Idempotent inserts. Whenever we add an insert query, we should add an ON CONFLICT DO NOTHING clause, and either return the inserted count via :execrows or the row itself via :one. This saves the caller from having to parse untyped errors in search of unique constraint violations, which are sometimes expected.
  3. Result granularity.
    • Postgres functions should always return the inserted rows via RETURNING *, even if the result is not currently used. This is because any future changes to the function would require a DB migration.
    • In queries.sql, results should be defined as narrowly as possible - for example, do not return the row via :one if the code is not using it. This saves networking bandwidth, and the query can easily be changed later on without a migration.
  4. Consistent SID terminology. Settling on the terms Node ID and Sequence ID as the components of a SID, removing references to LocalID.

Some other stuff I've done:

  1. I've split up the sid field in the gateway_envelopes table into node_id and sequence_id, because the postgres BIGINT type is a signed 64-bit int that cannot hold a uint64 without bad side effects.
  2. I've added the insert_gateway_envelope and delete_staged_originator_envelope queries which are needed for the stacked PR's.

Copy link
Contributor Author

richardhuaaa commented Aug 21, 2024

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @richardhuaaa and the rest of your teammates on Graphite Graphite

This was referenced Aug 21, 2024
Comment on lines 131 to +135
if err != nil {
return fmt.Errorf("unable to insert node info into database: %v", err)
}

if numRows == 0 {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An example of where ON CONFLICT DO NOTHING + :execRows is useful

AS $$
BEGIN
-- Ensures that the generated sequence ID matches the insertion order
-- Only released at the end of the enclosing transaction - beware if called within a long transaction
Copy link
Contributor Author

@richardhuaaa richardhuaaa Aug 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every SQL statement gets wrapped into a transaction if one is not explicitly created, but what we need to be careful of is if application code creates an enclosing transaction that performs a bunch of stuff after this function is called.

If that is needed, the application code can manually wrap the function call in a sub-transaction, to make sure the lock is released in a timely manner.

@richardhuaaa richardhuaaa force-pushed the 08-20-add_queries_and_sprocs branch 2 times, most recently from 0c684a8 to 966eb8b Compare August 21, 2024 04:29
@richardhuaaa richardhuaaa force-pushed the 08-20-add_queries_and_sprocs branch from 966eb8b to b342aa5 Compare August 21, 2024 04:30
@richardhuaaa richardhuaaa changed the title Add queries and sprocs Add more queries, fix DB sequence ID's Aug 21, 2024
@richardhuaaa richardhuaaa marked this pull request as ready for review August 21, 2024 04:35
@neekolas
Copy link
Contributor

Great work going deep on these sequencing problems. Unfortunate that we have to use table-level locks, but I don't see much of an alternative.

This article is also useful and lists a bunch of alternative solutions to the problem (and why they're all bad). Batching would help a bit, but only if we had enough volume to have large batches in very short time windows. We could always add batching later.

@richardhuaaa richardhuaaa merged commit 3879604 into main Aug 21, 2024
5 checks passed
@richardhuaaa richardhuaaa deleted the 08-20-add_queries_and_sprocs branch August 21, 2024 16:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants