fix: wrong encoding on migration #651

gfyrag · 2025-01-15T10:39:36Z

No description provided.

coderabbitai · 2025-01-15T10:39:47Z

Walkthrough

This pull request introduces a comprehensive update to the SQL migration script, focusing on database schema modifications. The changes involve creating new functions and triggers for the moves, accounts, transactions, and logs tables. The primary objectives include enhancing data integrity, implementing more robust data handling mechanisms, and updating the encoding method for log hash generation. The modifications aim to improve the database's structural consistency and functionality across various table operations.

Changes

File	Change Summary
`internal/storage/bucket/migrations/11-make-stateless/up.sql`	- Updated `set_log_hash()` function with UTF-8 encoding - Added new functions: `enforce_reference_uniqueness()`, `set_transaction_inserted_at()`, `set_transaction_addresses()`, `set_transaction_addresses_segments()`, `set_address_array_for_account()` - Created new constraint trigger for reference uniqueness - Modified table structures for `moves`, `transactions`, `logs`, and `accounts` - Added new `accounts_volumes` table

Sequence Diagram

sequenceDiagram
    participant DB as Database
    participant Trigger as Constraint Triggers
    participant Func as Custom Functions

    DB->>Trigger: Insert/Update Transaction
    Trigger->>Func: Enforce Reference Uniqueness
    Trigger->>Func: Set Transaction Metadata
    Trigger->>Func: Update Address Arrays
    Func-->>DB: Validate and Modify Data

Possibly related PRs

fix: wrong encoding on a migration #650: Changes to the set_log_hash function, updating encoding from 'LATIN1' to 'UTF-8'

Suggested reviewers

paul-nicolas

Poem

🐰 In the realm of SQL, where data dances free,
Triggers and functions weave their magic spree
Encoding shifts, tables transform with grace
A database ballet in its digital space
Migration's rabbit hops with coding delight! 🚀

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 0

🔭 Outside diff range comments (2)

internal/storage/bucket/migrations/11-make-stateless/up.sql (2)

Line range hint 523-587: Add error handling in DO block

The DO block creates multiple triggers and sequences without proper error handling. Consider adding exception handling to ensure atomic execution.

DO
$do$
	declare
		ledger record;
		vsql text;
	BEGIN
+       -- Add transaction to ensure atomic execution
+       BEGIN
		for ledger in select * from _system.ledgers where bucket = current_schema loop
			-- Wrap each ledger's operations in a sub-transaction
+           BEGIN
			vsql = 'create sequence "transaction_id_' || ledger.id || '" owned by transactions.id';
			execute vsql;
			-- ... (rest of the operations)
+           EXCEPTION WHEN OTHERS THEN
+               RAISE NOTICE 'Failed to process ledger %: %', ledger.name, SQLERRM;
+               RAISE;
+           END;
		end loop;
+       EXCEPTION WHEN OTHERS THEN
+           RAISE NOTICE 'Migration failed: %', SQLERRM;
+           RAISE;
+       END;
	END
$do$;

Line range hint 589-617: Consider using a unique index instead of advisory locks

The current implementation uses advisory locks to enforce reference uniqueness, which could cause unnecessary blocking. A unique partial index would be more efficient.

-create or replace function enforce_reference_uniqueness() returns trigger
-	security definer
-	language plpgsql
-as
-$$
-begin
-	perform pg_advisory_xact_lock(hashtext('reference-check' || current_schema));
-
-	if exists(
-		select 1
-		from transactions
-		where reference = new.reference
-			and ledger = new.ledger
-			and id != new.id
-	) then
-		raise exception 'duplicate reference';
-	end if;
-
-	return new;
-end
-$$ set search_path from current;
-
-create constraint trigger enforce_reference_uniqueness
-after insert on transactions
-deferrable initially deferred
-for each row
-when ( new.reference is not null )
-execute procedure enforce_reference_uniqueness();

+-- Create a unique partial index instead
+CREATE UNIQUE INDEX transactions_reference_unique_idx 
+ON transactions (ledger, reference) 
+WHERE reference IS NOT NULL;

🧹 Nitpick comments (2)

internal/storage/bucket/migrations/11-make-stateless/up.sql (2)

Line range hint 3-22: Consider using a more robust approach for transaction_date()

The current implementation using a temporary table could cause issues with concurrent transactions. Consider using a simpler approach:

-create or replace function transaction_date() returns timestamp as $$
-    declare
-        ret timestamp without time zone;
-    begin
-        create temporary table if not exists transaction_date on commit delete rows as
-        select statement_timestamp();
-
-        select *
-        from transaction_date
-        limit 1
-        into ret;
-
-        if not found then
-            ret = statement_timestamp();
-
-            insert into transaction_date
-            select ret at time zone 'utc';
-        end if;
-
-        return ret at time zone 'utc';
-    end
+create or replace function transaction_date() returns timestamp as $$
+    begin
+        return statement_timestamp() at time zone 'utc';
+    end
$$ language plpgsql;

Line range hint 523-587: Consider optimizing trigger operations

The current implementation creates multiple triggers per ledger, which could impact insert performance. Consider:

Combining related operations into single triggers where possible
Moving non-critical operations to the application layer

For example, set_transaction_addresses and set_transaction_addresses_segments could be combined:

create or replace function set_transaction_addresses() returns trigger
	security definer
	language plpgsql
as
$$
begin
	new.sources = (
		select to_jsonb(array_agg(v->>'source')) as value
		from jsonb_array_elements(new.postings::jsonb) v
	);
	new.destinations = (
		select to_jsonb(array_agg(v->>'destination')) as value
		from jsonb_array_elements(new.postings::jsonb) v
	);
+	new.sources_arrays = (
+		select to_jsonb(array_agg(explode_address(v ->> 'source'))) as value
+		from jsonb_array_elements(new.postings::jsonb) v
+	);
+	new.destinations_arrays = (
+		select to_jsonb(array_agg(explode_address(v ->> 'destination'))) as value
+		from jsonb_array_elements(new.postings::jsonb) v
+	);

	return new;
end
$$ set search_path from current;

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2a7a065 and f422c88.

📒 Files selected for processing (1)

internal/storage/bucket/migrations/11-make-stateless/up.sql (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms (1)

GitHub Check: Tests

🔇 Additional comments (1)

internal/storage/bucket/migrations/11-make-stateless/up.sql (1)

464-464: ⚠️ Potential issue

Verify hash consistency after encoding change

The change from implicit LATIN1 to explicit UTF-8 encoding is good for handling international characters, but it will affect how new log hashes are generated. This could create inconsistencies between old and new log entries.

Run this script to check for potential hash inconsistencies:

✅ Verification successful

Encoding change is safe and well-implemented

The change from LATIN1 to UTF-8 encoding is safe because:

Previous hashes are preserved in their base64-encoded form

The UTF-8 encoding is only applied to new JSON data

Hash calculation remains atomic within transactions
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Check for potential hash inconsistencies in logs table
# Look for any logs with non-ASCII characters that might be affected by the encoding change

# Check for logs with non-ASCII characters
rg -l '[^\x00-\x7F]' | grep -i 'log'

# Check the logs table structure
\d logs

# Sample a few log entries to verify hash consistency
SELECT id, encode(hash, 'hex') as hash, type 
FROM logs 
WHERE date < NOW() 
ORDER BY id DESC 
LIMIT 5;
Length of output: 368

Script:
#!/bin/bash
# Check the complete migration file content
cat internal/storage/bucket/migrations/11-make-stateless/up.sql

# Look for other encoding-related changes
rg -i "convert_to|encode|::bytea|latin1|utf-8|utf8" internal/storage/bucket/migrations/11-make-stateless/

# Find related test files
fd -e go -e sql test | grep -i "log\|hash\|encoding"
Length of output: 18594

codecov · 2025-01-15T10:42:55Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 81.64%. Comparing base (2a7a065) to head (f422c88).
Report is 1 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #651      +/-   ##
==========================================
+ Coverage   81.62%   81.64%   +0.01%     
==========================================
  Files         131      131              
  Lines        7059     7059              
==========================================
+ Hits         5762     5763       +1     
  Misses        994      994              
+ Partials      303      302       -1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

fix: wrong encoding on migration

f422c88

gfyrag requested a review from a team as a code owner January 15, 2025 10:39

gfyrag enabled auto-merge January 15, 2025 10:39

paul-nicolas approved these changes Jan 15, 2025

View reviewed changes

coderabbitai bot reviewed Jan 15, 2025

View reviewed changes

gfyrag added this pull request to the merge queue Jan 15, 2025

Merged via the queue into main with commit 2949957 Jan 15, 2025
10 checks passed

gfyrag deleted the fix/encoding branch January 15, 2025 10:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: wrong encoding on migration #651

fix: wrong encoding on migration #651

gfyrag commented Jan 15, 2025

coderabbitai bot commented Jan 15, 2025 •

edited

Loading

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

coderabbitai bot left a comment

codecov bot commented Jan 15, 2025 •

edited

Loading

fix: wrong encoding on migration #651

fix: wrong encoding on migration #651

Conversation

gfyrag commented Jan 15, 2025

coderabbitai bot commented Jan 15, 2025 • edited Loading

Walkthrough

Changes

Sequence Diagram

Possibly related PRs

Suggested reviewers

Poem

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

codecov bot commented Jan 15, 2025 • edited Loading

Codecov Report

coderabbitai bot commented Jan 15, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

codecov bot commented Jan 15, 2025 •

edited

Loading