-
Notifications
You must be signed in to change notification settings - Fork 333
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: region_peers
returns same region_id for multi logical tables
#4190
Conversation
WalkthroughThe changes optimize how region and peer information is collected and stored in the Changes
Assessment against linked issues
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (1)
- src/catalog/src/information_schema/region_peers.rs (2 hunks)
Additional comments not posted (2)
src/catalog/src/information_schema/region_peers.rs (2)
34-34
: Updated import statement to includeRegionId
.The inclusion of
RegionId
in the import statement supports the changes in theadd_region_peers
method, which now utilizesRegionId
to ensure unique identifiers for different logical tables.
217-222
: Refactoredadd_region_peers
to accepttable_id
.The addition of the
table_id
parameter is a significant and necessary change to ensure uniqueregion_id
values across different logical tables. This directly addresses the bug reported in issue #4157. Consider adding a comment here explaining howtable_id
is used to generate uniqueregion_ids
.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #4190 +/- ##
==========================================
- Coverage 85.11% 84.85% -0.27%
==========================================
Files 1028 1028
Lines 180903 180908 +5
==========================================
- Hits 153970 153503 -467
- Misses 26933 27405 +472 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PTAL @WenyXu |
@realtaobo, that is nicely done. Would you like to add a sqlness test for this case? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (4)
- tests/cases/distributed/information_schema/region_peers.result (1 hunks)
- tests/cases/distributed/information_schema/region_peers.sql (1 hunks)
- tests/cases/standalone/information_schema/region_peers.result (1 hunks)
- tests/cases/standalone/information_schema/region_peers.sql (1 hunks)
Additional comments not posted (9)
tests/cases/standalone/information_schema/region_peers.sql (2)
4-26
: Ensure proper configuration for metric tables and partitioning.The tables are configured with the
metric
engine, which is suitable for time-series data. The partitioning logic inregion_peers_test
appears well-thought-out for scalability. However, ensure that the partition ranges (a < 10
,a >= 10 AND a < 20
,a >= 20
) cover all potential values without overlaps or gaps.
32-32
: Proper schema context switch.Switching back to the
public
schema at the end of the script is a good practice to maintain a clean testing environment.tests/cases/distributed/information_schema/region_peers.sql (3)
28-28
: Optimized SELECT statement for distributed environments.Specifying columns (
region_id
,is_leader
,status
,down_seconds
) instead of using*
is a good practice, especially in distributed environments where minimizing data transfer is crucial.
32-32
: Consistent schema context switching across test environments.Maintaining consistency in switching back to the
public
schema at the end of the script is crucial for ensuring a clean state across different test environments.
4-26
: Check distributed-specific configurations for metric tables.The CREATE TABLE statements are identical to the standalone version. Ensure that any distributed-specific configurations or optimizations are considered and applied if necessary.
Verification successful
Check distributed-specific configurations for metric tables.
The CREATE TABLE statements in the distributed environment are consistent with those in the standalone environment. The tables are configured with
engine = metric
andPARTITION ON COLUMNS
, which are typical for distributed environments. However, ensure that any additional distributed-specific optimizations or configurations are considered and applied if necessary.
region_peers_phy
:engine = metric with ("physical_metric_table" = "")
region_peers_t1
:engine = metric with ("on_physical_table" = "region_peers_phy")
region_peers_t2
:engine = metric with ("on_physical_table" = "region_peers_phy")
region_peers_test
:PARTITION ON COLUMNS (a)
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Check for distributed-specific configurations in CREATE TABLE statements. # Test: Search for distributed-specific configurations in SQL files. rg --type sql $'CREATE TABLE' --context 10Length of output: 201978
tests/cases/distributed/information_schema/region_peers.result (2)
40-49
: Correct expected results reflecting uniqueregion_id
.The expected results correctly show unique
region_id
values, which is the main objective of the PR to address the issue of duplicateregion_id
s in different logical tables.
2-57
: Good use of affected rows messages for clarity in test results.Including messages about affected rows after each operation provides clear feedback on the impact of each SQL statement, which is beneficial for debugging and understanding the flow of the test.
tests/cases/standalone/information_schema/region_peers.result (2)
40-49
: Correct expected results reflecting uniqueregion_id
.The expected results correctly show unique
region_id
values, which is the main objective of the PR to address the issue of duplicateregion_id
s in different logical tables.
2-57
: Consistent use of affected rows messages across test environments.The inclusion of messages about affected rows after each operation is consistent with the distributed environment tests and provides clear feedback, which is beneficial for debugging and understanding the flow of the test.
Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (2)
- tests/cases/standalone/common/information_schema/region_peers.result (1 hunks)
- tests/cases/standalone/common/information_schema/region_peers.sql (1 hunks)
Additional comments not posted (5)
tests/cases/standalone/common/information_schema/region_peers.sql (4)
32-32
: Validate context switch to public schema.Switching back to the
public
schema at the end of the script is a good practice to reset the testing environment. Confirm that this switch effectively resets the session to avoid any unintended side effects in other tests.
4-10
: Verify table creation and partitioning logic.The SQL script correctly creates the tables
region_peers_phy
,region_peers_t1
, andregion_peers_t2
with appropriate engine settings and primary keys. The partitioning logic forregion_peers_test
is well-defined. Ensure that the partitioning ranges meet the expected use cases and data distribution.Also applies to: 12-16, 18-26
Verification successful
Verify the partitioning logic in the
region_peers_test
table.The partitioning logic for the
region_peers_test
table needs to be verified to ensure it meets the expected use cases and data distribution. The search results indicate the presence of partitioning logic in theregion_peers.sql
file. Let's extract the relevant partitioning details from this file.
Verified partitioning logic in the
region_peers_test
table.The partitioning logic for the
region_peers_test
table is correctly defined with appropriate ranges for columna
.
a < 10
a >= 10 AND a < 20
a >= 20
This meets the expected use cases and data distribution requirements.
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the partitioning logic in the `region_peers_test` table. # Test: Search for the partitioning logic in the SQL files. rg --type sql $'PARTITION ON COLUMNS'Length of output: 797
Script:
#!/bin/bash # Extract the partitioning logic from the `region_peers.sql` file. rg --type sql --context 5 'PARTITION ON COLUMNS' tests/cases/standalone/common/information_schema/region_peers.sqlLength of output: 298
28-28
: Ensure accuracy of the SELECT query.The query aims to count distinct
region_id
values from theregion_peers
table. This is crucial for verifying that the issue #4157 is resolved. Consider adding more comprehensive checks to ensure the correctness of the returned data.
30-30
: Check the cleanup process.The script drops all created tables at the end, which is good for cleanup. Ensure that no residual data or metadata persists after the tables are dropped, which could affect subsequent tests.
tests/cases/standalone/common/information_schema/region_peers.result (1)
4-4
: Check consistency of affected rows messages.The result file consistently reports 'Affected Rows: 0' for each table operation. Verify that these messages accurately reflect the operations performed, especially considering the creation and dropping of tables should affect rows.
Also applies to: 8-8, 16-16, 24-24, 36-36, 48-48, 52-52
Verification successful
Check consistency of affected rows messages.
The 'Affected Rows: 0' messages in the result file accurately reflect the operations performed in the
region_peers.sql
script. The creation and dropping of tables typically do not affect rows, and theSELECT
query returns a count without affecting rows.
CREATE TABLE
operations: Expected affected rows = 0SELECT COUNT
query: Expected affected rows = 0DROP TABLE
operations: Expected affected rows = 0Thus, the 'Affected Rows: 0' messages are consistent with the SQL operations performed.
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the affected rows messages in the result file. # Test: Check the logs or output directly after running the SQL script to ensure it matches the expected 'Affected Rows: 0'. echo "Check your database logs or direct output for consistency with affected rows."Length of output: 164
Script:
#!/bin/bash # Locate the SQL script associated with the result file. fd -e sql region_peersLength of output: 90
Script:
#!/bin/bash # Display the contents of the SQL script to inspect the operations performed. cat tests/cases/standalone/common/information_schema/region_peers.sqlLength of output: 938
I hereby agree to the terms of the GreptimeDB CLA.
Refer to a related PR or issue link (optional)
Resolve #4157
What's changed and what's your intention?
as title.
Checklist
Summary by CodeRabbit
New Features
INFORMATION_SCHEMA
database for storing and managing region peer information.Bug Fixes
Tests