fix: `region_peers` returns same region_id for multi logical tables #4190

poltao · 2024-06-23T17:03:50Z

I hereby agree to the terms of the GreptimeDB CLA.

Refer to a related PR or issue link (optional)

Resolve #4157

What's changed and what's your intention?

as title.

Checklist

I have written the necessary rustdoc comments.
I have added the necessary unit tests and integration tests.
This PR requires documentation updates.

Summary by CodeRabbit

New Features
- Introduced new tables in the INFORMATION_SCHEMA database for storing and managing region peer information.
- Added support for querying region peer data, including distinct counts of region IDs.
Bug Fixes
- Enhanced the method for adding region peers to handle additional parameters for better data management.
Tests
- Included comprehensive test cases covering the creation, querying, and deletion of region peer tables in both distributed and standalone environments.

coderabbitai · 2024-06-23T17:04:01Z

Walkthrough

The changes optimize how region and peer information is collected and stored in the information_schema.region_peers table of the INFORMATION_SCHEMA database. Key modifications include adjustments to the add_region_peers method to accept a table_id parameter and the introduction of new SQL test files illustrating the creation, querying, and dropping of region peer-related tables.

Changes

File/Directory Path	Change Summary
`src/catalog/src/information_schema/region_peers.rs`	Modified the `add_region_peers` method to accept a `table_id` parameter and adjusted its logic for handling region and peer information.
`.../distributed/information_schema/region_peers.result`	Introduced the creation of multiple tables related to region peers and included a sample `SELECT` query on the `region_peers` table and subsequent table drops.
`.../standalone/information_schema/region_peers.result`	Included the creation of several region peer-related tables, data selection, and table drops before switching back to the `public` schema.
`.../distributed/information_schema/region_peers.sql`	Added definitions for multiple tables in the `INFORMATION_SCHEMA` database related to region peers, partitioning logic, and a `SELECT` query on region peers.
`.../standalone/information_schema/region_peers.sql`	Defined the structure of region peer-related tables and included partitioning, data selection logic, and table drops in the `INFORMATION_SCHEMA` database.
`.../standalone/common/information_schema/region_peers.result`	Added logic for creating/manipulating tables in the `information_schema` database and a query for counting distinct `region_id` values, followed by table drops.
`.../standalone/common/information_schema/region_peers.sql`	Introduced table definitions related to region peers in the `INFORMATION_SCHEMA` database, including a count query and table dropping logic.

Assessment against linked issues

Objective	Addressed	Explanation
`information_schema.region_peers` returns same `region_id` multiple times (#4157)	✅

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share

Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai generate interesting stats about this repository and render them as a table.
- @coderabbitai show all the console.log statements in this repository.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

src/catalog/src/information_schema/region_peers.rs

coderabbitai

Actionable comments posted: 1

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 5566dd7 and d12200f.

Files selected for processing (1)

src/catalog/src/information_schema/region_peers.rs (2 hunks)

Additional comments not posted (2)

src/catalog/src/information_schema/region_peers.rs (2)

34-34: Updated import statement to include RegionId.

The inclusion of RegionId in the import statement supports the changes in the add_region_peers method, which now utilizes RegionId to ensure unique identifiers for different logical tables.

217-222: Refactored add_region_peers to accept table_id.

The addition of the table_id parameter is a significant and necessary change to ensure unique region_id values across different logical tables. This directly addresses the bug reported in issue #4157. Consider adding a comment here explaining how table_id is used to generate unique region_ids.

src/catalog/src/information_schema/region_peers.rs

codecov · 2024-06-23T17:23:30Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 84.85%. Comparing base (07cbaba) to head (c4f26cc).

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #4190      +/-   ##
==========================================
- Coverage   85.11%   84.85%   -0.27%     
==========================================
  Files        1028     1028              
  Lines      180903   180908       +5     
==========================================
- Hits       153970   153503     -467     
- Misses      26933    27405     +472

fengjiachun

LGTM

fengjiachun · 2024-06-24T03:01:27Z

PTAL @WenyXu

WenyXu · 2024-06-24T04:18:43Z

@realtaobo, that is nicely done. Would you like to add a sqlness test for this case?

src/catalog/src/information_schema/region_peers.rs

coderabbitai

Actionable comments posted: 1

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between d12200f and 69d6335.

Files selected for processing (4)

tests/cases/distributed/information_schema/region_peers.result (1 hunks)
tests/cases/distributed/information_schema/region_peers.sql (1 hunks)
tests/cases/standalone/information_schema/region_peers.result (1 hunks)
tests/cases/standalone/information_schema/region_peers.sql (1 hunks)

Additional comments not posted (9)

tests/cases/standalone/information_schema/region_peers.sql (2)

4-26: Ensure proper configuration for metric tables and partitioning.

The tables are configured with the metric engine, which is suitable for time-series data. The partitioning logic in region_peers_test appears well-thought-out for scalability. However, ensure that the partition ranges (a < 10, a >= 10 AND a < 20, a >= 20) cover all potential values without overlaps or gaps.

32-32: Proper schema context switch.

Switching back to the public schema at the end of the script is a good practice to maintain a clean testing environment.
tests/cases/distributed/information_schema/region_peers.sql (3)
28-28: Optimized SELECT statement for distributed environments.

Specifying columns (region_id, is_leader, status, down_seconds) instead of using * is a good practice, especially in distributed environments where minimizing data transfer is crucial.

32-32: Consistent schema context switching across test environments.

Maintaining consistency in switching back to the public schema at the end of the script is crucial for ensuring a clean state across different test environments.

4-26: Check distributed-specific configurations for metric tables.

The CREATE TABLE statements are identical to the standalone version. Ensure that any distributed-specific configurations or optimizations are considered and applied if necessary.

Verification successful

Check distributed-specific configurations for metric tables.

The CREATE TABLE statements in the distributed environment are consistent with those in the standalone environment. The tables are configured with engine = metric and PARTITION ON COLUMNS, which are typical for distributed environments. However, ensure that any additional distributed-specific optimizations or configurations are considered and applied if necessary.

region_peers_phy: engine = metric with ("physical_metric_table" = "")

region_peers_t1: engine = metric with ("on_physical_table" = "region_peers_phy")

region_peers_t2: engine = metric with ("on_physical_table" = "region_peers_phy")

region_peers_test: PARTITION ON COLUMNS (a)
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Check for distributed-specific configurations in CREATE TABLE statements.

# Test: Search for distributed-specific configurations in SQL files.
rg --type sql $'CREATE TABLE' --context 10
Length of output: 201978
tests/cases/distributed/information_schema/region_peers.result (2)

40-49: Correct expected results reflecting unique region_id.

The expected results correctly show unique region_id values, which is the main objective of the PR to address the issue of duplicate region_ids in different logical tables.

2-57: Good use of affected rows messages for clarity in test results.

Including messages about affected rows after each operation provides clear feedback on the impact of each SQL statement, which is beneficial for debugging and understanding the flow of the test.

tests/cases/standalone/information_schema/region_peers.result (2)

40-49: Correct expected results reflecting unique region_id.

The expected results correctly show unique region_id values, which is the main objective of the PR to address the issue of duplicate region_ids in different logical tables.

2-57: Consistent use of affected rows messages across test environments.

The inclusion of messages about affected rows after each operation is consistent with the distributed environment tests and provides clear feedback, which is beneficial for debugging and understanding the flow of the test.

tests/cases/standalone/information_schema/region_peers.sql

tests/cases/distributed/information_schema/region_peers.result

WenyXu · 2024-06-24T14:03:45Z

Thanks!

coderabbitai

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 69d6335 and c4f26cc.

Files selected for processing (2)

tests/cases/standalone/common/information_schema/region_peers.result (1 hunks)
tests/cases/standalone/common/information_schema/region_peers.sql (1 hunks)

Additional comments not posted (5)

tests/cases/standalone/common/information_schema/region_peers.sql (4)
32-32: Validate context switch to public schema.

Switching back to the public schema at the end of the script is a good practice to reset the testing environment. Confirm that this switch effectively resets the session to avoid any unintended side effects in other tests.

4-10: Verify table creation and partitioning logic.

The SQL script correctly creates the tables region_peers_phy, region_peers_t1, and region_peers_t2 with appropriate engine settings and primary keys. The partitioning logic for region_peers_test is well-defined. Ensure that the partitioning ranges meet the expected use cases and data distribution.

Also applies to: 12-16, 18-26

Verification successful

Verify the partitioning logic in the region_peers_test table.

The partitioning logic for the region_peers_test table needs to be verified to ensure it meets the expected use cases and data distribution. The search results indicate the presence of partitioning logic in the region_peers.sql file. Let's extract the relevant partitioning details from this file.

Verified partitioning logic in the region_peers_test table.

The partitioning logic for the region_peers_test table is correctly defined with appropriate ranges for column a.

a < 10

a >= 10 AND a < 20

a >= 20

This meets the expected use cases and data distribution requirements.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the partitioning logic in the `region_peers_test` table.

# Test: Search for the partitioning logic in the SQL files.
rg --type sql $'PARTITION ON COLUMNS'
Length of output: 797

Script:
#!/bin/bash
# Extract the partitioning logic from the `region_peers.sql` file.
rg --type sql --context 5 'PARTITION ON COLUMNS' tests/cases/standalone/common/information_schema/region_peers.sql
Length of output: 298
28-28: Ensure accuracy of the SELECT query.

The query aims to count distinct region_id values from the region_peers table. This is crucial for verifying that the issue #4157 is resolved. Consider adding more comprehensive checks to ensure the correctness of the returned data.

30-30: Check the cleanup process.

The script drops all created tables at the end, which is good for cleanup. Ensure that no residual data or metadata persists after the tables are dropped, which could affect subsequent tests.
tests/cases/standalone/common/information_schema/region_peers.result (1)
4-4: Check consistency of affected rows messages.

The result file consistently reports 'Affected Rows: 0' for each table operation. Verify that these messages accurately reflect the operations performed, especially considering the creation and dropping of tables should affect rows.

Also applies to: 8-8, 16-16, 24-24, 36-36, 48-48, 52-52

Verification successful

Check consistency of affected rows messages.

The 'Affected Rows: 0' messages in the result file accurately reflect the operations performed in the region_peers.sql script. The creation and dropping of tables typically do not affect rows, and the SELECT query returns a count without affecting rows.

CREATE TABLE operations: Expected affected rows = 0

SELECT COUNT query: Expected affected rows = 0

DROP TABLE operations: Expected affected rows = 0

Thus, the 'Affected Rows: 0' messages are consistent with the SQL operations performed.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the affected rows messages in the result file.

# Test: Check the logs or output directly after running the SQL script to ensure it matches the expected 'Affected Rows: 0'.
echo "Check your database logs or direct output for consistency with affected rows."
Length of output: 164

Script:
#!/bin/bash
# Locate the SQL script associated with the result file.
fd -e sql region_peers
Length of output: 90

Script:
#!/bin/bash
# Display the contents of the SQL script to inspect the operations performed.
cat tests/cases/standalone/common/information_schema/region_peers.sql
Length of output: 938

fix: region_peers returns same region_id for multi logical tables

d12200f

poltao requested a review from a team as a code owner June 23, 2024 17:03

github-actions bot added the docs-not-required This change does not impact docs. label Jun 23, 2024

poltao commented Jun 23, 2024

View reviewed changes

src/catalog/src/information_schema/region_peers.rs Show resolved Hide resolved

coderabbitai bot reviewed Jun 23, 2024

View reviewed changes

src/catalog/src/information_schema/region_peers.rs Show resolved Hide resolved

fengjiachun approved these changes Jun 24, 2024

View reviewed changes

WenyXu reviewed Jun 24, 2024

View reviewed changes

src/catalog/src/information_schema/region_peers.rs Show resolved Hide resolved

poltao added 2 commits June 24, 2024 21:28

test: add sqlness test for information_schema.region_peers

54e0451

Merge branch 'main' into region-peers

69d6335

coderabbitai bot reviewed Jun 24, 2024

View reviewed changes

tests/cases/standalone/information_schema/region_peers.sql Outdated Show resolved Hide resolved

WenyXu reviewed Jun 24, 2024

View reviewed changes

tests/cases/distributed/information_schema/region_peers.result Outdated Show resolved Hide resolved

refactor: region_peers sqlness

c4f26cc

WenyXu approved these changes Jun 24, 2024

View reviewed changes

WenyXu enabled auto-merge June 24, 2024 14:02

WenyXu added this pull request to the merge queue Jun 24, 2024

coderabbitai bot reviewed Jun 24, 2024

View reviewed changes

Merged via the queue into GreptimeTeam:main with commit 77904ad Jun 24, 2024
51 checks passed

poltao deleted the region-peers branch June 24, 2024 14:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: `region_peers` returns same region_id for multi logical tables #4190

fix: `region_peers` returns same region_id for multi logical tables #4190

poltao commented Jun 23, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jun 23, 2024 •

edited

Loading

Chat

CodeRabbit Commands (invoked as PR comments)

CodeRabbit Configration File (`.coderabbit.yaml`)

Documentation and Community

coderabbitai bot left a comment

codecov bot commented Jun 23, 2024 •

edited

Loading

fengjiachun left a comment

fengjiachun commented Jun 24, 2024

WenyXu commented Jun 24, 2024

coderabbitai bot left a comment

WenyXu commented Jun 24, 2024

coderabbitai bot left a comment

fix: region_peers returns same region_id for multi logical tables #4190

fix: region_peers returns same region_id for multi logical tables #4190

Conversation

poltao commented Jun 23, 2024 • edited by coderabbitai bot Loading

Refer to a related PR or issue link (optional)

What's changed and what's your intention?

Checklist

Summary by CodeRabbit

coderabbitai bot commented Jun 23, 2024 • edited Loading

Walkthrough

Changes

Assessment against linked issues

Chat

CodeRabbit Commands (invoked as PR comments)

CodeRabbit Configration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

codecov bot commented Jun 23, 2024 • edited Loading

Codecov Report

fengjiachun left a comment

Choose a reason for hiding this comment

fengjiachun commented Jun 24, 2024

WenyXu commented Jun 24, 2024

coderabbitai bot left a comment

Choose a reason for hiding this comment

WenyXu commented Jun 24, 2024

coderabbitai bot left a comment

Choose a reason for hiding this comment

fix: `region_peers` returns same region_id for multi logical tables #4190

fix: `region_peers` returns same region_id for multi logical tables #4190

poltao commented Jun 23, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jun 23, 2024 •

edited

Loading

CodeRabbit Configration File (`.coderabbit.yaml`)

codecov bot commented Jun 23, 2024 •

edited

Loading