-
Notifications
You must be signed in to change notification settings - Fork 189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
opt(torii-grpc): parallelize queries #2443
Conversation
WalkthroughOhayo, sensei! The recent changes enhance the Changes
Possibly related PRs
Suggested reviewers
Recent review detailsConfiguration used: .coderabbit.yaml Files ignored due to path filters (1)
Files selected for processing (1)
Files skipped from review as they are similar to previous changes (1)
TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Files selected for processing (1)
- crates/torii/grpc/src/server/mod.rs (4 hunks)
Additional comments not posted (3)
crates/torii/grpc/src/server/mod.rs (3)
24-24
: Ohayo sensei! Excellent addition of Rayon for parallel processingIncluding
rayon
will enhance performance when handling large datasets through parallelization.
343-346
: Ohayo sensei! Ensure thread safety when usingpar_iter()
withmap_row_to_entity
When applying
par_iter()
, please confirm thatmap_row_to_entity
is thread-safe and does not access shared mutable state without proper synchronization.
613-623
: Ohayo sensei! Ensure thread safety when parallelizing entity mappingThe use of
par_iter()
fordb_entities
is great for performance. Please verify thatmap_row_to_entity
and any data accessed within are thread-safe to prevent race conditions.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2443 +/- ##
==========================================
+ Coverage 68.21% 68.38% +0.16%
==========================================
Files 365 365
Lines 48018 47973 -45
==========================================
+ Hits 32757 32805 +48
+ Misses 15261 15168 -93 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Files ignored due to path filters (1)
Cargo.lock
is excluded by!**/*.lock
Files selected for processing (1)
- crates/torii/grpc/Cargo.toml (1 hunks)
@@ -13,6 +13,7 @@ futures.workspace = true | |||
num-traits.workspace = true | |||
parking_lot.workspace = true | |||
rayon.workspace = true | |||
itertools.workspace = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ohayo, sensei! Potential issue with the itertools
dependency configuration
The line itertools.workspace = true
adds itertools
as a workspace dependency. However, itertools
is an external crate and should typically be specified with a version number. Using workspace = true
implies that itertools
is part of your local workspace, which might not be the case unless you have a local copy of the crate.
Recommend updating the dependency to include the appropriate version from crates.io:
-itertools.workspace = true
+itertools = "0.10" # Replace with the desired version
This ensures that the correct version of itertools
is fetched from crates.io.
Committable suggestion was skipped due to low confidence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Files selected for processing (1)
- crates/torii/grpc/src/server/mod.rs (14 hunks)
Additional comments not posted (4)
crates/torii/grpc/src/server/mod.rs (4)
218-305
: Ohayo sensei! Efficient Batch Processing infetch_entities
The new
fetch_entities
function (lines 218-305) efficiently batches database queries and handles large datasets by grouping entities and chunking inserts. Excellent work on addressing SQLite's parameter limit and optimizing query performance.
873-886
: Ohayo sensei! Safe Regex Pattern Construction inbuild_keys_pattern
The
build_keys_pattern
function (lines 873-886) constructs regex patterns safely for matching keys. The logic properly handles empty inputs and ensures patterns are well-formed.
733-748
: Ohayo sensei! Consistent Function Parameter UpdatesThe functions
retrieve_entities
,query_by_hashed_keys
,query_by_keys
,query_by_member
, andquery_by_composite
have been updated to accepttable
,model_relation_table
, andentity_relation_column
parameters (lines 733-748, 756-763, 771-773, 782-784, 793-795). Ensure all calls to these functions pass the correct arguments to maintain consistency and prevent potential bugs.Also applies to: 756-763, 771-773, 782-784, 793-795
1076-1084
: Ohayo sensei! Updated Calls toretrieve_entities
The calls to
retrieve_entities
have been updated to include additional parameters for the table and relation columns (lines 1076-1084, 1127-1132). Verify that these changes are reflected whereverretrieve_entities
is called to ensure correct functionality.Run the following script to check for all
retrieve_entities
calls:Also applies to: 1127-1132
Verification successful
Ohayo sensei! It seems we need to take a closer look at those
retrieve_entities
calls. Let's dig a bit deeper to see the full picture!
Ohayo sensei! I've taken a deep dive into the
retrieve_entities
calls, and I've got some good news for you!
retrieve_entities
Changes Correctly ImplementedThe changes to
retrieve_entities
have been correctly implemented in the server-side code (crates/torii/grpc/src/server/mod.rs). The function now accepts the new parameters (table, model_relation_table, entity_relation_column, query) and is called with the appropriate arguments.However, it's important to note that the client-side calls (in crates/torii/client/src/client/mod.rs and crates/torii/grpc/src/client.rs) don't include these new parameters. This is actually correct because:
- The client-side calls should match the gRPC interface, which hasn't changed from the client's perspective.
- The new parameters are internal to the server implementation and don't affect the client-server communication.
So, no worries! The changes have been implemented where they should be, and the code is looking good, sensei!
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify that all calls to `retrieve_entities` have been updated with the new parameters. # Test: Search for `retrieve_entities` calls. # Expect: All calls should include the new parameters. rg --type rust 'retrieve_entities\('Length of output: 606
Script:
#!/bin/bash # Description: Search for `retrieve_entities` calls with more context # Test: Search for `retrieve_entities` calls with 5 lines of context before and after. # Expect: Full function calls with all parameters visible. rg --type rust -C 5 'retrieve_entities\('Length of output: 5230
@@ -21,6 +21,7 @@ use proto::world::{ | |||
RetrieveEventsRequest, RetrieveEventsResponse, SubscribeModelsRequest, SubscribeModelsResponse, | |||
UpdateEntitiesSubscriptionRequest, | |||
}; | |||
use rayon::iter::{IntoParallelRefIterator, ParallelIterator}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ohayo sensei! Introducing Rayon Requires Caution in Async Context
The addition of Rayon imports (line 24) indicates the use of data parallelism. However, using Rayon within asynchronous functions can lead to unexpected behavior due to its blocking nature and potential thread pool conflicts with Tokio's async runtime. It's important to ensure that integrating Rayon doesn't cause deadlocks or hinder async performance.
Consider using async-aware concurrency patterns provided by Tokio or the futures
crate to achieve parallelism that's compatible with the async runtime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Outside diff range and nitpick comments (1)
crates/torii/grpc/src/server/mod.rs (1)
218-305
: Add unit tests for the newfetch_entities
functionThe newly added
fetch_entities
function significantly enhances data retrieval. To ensure its reliability and correctness, please consider adding unit tests that cover various scenarios, including edge cases and error handling.
Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Files selected for processing (1)
- crates/torii/grpc/src/server/mod.rs (16 hunks)
Additional comments not posted (4)
crates/torii/grpc/src/server/mod.rs (4)
24-24
: Ohayo sensei! Reminder about using Rayon in async contextAs previously noted, using Rayon within asynchronous functions can lead to unexpected behavior due to potential thread pool conflicts with Tokio's async runtime. Please ensure that integrating Rayon doesn't cause deadlocks or hinder async performance.
290-294
: Ohayo sensei! Reminder about using Rayon in async contextThe previous concerns regarding the use of
par_iter()
within an async function are still applicable. Leveraging Rayon can conflict with the async runtime's thread management.
Line range hint
345-357
: Ensure safe construction of SQL queries to prevent injection risksInterpolating variables like
{table}
and{model_relation_table}
directly into SQL queries can introduce SQL injection vulnerabilities if these variables contain unexpected values.[security]
To enhance security, validate that
table
andmodel_relation_table
are from a predefined list of table names or sanitize them appropriately before interpolation. Alternatively, consider using parameterized queries or query builders that safely handle identifiers.
604-612
: Ohayo sensei! Reminder about using Rayon in async contextAs previously mentioned, using
par_iter()
inside an async context may lead to undesirable outcomes. Please revisit this section to ensure compatibility with the async runtime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice cleanup and refactoring, some minor comments.
Summary by CodeRabbit
New Features
itertools
library for enhanced collection manipulation capabilities.Bug Fixes