-
-
Notifications
You must be signed in to change notification settings - Fork 526
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize JSON_REMOVE on IndexedJsonDocument
#8103
Conversation
@nicktobey DOLT
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overview:
The Go code involves JSON document manipulation, with functions handling insertion and removal of JSON elements. The recent changes introduce refactoring for improved modularity and readability.
Review:
-
Code Clarity and Modularity:
- The introduction of
lookupByLocation
andinsertIntoCursor
functions improves code modularity, making the main functions cleaner and more readable. - Refactoring the path lookup and insertion logic into separate functions allows for easier maintenance and potential reuse.
- The introduction of
-
Error Handling:
- Consistent error handling with returning
nil, err
ensures that any issues are propagated back to the caller for handling. - The refactoring maintains the existing error handling patterns, which is good for consistency.
- Consistent error handling with returning
-
Logic Flow:
- The changes maintain the logical flow of the original code while separating concerns. This improves both readability and potential debugging.
Improvements:
-
Documentation:
- Consider adding comments to the new functions (
lookupByLocation
andinsertIntoCursor
) to explain their purpose and parameters. This will help future developers understand the code faster. - Inline comments for key operations, such as path equivalence checks and cursor manipulations, can enhance understanding.
- Consider adding comments to the new functions (
-
Error Messaging:
- Enhance error messages where possible to provide more context. For example, include the path being processed in the error messages to aid in debugging.
-
Testing:
- Ensure thorough unit testing for the new functions (
lookupByLocation
andinsertIntoCursor
). Test various edge cases, such as non-existent paths and boundary conditions. - Validate performance implications, especially with large JSON documents, to ensure the refactoring doesn't introduce inefficiencies.
- Ensure thorough unit testing for the new functions (
-
Code Consistency:
- Verify that the style and naming conventions are consistent across the refactored and existing code. Consistency aids readability and maintainability.
Summary:
The refactor improves code modularity and readability, making the functions cleaner and more maintainable. Enhancing documentation, error messaging, and thorough testing will further improve the robustness and usability of the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implementation looks great. Just a couple minor suggestions on method documentation that could make the code a little easier for new eyes to read.
go/store/prolly/tree/json_cursor.go
Outdated
@@ -50,24 +50,25 @@ func getPreviousKey(ctx context.Context, cur *cursor) ([]byte, error) { | |||
|
|||
// newJsonCursor takes the root node of a prolly tree representing a JSON document, and creates a new JsonCursor for reading | |||
// JSON, starting at the specified location in the document. | |||
func newJsonCursor(ctx context.Context, ns NodeStore, root Node, startKey jsonLocation) (*JsonCursor, error) { | |||
func newJsonCursor(ctx context.Context, ns NodeStore, root Node, startKey jsonLocation, forRemoval bool) (*JsonCursor, bool, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(minor) Would be nice to give the return values names, or to document them (the boolean in particular) so that it's easier for future readers to quickly understand what the boolean return param indicates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a comment and return variable names.
@@ -122,20 +123,20 @@ func (j *JsonCursor) isKeyInChunk(path jsonLocation) bool { | |||
return compareJsonLocations(path, nodeEndPosition) <= 0 | |||
} | |||
|
|||
func (j *JsonCursor) AdvanceToLocation(ctx context.Context, path jsonLocation) error { | |||
func (j *JsonCursor) AdvanceToLocation(ctx context.Context, path jsonLocation, forRemoval bool) (found bool, err error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some method docs would help readers quickly understand the params without having to read the implementation; especially since this method is exposed outside the package, it would be helpful to add docs. For example, it seems important for callers to know that if the third param is passed as true
, then the cursor position returned will be different. Currently, readers would have to read the implementation to figure out how to use this method correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a docstring.
Additional work is required for integration with DoltgreSQL. |
@nicktobey DOLT
|
This PR includes a new implementation of the
JSON_REMOVE
function that leverages the new indexed JSON storage format.For JSON documents that span multiple chunks, only the affected chunks need to be loaded and modified, allowing operations to scale with the size of the removed value instead of the size of the entire document.