-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add graph_model parameter to cognify api call #377
base: dev
Are you sure you want to change the base?
Conversation
WalkthroughThe changes in the Changes
Sequence DiagramsequenceDiagram
participant Client
participant Router
participant CognifyPayload
participant DynamicModel
Client->>Router: Send Cognify Request
Router->>CognifyPayload: Parse Payload
alt Graph Model Provided
CognifyPayload->>DynamicModel: Create Dynamic Model
DynamicModel-->>Router: Dynamic Model Instance
else No Graph Model
CognifyPayload-->>Router: graph_model = None
end
Router->>Router: Process Cognify Request
Poem
Tip CodeRabbit's docstrings feature is now available as part of our Early Access Program! Simply use the command Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (1)
cognee/api/v1/cognify/routers/get_cognify_router.py (1)
28-36
: Consider separating model conversion logicThe endpoint currently handles both model conversion and business logic. Consider extracting the model conversion into a separate service or utility class for better maintainability and testability.
Suggested structure:
- Create a
GraphModelConverter
service class- Add specific exception types for different failure scenarios
- Move the conversion logic out of the route handler
Would you like me to provide a detailed implementation of this architectural improvement?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
cognee/api/v1/cognify/routers/get_cognify_router.py
(3 hunks)
🔇 Additional comments (2)
cognee/api/v1/cognify/routers/get_cognify_router.py (2)
2-3
: LGTM! Import changes align with implementation needs
The addition of Any
type and create_model
supports the enhanced flexibility in graph model handling.
12-12
: Consider adding runtime type validation for graph_model
While changing to Optional[Any]
enables flexibility, it bypasses Pydantic's static type checking. Consider adding runtime validation to ensure the graph_model contains the expected schema structure.
def json_to_pydantic_model(name: str, json_schema: dict) -> BaseModel: | ||
""" | ||
Create a Pydantic model on the fly from JSON. | ||
""" | ||
return create_model(name, **{k: (type(v), ...) for k, v in json_schema.items()}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add schema validation and improve type mapping
The current implementation has several potential issues:
- No validation of the input schema structure
- Assumes all fields are required (...)
- Simple type inference might not handle complex nested types correctly
Consider this improved implementation:
def json_to_pydantic_model(name: str, json_schema: dict) -> BaseModel:
"""
Create a Pydantic model on the fly from JSON.
"""
- return create_model(name, **{k: (type(v), ...) for k, v in json_schema.items()})
+ def get_field_type(value: Any) -> tuple:
+ if isinstance(value, dict):
+ return (dict, ...)
+ if isinstance(value, list):
+ return (List[Any], ...)
+ return (type(value), None) # Make fields optional by default
+
+ if not isinstance(json_schema, dict):
+ raise ValueError("JSON schema must be a dictionary")
+
+ return create_model(
+ name,
+ **{k: get_field_type(v) for k, v in json_schema.items()}
+ )
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def json_to_pydantic_model(name: str, json_schema: dict) -> BaseModel: | |
""" | |
Create a Pydantic model on the fly from JSON. | |
""" | |
return create_model(name, **{k: (type(v), ...) for k, v in json_schema.items()}) | |
def json_to_pydantic_model(name: str, json_schema: dict) -> BaseModel: | |
""" | |
Create a Pydantic model on the fly from JSON. | |
""" | |
def get_field_type(value: Any) -> tuple: | |
if isinstance(value, dict): | |
return (dict, ...) | |
if isinstance(value, list): | |
return (List[Any], ...) | |
return (type(value), None) # Make fields optional by default | |
if not isinstance(json_schema, dict): | |
raise ValueError("JSON schema must be a dictionary") | |
return create_model( | |
name, | |
**{k: get_field_type(v) for k, v in json_schema.items()} | |
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth merging?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@borisarzentar, yes, the suggested improvements provide additional schema validation and handle nested data structures more robustly. By making fields optional by default, they also reduce the chances of unexpected errors. Unless you foresee any compatibility issues, it seems worth merging.
try: | ||
# Dynamic conversion of `graph_model` to Pydantic | ||
if payload.graph_model: | ||
graph_model_schema = payload.graph_model | ||
GraphModelDynamic = json_to_pydantic_model("GraphModelDynamic", graph_model_schema) | ||
graph_model_instance = GraphModelDynamic(**graph_model_schema) | ||
print(graph_model_instance) | ||
else: | ||
graph_model_instance = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix control flow and remove debug code
Several issues need attention:
- Remove the print statement
- Fix indentation of the else clause
- Add error handling for model creation
- Restructure the nested try-except blocks
Apply this fix:
try:
# Dynamic conversion of `graph_model` to Pydantic
if payload.graph_model:
graph_model_schema = payload.graph_model
GraphModelDynamic = json_to_pydantic_model("GraphModelDynamic", graph_model_schema)
graph_model_instance = GraphModelDynamic(**graph_model_schema)
- print(graph_model_instance)
- else:
- graph_model_instance = None
- try:
- await cognee_cognify(payload.datasets, user, payload.graph_model)
- except Exception as error:
+ else:
+ graph_model_instance = None
+
+ await cognee_cognify(payload.datasets, user, graph_model_instance)
+ except ValueError as error:
+ return JSONResponse(
+ status_code=400,
+ content={"error": f"Invalid graph model schema: {str(error)}"}
+ )
+ except Exception as error:
return JSONResponse(
status_code=409,
content={"error": str(error)}
)
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
try: | |
# Dynamic conversion of `graph_model` to Pydantic | |
if payload.graph_model: | |
graph_model_schema = payload.graph_model | |
GraphModelDynamic = json_to_pydantic_model("GraphModelDynamic", graph_model_schema) | |
graph_model_instance = GraphModelDynamic(**graph_model_schema) | |
print(graph_model_instance) | |
else: | |
graph_model_instance = None | |
try: | |
# Dynamic conversion of `graph_model` to Pydantic | |
if payload.graph_model: | |
graph_model_schema = payload.graph_model | |
GraphModelDynamic = json_to_pydantic_model("GraphModelDynamic", graph_model_schema) | |
graph_model_instance = GraphModelDynamic(**graph_model_schema) | |
else: | |
graph_model_instance = None | |
await cognee_cognify(payload.datasets, user, graph_model_instance) | |
except ValueError as error: | |
return JSONResponse( | |
status_code=400, | |
content={"error": f"Invalid graph model schema: {str(error)}"} | |
) | |
except Exception as error: | |
return JSONResponse( | |
status_code=409, | |
content={"error": str(error)} | |
) |
Resolve syntax issue with cognify router Fix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (2)
cognee/api/v1/cognify/routers/get_cognify_router.py (2)
17-21
:⚠️ Potential issuePrevious concerns about schema validation remain unaddressed
The current implementation still lacks proper schema validation and type mapping as mentioned in the previous review.
The implementation needs to:
- Validate input schema structure
- Handle optional fields
- Support complex nested types
- Add proper error handling
Please refer to the previous review comment for the suggested implementation.
31-40
:⚠️ Potential issueClean up implementation and improve error handling
Several issues need to be addressed:
- Remove debug print statement
- Fix control flow
- Add specific error handling for model creation
Please refer to the previous review comment for the detailed implementation fix.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
cognee/api/v1/cognify/routers/get_cognify_router.py
(2 hunks)
🔇 Additional comments (2)
cognee/api/v1/cognify/routers/get_cognify_router.py (2)
2-2
: LGTM! Required imports added for new functionality.
The addition of Any
type is necessary for the flexible graph_model type in CognifyPayloadDTO.
31-40
: Verify compatibility with existing implementations
Let's verify the impact of these changes on the existing codebase.
✅ Verification successful
Changes are compatible with existing implementations
The changes in the code are safe and compatible with the existing codebase for the following reasons:
- The default behavior is preserved by using
KnowledgeGraph
whengraph_model
is not provided. - The
cognee_cognify
function incognify_v2.py
already accepts agraph_model
parameter of typeBaseModel
, making it compatible with both the defaultKnowledgeGraph
and any dynamically created models. - The dynamic model creation follows Pydantic's schema, ensuring type safety and compatibility with the graph extraction pipeline.
- The existing graph model implementations in
data_models.py
andmodels.py
show that all graph-related models follow the PydanticBaseModel
pattern, which is consistent with the dynamic model approach.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Check for existing implementations and usages
# Find all files that import or use cognify endpoint
rg -l "cognify" --type python
# Find existing graph model implementations
rg -l "class.*Graph.*\(BaseModel\)" --type python
# Check for existing calls to cognee_cognify
rg "cognee_cognify\(" --type python -A 2
Length of output: 217
Script:
#!/bin/bash
# Let's try without the python type filter since it's not recognized
# Find all files that import or use cognify endpoint
rg -l "cognify"
# Find existing graph model implementations
rg -l "class.*Graph.*\(BaseModel\)"
# Check for existing calls to cognee_cognify
rg "cognee_cognify\(" -A 2
# Look for any potential graph model related files
fd "graph|model" --type f
# Check the content of the current file to understand the context better
cat "cognee/api/v1/cognify/routers/get_cognify_router.py"
Length of output: 5774
Script:
#!/bin/bash
# Let's check the existing graph model implementations and their structure
# Check KnowledgeGraph implementation
cat "cognee/shared/data_models.py"
# Check other potential graph model implementations
cat "cognee/tasks/graph/models.py"
# Look for any usage patterns of cognee_cognify
cat "cognee/api/v1/cognify/cognify_v2.py"
# Check test files for usage patterns
cat "cognee/tests/test_library.py"
Length of output: 20186
|
||
class CognifyPayloadDTO(BaseModel): | ||
datasets: List[str] | ||
graph_model: Optional[BaseModel] = None | ||
graph_model: Optional[Any] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider using a more specific type than Any
Using Optional[Any]
reduces type safety and could lead to runtime errors. Consider using a Union type with specific expected types or creating a base interface that all graph models must implement.
- graph_model: Optional[Any] = None
+ graph_model: Optional[Union[dict, BaseModel]] = None
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
graph_model: Optional[Any] = None | |
graph_model: Optional[Union[dict, BaseModel]] = None |
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (3)
cognee/api/v1/cognify/routers/get_cognify_router.py (3)
14-14
:⚠️ Potential issueUse more specific types instead of Any
Using
Optional[Any]
significantly reduces type safety. A more specific type would prevent potential runtime errors and provide better IDE support.- graph_model: Optional[Any] = None + graph_model: Optional[Union[dict, BaseModel]] = None
17-21
:⚠️ Potential issueEnhance schema validation and type mapping
The current implementation lacks proper validation and type handling:
- No schema structure validation
- All fields are marked as required
- Complex nested types aren't properly handled
The previous suggestion for a more robust implementation is still valid and should be implemented.
31-40
:⚠️ Potential issueRemove debug code and fix control flow structure
The current implementation contains debug code and has suboptimal control flow structure.
Previous suggestions for fixing control flow and removing debug code should be implemented.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
cognee/api/v1/cognify/routers/get_cognify_router.py
(2 hunks)
🔇 Additional comments (1)
cognee/api/v1/cognify/routers/get_cognify_router.py (1)
2-2
: LGTM: Import changes are appropriate
The addition of Any
from typing is necessary for the new type annotation in CognifyPayloadDTO
.
# Dynamic conversion of `graph_model` to Pydantic | ||
if payload.graph_model: | ||
graph_model_schema = payload.graph_model | ||
GraphModelDynamic = json_to_pydantic_model("GraphModelDynamic", graph_model_schema) | ||
graph_model_instance = GraphModelDynamic(**graph_model_schema) | ||
print(graph_model_instance) | ||
else: | ||
graph_model_instance = KnowledgeGraph() | ||
|
||
await cognee_cognify(payload.datasets, user, graph_model_instance) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add proper validation and error handling
The current implementation lacks proper validation before dynamic model creation.
Add validation and specific error handling:
try:
if payload.graph_model:
+ if not isinstance(payload.graph_model, dict):
+ raise ValueError("graph_model must be a dictionary")
graph_model_schema = payload.graph_model
try:
GraphModelDynamic = json_to_pydantic_model("GraphModelDynamic", graph_model_schema)
graph_model_instance = GraphModelDynamic(**graph_model_schema)
except ValidationError as e:
return JSONResponse(
status_code=400,
content={"error": f"Invalid graph model schema: {str(e)}"}
)
else:
graph_model_instance = KnowledgeGraph()
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
# Dynamic conversion of `graph_model` to Pydantic | |
if payload.graph_model: | |
graph_model_schema = payload.graph_model | |
GraphModelDynamic = json_to_pydantic_model("GraphModelDynamic", graph_model_schema) | |
graph_model_instance = GraphModelDynamic(**graph_model_schema) | |
print(graph_model_instance) | |
else: | |
graph_model_instance = KnowledgeGraph() | |
await cognee_cognify(payload.datasets, user, graph_model_instance) | |
# Dynamic conversion of `graph_model` to Pydantic | |
try: | |
if payload.graph_model: | |
if not isinstance(payload.graph_model, dict): | |
raise ValueError("graph_model must be a dictionary") | |
graph_model_schema = payload.graph_model | |
try: | |
GraphModelDynamic = json_to_pydantic_model("GraphModelDynamic", graph_model_schema) | |
graph_model_instance = GraphModelDynamic(**graph_model_schema) | |
except ValidationError as e: | |
return JSONResponse( | |
status_code=400, | |
content={"error": f"Invalid graph model schema: {str(e)}"} | |
) | |
else: | |
graph_model_instance = KnowledgeGraph() | |
await cognee_cognify(payload.datasets, user, graph_model_instance) |
💡 Codebase verification
Custom graph models must include nodes and edges fields
The dynamic graph model must be compatible with the base KnowledgeGraph model structure which requires:
nodes: List[Node]
where each Node has: id, name, type, and descriptionedges: List[Edge]
where each Edge has: source_node_id, target_node_id, relationship_name, and optional properties
The code should validate that any custom graph_model schema includes these required fields before creating the dynamic model.
🔗 Analysis chain
Verify graph model compatibility
The code assumes any valid schema will work with cognee_cognify
.
Let's verify the expected graph model structure:
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for graph model usage patterns in cognee_cognify
ast-grep --pattern 'async def cognify($$$, graph_model: $_) {
$$$
}'
# Search for KnowledgeGraph usage to understand required interface
rg -A 5 "class KnowledgeGraph"
Length of output: 485
Script:
#!/bin/bash
# Let's check the Node and Edge models to understand the required structure
rg -A 5 "class Node" cognee/shared/data_models.py
rg -A 5 "class Edge" cognee/shared/data_models.py
# Let's also check how cognee_cognify uses the graph_model
ast-grep --pattern 'async def cognee_cognify($_, $_, graph_model: $_) {
$$$
}'
# Search for any validation or type checking of graph_model
rg -B2 -A5 "graph_model" cognee/api/v1/cognify/
Length of output: 6080
graph_model_schema = payload.graph_model | ||
GraphModelDynamic = json_to_pydantic_model("GraphModelDynamic", graph_model_schema) | ||
graph_model_instance = GraphModelDynamic(**graph_model_schema) | ||
print(graph_model_instance) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove the print statement
|
||
class CognifyPayloadDTO(BaseModel): | ||
datasets: List[str] | ||
graph_model: Optional[BaseModel] = None | ||
graph_model: Optional[Any] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why changing to Any
? We assume that graph model will be a pydantic model.
def json_to_pydantic_model(name: str, json_schema: dict) -> BaseModel: | ||
""" | ||
Create a Pydantic model on the fly from JSON. | ||
""" | ||
return create_model(name, **{k: (type(v), ...) for k, v in json_schema.items()}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth merging?
@Vasilije1990 Please take care of this one. |
Summary by CodeRabbit
cognify
endpoint to support flexible data types for thegraph_model
attribute.cognify
function to return appropriate error messages.