Cog 793 metadata rework #460

dexters1 · 2025-01-20T16:29:55Z

Description

Simplify Foreign Metadata handling in Cognee
Show Foreign Metadata in graphs as part of the Document node
Remove unused methods and files related to metadata ingestion

Picture showing foreign metadata in graph property of document node:

Foreign metadata refers to metadata related do documents that were not processed by Cognee, for example if using LlamaIndex to handle documents

DCO Affirmation

I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin

Summary by CodeRabbit

Release Notes

New Features
- Enhanced data ingestion process with more flexible data type handling.
- Introduced support for external metadata storage in data models.
Breaking Changes
- Removed metadata-specific functions and relationships.
- Replaced metadata_id with external_metadata across document types.
- Simplified data ingestion workflow by removing metadata-specific methods.
Improvements
- Updated error handling in the embedding engine.
- Streamlined data chunking and processing logic.
- Improved flexibility in data and document management.
Refactoring
- Simplified import and task creation processes.
- Removed deprecated metadata-related modules and functions.

Have foreign metadata be a table column in data instead of it's own table to reduce complexity Refactor COG-793

…s string

coderabbitai · 2025-01-20T16:30:02Z

Walkthrough

This pull request introduces significant changes to the data and metadata handling in the Cognee system. The modifications primarily involve removing the Metadata class, updating data models, and simplifying the data ingestion process. The changes remove explicit metadata management, replacing it with a more flexible external_metadata approach across various components of the system, including document processing, task ingestion, and database models.

Changes

File/Path	Change Summary
`cognee/api/v1/add/add_v2.py`	Replaced `ingest_data_with_metadata` import with `ingest_data` and updated task creation.
`cognee/api/v1/cognify/code_graph_pipeline.py`	Updated task instantiation to use `ingest_data`.
`cognee/infrastructure/databases/vector/embeddings/LiteLLMEmbeddingEngine.py`	Simplified error handling for `BadRequestError`.
`cognee/modules/data/models/`	Removed `Metadata.py`, added `external_metadata` to `Data` class, removed `metadata_relationship`.
`cognee/modules/data/processing/document_types/Document.py`	Replaced `metadata_id` with `external_metadata`.
`cognee/tasks/ingestion/`	Removed metadata-related functions, updated `ingest_data.py` to accept generic data input and modified related function signatures.
Test files	Updated document class instantiations to use `external_metadata`.
`notebooks/cognee_llama_index.ipynb`	Replaced import of `ingest_data_with_metadata` with `ingest_data` and updated related function signatures.

Possibly related PRs

Code graph pipeline improvements and fixes #414: Directly replaces ingest_data_with_metadata with ingest_data.
Feature/cog 971 preparing swe bench run #424: Also replaces ingest_data_with_metadata with ingest_data in the run_code_graph_pipeline function.
feat: Add data visualization for Anthropic #432: Addition of the max_tokens parameter in the extract_chunks_from_documents function, reflecting broader changes in function handling.

Suggested reviewers

Vasilije1990

Poem

🐰 Metadata's dance has come to an end,
External fields now where structures would bend
No more tables of rigid design
Just flexible data that smoothly align
A rabbit's code hop, light and free! 🌈

✨ Finishing Touches

📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

… effort

Resolve issue with LlamaIndex notebook after refactor

coderabbitai

Actionable comments posted: 10

🧹 Nitpick comments (5)

cognee/tasks/ingestion/ingest_data.py (2)
126-127: Consider combining permission grants into a single call

On lines 126-127, permissions are granted separately:
await give_permission_on_document(user, data_id, "read")
await give_permission_on_document(user, data_id, "write")
If the give_permission_on_document function supports multiple permissions at once, consider combining them to reduce the number of calls:
await give_permission_on_document(user, data_id, ["read", "write"])
134-135: Variable db_engine is unused after assignment

On line 134, db_engine is assigned but not used in the subsequent code:
db_engine = get_relational_engine()
If db_engine is not needed beyond this point, consider removing the assignment to clean up the code.
- db_engine = get_relational_engine()
Alternatively, if it is needed elsewhere, ensure it's utilized appropriately.
cognee/tasks/ingestion/save_data_item_to_storage.py (1)
7-7: Consider using a more specific type than Any.

While Any provides flexibility, consider creating a custom type or protocol for supported data items to maintain type safety and improve code documentation.
-from typing import Union, BinaryIO, Any
+from typing import Union, BinaryIO, Protocol
+
+class DataItem(Protocol):
+    """Protocol for supported data items in storage."""
+    def __str__(self) -> str: ...
+
+class LlamaIndexItem(DataItem):
+    """Type hint for LlamaIndex data items."""
+    pass

-async def save_data_item_to_storage(data_item: Union[BinaryIO, str, Any], dataset_name: str) -> str:
+async def save_data_item_to_storage(data_item: Union[BinaryIO, str, LlamaIndexItem], dataset_name: str) -> str:
cognee/tests/integration/documents/UnstructuredDocument_test.py (1)

42-42: Enhance test coverage for foreign_metadata.

The test currently uses empty strings for foreign_metadata across all document types. Consider adding test cases with:

Non-empty foreign metadata

Different metadata structures (e.g., JSON objects)

Edge cases (e.g., very large metadata)

Also applies to: 50-50, 58-58, 66-66
notebooks/cognee_llama_index.ipynb (1)
156-159: Document the new metadata handling approach.

Since this notebook serves as an example, it would be helpful to add documentation showing how Foreign Metadata from LlamaIndex Documents appears in the graph visualization. Consider adding a markdown cell after the graph rendering to explain this.

Add a markdown cell like this:
await add(documents)

# Use LLMs and cognee to create knowledge graph
await cognee.cognify()
+```
+
+## Understanding Foreign Metadata in the Graph
+
+When using LlamaIndex Documents, their metadata is now displayed as part of the Document node's properties in the graph. You can observe this in the visualization above, where each Document node includes the original LlamaIndex metadata.
+```

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bbb8e89 and bd3a5a7.

📒 Files selected for processing (23)

cognee/api/v1/add/add_v2.py (2 hunks)
cognee/api/v1/cognify/code_graph_pipeline.py (2 hunks)
cognee/infrastructure/databases/vector/embeddings/LiteLLMEmbeddingEngine.py (1 hunks)
cognee/modules/chunking/TextChunker.py (1 hunks)
cognee/modules/data/models/Data.py (2 hunks)
cognee/modules/data/models/Metadata.py (0 hunks)
cognee/modules/data/operations/delete_metadata.py (0 hunks)
cognee/modules/data/operations/get_metadata.py (0 hunks)
cognee/modules/data/operations/write_metadata.py (0 hunks)
cognee/modules/data/processing/document_types/Document.py (1 hunks)
cognee/tasks/documents/classify_documents.py (2 hunks)
cognee/tasks/ingestion/__init__.py (1 hunks)
cognee/tasks/ingestion/ingest_data.py (1 hunks)
cognee/tasks/ingestion/ingest_data_with_metadata.py (0 hunks)
cognee/tasks/ingestion/save_data_item_to_storage.py (1 hunks)
cognee/tasks/ingestion/save_data_item_with_metadata_to_storage.py (0 hunks)
cognee/tasks/ingestion/save_data_to_storage.py (0 hunks)
cognee/tests/integration/documents/AudioDocument_test.py (1 hunks)
cognee/tests/integration/documents/ImageDocument_test.py (1 hunks)
cognee/tests/integration/documents/PdfDocument_test.py (1 hunks)
cognee/tests/integration/documents/TextDocument_test.py (1 hunks)
cognee/tests/integration/documents/UnstructuredDocument_test.py (1 hunks)
notebooks/cognee_llama_index.ipynb (3 hunks)

💤 Files with no reviewable changes (7)

cognee/modules/data/operations/get_metadata.py
cognee/tasks/ingestion/save_data_to_storage.py
cognee/modules/data/models/Metadata.py
cognee/modules/data/operations/delete_metadata.py
cognee/tasks/ingestion/save_data_item_with_metadata_to_storage.py
cognee/tasks/ingestion/ingest_data_with_metadata.py
cognee/modules/data/operations/write_metadata.py

⏰ Context from checks skipped due to timeout of 90000ms (5)

GitHub Check: test
GitHub Check: test
GitHub Check: test
GitHub Check: windows-latest
GitHub Check: docker-compose-test

🔇 Additional comments (12)

cognee/tasks/ingestion/ingest_data.py (2)

142-152: Ensure consistent dataset names for different database engines

The dataset names differ between SQLite and other databases:

SQLite uses "main" as the dataset_name.

Other databases use "public" as the dataset_name.

Confirm that this discrepancy is intentional and that downstream processes handle these dataset names correctly.

29-36: Improve warning message and ensure consistent metadata handling

In the get_foreign_metadata_dict function, there's a couple of issues to address:

Ellipsis in warning message: The warning message on line 34 includes an ellipsis (...), which may be unintentional and could confuse the reader.

Returning an empty dictionary: When the data_item does not have a dict method, the function returns an empty dictionary. This might lead to unexpected behavior downstream if not handled properly.

Apply this diff to improve the warning message and consider if returning an empty dictionary is appropriate:
 def get_foreign_metadata_dict(data_item: Union[BinaryIO, str, Any]) -> dict[str, Any]:
     if hasattr(data_item, "dict") and inspect.ismethod(getattr(data_item, "dict")):
         return {"metadata": data_item.dict(), "origin": str(type(data_item))}
     else:
-        warnings.warn(
-            f"Data of type {type(data_item)}... does not have dict method. Returning empty metadata."
-        )
+        warnings.warn(
+            f"Data of type {type(data_item)} does not have a 'dict' method. Returning empty metadata."
+        )
         return {}
Ensure that functions consuming foreign_metadata can handle cases where it's an empty dictionary.

✅ Verification successful

Warning message improvement is safe to apply, empty metadata handling is correct

The codebase is designed to handle empty metadata safely:

The foreign_metadata field is optional in the Document model

Empty metadata is wrapped in a tuple for additional safety

No evidence of code requiring non-empty metadata
🏁 Scripts executed

The following scripts were executed for the analysis:

Error: No shell scripts found

Script:
#!/bin/bash
# Find calls to get_foreign_metadata_dict and its usage context
ast-grep --pattern 'get_foreign_metadata_dict($_)'

# Search for direct references to the function
rg "get_foreign_metadata_dict" -A 3 -B 3

# Look for metadata dictionary access patterns
rg "foreign_metadata\[" -A 2 -B 2
rg "metadata.*dict" -A 2 -B 2
Length of output: 48549
cognee/tasks/ingestion/__init__.py (1)

2-2: Update import statements to reflect refactoring

The import of ingest_data replaces the previous imports of deprecated functions. This change aligns with the refactoring objectives to simplify the ingestion process.

cognee/modules/data/models/Data.py (1)

3-4: LGTM: Import changes align with new schema requirements.

The addition of JSON type import supports the new foreign_metadata column.
cognee/api/v1/add/add_v2.py (2)

25-25: LGTM: Task pipeline update is consistent.

The task list correctly reflects the simplified ingestion process.

5-5: Verify the impact of removing metadata-specific ingestion.

The change from ingest_data_with_metadata to ingest_data aligns with the metadata rework. However, we should verify:

Handling of existing metadata during ingestion

Impact on downstream processes expecting metadata

Let's check for dependencies and usages:

✅ Verification successful

Metadata handling is properly maintained in the new implementation

The consolidation of ingest_data_with_metadata into ingest_data is a safe refactoring that preserves all metadata functionality. The new implementation:

Captures and stores all file metadata (name, path, extension, mime_type, hash)

Preserves custom metadata through foreign_metadata field

Maintains backward compatibility with existing data structures
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Search for references to metadata ingestion
rg -g '!*.pyc' 'ingest_data_with_metadata'

# Check for potential impacts on downstream processes
rg -g '!*.pyc' -A 5 'metadata.*process|process.*metadata'
Length of output: 37394

Script:
#!/bin/bash
# Check implementation of ingest_data and metadata handling
ast-grep --pattern 'def ingest_data($$$)'

# Look for metadata handling patterns in ingestion module
rg -g '!*.pyc' -A 5 'metadata.*ingest|ingest.*metadata' cognee/tasks/ingestion/
Length of output: 125

Script:
#!/bin/bash
# Find ingestion implementation with broader pattern
fd . cognee/tasks/ingestion -t f -x cat {}

# Search for any metadata handling patterns
rg -g '!*.pyc' 'metadata' cognee/tasks/
Length of output: 17095
cognee/tests/integration/documents/AudioDocument_test.py (1)

32-32: Similar improvements needed as in TextDocument_test.py.

See the review comment in TextDocument_test.py about adding test cases for foreign_metadata handling.

cognee/tests/integration/documents/ImageDocument_test.py (1)

21-21: Similar improvements needed as in TextDocument_test.py.

See the review comment in TextDocument_test.py about adding test cases for foreign_metadata handling.

cognee/api/v1/cognify/code_graph_pipeline.py (1)

13-13: LGTM! Changes align with metadata simplification objective.

The transition from ingest_data_with_metadata to ingest_data is clean and maintains the existing task configuration structure.

Also applies to: 71-71
notebooks/cognee_llama_index.ipynb (3)

122-124: Cell metadata changes look good.

These changes reset the cell execution state, which is a good practice for notebooks that will be shared or committed to version control.

136-136: Verify backward compatibility of the new ingestion method.

The change from ingest_data_with_metadata to ingest_data aligns with the PR's goal of simplifying metadata handling. However, we should verify that this change maintains compatibility with existing LlamaIndex documents.

Let's check if there are any other notebooks or examples that might be affected:

✅ Verification successful

Import change is consistent and properly migrated

The change from ingest_data_with_metadata to ingest_data has been properly implemented across the codebase. The new function is consistently used in all relevant places, and there are no remaining references to the old implementation.
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Search for any remaining references to the old ingestion method
rg "ingest_data_with_metadata" --type py --type jupyter
Length of output: 55

Script:
#!/bin/bash
# Check the implementation of ingest_data and its usages
ast-grep --pattern 'def ingest_data($$$)'

# Find other references to ingest_data to understand usage patterns
rg "ingest_data" -A 3 --type py --type jupyter
Length of output: 2465
156-156: Verify preservation of LlamaIndex Document metadata.

Since this PR changes how Foreign Metadata is handled, we should verify that the LlamaIndex Document metadata is properly preserved in the Document node's graph property.

Let's check the implementation:

✅ Verification successful

Metadata handling is properly implemented and preserved

The change from ingest_data_with_metadata to ingest_data is safe as the metadata from LlamaIndex Documents is automatically extracted and stored in the foreign_metadata field through the get_foreign_metadata_dict helper function.
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Check how LlamaIndex Document metadata is handled in the new implementation
ast-grep --pattern 'def ingest_data($$$):
  $$$
  # Look for metadata handling
  $$$
'
Length of output: 88

Script:
#!/bin/bash
# Find the ingest_data function definition
rg "def ingest_data" -A 10

# Also search for metadata handling near ingest_data
rg "ingest_data.*metadata|metadata.*ingest_data" -A 5
Length of output: 989

Script:
#!/bin/bash
# Get complete implementation of ingest_data function
cat cognee/tasks/ingestion/ingest_data.py

# Search for LlamaIndex Document handling
rg -l "from llama_index" 
rg "Document.*metadata" -A 5
Length of output: 44109

coderabbitai · 2025-01-20T17:11:42Z

cognee/tasks/ingestion/ingest_data.py

+                        data_point.mime_type = file_metadata["mime_type"]
+                        data_point.owner_id = user.id
+                        data_point.content_hash = file_metadata["content_hash"]
+                        data_point.foreign_metadata = (get_foreign_metadata_dict(data_item),)


⚠️ Potential issue

Assigning tuple to foreign_metadata may cause data type issues

On line 98, foreign_metadata is assigned as a tuple containing a dictionary:

data_point.foreign_metadata = (get_foreign_metadata_dict(data_item),)

This assignment creates a tuple with one element, which may not be the intended data type for foreign_metadata. Based on the DataPoint model, foreign_metadata is expected to be an optional string.

To fix this issue, remove the parentheses and comma to assign the dictionary directly:

- data_point.foreign_metadata = (get_foreign_metadata_dict(data_item),) + data_point.foreign_metadata = get_foreign_metadata_dict(data_item)

Or, if foreign_metadata should be a JSON string, serialize it:

+ import json ... - data_point.foreign_metadata = (get_foreign_metadata_dict(data_item),) + data_point.foreign_metadata = json.dumps(get_foreign_metadata_dict(data_item))

coderabbitai · 2025-01-20T17:11:42Z

cognee/tasks/ingestion/ingest_data.py

+                            mime_type=file_metadata["mime_type"],
+                            owner_id=user.id,
+                            content_hash=file_metadata["content_hash"],
+                            foreign_metadata=get_foreign_metadata_dict(data_item),


⚠️ Potential issue

Ensure correct assignment of foreign_metadata when creating Data object

Similar to the previous issue, on line 109, foreign_metadata is assigned:

foreign_metadata=get_foreign_metadata_dict(data_item),

The trailing comma may inadvertently create a tuple, leading to potential data type issues.

Apply this diff to correct the assignment:

- foreign_metadata=get_foreign_metadata_dict(data_item), + foreign_metadata=get_foreign_metadata_dict(data_item)

Or serialize if necessary:

+ import json ... - foreign_metadata=get_foreign_metadata_dict(data_item) + foreign_metadata=json.dumps(get_foreign_metadata_dict(data_item))

coderabbitai · 2025-01-20T17:11:42Z

cognee/modules/data/processing/document_types/Document.py

@@ -7,7 +7,7 @@
 class Document(DataPoint):
    name: str
    raw_data_location: str
-    metadata_id: UUID
+    foreign_metadata: Optional[str]


⚠️ Potential issue

Potential type mismatch for foreign_metadata attribute

On line 10, foreign_metadata is defined as:

foreign_metadata: Optional[str]

However, elsewhere in the code (e.g., ingest_data.py), foreign_metadata is assigned a dictionary, which may cause type inconsistencies.

Consider changing the type annotation to accept a dictionary or ensure that foreign_metadata is always a string (e.g., by serializing the dictionary to JSON):

- foreign_metadata: Optional[str] + from typing import Optional, Dict, Any + foreign_metadata: Optional[Dict[str, Any]]

Or, if foreign_metadata should remain a string, ensure all assignments serialize the data properly:

foreign_metadata=json.dumps(get_foreign_metadata_dict(data_item))

coderabbitai · 2025-01-20T17:11:42Z

cognee/modules/data/models/Data.py

@@ -21,6 +19,7 @@ class Data(Base):
    raw_data_location = Column(String)
    owner_id = Column(UUID, index=True)
    content_hash = Column(String)
+    foreign_metadata = Column(JSON)


💡 Codebase verification

Consider adding schema validation for foreign_metadata - Confirmed

The codebase shows that foreign_metadata follows an implicit schema {"metadata": dict, "origin": str} but lacks validation. Current implementation silently accepts invalid data, which could lead to inconsistencies. Recommended changes:

Add schema validation using Pydantic or similar

Document the expected format in the model's docstring

Replace the warning with proper error handling for invalid metadata

🔗 Analysis chain

Consider adding schema validation for foreign_metadata.

While using JSON type provides flexibility, consider:

Adding schema validation to ensure consistent metadata structure

Documenting the expected format

Adding migration strategy for existing metadata

Let's check for any existing schema validation or documentation:

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Search for schema validation or documentation rg -g '!*.pyc' -A 5 'foreign_metadata.*schema|metadata.*schema' # Check for migration files fd -g 'migrations/*.py' -x cat {}

Length of output: 2493

Script:

#!/bin/bash # Get the complete Data.py file cat cognee/modules/data/models/Data.py # Search for any validation logic or usage of foreign_metadata rg -g '!*.pyc' -A 5 'foreign_metadata' # Look for any documentation files fd -e md -e rst -x grep -l -i "metadata" {} \;

Length of output: 9265

coderabbitai · 2025-01-20T17:11:42Z

cognee/tasks/ingestion/save_data_item_to_storage.py

+    if "llama_index" in str(type(data_item)):
+        # Dynamic import is used because the llama_index module is optional.
+        from .transform_data import get_data_from_llama_index
+
+        file_path = get_data_from_llama_index(data_item, dataset_name)
+


⚠️ Potential issue

Add error handling for llama_index integration.

The dynamic import and processing of llama_index data should include proper error handling:

Import failures

Data processing errors

Invalid llama_index data

if "llama_index" in str(type(data_item)): - # Dynamic import is used because the llama_index module is optional. - from .transform_data import get_data_from_llama_index + try: + # Dynamic import is used because the llama_index module is optional. + from .transform_data import get_data_from_llama_index + except ImportError as e: + raise IngestionError(f"LlamaIndex support not available: {str(e)}") + + try: + file_path = get_data_from_llama_index(data_item, dataset_name) + except Exception as e: + raise IngestionError(f"Failed to process LlamaIndex data: {str(e)}") - file_path = get_data_from_llama_index(data_item, dataset_name)

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai · 2025-01-20T17:11:43Z

cognee/tests/integration/documents/PdfDocument_test.py

@@ -20,7 +20,7 @@ def test_PdfDocument():
        id=uuid.uuid4(),
        name="Test document.pdf",
        raw_data_location=test_file_path,
-        metadata_id=uuid.uuid4(),
+        foreign_metadata="",


🛠️ Refactor suggestion

Enhance test coverage for foreign_metadata scenarios.

Current test uses an empty string for foreign_metadata, but should include tests for:

Valid JSON metadata

Different metadata structures

Edge cases (null, invalid JSON)

Example test cases to add:

def test_PdfDocument_with_metadata(): metadata_cases = [ {"source": "llama_index", "metadata": {"key": "value"}}, {"source": "external", "tags": ["test", "pdf"]}, None, {"invalid": float('inf')}, # Should handle JSON serialization edge cases ] for metadata in metadata_cases: document = PdfDocument( id=uuid.uuid4(), name="Test document.pdf", raw_data_location=test_file_path, foreign_metadata=metadata, mime_type="", ) # Add assertions for metadata handling

coderabbitai · 2025-01-20T17:11:43Z

cognee/tests/integration/documents/TextDocument_test.py

@@ -32,7 +32,7 @@ def test_TextDocument(input_file, chunk_size):
        id=uuid.uuid4(),
        name=input_file,
        raw_data_location=test_file_path,
-        metadata_id=uuid.uuid4(),
+        foreign_metadata="",


🛠️ Refactor suggestion

Add test cases for foreign_metadata handling.

The test currently initializes foreign_metadata with an empty string but doesn't verify how the document handles various foreign metadata scenarios. Consider adding test cases for:

Valid JSON metadata

Empty metadata

Invalid JSON metadata

Example test case:

@pytest.mark.parametrize( "foreign_metadata,expected_result", [ ('{"source": "external", "tags": ["test"]}', True), ("", True), ("invalid json", False), ], ) def test_TextDocument_foreign_metadata(foreign_metadata, expected_result): document = TextDocument( id=uuid.uuid4(), name="test.txt", raw_data_location="test_path", foreign_metadata=foreign_metadata, mime_type="" ) assert hasattr(document, "foreign_metadata") == expected_result

coderabbitai · 2025-01-20T17:11:43Z

cognee/tasks/documents/classify_documents.py

@@ -1,4 +1,5 @@
 from cognee.modules.data.models import Data
+import json


🛠️ Refactor suggestion

Add error handling for JSON serialization and remove unnecessary indentation.

The current implementation has two potential issues:

No error handling for JSON serialization failures

Unnecessary overhead from pretty-printing JSON with indent=4

Consider this implementation:

async def classify_documents(data_documents: list[Data]) -> list[Document]: documents = [] for data_item in data_documents: + try: + metadata_json = json.dumps(data_item.foreign_metadata) + except (TypeError, ValueError) as e: + logger.error(f"Failed to serialize foreign_metadata for {data_item.id}: {e}") + metadata_json = "{}" + document = EXTENSION_TO_DOCUMENT_CLASS[data_item.extension]( id=data_item.id, title=f"{data_item.name}.{data_item.extension}", raw_data_location=data_item.raw_data_location, name=data_item.name, mime_type=data_item.mime_type, - foreign_metadata=json.dumps(data_item.foreign_metadata, indent=4), + foreign_metadata=metadata_json, )

Also applies to: 61-61

coderabbitai · 2025-01-20T17:11:43Z

cognee/modules/chunking/TextChunker.py

+                    _metadata={"index_fields": ["text"]},
                )
            except Exception as e:
                print(e)


🛠️ Refactor suggestion

Improve error handling in DocumentChunk creation.

Replace print statements with proper logging:

- except Exception as e: - print(e) + except Exception as e: + logger.error("Failed to create DocumentChunk: %s", str(e))

Also consider:

Adding more specific exception types

Including context information in the error message

Potentially propagating the error up the stack

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai · 2025-01-20T17:11:43Z

cognee/infrastructure/databases/vector/embeddings/LiteLLMEmbeddingEngine.py

+        except litellm.exceptions.BadRequestError:
            raise EmbeddingException("Failed to index data points.")



🛠️ Refactor suggestion

Enhance error handling granularity.

The current implementation might miss important OpenAI-specific errors. Consider:

Restoring specific handling for OpenAIError

Adding specific error messages for different failure scenarios

- except litellm.exceptions.BadRequestError: - raise EmbeddingException("Failed to index data points.") + except litellm.exceptions.BadRequestError as e: + raise EmbeddingException(f"Bad request while indexing data points: {str(e)}") + except litellm.llms.OpenAI.openai.OpenAIError as e: + raise EmbeddingException(f"OpenAI service error: {str(e)}")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

except litellm.exceptions.BadRequestError:

raise EmbeddingException("Failed to index data points.")

except litellm.exceptions.BadRequestError as e:

raise EmbeddingException(f"Bad request while indexing data points: {str(e)}")

except litellm.llms.OpenAI.openai.OpenAIError as e:

raise EmbeddingException(f"OpenAI service error: {str(e)}")

Rename foreign metadata to external metadata for metadata coming outside of Cognee

…ee into COG-793-metadata-rework

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (1)

cognee/tasks/ingestion/ingest_data.py (1)
98-98: ⚠️ Potential issue

Fix tuple assignment to external_metadata.

The trailing comma creates an unintended tuple.
-data_point.external_metadata = (get_external_metadata_dict(data_item),)
+data_point.external_metadata = get_external_metadata_dict(data_item)

🧹 Nitpick comments (3)

cognee/tasks/ingestion/ingest_data.py (3)

1-18: Consider reorganizing imports for better readability.

Consider organizing imports into standard library, third-party, and local application imports with a blank line between each group:

 from typing import Any, List, Union, BinaryIO
 import inspect
 import warnings
 
 import dlt
 
 import cognee.modules.ingestion as ingestion
 from cognee.infrastructure.databases.relational import get_relational_engine
 from cognee.modules.data.methods import create_dataset
 from cognee.modules.data.models.DatasetData import DatasetData
 from cognee.modules.users.models import User
 from cognee.modules.users.permissions.methods import give_permission_on_document
 from cognee.shared.utils import send_telemetry
 from .get_dlt_destination import get_dlt_destination
 from .save_data_item_to_storage import save_data_item_to_storage

29-36: Enhance the helper function with documentation and improved error handling.

The function could benefit from:

A docstring explaining its purpose and return value
More specific type hints
A more informative warning message

-def get_external_metadata_dict(data_item: Union[BinaryIO, str, Any]) -> dict[str, Any]:
+def get_external_metadata_dict(data_item: Any) -> dict[str, Any]:
+    """Extract metadata from a data item.
+    
+    Args:
+        data_item: Any object that might have a dict method for metadata extraction
+        
+    Returns:
+        dict: A dictionary containing metadata and origin information, or empty dict if extraction fails
+    """
     if hasattr(data_item, "dict") and inspect.ismethod(getattr(data_item, "dict")):
         return {"metadata": data_item.dict(), "origin": str(type(data_item))}
     else:
         warnings.warn(
-            f"Data of type {type(data_item)}... does not have dict method. Returning empty metadata."
+            f"Unable to extract metadata from {type(data_item).__name__}: object does not implement dict() method"
         )
         return {}

137-154: Add error handling for pipeline execution and extract configuration.

The pipeline configuration should be handled more robustly:

+def _get_pipeline_config(dialect_name: str) -> tuple[str, str]:
+    """Get pipeline configuration based on database dialect.
+    
+    Returns:
+        tuple: (dataset_name, table_name)
+    """
+    if dialect_name == "sqlite":
+        return "main", "file_metadata"
+    return "public", "file_metadata"

 if db_engine.engine.dialect.name == "sqlite":
-    # To use sqlite with dlt dataset_name must be set to "main".
-    # Sqlite doesn't support schemas
-    run_info = pipeline.run(
-        data_resources(file_paths, user),
-        table_name="file_metadata",
-        dataset_name="main",
-        write_disposition="merge",
-    )
-else:
-    # Data should be stored in the same schema to allow deduplication
-    run_info = pipeline.run(
-        data_resources(file_paths, user),
-        table_name="file_metadata",
-        dataset_name="public",
-        write_disposition="merge",
-    )
+    dataset_name, table_name = _get_pipeline_config(db_engine.engine.dialect.name)
+    try:
+        run_info = pipeline.run(
+            data_resources(file_paths, user),
+            table_name=table_name,
+            dataset_name=dataset_name,
+            write_disposition="merge",
+        )
+    except Exception as e:
+        raise RuntimeError(f"Pipeline execution failed: {str(e)}") from e

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 655ab0b and 40c0279.

📒 Files selected for processing (9)

cognee/modules/data/models/Data.py (2 hunks)
cognee/modules/data/processing/document_types/Document.py (1 hunks)
cognee/tasks/documents/classify_documents.py (2 hunks)
cognee/tasks/ingestion/ingest_data.py (1 hunks)
cognee/tests/integration/documents/AudioDocument_test.py (1 hunks)
cognee/tests/integration/documents/ImageDocument_test.py (1 hunks)
cognee/tests/integration/documents/PdfDocument_test.py (1 hunks)
cognee/tests/integration/documents/TextDocument_test.py (1 hunks)
cognee/tests/integration/documents/UnstructuredDocument_test.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (8)

cognee/modules/data/processing/document_types/Document.py
cognee/tests/integration/documents/ImageDocument_test.py
cognee/tests/integration/documents/AudioDocument_test.py
cognee/modules/data/models/Data.py
cognee/tests/integration/documents/TextDocument_test.py
cognee/tasks/documents/classify_documents.py
cognee/tests/integration/documents/UnstructuredDocument_test.py
cognee/tests/integration/documents/PdfDocument_test.py

⏰ Context from checks skipped due to timeout of 90000ms (19)

GitHub Check: run_multimedia_example_test / test
GitHub Check: run_notebook_test / test
GitHub Check: run_notebook_test / test
GitHub Check: run_notebook_test / test
GitHub Check: run_notebook_test / test
GitHub Check: test
GitHub Check: test
GitHub Check: run_simple_example_test / test
GitHub Check: run_dynamic_steps_example_test / test
GitHub Check: test
GitHub Check: test
GitHub Check: test
GitHub Check: test
GitHub Check: test
GitHub Check: windows-latest
GitHub Check: test
GitHub Check: test
GitHub Check: profiler
GitHub Check: docker-compose-test

coderabbitai · 2025-01-22T15:16:18Z

cognee/tasks/ingestion/ingest_data.py

+            data = [data]

-        async with db_engine.get_async_session() as session:
-            # Read metadata stored with dlt
-            files_metadata = await db_engine.get_all_data_from_table(table_name, dataset_name)
-            for file_metadata in files_metadata:
-                from sqlalchemy import select
-                from cognee.modules.data.models import Data
+        file_paths = []

-                dataset = await create_dataset(dataset_name, user.id, session)
+        # Process data
+        for data_item in data:
+            file_path = await save_data_item_to_storage(data_item, dataset_name)

-                data = (
-                    await session.execute(select(Data).filter(Data.id == UUID(file_metadata["id"])))
-                ).scalar_one_or_none()
+            file_paths.append(file_path)

-                if data is not None:
-                    data.name = file_metadata["name"]
-                    data.raw_data_location = file_metadata["file_path"]
-                    data.extension = file_metadata["extension"]
-                    data.mime_type = file_metadata["mime_type"]
+            # Ingest data and add metadata
+            with open(file_path.replace("file://", ""), mode="rb") as file:
+                classified_data = ingestion.classify(file)
+
+                # data_id is the hash of file contents + owner id to avoid duplicate data
+                data_id = ingestion.identify(classified_data, user)
+
+                file_metadata = classified_data.get_metadata()
+
+                from sqlalchemy import select
+
+                from cognee.modules.data.models import Data
+
+                db_engine = get_relational_engine()
+
+                async with db_engine.get_async_session() as session:
+                    dataset = await create_dataset(dataset_name, user.id, session)
+
+                    # Check to see if data should be updated
+                    data_point = (
+                        await session.execute(select(Data).filter(Data.id == data_id))
+                    ).scalar_one_or_none()
+
+                    if data_point is not None:
+                        data_point.name = file_metadata["name"]
+                        data_point.raw_data_location = file_metadata["file_path"]
+                        data_point.extension = file_metadata["extension"]
+                        data_point.mime_type = file_metadata["mime_type"]
+                        data_point.owner_id = user.id
+                        data_point.content_hash = file_metadata["content_hash"]
+                        data_point.external_metadata = (get_external_metadata_dict(data_item),)
+                        await session.merge(data_point)
+                    else:
+                        data_point = Data(
+                            id=data_id,
+                            name=file_metadata["name"],
+                            raw_data_location=file_metadata["file_path"],
+                            extension=file_metadata["extension"],
+                            mime_type=file_metadata["mime_type"],
+                            owner_id=user.id,
+                            content_hash=file_metadata["content_hash"],
+                            external_metadata=get_external_metadata_dict(data_item),
+                        )
+
+                    # Check if data is already in dataset
+                    dataset_data = (
+                        await session.execute(
+                            select(DatasetData).filter(
+                                DatasetData.data_id == data_id, DatasetData.dataset_id == dataset.id
+                            )
+                        )
+                    ).scalar_one_or_none()
+                    # If data is not present in dataset add it
+                    if dataset_data is None:
+                        dataset.data.append(data_point)

-                    await session.merge(data)
-                    await session.commit()
-                else:
-                    data = Data(
-                        id=UUID(file_metadata["id"]),
-                        name=file_metadata["name"],
-                        raw_data_location=file_metadata["file_path"],
-                        extension=file_metadata["extension"],
-                        mime_type=file_metadata["mime_type"],
-                    )
-
-                    dataset.data.append(data)
                    await session.commit()

-                await give_permission_on_document(user, UUID(file_metadata["id"]), "read")
-                await give_permission_on_document(user, UUID(file_metadata["id"]), "write")
+                await give_permission_on_document(user, data_id, "read")
+                await give_permission_on_document(user, data_id, "write")
+        return file_paths


🛠️ Refactor suggestion

Refactor long function into smaller, focused functions.

The data_storing function is too complex and handles multiple responsibilities. Consider breaking it down into smaller functions:

Extract database operations:

async def _update_data_point(data_point: Data, file_metadata: dict, user: User, data_item: Any) -> None: data_point.name = file_metadata["name"] data_point.raw_data_location = file_metadata["file_path"] data_point.extension = file_metadata["extension"] data_point.mime_type = file_metadata["mime_type"] data_point.owner_id = user.id data_point.content_hash = file_metadata["content_hash"] data_point.external_metadata = get_external_metadata_dict(data_item) async def _create_data_point(data_id: str, file_metadata: dict, user: User, data_item: Any) -> Data: return Data( id=data_id, name=file_metadata["name"], raw_data_location=file_metadata["file_path"], extension=file_metadata["extension"], mime_type=file_metadata["mime_type"], owner_id=user.id, content_hash=file_metadata["content_hash"], external_metadata=get_external_metadata_dict(data_item) )

Add error handling:

async def data_storing(data: Any, dataset_name: str, user: User): if not isinstance(data, list): data = [data] file_paths = [] for data_item in data: - file_path = await save_data_item_to_storage(data_item, dataset_name) - file_paths.append(file_path) + try: + file_path = await save_data_item_to_storage(data_item, dataset_name) + file_paths.append(file_path) + except Exception as e: + raise RuntimeError(f"Failed to save data item to storage: {str(e)}") from e async with db_engine.get_async_session() as session: + try: dataset = await create_dataset(dataset_name, user.id, session) # ... rest of the database operations ... await session.commit() + except Exception as e: + await session.rollback() + raise RuntimeError(f"Database operation failed: {str(e)}") from e

coderabbitai · 2025-01-22T15:16:19Z

cognee/tasks/ingestion/ingest_data.py

+    @dlt.resource(standalone=True, primary_key="id", merge_key="id")
+    async def data_resources(file_paths: List[str], user: User):
        for file_path in file_paths:
            with open(file_path.replace("file://", ""), mode="rb") as file:
                classified_data = ingestion.classify(file)
-                data_id = ingestion.identify(classified_data)
+                data_id = ingestion.identify(classified_data, user)
                file_metadata = classified_data.get_metadata()
                yield {
                    "id": data_id,
                    "name": file_metadata["name"],
                    "file_path": file_metadata["file_path"],
                    "extension": file_metadata["extension"],
                    "mime_type": file_metadata["mime_type"],
+                    "content_hash": file_metadata["content_hash"],
+                    "owner_id": str(user.id),
                }


🛠️ Refactor suggestion

Add error handling for file operations.

The function should handle potential file operation errors and validate inputs:

@dlt.resource(standalone=True, primary_key="id", merge_key="id") async def data_resources(file_paths: List[str], user: User): + if not file_paths: + raise ValueError("file_paths cannot be empty") + for file_path in file_paths: - with open(file_path.replace("file://", ""), mode="rb") as file: - classified_data = ingestion.classify(file) - data_id = ingestion.identify(classified_data, user) - file_metadata = classified_data.get_metadata() - yield { - "id": data_id, - "name": file_metadata["name"], - "file_path": file_metadata["file_path"], - "extension": file_metadata["extension"], - "mime_type": file_metadata["mime_type"], - "content_hash": file_metadata["content_hash"], - "owner_id": str(user.id), - } + try: + with open(file_path.replace("file://", ""), mode="rb") as file: + classified_data = ingestion.classify(file) + data_id = ingestion.identify(classified_data, user) + file_metadata = classified_data.get_metadata() + yield { + "id": data_id, + "name": file_metadata["name"], + "file_path": file_metadata["file_path"], + "extension": file_metadata["extension"], + "mime_type": file_metadata["mime_type"], + "content_hash": file_metadata["content_hash"], + "owner_id": str(user.id), + } + except FileNotFoundError as e: + raise ValueError(f"File not found: {file_path}") from e + except Exception as e: + raise RuntimeError(f"Error processing file {file_path}: {str(e)}") from e

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

@dlt.resource(standalone=True, primary_key="id", merge_key="id")

async def data_resources(file_paths: List[str], user: User):

for file_path in file_paths:

with open(file_path.replace("file://", ""), mode="rb") as file:

classified_data = ingestion.classify(file)

data_id = ingestion.identify(classified_data)

data_id = ingestion.identify(classified_data, user)

file_metadata = classified_data.get_metadata()

yield {

"id": data_id,

"name": file_metadata["name"],

"file_path": file_metadata["file_path"],

"extension": file_metadata["extension"],

"mime_type": file_metadata["mime_type"],

"content_hash": file_metadata["content_hash"],

"owner_id": str(user.id),

}

@dlt.resource(standalone=True, primary_key="id", merge_key="id")

async def data_resources(file_paths: List[str], user: User):

if not file_paths:

raise ValueError("file_paths cannot be empty")

for file_path in file_paths:

try:

with open(file_path.replace("file://", ""), mode="rb") as file:

classified_data = ingestion.classify(file)

data_id = ingestion.identify(classified_data, user)

file_metadata = classified_data.get_metadata()

yield {

"id": data_id,

"name": file_metadata["name"],

"file_path": file_metadata["file_path"],

"extension": file_metadata["extension"],

"mime_type": file_metadata["mime_type"],

"content_hash": file_metadata["content_hash"],

"owner_id": str(user.id),

}

except FileNotFoundError as e:

raise ValueError(f"File not found: {file_path}") from e

except Exception as e:

raise RuntimeError(f"Error processing file {file_path}: {str(e)}") from e

dexters1 added 3 commits January 20, 2025 14:33

refactor: Refactor ingestion to only have one ingestion task

0c7c1d7

refactor: Reduce complexity of metadata handling

49ad292

Have foreign metadata be a table column in data instead of it's own table to reduce complexity Refactor COG-793

refactor: As neo4j can't support dictionaries, add foreign metadata a…

ab8d95c

…s string

dexters1 added the run-checks label Jan 20, 2025

dexters1 added 2 commits January 20, 2025 17:38

refactor: add missing foreing_metadata attr to tests

5c17501

refactor: Update test to be up to date with current metadata refactor…

4196a4c

… effort

dexters1 self-assigned this Jan 20, 2025

refactor: Resolve issue with notebook after metadata refactor

77f0b45

Resolve issue with LlamaIndex notebook after refactor

dexters1 marked this pull request as ready for review January 20, 2025 17:06

Merge branch 'dev' into COG-793-metadata-rework

bd3a5a7

coderabbitai bot reviewed Jan 20, 2025

View reviewed changes

Vasilije1990 self-requested a review January 21, 2025 13:02

dexters1 and others added 3 commits January 21, 2025 18:20

Merge branch 'dev' into COG-793-metadata-rework

655ab0b

refactor: Rename foreign to external metadata

80e67b0

Rename foreign metadata to external metadata for metadata coming outside of Cognee

Merge branch 'COG-793-metadata-rework' of github.com:topoteretes/cogn…

40c0279

…ee into COG-793-metadata-rework

coderabbitai bot reviewed Jan 22, 2025

View reviewed changes

Vasilije1990 approved these changes Jan 23, 2025

View reviewed changes

Vasilije1990 merged commit 90657a2 into dev Jan 23, 2025
28 checks passed

Vasilije1990 deleted the COG-793-metadata-rework branch January 23, 2025 10:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cog 793 metadata rework #460

Cog 793 metadata rework #460

dexters1 commented Jan 20, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 20, 2025 •

edited

Loading

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

coderabbitai bot left a comment

coderabbitai bot Jan 20, 2025

coderabbitai bot Jan 20, 2025

coderabbitai bot Jan 20, 2025

coderabbitai bot Jan 20, 2025

coderabbitai bot Jan 20, 2025

coderabbitai bot Jan 20, 2025

coderabbitai bot Jan 20, 2025

coderabbitai bot Jan 20, 2025

coderabbitai bot Jan 20, 2025

coderabbitai bot Jan 20, 2025

coderabbitai bot left a comment

coderabbitai bot Jan 22, 2025

coderabbitai bot Jan 22, 2025

		@@ -1,4 +1,5 @@
		from cognee.modules.data.models import Data
		import json

		except litellm.exceptions.BadRequestError:
		raise EmbeddingException("Failed to index data points.")

-        except litellm.exceptions.BadRequestError:
-            raise EmbeddingException("Failed to index data points.")
+        except litellm.exceptions.BadRequestError as e:
+            raise EmbeddingException(f"Bad request while indexing data points: {str(e)}")
+        except litellm.llms.OpenAI.openai.OpenAIError as e:
+            raise EmbeddingException(f"OpenAI service error: {str(e)}")

Cog 793 metadata rework #460

Cog 793 metadata rework #460

Conversation

dexters1 commented Jan 20, 2025 • edited by coderabbitai bot Loading

Description

DCO Affirmation

Summary by CodeRabbit

Release Notes

coderabbitai bot commented Jan 20, 2025 • edited Loading

Walkthrough

Changes

Possibly related PRs

Suggested reviewers

Poem

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Jan 20, 2025

Choose a reason for hiding this comment

coderabbitai bot Jan 20, 2025

Choose a reason for hiding this comment

coderabbitai bot Jan 20, 2025

Choose a reason for hiding this comment

coderabbitai bot Jan 20, 2025

Choose a reason for hiding this comment

coderabbitai bot Jan 20, 2025

Choose a reason for hiding this comment

coderabbitai bot Jan 20, 2025

Choose a reason for hiding this comment

coderabbitai bot Jan 20, 2025

Choose a reason for hiding this comment

coderabbitai bot Jan 20, 2025

Choose a reason for hiding this comment

coderabbitai bot Jan 20, 2025

Choose a reason for hiding this comment

coderabbitai bot Jan 20, 2025

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Jan 22, 2025

Choose a reason for hiding this comment

coderabbitai bot Jan 22, 2025

Choose a reason for hiding this comment

dexters1 commented Jan 20, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 20, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)