Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate graphiti's temporal awareness functionality as Tasks #253

Merged
merged 5 commits into from
Dec 4, 2024

Conversation

alekszievr
Copy link
Contributor

@alekszievr alekszievr commented Dec 4, 2024

Summary by CodeRabbit

  • New Features

    • Introduced asynchronous functions for building and searching graphs with temporal awareness.
    • Added an example script demonstrating how to utilize the new functionalities in the cognee library.
  • Bug Fixes

    • N/A
  • Documentation

    • N/A
  • Refactor

    • N/A
  • Style

    • N/A
  • Tests

    • N/A
  • Chores

    • N/A
  • Revert

    • N/A

Copy link
Contributor

coderabbitai bot commented Dec 4, 2024

Caution

Review failed

The pull request is closed.

Walkthrough

This pull request introduces several changes to the cognee library, specifically within the temporal_awareness module. It adds two asynchronous functions: build_graph_with_temporal_awareness and search_graph_with_temporal_awareness, which facilitate graph construction and querying. The __init__.py file is updated to enable access to these functions at the package level. Additionally, a new example script, graphiti_example.py, is provided to demonstrate the usage of these functions in an asynchronous context with a sample text list.

Changes

File Change Summary
cognee/tasks/temporal_awareness/__init__.py Added imports for build_graph_with_temporal_awareness and search_graph_with_temporal_awareness.
cognee/tasks/temporal_awareness/build_graph_with_temporal_awareness.py Introduced asynchronous function build_graph_with_temporal_awareness(text_list) for building a graph.
cognee/tasks/temporal_awareness/search_graph_with_temporal_awareness.py Introduced asynchronous function search_graph_with_temporal_awareness(graphiti, query) for searching a graph.
examples/python/graphiti_example.py Added a new example script demonstrating the use of the new functions with asynchronous tasks.

Poem

In the realm of code where rabbits play,
New functions hop in, brightening the day.
With graphs that build and queries that seek,
Temporal awareness, oh so unique!
Let's dance with data, let the tasks unfold,
In the world of cognee, adventures untold! 🐇✨


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 0d2a9e9 and b571fb5.

📒 Files selected for processing (1)
  • cognee/tasks/temporal_awareness/build_graph_with_temporal_awareness.py (1 hunks)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Outside diff range and nitpick comments (6)
examples/python/graphiti_example.py (4)

4-4: Remove unused import

The SearchType import is not used in the code.

-from cognee.api.v1.search import SearchType

9-13: Consider moving test data to a separate file

The example text data should be moved to a separate data file for better maintainability and reusability.

Create a new file examples/python/data/sample_texts.py:

KAMALA_HARRIS_TIMELINE = [
    "Kamala Harris is the Attorney General of California. She was previously "
    "the district attorney for San Francisco.",
    "As AG, Harris was in office from January 3, 2011 – January 3, 2017",
]

Then import and use it in the example:

-text_list = [
-    "Kamala Harris is the Attorney General of California. She was previously "
-    "the district attorney for San Francisco.",
-    "As AG, Harris was in office from January 3, 2011 – January 3, 2017",
-]
+from data.sample_texts import KAMALA_HARRIS_TIMELINE
+
+text_list = KAMALA_HARRIS_TIMELINE

24-25: Improve result handling and presentation

The results are printed without context, making it difficult to understand the output.

-    async for result in pipeline:
-        print(result)
+    try:
+        async for result in pipeline:
+            print(f"Search Result: {result}")
+    except Exception as e:
+        print(f"Pipeline execution failed: {e}")

28-29: Add proper script execution handling

The main execution block should handle keyboard interrupts and provide proper exit codes.

 if __name__ == '__main__':
-    asyncio.run(main())
+    try:
+        asyncio.run(main())
+    except KeyboardInterrupt:
+        print("\nOperation cancelled by user")
+        exit(1)
+    except Exception as e:
+        print(f"Execution failed: {e}")
+        exit(1)
cognee/tasks/temporal_awareness/build_graph_with_temporal_awareness.py (2)

13-21: Add batch processing capability for better performance.

Consider implementing batch processing for multiple episodes to improve performance with large text lists.

+    BATCH_SIZE = 100
+    for i in range(0, len(text_list), BATCH_SIZE):
+        batch = text_list[i:i + BATCH_SIZE]
+        tasks = [
+            graphiti.add_episode(
+                name=f"episode_{i+j}",
+                episode_body=text,
+                source=EpisodeType.text,
+                source_description="input",
+                reference_time=datetime.now()
+            )
+            for j, text in enumerate(batch)
+        ]
+        await asyncio.gather(*tasks)
+        print(f"Added batch of {len(batch)} episodes...")

21-21: Improve logging mechanism.

Replace print statements with proper logging to enable better monitoring and debugging.

+import logging
+
+logger = logging.getLogger(__name__)
+
 async def build_graph_with_temporal_awareness(text_list: list[str]) -> Graphiti:
     # ... initialization code ...
-    print("Graph database initialized.")
+    logger.info("Graph database initialized")
     
     for i, text in enumerate(text_list):
         # ... episode addition code ...
-        print(f"Added text: {text[:35]}...")
+        logger.debug(f"Added episode_{i}: {text[:35]}...")
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between df8fc82 and 0d2a9e9.

📒 Files selected for processing (4)
  • cognee/tasks/temporal_awareness/__init__.py (1 hunks)
  • cognee/tasks/temporal_awareness/build_graph_with_temporal_awareness.py (1 hunks)
  • cognee/tasks/temporal_awareness/search_graph_with_temporal_awareness.py (1 hunks)
  • examples/python/graphiti_example.py (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • cognee/tasks/temporal_awareness/init.py
🔇 Additional comments (1)
cognee/tasks/temporal_awareness/build_graph_with_temporal_awareness.py (1)

1-5: LGTM! Imports are well-organized and necessary.

The imports are correctly separated between standard library and third-party packages.

Comment on lines +3 to +6
async def search_graph_with_temporal_awareness(graphiti, query):
search_result = await graphiti.search(query)
await graphiti.close()
return search_result
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add error handling, type hints, and documentation

The function needs several improvements for robustness and maintainability:

Consider this enhanced implementation:

-async def search_graph_with_temporal_awareness(graphiti, query):
-    search_result = await graphiti.search(query)
-    await graphiti.close()
-    return search_result
+from typing import Any, TypeVar
+
+T = TypeVar('T')
+
+async def search_graph_with_temporal_awareness(graphiti: Any, query: str) -> T:
+    """Search the temporal awareness graph with the given query.
+    
+    Args:
+        graphiti: The graphiti instance to use for searching
+        query: The search query string
+        
+    Returns:
+        The search result from graphiti
+        
+    Raises:
+        Exception: If the search operation fails
+    """
+    try:
+        search_result = await graphiti.search(query)
+        return search_result
+    except Exception as e:
+        raise Exception(f"Failed to search graph: {str(e)}") from e
+    finally:
+        try:
+            await graphiti.close()
+        except Exception:
+            # Log error but don't fail the operation
+            pass
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
async def search_graph_with_temporal_awareness(graphiti, query):
search_result = await graphiti.search(query)
await graphiti.close()
return search_result
from typing import Any, TypeVar
T = TypeVar('T')
async def search_graph_with_temporal_awareness(graphiti: Any, query: str) -> T:
"""Search the temporal awareness graph with the given query.
Args:
graphiti: The graphiti instance to use for searching
query: The search query string
Returns:
The search result from graphiti
Raises:
Exception: If the search operation fails
"""
try:
search_result = await graphiti.search(query)
return search_result
except Exception as e:
raise Exception(f"Failed to search graph: {str(e)}") from e
finally:
try:
await graphiti.close()
except Exception:
# Log error but don't fail the operation
pass

Comment on lines +17 to +20
tasks = [
Task(build_graph_with_temporal_awareness, text_list=text_list),
Task(search_graph_with_temporal_awareness, query='Who was the California Attorney General?')
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Improve task dependency management and error handling

The second task depends on the result of the first task, but this dependency isn't properly managed.

Consider this improved implementation:

-    tasks = [
-        Task(build_graph_with_temporal_awareness, text_list=text_list),
-        Task(search_graph_with_temporal_awareness, query='Who was the California Attorney General?')
-    ]
+    try:
+        graphiti = await build_graph_with_temporal_awareness(text_list=text_list)
+        tasks = [
+            Task(search_graph_with_temporal_awareness, 
+                 graphiti=graphiti, 
+                 query='Who was the California Attorney General?')
+        ]
+    except Exception as e:
+        print(f"Failed to build graph: {e}")
+        return

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines 9 to 10
graphiti = Graphiti("bolt://localhost:7687", "neo4j", "pleaseletmein")
await graphiti.build_indices_and_constraints()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Critical: Remove hardcoded credentials and add error handling.

Hardcoded credentials pose a security risk and should be moved to environment variables. Additionally, database operations should include proper error handling.

+import os
+from graphiti_core.exceptions import GraphitiConnectionError

 async def build_graph_with_temporal_awareness(text_list: list[str]) -> Graphiti:
-    graphiti = Graphiti("bolt://localhost:7687", "neo4j", "pleaseletmein")
-    await graphiti.build_indices_and_constraints() 
+    try:
+        graphiti = Graphiti(
+            os.getenv("NEO4J_URI", "bolt://localhost:7687"),
+            os.getenv("NEO4J_USER", "neo4j"),
+            os.getenv("NEO4J_PASSWORD")
+        )
+        await graphiti.build_indices_and_constraints()
+    except Exception as e:
+        raise GraphitiConnectionError(f"Failed to initialize graph database: {str(e)}")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
graphiti = Graphiti("bolt://localhost:7687", "neo4j", "pleaseletmein")
await graphiti.build_indices_and_constraints()
try:
graphiti = Graphiti(
os.getenv("NEO4J_URI", "bolt://localhost:7687"),
os.getenv("NEO4J_USER", "neo4j"),
os.getenv("NEO4J_PASSWORD")
)
await graphiti.build_indices_and_constraints()
except Exception as e:
raise GraphitiConnectionError(f"Failed to initialize graph database: {str(e)}")

from graphiti_core.nodes import EpisodeType


async def build_graph_with_temporal_awareness(text_list):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add type hints and documentation.

The function lacks type hints and documentation, which are essential for maintainability and usability.

-async def build_graph_with_temporal_awareness(text_list):
+async def build_graph_with_temporal_awareness(text_list: list[str]) -> Graphiti:
+    """Build a temporal awareness graph from a list of texts.
+
+    Args:
+        text_list: List of text strings to be added as episodes.
+
+    Returns:
+        Graphiti: Initialized graphiti instance with added episodes.
+
+    Raises:
+        ConnectionError: If unable to connect to Neo4j database.
+    """

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines +13 to +21
for i, text in enumerate(text_list):
await graphiti.add_episode(
name=f"episode_{i}",
episode_body=text,
source=EpisodeType.text,
source_description="input",
reference_time=datetime.now()
)
print(f"Added text: {text[:35]}...")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Improve temporal awareness implementation.

The current implementation uses the same timestamp (current time) for all episodes, which might not be accurate for temporal awareness. Consider accepting a reference time parameter or extracting temporal information from the text.

-async def build_graph_with_temporal_awareness(text_list: list[str]) -> Graphiti:
+async def build_graph_with_temporal_awareness(
+    text_list: list[str],
+    reference_times: list[datetime] | None = None
+) -> Graphiti:
     # ... initialization code ...
     
     for i, text in enumerate(text_list):
+        reference_time = (
+            reference_times[i] if reference_times 
+            else datetime.now()
+        )
         await graphiti.add_episode(
             name=f"episode_{i}",
             episode_body=text,
             source=EpisodeType.text,
             source_description="input",
-            reference_time=datetime.now()
+            reference_time=reference_time
         )

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines +3 to +4
from graphiti_core import Graphiti
from graphiti_core.nodes import EpisodeType
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

⚠️ Missing dependency specification for graphiti-core

The code correctly uses the Graphiti API with proper initialization and method calls (Graphiti() constructor, build_indices_and_constraints(), add_episode()), and the EpisodeType enum is used appropriately. However, the graphiti-core dependency is not specified in any requirements file, which could lead to installation issues.

  • Add graphiti-core to your project's requirements file to ensure proper dependency management
  • Consider externalizing the database connection parameters instead of hardcoding them
🔗 Analysis chain

Verify Graphiti API usage.

Let's verify that we're using the correct Graphiti API methods and that all required dependencies are properly specified in the project's requirements.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify Graphiti dependency and API usage

# Check if graphiti-core is in requirements
fd -t f "requirements.*\.txt" -x grep -l "graphiti-core"

# Look for other Graphiti method usages to ensure consistency
rg "from graphiti_core import|from graphiti_core\." -A 3

Length of output: 609


Script:

#!/bin/bash
# Let's check the actual usage of Graphiti and EpisodeType in the file
ast-grep --pattern 'class $_ {
  $$$
  Graphiti.$_($$$)
  $$$
}'

# Also check direct Graphiti instantiation
ast-grep --pattern 'Graphiti($$$)'

# And check EpisodeType usage
ast-grep --pattern 'EpisodeType.$_'

# Let's also see the full content of the build_graph function to understand the context
rg -A 20 "async def build_graph_with_temporal_awareness" cognee/tasks/temporal_awareness/build_graph_with_temporal_awareness.py

Length of output: 1097

Copy link
Contributor

@Vasilije1990 Vasilije1990 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use variables from our system to pass neo4j credentials instead of hardcoding them

@Vasilije1990 Vasilije1990 merged commit c4ad473 into main Dec 4, 2024
25 of 26 checks passed
@Vasilije1990 Vasilije1990 deleted the feat/COG-711-temporal-awareness-task branch December 4, 2024 19:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants