Shiny new LLMOps #110

Vasilije1990 · 2024-06-15T09:46:35Z

Summary by CodeRabbit

New Features
- Introduced detailed documentation on the development of Cognee and its applications in AI.
- Added search methods for evaluating datasets asynchronously.
Bug Fixes
- Fixed active link for the blog post on knowledge graphs.
Dependencies
- Updated dspy-ai to version 2.4.10.
- Added deepeval and evals dependencies.
Enhancements
- Improved configuration options for graph topology tasks.
- Enhanced dataset management and evaluation processes.

coderabbitai · 2024-06-15T09:46:43Z

Walkthrough

This release introduces significant enhancements across multiple components of the Cognee system, focusing on improvements in graph topology processing, dataset management, and the configuration setup. Key updates include the addition of new functions for asynchronous operations, adjustments to dependencies, and a new blog post detailing Cognee's development and its practical applications.

Changes

File Path	Summary
cognee/api/v1/cognify/cognify.py	Added imports, print statements, setting variables, and calling functions related to graph topology and configuration adjustments.
cognee/api/v1/config/config.py	Added static methods in the `Config` class to set graph topology task configurations.
cognee/infrastructure/databases/graph/config.py	Changed `graph_topology_task` to `True`, renamed `graph_topology` to `graph_topology_model`, and updated `to_dict`.
cognee/infrastructure/llm/prompts/extract_topology.txt	Updated instructions for extracting topology information from text.
cognee/modules/topology/infer_data_topology.py	Updated assignment of `graph_topology` to `graph_topology_model`.
evals/simple_rag_vs_cognee_eval.py	Major restructuring, added asynchronous tasks, new functions for dataset loading and evaluation, and logging.
pyproject.toml	Updated dependencies (`deepeval`, `dspy-ai`), added `evals` under `notebook`.
docs/blog/posts/where-do-knowledge-graphs-fit-and-where-they-dont.md	New blog post detailing Cognee's development and practical applications with Dynamo.fyi.
docs/blog/index.md	Updated a link for the blog post "Where do knowledge graphs fit, and where do they not? A case study with dynamo.fyi".

Poem

In code's grand waltz, changes align,
Graphs and tasks redefined.
With async strides and datasets vast,
Configs tuned, improvements last.
Dependencies now freshly sown,
Cognee's strengths have grown.
📚✨

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share

Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai generate interesting stats about this repository and render them as a table.
- @coderabbitai show all the console.log statements in this repository.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 6

Outside diff range and nitpick comments (1)

cognee/api/v1/cognify/cognify.py (1)
Line range hint 241-241: Avoid using a bare except statement. It's a best practice to specify the exceptions you're catching to avoid masking unexpected errors.
- except:
+ except ExpectedExceptionType:

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 79d299e and 16139f7.

Files selected for processing (5)

cognee/api/v1/cognify/cognify.py (3 hunks)
cognee/api/v1/config/config.py (1 hunks)
cognee/infrastructure/databases/graph/config.py (2 hunks)
cognee/infrastructure/llm/prompts/extract_topology.txt (1 hunks)
cognee/modules/topology/infer_data_topology.py (1 hunks)

Additional context used

Ruff

cognee/api/v1/cognify/cognify.py

151-151: Local variable graph_topology is assigned to but never used (F841)

Remove assignment to unused variable graph_topology

241-241: Do not use bare except (E722)

306-306: Local variable dataset_name is assigned to but never used (F841)

Remove assignment to unused variable dataset_name

308-308: Local variable data_dir is assigned to but never used (F841)

Remove assignment to unused variable data_dir

328-328: Local variable graph is assigned to but never used (F841)

Remove assignment to unused variable graph

Additional comments not posted (4)

cognee/infrastructure/llm/prompts/extract_topology.txt (1)

4-6: The updated instructions clarify the process and ensure the correct usage of document IDs for node identification, aligning well with the intended graph topology extraction.

cognee/modules/topology/infer_data_topology.py (1)

10-10: Correctly updated the variable to match the new attribute name in the graph configuration. This ensures consistency across the codebase.

cognee/infrastructure/databases/graph/config.py (1)

22-23: The updates to the GraphConfig class are well-implemented, ensuring that the new graph_topology_model attribute and the graph_topology_task setting are correctly integrated and reflected in the configuration dictionary.

Also applies to: 42-42

cognee/api/v1/config/config.py (1)

135-149: The addition of new static methods for managing graph topology settings is well-executed. These methods enhance the configurability of the system and ensure greater flexibility in handling different graph topology scenarios.

coderabbitai · 2024-06-15T09:50:41Z

cognee/api/v1/cognify/cognify.py

+    print(graph_config.infer_graph_topology)
+    print(graph_config.graph_topology_task)
+


Multiple print statements for debugging purposes can clutter the production code. Consider using a more robust logging approach that can be easily toggled or configured.

- print(graph_config.infer_graph_topology) - print(graph_config.graph_topology_task) - print('infered topology added') - print('provided topology added') + logger.debug(f"Infer graph topology: {graph_config.infer_graph_topology}") + logger.debug(f"Graph topology task: {graph_config.graph_topology_task}") + logger.debug("Inferred topology added") + logger.debug("Provided topology added")

Also applies to: 162-163, 168-169

@Vasilije1990 This is a good suggestion

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

coderabbitai · 2024-06-15T09:50:41Z

cognee/api/v1/cognify/cognify.py

@@ -294,14 +303,30 @@

        from cognee.api.v1.add import add

+        dataset_name = "explanations"


Several local variables (dataset_name, data_dir, graph) are assigned but never used. Consider removing these if they are unnecessary.

- dataset_name = "explanations" - data_dir = os.path.abspath("../../../.data") - graph = await cognify("explanations")

Also applies to: 308-308, 328-328

Tools

Ruff

306-306: Local variable dataset_name is assigned to but never used (F841)

Remove assignment to unused variable dataset_name

coderabbitai · 2024-06-15T09:50:41Z

cognee/api/v1/cognify/cognify.py

        await add([text], "example_dataset")

        from cognee.api.v1.config.config import config
        config.set_chunk_engine(ChunkEngine.LANGCHAIN_ENGINE )
        config.set_chunk_strategy(ChunkStrategy.LANGCHAIN_CHARACTER)
        config.embedding_engine = LiteLLMEmbeddingEngine()
+        config.set_chunk_engine(ChunkEngine.LANGCHAIN_ENGINE)


Duplicate assignment to config.set_chunk_engine. If this is intentional, consider adding a comment explaining why; otherwise, it seems like an oversight.

- config.set_chunk_engine(ChunkEngine.LANGCHAIN_ENGINE)

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

config.set_chunk_engine(ChunkEngine.LANGCHAIN_ENGINE)

@Vasilije1990 This one

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

coderabbitai · 2024-06-15T09:50:42Z

cognee/api/v1/cognify/cognify.py

+        config.set_infer_graph_topology=True
+
+        from cognee.api.v1.datasets.datasets import datasets
+        print(datasets.list_datasets())


Printing the list of datasets might not be necessary in the production environment. Consider removing or replacing it with a debug log.

- print(datasets.list_datasets()) + logger.debug("Available datasets: " + str(datasets.list_datasets()))

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

print(datasets.list_datasets())

logger.debug("Available datasets: " + str(datasets.list_datasets()))

@Vasilije1990 This

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

coderabbitai · 2024-06-15T09:50:42Z

cognee/api/v1/cognify/cognify.py

@@ -149,15 +150,23 @@
    graph_config = get_graph_config()
    graph_topology = graph_config.graph_model


Local variable graph_topology is assigned but never used. Consider removing it if it's not necessary.

- graph_topology = graph_config.graph_model

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

graph_topology = graph_config.graph_model

Tools

Ruff

151-151: Local variable graph_topology is assigned to but never used (F841)

Remove assignment to unused variable graph_topology

coderabbitai · 2024-06-15T09:50:42Z

cognee/api/v1/cognify/cognify.py

@@ -1,4 +1,5 @@
 import asyncio
+import os


Consider consolidating imports from the same module on a single line for better readability.

- import os + import os, asyncio

Committable suggestion was skipped due to low confidence.

coderabbitai

Actionable comments posted: 2

Outside diff range and nitpick comments (1)

evals/simple_rag_vs_cognee_eval.py (1)

Line range hint 72-87: The function cognify_search_base_rag and cognify_search_graph are using dynamic imports which can be inefficient and harder to track. Consider moving these imports to the top of the file.

+ from cognee.api.v1.search.search import search
async def cognify_search_base_rag(content:str):
    base_config = get_base_config()
    cognee_directory_path = os.path.abspath(".cognee_system")
    base_config.system_root_directory = cognee_directory_path
    vector_engine = get_vector_engine()
    return_ = await vector_engine.search(collection_name="basic_rag", query_text=content, limit=10)
    print("results", return_)
    return return_

async def cognify_search_graph(content:str, search_type= 'SIMILARITY'):
-    from cognee.api.v1.search.search import search
    params = {'query': content}
    results = await search(search_type, params)
    print("results", results)
    return results

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 16139f7 and 7ff87bb.

Files ignored due to path filters (1)

poetry.lock is excluded by !**/*.lock

Files selected for processing (2)

evals/simple_rag_vs_cognee_eval.py (5 hunks)
pyproject.toml (2 hunks)

Files skipped from review due to trivial changes (1)

pyproject.toml

Additional context used

Ruff

evals/simple_rag_vs_cognee_eval.py

12-12: Module level import not at top of file (E402)

13-13: Module level import not at top of file (E402)

14-14: Module level import not at top of file (E402)

17-17: Module level import not at top of file (E402)

Additional comments not posted (2)

evals/simple_rag_vs_cognee_eval.py (2)
127-131: The main function is correctly using async/await patterns, but ensure that all asynchronous calls are properly awaited to avoid runtime issues.

4-10: Ensure that all imports are placed at the top of the file to comply with PEP 8 standards.
- import os
- from cognee.base_config import get_base_config
- from cognee.infrastructure.databases.vector import get_vector_engine
- import logging
+ # Imports should be moved to the top of the file
Likely invalid or redundant comment.

coderabbitai · 2024-06-15T15:17:59Z

evals/simple_rag_vs_cognee_eval.py

+async def run_cognify_task(dataset:str= 'test_datasets', dataset_name:str = 'initial_test' ):
    from cognee.api.v1.add import add
    from cognee.api.v1.prune import prune
    from cognee.api.v1.cognify.cognify import cognify

    await prune.prune_system()

-    await add("data://test_datasets", "initial_test")
-
-    graph = await cognify("initial_test")
+    await add(f"data://{dataset}", dataset_name)

+    graph = await cognify(dataset_name)

+    return graph


The function run_cognify_task seems to be well-defined for asynchronous operation. However, consider adding docstrings to improve maintainability and readability.

+ """ + Asynchronously runs the cognify task. + Parameters: + dataset (str): The dataset to use, defaults to 'test_datasets'. + dataset_name (str): The name of the dataset, defaults to 'initial_test'. + Returns: + graph: The result of the cognify operation. + """ async def run_cognify_task(dataset:str= 'test_datasets', dataset_name:str = 'initial_test' ): from cognee.api.v1.add import add from cognee.api.v1.prune import prune from cognee.api.v1.cognify.cognify import cognify await prune.prune_system() await add(f"data://{dataset}", dataset_name) graph = await cognify(dataset_name) return graph

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

async def run_cognify_task(dataset:str= 'test_datasets', dataset_name:str = 'initial_test' ):

from cognee.api.v1.add import add

from cognee.api.v1.prune import prune

from cognee.api.v1.cognify.cognify import cognify

await prune.prune_system()

await add("data://test_datasets", "initial_test")

graph = await cognify("initial_test")

await add(f"data://{dataset}", dataset_name)

graph = await cognify(dataset_name)

return graph

"""

Asynchronously runs the cognify task.

Parameters:

dataset (str): The dataset to use, defaults to 'test_datasets'.

dataset_name (str): The name of the dataset, defaults to 'initial_test'.

Returns:

graph: The result of the cognify operation.

"""

async def run_cognify_task(dataset:str= 'test_datasets', dataset_name:str = 'initial_test' ):

from cognee.api.v1.add import add

from cognee.api.v1.prune import prune

from cognee.api.v1.cognify.cognify import cognify

await prune.prune_system()

await add(f"data://{dataset}", dataset_name)

graph = await cognify(dataset_name)

return graph

coderabbitai · 2024-06-15T15:18:00Z

evals/simple_rag_vs_cognee_eval.py

+async def convert_goldens_to_test_cases(test_cases_raw: List[LLMTestCase], context_type:str =None) -> List[LLMTestCase]:
+
    test_cases = []
    for case in test_cases_raw:
+        if context_type == "naive_rag":
+            context = await cognify_search_base_rag(case.input)
+            print('naive rag', context)
+        elif context_type == "graph":
+            context = await cognify_search_graph(case.input)
+            print('graph', context)
+        else:
+            context = case.context
+
        test_case = LLMTestCase(
            input=case.input,
            # Generate actual output using the 'input' and 'additional_metadata'
-            actual_output= str(get_answer(case.input, case.context).model_dump()['response']),
+            actual_output= str(get_answer(case.input, context).model_dump()['response']),
            expected_output=case.expected_output,
            context=case.context,
            retrieval_context=["retrieval_context"],
        )
-        test_cases.append(test_case)
+    test_cases.append(test_case)


Tip
Codebase Verification

The previous script did not find any try or except blocks within the convert_goldens_to_test_cases function, indicating a lack of error handling around asynchronous calls.

The convert_goldens_to_test_cases function lacks proper error handling for its asynchronous calls.

Analysis chain

The function convert_goldens_to_test_cases is handling different context types and converting them to test cases. It is crucial to ensure that the error handling is robust, especially since it relies on asynchronous calls which might fail.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Description: Verify the robustness of error handling in asynchronous calls within `convert_goldens_to_test_cases`. # Test: Check for proper error handling in async functions. Expect: Proper try-except blocks around async calls. rg --type python $'try { $$$ } catch { $$$ }' evals/simple_rag_vs_cognee_eval.py

Length of output: 195

Script:

#!/bin/bash # Description: Verify the robustness of error handling in asynchronous calls within `convert_goldens_to_test_cases`. # Test: Check for proper error handling in async functions. Expect: Proper try-except blocks around async calls. rg 'try:' -A 3 evals/simple_rag_vs_cognee_eval.py

Length of output: 133

Script:

#!/bin/bash # Description: Verify the robustness of error handling in asynchronous calls within `convert_goldens_to_test_cases`. # Test: Check for proper error handling in async functions within the `convert_goldens_to_test_cases` function. Expect: Proper try-except blocks around async calls. ast-grep --lang python --pattern $'def convert_goldens_to_test_cases($_, $_):\n $$$' evals/simple_rag_vs_cognee_eval.py | rg 'try:|except:'

Length of output: 142

coderabbitai

Actionable comments posted: 2

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 7ff87bb and f3a1d2e.

Files selected for processing (1)

evals/simple_rag_vs_cognee_eval.py (1 hunks)

Additional context used

Ruff

evals/simple_rag_vs_cognee_eval.py

174-174: Module level import not at top of file

(E402)

Additional comments not posted (9)

evals/simple_rag_vs_cognee_eval.py (9)

25-26: LGTM!

29-43: LGTM!

46-60: LGTM!

63-75: LGTM!

78-92: LGTM!

95-112: LGTM!

149-172: LGTM!

176-187: LGTM!

189-207: LGTM!

coderabbitai · 2024-06-29T13:01:31Z

evals/simple_rag_vs_cognee_eval.py

+import asyncio
+import logging
+import os
+from typing import List, Type

+from pydantic import BaseModel
+from dotenv import load_dotenv

-from typing import List, Type
+from deepeval.dataset import EvaluationDataset
 from deepeval.test_case import LLMTestCase
-import dotenv
-dotenv.load_dotenv()
-
+from deepeval.metrics import HallucinationMetric, BaseMetric
 from cognee.infrastructure.llm.get_llm_client import get_llm_client
+from cognee.base_config import get_base_config
+from cognee.infrastructure.databases.vector import get_vector_engine
+from cognee.api.v1.add import add
+from cognee.api.v1.prune import prune
+from cognee.api.v1.cognify.cognify import cognify
+from cognee.api.v1.search.search import search

-dataset = EvaluationDataset()
-dataset.add_test_cases_from_json_file(
-    # file_path is the absolute path to you .json file
-    file_path="./synthetic_data/20240519_185842.json",
-    input_key_name="input",
-    actual_output_key_name="actual_output",
-    expected_output_key_name="expected_output",
-    context_key_name="context"
-)
-
-print(dataset)
-# from deepeval.synthesizer import Synthesizer
-#
-# synthesizer = Synthesizer(model="gpt-3.5-turbo")
-#
-# dataset = EvaluationDataset()
-# dataset.generate_goldens_from_docs(
-#     synthesizer=synthesizer,
-#     document_paths=['natural_language_processing.txt', 'soldiers_home.pdf', 'trump.txt'],
-#     max_goldens_per_document=10,
-#     num_evolutions=5,
-#     enable_breadth_evolve=True,
-# )
-
-
-print(dataset.goldens)
-print(dataset)
+load_dotenv()


Fix the import order.

The import statement at line 174 should be moved to the top of the file to comply with PEP 8 guidelines.

- from typing import Type

Also applies to: 174-174

coderabbitai · 2024-06-29T13:01:31Z

evals/simple_rag_vs_cognee_eval.py

+async def convert_goldens_to_test_cases(test_cases_raw: List[LLMTestCase], context_type: str = None) -> List[
+    LLMTestCase]:
+    """
+    Converts raw test cases into LLMTestCase objects with actual outputs generated.

+    Args:
+        test_cases_raw (List[LLMTestCase]): List of raw test cases.
+        context_type (str, optional): Type of context to use. Defaults to None.

-def convert_goldens_to_test_cases(test_cases_raw: List[LLMTestCase]) -> List[LLMTestCase]:
+    Returns:
+        List[LLMTestCase]: List of LLMTestCase objects with actual outputs populated.
+    """
    test_cases = []
    for case in test_cases_raw:
+        if context_type == "naive_rag":
+            context = await cognify_search_base_rag(case.input)
+        elif context_type == "graph":
+            context = await cognify_search_graph(case.input)
+        else:
+            context = case.context
+
+        actual_output = str((await get_answer(case.input, context)).response)
+
        test_case = LLMTestCase(
            input=case.input,
-            # Generate actual output using the 'input' and 'additional_metadata'
-            actual_output= str(get_answer(case.input, case.context).model_dump()['response']),
+            actual_output=actual_output,
            expected_output=case.expected_output,
            context=case.context,
            retrieval_context=["retrieval_context"],
        )
        test_cases.append(test_case)
+


Add error handling for asynchronous calls.

The function lacks proper error handling for its asynchronous calls. Ensure that all asynchronous calls are wrapped in try-except blocks.

+ try: + if context_type == "naive_rag": + context = await cognify_search_base_rag(case.input) + elif context_type == "graph": + context = await cognify_search_graph(case.input) + else: + context = case.context + + actual_output = str((await get_answer(case.input, context)).response) + except Exception as error: + logger.error("Error converting test case: %s", error, exc_info=True) + raise error

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

async def convert_goldens_to_test_cases(test_cases_raw: List[LLMTestCase], context_type: str = None) -> List[

LLMTestCase]:

"""

Converts raw test cases into LLMTestCase objects with actual outputs generated.

Args:

test_cases_raw (List[LLMTestCase]): List of raw test cases.

context_type (str, optional): Type of context to use. Defaults to None.

def convert_goldens_to_test_cases(test_cases_raw: List[LLMTestCase]) -> List[LLMTestCase]:

Returns:

List[LLMTestCase]: List of LLMTestCase objects with actual outputs populated.

"""

test_cases = []

for case in test_cases_raw:

if context_type == "naive_rag":

context = await cognify_search_base_rag(case.input)

elif context_type == "graph":

context = await cognify_search_graph(case.input)

else:

context = case.context

actual_output = str((await get_answer(case.input, context)).response)

test_case = LLMTestCase(

input=case.input,

# Generate actual output using the 'input' and 'additional_metadata'

actual_output= str(get_answer(case.input, case.context).model_dump()['response']),

actual_output=actual_output,

expected_output=case.expected_output,

context=case.context,

retrieval_context=["retrieval_context"],

)

test_cases.append(test_case)

test_cases = []

for case in test_cases_raw:

try:

if context_type == "naive_rag":

context = await cognify_search_base_rag(case.input)

elif context_type == "graph":

context = await cognify_search_graph(case.input)

else:

context = case.context

actual_output = str((await get_answer(case.input, context)).response)

except Exception as error:

logger.error("Error converting test case: %s", error, exc_info=True)

raise error

test_case = LLMTestCase(

input=case.input,

actual_output=actual_output,

expected_output=case.expected_output,

context=case.context,

retrieval_context=["retrieval_context"],

)

test_cases.append(test_case)

coderabbitai

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between f3a1d2e and 2d09379.

Files ignored due to path filters (1)

poetry.lock is excluded by !**/*.lock

Files selected for processing (1)

pyproject.toml (3 hunks)

Additional comments not posted (2)

pyproject.toml (2)
76-76: Verify compatibility of the new deepeval dependency.

Ensure that adding deepeval version ^0.21.55 does not introduce any compatibility issues with the rest of the codebase.

59-59: Verify compatibility of the updated dspy-ai dependency.

Ensure that updating dspy-ai to version 2.4.10 does not introduce any compatibility issues with the rest of the codebase.

Verification successful

The update to dspy-ai version 2.4.10 is compatible with the existing codebase.

There are no breaking changes or significant updates in the new version that would affect the current usage patterns.

The changelog for version 2.4.10 does not indicate any breaking changes.

The search results show extensive usage of dspy in the codebase, and no compatibility issues are expected.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify compatibility of the updated `dspy-ai` dependency.

# Test: Run the test suite. Expect: All tests to pass.
poetry install
poetry run pytest
Length of output: 184

Script:
#!/bin/bash
# Search for all import statements related to `dspy-ai` in the codebase.
rg --fixed-strings "import dspy-ai" --glob "*.py"

# Search for all usage instances of `dspy-ai` in the codebase.
rg --fixed-strings "dspy-ai." --glob "*.py"
Length of output: 95

Script:
#!/bin/bash
# Search for all import statements related to `dspy_ai` in the codebase (alternative import style).
rg --fixed-strings "import dspy_ai" --glob "*.py"

# Search for all usage instances of `dspy_ai` in the codebase (alternative usage pattern).
rg --fixed-strings "dspy_ai." --glob "*.py"

# Search for all import statements related to `dspy` in the codebase (potential module import).
rg --fixed-strings "import dspy" --glob "*.py"

# Search for all usage instances of `dspy` in the codebase (potential module usage).
rg --fixed-strings "dspy." --glob "*.py"
Length of output: 4542

Script:
#!/bin/bash
# Fetch the changelog or release notes for `dspy-ai` version `2.4.10`.
curl -s https://pypi.org/pypi/dspy-ai/2.4.10/json | jq -r '.info.description'
Length of output: 36729

borisarzentar · 2024-07-01T10:31:35Z

cognee/api/v1/cognify/cognify.py

@@ -149,15 +150,23 @@ async def process_batch(files_batch):
    graph_config = get_graph_config()
    graph_topology = graph_config.graph_model

+    print(graph_config.infer_graph_topology)


Let's delete these print lines.

borisarzentar · 2024-07-01T10:32:14Z

cognee/api/v1/cognify/cognify.py

+    print(graph_config.infer_graph_topology)
+    print(graph_config.graph_topology_task)
+


@Vasilije1990 This is a good suggestion

borisarzentar · 2024-07-01T10:33:44Z

cognee/api/v1/cognify/cognify.py

        await add([text], "example_dataset")

        from cognee.api.v1.config.config import config
        config.set_chunk_engine(ChunkEngine.LANGCHAIN_ENGINE )
        config.set_chunk_strategy(ChunkStrategy.LANGCHAIN_CHARACTER)
        config.embedding_engine = LiteLLMEmbeddingEngine()
+        config.set_chunk_engine(ChunkEngine.LANGCHAIN_ENGINE)


@Vasilije1990 This one

borisarzentar · 2024-07-01T10:34:08Z

cognee/api/v1/cognify/cognify.py

+        config.set_infer_graph_topology=True
+
+        from cognee.api.v1.datasets.datasets import datasets
+        print(datasets.list_datasets())


@Vasilije1990 This

coderabbitai

Actionable comments posted: 1

Outside diff range and nitpick comments (6)

docs/blog/posts/where-do-knowledge-graphs-fit-and-where-they-dont.md (6)
15-15: Remove trailing spaces.

There is an unnecessary trailing space at the end of this line.
- 
+
27-27: Remove multiple consecutive blank lines.

There are two consecutive blank lines here, which is unnecessary.
- 
- 
+ 
Tools

Markdownlint

27-27: Expected: 1; Actual: 2
Multiple consecutive blank lines

(MD012, no-multiple-blanks)

50-50: Remove trailing spaces.

There is an unnecessary trailing space at the end of this line.
- 
+
Tools

Markdownlint

50-50: Expected: 0 or 2; Actual: 1
Trailing spaces

(MD009, no-trailing-spaces)

68-68: Remove multiple consecutive blank lines.

There are two consecutive blank lines here, which is unnecessary.
- 
- 
+ 
Tools

Markdownlint

68-68: Expected: 1; Actual: 2
Multiple consecutive blank lines

(MD012, no-multiple-blanks)

78-78: Remove trailing spaces.

There is an unnecessary trailing space at the end of this line.
- 
+
Tools

Markdownlint

78-78: Expected: 0 or 2; Actual: 1
Trailing spaces

(MD009, no-trailing-spaces)

82-82: Remove multiple consecutive blank lines.

There are two consecutive blank lines here, which is unnecessary.
- 
- 
+ 
Tools

Markdownlint

82-82: Expected: 1; Actual: 2
Multiple consecutive blank lines

(MD012, no-multiple-blanks)

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 2d09379 and f322a6c.

Files ignored due to path filters (4)

docs/blog/posts/where-do-knowledge-graphs-fit/cognee_results.png is excluded by !**/*.png
docs/blog/posts/where-do-knowledge-graphs-fit/dynamo_fyi_demo.jpg is excluded by !**/*.jpg
docs/blog/posts/where-do-knowledge-graphs-fit/initial_results.png is excluded by !**/*.png
docs/blog/posts/where-do-knowledge-graphs-fit/timeline.jpg is excluded by !**/*.jpg

Files selected for processing (1)

docs/blog/posts/where-do-knowledge-graphs-fit-and-where-they-dont.md (1 hunks)

Additional context used

LanguageTool

docs/blog/posts/where-do-knowledge-graphs-fit-and-where-they-dont.md

[uncategorized] ~52-~52: Possible missing comma found.
Context: ...ersonalization layer above the Firebase store where all customer data is located. How...

(AI_HYDRA_LEO_MISSING_COMMA)

Markdownlint

docs/blog/posts/where-do-knowledge-graphs-fit-and-where-they-dont.md

36-36: Expected: 0 or 2; Actual: 1
Trailing spaces

(MD009, no-trailing-spaces)

50-50: Expected: 0 or 2; Actual: 1
Trailing spaces

(MD009, no-trailing-spaces)

78-78: Expected: 0 or 2; Actual: 1
Trailing spaces

(MD009, no-trailing-spaces)

27-27: Expected: 1; Actual: 2
Multiple consecutive blank lines

(MD012, no-multiple-blanks)

68-68: Expected: 1; Actual: 2
Multiple consecutive blank lines

(MD012, no-multiple-blanks)

82-82: Expected: 1; Actual: 2
Multiple consecutive blank lines

(MD012, no-multiple-blanks)

Additional comments not posted (2)

docs/blog/posts/where-do-knowledge-graphs-fit-and-where-they-dont.md (2)

1-12: Front matter looks good.

The metadata is correctly formatted.

42-47: Section looks good.

The description of Dynamo's use case and the recommender system is clear and informative.

coderabbitai · 2024-07-03T13:11:11Z

docs/blog/posts/where-do-knowledge-graphs-fit-and-where-they-dont.md

+
+Upon arrival at Dynamo, we found a rudimentary agent support system connected to OpenAI with one templated query. Instead of building a GraphRAG, we needed to replicate the existing functionality and create an infrastructure layer for future evolution towards GraphRAG. 
+
+GraphRAG would provide a semantic and personalization layer above the Firebase store where all customer data is located. However, this would only be useful if we had a system to release our agent support assistant to production.


Add missing comma.

There is a possible missing comma in this sentence.

- layer above the Firebase store where all customer data is located + layer above the Firebase store, where all customer data is located

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

GraphRAG would provide a semantic and personalization layer above the Firebase store where all customer data is located. However, this would only be useful if we had a system to release our agent support assistant to production.

GraphRAG would provide a semantic and personalization layer above the Firebase store, where all customer data is located. However, this would only be useful if we had a system to release our agent support assistant to production.

Tools

LanguageTool

[uncategorized] ~52-~52: Possible missing comma found.
Context: ...ersonalization layer above the Firebase store where all customer data is located. How...

(AI_HYDRA_LEO_MISSING_COMMA)

gitguardian · 2024-07-03T13:13:02Z

⚠️ GitGuardian has uncovered 2 secrets following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

🔎 Detected hardcoded secrets in your pull request

GitGuardian id	GitGuardian status	Secret	Commit	Filename
7122536	Triggered	Generic High Entropy Secret	`d8c6e6f`	docker-compose.yml	View secret
7122536	Triggered	Generic High Entropy Secret	`d8c6e6f`	docker-compose.yml	View secret

🛠 Guidelines to remediate hardcoded secrets

Understand the implications of revoking this secret by investigating where it is used in your code.
Replace and store your secrets safely. Learn here the best practices.
Revoke and rotate these secrets.
If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider

following these best practices for managing and storing secrets including API keys and other credentials
install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.

^{🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.}

coderabbitai

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between f322a6c and d8c6e6f.

Files ignored due to path filters (1)

poetry.lock is excluded by !**/*.lock

Files selected for processing (1)

pyproject.toml (2 hunks)

Files skipped from review as they are similar to previous changes (1)

pyproject.toml

coderabbitai

Actionable comments posted: 2

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between d8c6e6f and 688d757.

Files selected for processing (1)

docs/blog/index.md (1 hunks)

Additional context used

Markdownlint

docs/blog/index.md

8-8: Expected: 1; Actual: 2
Multiple consecutive blank lines

(MD012, no-multiple-blanks)

coderabbitai · 2024-07-03T13:50:05Z

docs/blog/index.md

@@ -8,7 +8,7 @@ The goal of the blog is to discuss broader topics around the cognee project, inc

 1. [LLMOps stack + Graphs](posts/Shiny_new_LLMOps.md)

-[//]: # (2. [Where do knowledge graphs fit, and where do they not? A case study with dynamo.fyi]&#40;posts/where-do-knowledge-graphs-fit.md&#41;)
+2. [Where do knowledge graphs fit, and where do they not? A case study with dynamo.fyi](posts/wwhere-do-knowledge-graphs-fit-and-where-they-dont.md)


Verify the new blog post link.

The link to the new blog post "Where do knowledge graphs fit, and where do they not? A case study with dynamo.fyi" is correctly formatted. However, there is a typographical error in the filename wwhere-do-knowledge-graphs-fit-and-where-they-dont.md.

- 2. [Where do knowledge graphs fit, and where do they not? A case study with dynamo.fyi](posts/wwhere-do-knowledge-graphs-fit-and-where-they-dont.md) + 2. [Where do knowledge graphs fit, and where do they not? A case study with dynamo.fyi](posts/where-do-knowledge-graphs-fit-and-where-they-dont.md)

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

2. [Where do knowledge graphs fit, and where do they not? A case study with dynamo.fyi](posts/wwhere-do-knowledge-graphs-fit-and-where-they-dont.md)

2. [Where do knowledge graphs fit, and where do they not? A case study with dynamo.fyi](posts/where-do-knowledge-graphs-fit-and-where-they-dont.md)

coderabbitai · 2024-07-03T13:50:06Z

docs/blog/index.md

@@ -8,7 +8,7 @@



Remove consecutive blank lines.

There are multiple consecutive blank lines which are not necessary and should be removed to adhere to markdown best practices.

- 8-8: Expected: 1; Actual: 2 + 8: Expected: 1; Actual: 1

Committable suggestion was skipped due to low confidence.

Tools

Markdownlint

8-8: Expected: 1; Actual: 2
Multiple consecutive blank lines

(MD012, no-multiple-blanks)

coderabbitai

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 688d757 and 77e06e2.

Files ignored due to path filters (1)

poetry.lock is excluded by !**/*.lock

Files selected for processing (1)

docs/blog/index.md (1 hunks)

Files skipped from review as they are similar to previous changes (1)

docs/blog/index.md

coderabbitai

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 77e06e2 and a8ed752.

Files selected for processing (1)

docs/blog/posts/where-do-knowledge-graphs-fit-and-where-they-dont.md (1 hunks)

Files skipped from review as they are similar to previous changes (1)

docs/blog/posts/where-do-knowledge-graphs-fit-and-where-they-dont.md

Shiny new LLMOps

16139f7

coderabbitai bot reviewed Jun 15, 2024

View reviewed changes

End to end testing working

7ff87bb

coderabbitai bot reviewed Jun 15, 2024

View reviewed changes

Evals updated

f3a1d2e

coderabbitai bot reviewed Jun 29, 2024

View reviewed changes

added lib updates

2d09379

coderabbitai bot reviewed Jun 30, 2024

View reviewed changes

borisarzentar reviewed Jul 1, 2024

View reviewed changes

ADDED AND FIXED ENDPOINTS

f322a6c

coderabbitai bot reviewed Jul 3, 2024

View reviewed changes

Merge branch 'main' into topology_fixes

d8c6e6f

coderabbitai bot reviewed Jul 3, 2024

View reviewed changes

added new blog post

688d757

coderabbitai bot reviewed Jul 3, 2024

View reviewed changes

Vasilije1990 added 2 commits July 3, 2024 15:52

Update index.md

17f4b27

fix initial welcome

77e06e2

coderabbitai bot reviewed Jul 7, 2024

View reviewed changes

fix initial title

a8ed752

coderabbitai bot reviewed Jul 7, 2024

View reviewed changes

Vasilije1990 closed this Jul 21, 2024

		print(graph_config.infer_graph_topology)
		print(graph_config.graph_topology_task)

		@@ -294,14 +303,30 @@

		from cognee.api.v1.add import add

		dataset_name = "explanations"

	print(datasets.list_datasets())
	logger.debug("Available datasets: " + str(datasets.list_datasets()))

		@@ -149,15 +150,23 @@
		graph_config = get_graph_config()
		graph_topology = graph_config.graph_model


		Upon arrival at Dynamo, we found a rudimentary agent support system connected to OpenAI with one templated query. Instead of building a GraphRAG, we needed to replicate the existing functionality and create an infrastructure layer for future evolution towards GraphRAG.

		GraphRAG would provide a semantic and personalization layer above the Firebase store where all customer data is located. However, this would only be useful if we had a system to release our agent support assistant to production.

	2. [Where do knowledge graphs fit, and where do they not? A case study with dynamo.fyi](posts/wwhere-do-knowledge-graphs-fit-and-where-they-dont.md)
	2. [Where do knowledge graphs fit, and where do they not? A case study with dynamo.fyi](posts/where-do-knowledge-graphs-fit-and-where-they-dont.md)

Shiny new LLMOps #110

Shiny new LLMOps #110

Conversation

Vasilije1990 commented Jun 15, 2024 • edited by coderabbitai bot Loading

Summary by CodeRabbit

coderabbitai bot commented Jun 15, 2024 • edited Loading

Walkthrough

Changes

Poem

Chat

CodeRabbit Commands (invoked as PR comments)

CodeRabbit Configration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Jun 15, 2024 • edited Loading

Choose a reason for hiding this comment

borisarzentar Jul 1, 2024

Choose a reason for hiding this comment

coderabbitai bot Jul 1, 2024

Choose a reason for hiding this comment

coderabbitai bot Jun 15, 2024

Choose a reason for hiding this comment

coderabbitai bot Jun 15, 2024 • edited Loading

Choose a reason for hiding this comment

borisarzentar Jul 1, 2024

Choose a reason for hiding this comment

coderabbitai bot Jul 1, 2024

Choose a reason for hiding this comment

coderabbitai bot Jun 15, 2024 • edited Loading

Choose a reason for hiding this comment

borisarzentar Jul 1, 2024

Choose a reason for hiding this comment

coderabbitai bot Jul 1, 2024

Choose a reason for hiding this comment

coderabbitai bot Jun 15, 2024

Choose a reason for hiding this comment

coderabbitai bot Jun 15, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Jun 15, 2024

Choose a reason for hiding this comment

coderabbitai bot Jun 15, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Jun 29, 2024

Choose a reason for hiding this comment

coderabbitai bot Jun 29, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

borisarzentar Jul 1, 2024

Choose a reason for hiding this comment

borisarzentar Jul 1, 2024

Choose a reason for hiding this comment

borisarzentar Jul 1, 2024

Choose a reason for hiding this comment

borisarzentar Jul 1, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Jul 3, 2024

Choose a reason for hiding this comment

gitguardian bot commented Jul 3, 2024 • edited Loading

⚠️ GitGuardian has uncovered 2 secrets following the scan of your pull request.

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Jul 3, 2024

Choose a reason for hiding this comment

coderabbitai bot Jul 3, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

Vasilije1990 commented Jun 15, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jun 15, 2024 •

edited

Loading

CodeRabbit Configration File (`.coderabbit.yaml`)

coderabbitai bot Jun 15, 2024 •

edited

Loading

coderabbitai bot Jun 15, 2024 •

edited

Loading

coderabbitai bot Jun 15, 2024 •

edited

Loading

gitguardian bot commented Jul 3, 2024 •

edited

Loading