Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add mcp to cognee #370

Merged
merged 3 commits into from
Dec 14, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 57 additions & 0 deletions cognee-mcp/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# cognee MCP server

A MCP server project

Create a boilerplate server:

```jsx
uvx create-mcp-server
```

1. The command will ask you to name your server, e.g. mcp_cognee


2. Answer “Y” to connect with Claude
Then run

```jsx
cd mcp_cognee
uv sync --dev --all-extras
```

Activate the venv with

```jsx
source .venv/bin/activate
```

This should already add the new server to your Claude config, but if not, add these lines manually:

```
"mcpcognee": {
"command": "uv",
"args": [
"--directory",
"/Users/your_username/mcp/mcp_cognee",
"run",
"mcpcognee"
],
"env": {
"ENV": "local",
"TOKENIZERS_PARALLELISM": "false",
"LLM_API_KEY": "add_your_api_key_here",
"GRAPH_DATABASE_PROVIDER": "neo4j",
"GRAPH_DATABASE_URL": "bolt://localhost:7687",
"GRAPH_DATABASE_USERNAME": "add_username_here",
"GRAPH_DATABASE_PASSWORD": "add_pwd_here",
"VECTOR_DB_PROVIDER": "lancedb",
"DB_PROVIDER": "sqlite",
"DB_NAME": "postgres"
}
}
```
Comment on lines +28 to +52
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Enhance security and configuration documentation

  1. Add proper code block identifier (json)
  2. Use environment variable placeholders instead of literal "add_your_api_key_here"
  3. Document each environment variable's purpose
-```
+```json
 "mcpcognee": {
   "command": "uv",
   "args": [
     "--directory",
-    "/Users/your_username/mcp/mcp_cognee",
+    "${MCP_COGNEE_PATH}",
     "run",
     "mcpcognee"
   ],
   "env": {
     "ENV": "local",
     "TOKENIZERS_PARALLELISM": "false",
-    "LLM_API_KEY": "add_your_api_key_here",
+    "LLM_API_KEY": "${LLM_API_KEY}",
     "GRAPH_DATABASE_PROVIDER": "neo4j",
     "GRAPH_DATABASE_URL": "bolt://localhost:7687",
-    "GRAPH_DATABASE_USERNAME": "add_username_here",
-    "GRAPH_DATABASE_PASSWORD": "add_pwd_here",
+    "GRAPH_DATABASE_USERNAME": "${GRAPH_DB_USER}",
+    "GRAPH_DATABASE_PASSWORD": "${GRAPH_DB_PASSWORD}",
     "VECTOR_DB_PROVIDER": "lancedb",
     "DB_PROVIDER": "sqlite",
     "DB_NAME": "postgres"
   }
 }

Consider adding a section documenting these environment variables:
```markdown
### Environment Variables

- `LLM_API_KEY`: Your Language Model API key
- `GRAPH_DATABASE_PROVIDER`: Graph database type (default: neo4j)
- `GRAPH_DATABASE_URL`: Connection URL for the graph database
...
🧰 Tools
🪛 Markdownlint (0.37.0)

30-30: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


Then, edit the pyproject.toml in your new folder so that it includes packages from the cognee requirements. Use the pyproject.toml in your cognee library for this, but match the syntax of the automatically generated pyproject.toml so that it is compatible with uv.

Define cognify tool in server.py
Restart your Claude desktop.
Comment on lines +54 to +57
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Provide more detailed setup instructions

The final setup steps need more specificity:

  1. Add an example of the required dependencies in pyproject.toml
  2. Provide a basic template or example for defining the cognify tool in server.py

Consider adding:

### Example pyproject.toml dependencies

```toml
[project]
dependencies = [
    "cognee>=0.1.0",
    # Add other required dependencies
]

Defining the Cognify Tool

In server.py, implement the tool following this pattern:

from cognee.tools import BaseTool

class CognifyTool(BaseTool):
    name = "cognify"
    description = "Tool for cognitive operations"
    # Add implementation details

<!-- This is an auto-generated comment by CodeRabbit -->

14 changes: 14 additions & 0 deletions cognee-mcp/mcpcognee/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
import asyncio

from . import server


def main():
"""Main entry point for the package."""
asyncio.run(server.main())

# Optionally expose other important items at package level
__all__ = ['main', 'server']

if __name__ == "__main__":
main()
4 changes: 4 additions & 0 deletions cognee-mcp/mcpcognee/__main__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
from mcpcognee import main
import asyncio

asyncio.run(main())
126 changes: 126 additions & 0 deletions cognee-mcp/mcpcognee/server.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
import importlib.util
import os
from contextlib import redirect_stderr, redirect_stdout

import cognee
import mcp.server.stdio
import mcp.types as types
from cognee.api.v1.search import SearchType
from cognee.shared.data_models import KnowledgeGraph
from mcp.server import NotificationOptions, Server
from mcp.server.models import InitializationOptions
from pydantic import AnyUrl, BaseModel

server = Server("mcpcognee")


def node_to_string(node):
keys_to_keep = ["chunk_index", "topological_rank", "cut_type", "id", "text"]
keyset = set(keys_to_keep) & node.keys()
return "Node(" + " ".join([key + ": " + str(node[key]) + "," for key in keyset]) + ")"
Comment on lines +17 to +20
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix potential trailing comma in node_to_string output

The current implementation may result in an extra comma in the string representation of the node due to the trailing comma in the list comprehension.

Apply this diff to correct the issue:

-def node_to_string(node):
-    keys_to_keep = ["chunk_index", "topological_rank", "cut_type", "id", "text"]
-    keyset = set(keys_to_keep) & node.keys()
-    return "Node(" + " ".join([key + ": " + str(node[key]) + "," for key in keyset]) + ")"
+def node_to_string(node):
+    keys_to_keep = ["chunk_index", "topological_rank", "cut_type", "id", "text"]
+    keyset = set(keys_to_keep) & node.keys()
+    return "Node(" + ", ".join([f"{key}: {node[key]}" for key in keyset]) + ")"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def node_to_string(node):
keys_to_keep = ["chunk_index", "topological_rank", "cut_type", "id", "text"]
keyset = set(keys_to_keep) & node.keys()
return "Node(" + " ".join([key + ": " + str(node[key]) + "," for key in keyset]) + ")"
def node_to_string(node):
keys_to_keep = ["chunk_index", "topological_rank", "cut_type", "id", "text"]
keyset = set(keys_to_keep) & node.keys()
return "Node(" + ", ".join([f"{key}: {node[key]}" for key in keyset]) + ")"



def retrieved_edges_to_string(search_results):
edge_strings = []
for triplet in search_results:
node1, edge, node2 = triplet
relationship_type = edge["relationship_name"]
edge_str = f"{node_to_string(node1)} {relationship_type} {node_to_string(node2)}"
edge_strings.append(edge_str)
return "\n".join(edge_strings)


def load_class(model_file, model_name):
model_file = os.path.abspath(model_file)
spec = importlib.util.spec_from_file_location("graph_model", model_file)
module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(module)

model_class = getattr(module, model_name)

return model_class
Comment on lines +33 to +41
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Address security risk with dynamic module loading

Dynamically loading modules using load_class without validation can lead to execution of untrusted code.

Consider validating the model_file path to ensure it's within an approved directory or use a whitelist of allowed modules.



@server.list_tools()
async def handle_list_tools() -> list[types.Tool]:
"""
List available tools.
Each tool specifies its arguments using JSON Schema validation.
"""
return [
types.Tool(
name="Cognify_and_search",
description="Build knowledge graph from the input text and search in it.",
inputSchema={
"type": "object",
"properties": {
"text": {"type": "string"},
"search_query": {"type": "string"},
"graph_model_file": {"type": "string"},
"graph_model_name": {"type": "string"},
},
"required": ["text", "search_query"],
},
)
]


@server.call_tool()
async def handle_call_tool(
name: str, arguments: dict | None
) -> list[types.TextContent | types.ImageContent | types.EmbeddedResource]:
"""
Handle tool execution requests.
Tools can modify server state and notify clients of changes.
"""
if name == "Cognify_and_search":
with open(os.devnull, "w") as fnull:
with redirect_stdout(fnull), redirect_stderr(fnull):
await cognee.prune.prune_data()
await cognee.prune.prune_system(metadata=True)

if not arguments:
raise ValueError("Missing arguments")

text = arguments.get("text")
search_query = arguments.get("search_query")
if ("graph_model_file" in arguments) and ("graph_model_name" in arguments):
model_file = arguments.get("graph_model_file")
model_name = arguments.get("graph_model_name")
graph_model = load_class(model_file, model_name)
else:
graph_model = KnowledgeGraph

Comment on lines +87 to +93
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Handle missing optional arguments gracefully

When optional arguments graph_model_file and graph_model_name are provided, but loading fails, the code does not handle potential exceptions.

Add exception handling around load_class to manage errors in loading the custom graph model.

await cognee.add(text)
await cognee.cognify(graph_model=graph_model)
search_results = await cognee.search(
SearchType.INSIGHTS, query_text=search_query
)

results = retrieved_edges_to_string(search_results)

return [
types.TextContent(
type="text",
text=results,
)
]
else:
raise ValueError(f"Unknown tool: {name}")
Comment on lines +76 to +109
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Enhance argument validation in handle_call_tool

The function assumes that text and search_query are present in arguments but does not explicitly check for them.

Add explicit checks for required arguments:

if not arguments:
    raise ValueError("Missing arguments")

+ if "text" not in arguments or "search_query" not in arguments:
+     raise ValueError("Arguments 'text' and 'search_query' are required")

text = arguments.get("text")
search_query = arguments.get("search_query")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if name == "Cognify_and_search":
with open(os.devnull, "w") as fnull:
with redirect_stdout(fnull), redirect_stderr(fnull):
await cognee.prune.prune_data()
await cognee.prune.prune_system(metadata=True)
if not arguments:
raise ValueError("Missing arguments")
text = arguments.get("text")
search_query = arguments.get("search_query")
if ("graph_model_file" in arguments) and ("graph_model_name" in arguments):
model_file = arguments.get("graph_model_file")
model_name = arguments.get("graph_model_name")
graph_model = load_class(model_file, model_name)
else:
graph_model = KnowledgeGraph
await cognee.add(text)
await cognee.cognify(graph_model=graph_model)
search_results = await cognee.search(
SearchType.INSIGHTS, query_text=search_query
)
results = retrieved_edges_to_string(search_results)
return [
types.TextContent(
type="text",
text=results,
)
]
else:
raise ValueError(f"Unknown tool: {name}")
if name == "Cognify_and_search":
with open(os.devnull, "w") as fnull:
with redirect_stdout(fnull), redirect_stderr(fnull):
await cognee.prune.prune_data()
await cognee.prune.prune_system(metadata=True)
if not arguments:
raise ValueError("Missing arguments")
if "text" not in arguments or "search_query" not in arguments:
raise ValueError("Arguments 'text' and 'search_query' are required")
text = arguments.get("text")
search_query = arguments.get("search_query")
if ("graph_model_file" in arguments) and ("graph_model_name" in arguments):
model_file = arguments.get("graph_model_file")
model_name = arguments.get("graph_model_name")
graph_model = load_class(model_file, model_name)
else:
graph_model = KnowledgeGraph
await cognee.add(text)
await cognee.cognify(graph_model=graph_model)
search_results = await cognee.search(
SearchType.INSIGHTS, query_text=search_query
)
results = retrieved_edges_to_string(search_results)
return [
types.TextContent(
type="text",
text=results,
)
]
else:
raise ValueError(f"Unknown tool: {name}")



async def main():
# Run the server using stdin/stdout streams
async with mcp.server.stdio.stdio_server() as (read_stream, write_stream):
await server.run(
read_stream,
write_stream,
InitializationOptions(
server_name="mcpcognee",
server_version="0.1.0",
capabilities=server.get_capabilities(
notification_options=NotificationOptions(),
experimental_capabilities={},
),
),
)
94 changes: 94 additions & 0 deletions cognee-mcp/pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
[project]
name = "mcpcognee"
version = "0.1.0"
description = "A MCP server project"
readme = "README.md"
requires-python = ">=3.11"
dependencies = [
"mcp>=1.1.1",
"openai==1.52.0",
"pydantic==2.8.2",
"python-dotenv==1.0.1",
"fastapi>=0.109.2,<0.110.0",
"uvicorn==0.22.0",
"requests==2.32.3",
"aiohttp==3.10.10",
"typing_extensions==4.12.2",
"nest_asyncio==1.6.0",
"numpy==1.26.4",
"datasets==3.1.0",
"falkordb==1.0.9", # Optional
"boto3>=1.26.125,<2.0.0",
"botocore>=1.35.54,<2.0.0",
"gunicorn>=20.1.0,<21.0.0",
"sqlalchemy==2.0.35",
"instructor==1.5.2",
"networkx>=3.2.1,<4.0.0",
"aiosqlite>=0.20.0,<0.21.0",
"pandas==2.0.3",
"filetype>=1.2.0,<2.0.0",
"nltk>=3.8.1,<4.0.0",
"dlt[sqlalchemy]>=1.4.1,<2.0.0",
"aiofiles>=23.2.1,<24.0.0",
"qdrant-client>=1.9.0,<2.0.0", # Optional
"graphistry>=0.33.5,<0.34.0",
"tenacity>=8.4.1,<9.0.0",
"weaviate-client==4.6.7", # Optional
"scikit-learn>=1.5.0,<2.0.0",
"pypdf>=4.1.0,<5.0.0",
"neo4j>=5.20.0,<6.0.0", # Optional
"jinja2>=3.1.3,<4.0.0",
"matplotlib>=3.8.3,<4.0.0",
"tiktoken==0.7.0",
"langchain_text_splitters==0.3.2", # Optional
"langsmith==0.1.139", # Optional
"langdetect==1.0.9",
"posthog>=3.5.0,<4.0.0", # Optional
"lancedb==0.15.0",
"litellm==1.49.1",
"groq==0.8.0", # Optional
"langfuse>=2.32.0,<3.0.0", # Optional
"pydantic-settings>=2.2.1,<3.0.0",
"anthropic>=0.26.1,<1.0.0",
"sentry-sdk[fastapi]>=2.9.0,<3.0.0",
"fastapi-users[sqlalchemy]", # Optional
"alembic>=1.13.3,<2.0.0",
"asyncpg==0.30.0", # Optional
"pgvector>=0.3.5,<0.4.0", # Optional
"psycopg2>=2.9.10,<3.0.0", # Optional
"llama-index-core>=0.11.22,<0.12.0", # Optional
"deepeval>=2.0.1,<3.0.0", # Optional
"transformers>=4.46.3,<5.0.0",
"pymilvus>=2.5.0,<3.0.0", # Optional
"unstructured[csv,doc,docx,epub,md,odt,org,ppt,pptx,rst,rtf,tsv,xlsx]>=0.16.10,<1.0.0", # Optional
"pytest>=7.4.0,<8.0.0",
"pytest-asyncio>=0.21.1,<0.22.0",
"coverage>=7.3.2,<8.0.0",
"mypy>=1.7.1,<2.0.0",
"deptry>=0.20.0,<0.21.0",
"debugpy==1.8.2",
"pylint>=3.0.3,<4.0.0",
"ruff>=0.2.2,<0.3.0",
"tweepy==4.14.0",
"gitpython>=3.1.43,<4.0.0",
"cognee",
]
Comment on lines +7 to +75
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Remove unnecessary dependencies

The dependencies list includes many packages marked as optional or duplicates, which may not be required.

Review and clean up the dependencies to include only necessary packages.


[[project.authors]]
name = "Rita Aleksziev"
email = "[email protected]"

[build-system]
requires = [ "hatchling",]
build-backend = "hatchling.build"

[tool.uv.sources]
cognee = { path = "../../cognee" }

[dependency-groups]
dev = [
"cognee",
]

[project.scripts]
mcpcognee = "mcpcognee:main"
Loading