-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add mcp to cognee #370
Add mcp to cognee #370
Conversation
Warning Rate limit exceeded@Vasilije1990 has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 8 minutes and 9 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (1)
WalkthroughThis pull request introduces a new MCP (Microservice Communication Protocol) server project called "mcpcognee". The project establishes a modular structure with an entry point, a server module, and a comprehensive project configuration. The server is designed to handle tool listing and execution, with a focus on building knowledge graphs and performing searches. The implementation includes dynamic class loading, tool registration, and asynchronous request handling. Changes
Possibly related PRs
Suggested Labels
Suggested Reviewers
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
🧹 Outside diff range and nitpick comments (4)
cognee-mcp/mcpcognee/server.py (2)
77-80
: Avoid suppressing all output during pruning operationsRedirecting both
stdout
andstderr
toos.devnull
may hide important error messages.Consider only suppressing standard output or implement logging to capture errors.
112-126
: Ensure graceful shutdown inmain
functionCurrently, exceptions within the server might not be properly handled, leading to unclean shutdowns.
Consider wrapping the server run with a try-except block to handle exceptions and ensure resources are cleaned up.
cognee-mcp/mcpcognee/__init__.py (2)
6-8
: Clarify docstring inmain
functionThe docstring can be more descriptive about the function's purpose and its asynchronous nature.
Consider updating the docstring:
def main(): - """Main entry point for the package.""" + """Asynchronous main entry point for the mcpcognee server.""" asyncio.run(server.main())
13-14
: Avoid redundant main invocationThe conditional
if __name__ == "__main__":
may not be necessary in__init__.py
since__main__.py
handles the script execution.Consider removing these lines to prevent confusion.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
cognee-mcp/mcpcognee/__init__.py
(1 hunks)cognee-mcp/mcpcognee/__main__.py
(1 hunks)cognee-mcp/mcpcognee/server.py
(1 hunks)cognee-mcp/pyproject.toml
(1 hunks)
🔇 Additional comments (3)
cognee-mcp/mcpcognee/server.py (1)
69-71
: Type hint compatibility with Python versions
Using the union operator |
for type hints is only supported in Python 3.10 and above. Since the pyproject.toml
specifies requires-python = ">=3.11"
, this is acceptable. However, ensure that all environments run the required Python version.
cognee-mcp/pyproject.toml (2)
6-6
: Confirm Python version compatibility
The requires-python = ">=3.11"
specifies Python 3.11 or newer. Ensure all environments support this version.
Check the deployment environments to confirm Python 3.11+ is available.
94-94
: Verify script entry point
Ensure that the script entry point mcpcognee = "mcpcognee:main"
correctly references the main
function.
Run the following script to test the entry point:
✅ Verification successful
Entry point correctly references the main function
The script entry point mcpcognee = "mcpcognee:main"
in pyproject.toml is correctly configured. The verification shows:
__main__.py
imports and calls the main function from the package__init__.py
defines the main function that serves as the entry point- The main function properly initializes the server by running the async server.main()
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Test the script entry point defined in pyproject.toml
# Attempt to execute the script
python -m mcpcognee
Length of output: 118
Script:
#!/bin/bash
# Let's check if the main function exists in the mcpcognee module
fd "main.py|__main__.py" --type f
# Check for any Python files that might contain the main function
ast-grep --pattern 'def main'
# Let's also check the content of the pyproject.toml to understand the project structure
cat pyproject.toml
Length of output: 3923
Script:
#!/bin/bash
# Let's check the content of __main__.py to verify the main function
cat cognee-mcp/mcpcognee/__main__.py
# Also check if there are any other potential entry points
rg "def main" --type py
Length of output: 1228
Script:
#!/bin/bash
# Let's check the content of mcpcognee/__init__.py to verify the main function implementation
cat cognee-mcp/mcpcognee/__init__.py
Length of output: 292
def load_class(model_file, model_name): | ||
model_file = os.path.abspath(model_file) | ||
spec = importlib.util.spec_from_file_location("graph_model", model_file) | ||
module = importlib.util.module_from_spec(spec) | ||
spec.loader.exec_module(module) | ||
|
||
model_class = getattr(module, model_name) | ||
|
||
return model_class |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Address security risk with dynamic module loading
Dynamically loading modules using load_class
without validation can lead to execution of untrusted code.
Consider validating the model_file
path to ensure it's within an approved directory or use a whitelist of allowed modules.
if ("graph_model_file" in arguments) and ("graph_model_name" in arguments): | ||
model_file = arguments.get("graph_model_file") | ||
model_name = arguments.get("graph_model_name") | ||
graph_model = load_class(model_file, model_name) | ||
else: | ||
graph_model = KnowledgeGraph | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Handle missing optional arguments gracefully
When optional arguments graph_model_file
and graph_model_name
are provided, but loading fails, the code does not handle potential exceptions.
Add exception handling around load_class
to manage errors in loading the custom graph model.
def node_to_string(node): | ||
keys_to_keep = ["chunk_index", "topological_rank", "cut_type", "id", "text"] | ||
keyset = set(keys_to_keep) & node.keys() | ||
return "Node(" + " ".join([key + ": " + str(node[key]) + "," for key in keyset]) + ")" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix potential trailing comma in node_to_string
output
The current implementation may result in an extra comma in the string representation of the node due to the trailing comma in the list comprehension.
Apply this diff to correct the issue:
-def node_to_string(node):
- keys_to_keep = ["chunk_index", "topological_rank", "cut_type", "id", "text"]
- keyset = set(keys_to_keep) & node.keys()
- return "Node(" + " ".join([key + ": " + str(node[key]) + "," for key in keyset]) + ")"
+def node_to_string(node):
+ keys_to_keep = ["chunk_index", "topological_rank", "cut_type", "id", "text"]
+ keyset = set(keys_to_keep) & node.keys()
+ return "Node(" + ", ".join([f"{key}: {node[key]}" for key in keyset]) + ")"
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def node_to_string(node): | |
keys_to_keep = ["chunk_index", "topological_rank", "cut_type", "id", "text"] | |
keyset = set(keys_to_keep) & node.keys() | |
return "Node(" + " ".join([key + ": " + str(node[key]) + "," for key in keyset]) + ")" | |
def node_to_string(node): | |
keys_to_keep = ["chunk_index", "topological_rank", "cut_type", "id", "text"] | |
keyset = set(keys_to_keep) & node.keys() | |
return "Node(" + ", ".join([f"{key}: {node[key]}" for key in keyset]) + ")" |
if name == "Cognify_and_search": | ||
with open(os.devnull, "w") as fnull: | ||
with redirect_stdout(fnull), redirect_stderr(fnull): | ||
await cognee.prune.prune_data() | ||
await cognee.prune.prune_system(metadata=True) | ||
|
||
if not arguments: | ||
raise ValueError("Missing arguments") | ||
|
||
text = arguments.get("text") | ||
search_query = arguments.get("search_query") | ||
if ("graph_model_file" in arguments) and ("graph_model_name" in arguments): | ||
model_file = arguments.get("graph_model_file") | ||
model_name = arguments.get("graph_model_name") | ||
graph_model = load_class(model_file, model_name) | ||
else: | ||
graph_model = KnowledgeGraph | ||
|
||
await cognee.add(text) | ||
await cognee.cognify(graph_model=graph_model) | ||
search_results = await cognee.search( | ||
SearchType.INSIGHTS, query_text=search_query | ||
) | ||
|
||
results = retrieved_edges_to_string(search_results) | ||
|
||
return [ | ||
types.TextContent( | ||
type="text", | ||
text=results, | ||
) | ||
] | ||
else: | ||
raise ValueError(f"Unknown tool: {name}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Enhance argument validation in handle_call_tool
The function assumes that text
and search_query
are present in arguments
but does not explicitly check for them.
Add explicit checks for required arguments:
if not arguments:
raise ValueError("Missing arguments")
+ if "text" not in arguments or "search_query" not in arguments:
+ raise ValueError("Arguments 'text' and 'search_query' are required")
text = arguments.get("text")
search_query = arguments.get("search_query")
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
if name == "Cognify_and_search": | |
with open(os.devnull, "w") as fnull: | |
with redirect_stdout(fnull), redirect_stderr(fnull): | |
await cognee.prune.prune_data() | |
await cognee.prune.prune_system(metadata=True) | |
if not arguments: | |
raise ValueError("Missing arguments") | |
text = arguments.get("text") | |
search_query = arguments.get("search_query") | |
if ("graph_model_file" in arguments) and ("graph_model_name" in arguments): | |
model_file = arguments.get("graph_model_file") | |
model_name = arguments.get("graph_model_name") | |
graph_model = load_class(model_file, model_name) | |
else: | |
graph_model = KnowledgeGraph | |
await cognee.add(text) | |
await cognee.cognify(graph_model=graph_model) | |
search_results = await cognee.search( | |
SearchType.INSIGHTS, query_text=search_query | |
) | |
results = retrieved_edges_to_string(search_results) | |
return [ | |
types.TextContent( | |
type="text", | |
text=results, | |
) | |
] | |
else: | |
raise ValueError(f"Unknown tool: {name}") | |
if name == "Cognify_and_search": | |
with open(os.devnull, "w") as fnull: | |
with redirect_stdout(fnull), redirect_stderr(fnull): | |
await cognee.prune.prune_data() | |
await cognee.prune.prune_system(metadata=True) | |
if not arguments: | |
raise ValueError("Missing arguments") | |
if "text" not in arguments or "search_query" not in arguments: | |
raise ValueError("Arguments 'text' and 'search_query' are required") | |
text = arguments.get("text") | |
search_query = arguments.get("search_query") | |
if ("graph_model_file" in arguments) and ("graph_model_name" in arguments): | |
model_file = arguments.get("graph_model_file") | |
model_name = arguments.get("graph_model_name") | |
graph_model = load_class(model_file, model_name) | |
else: | |
graph_model = KnowledgeGraph | |
await cognee.add(text) | |
await cognee.cognify(graph_model=graph_model) | |
search_results = await cognee.search( | |
SearchType.INSIGHTS, query_text=search_query | |
) | |
results = retrieved_edges_to_string(search_results) | |
return [ | |
types.TextContent( | |
type="text", | |
text=results, | |
) | |
] | |
else: | |
raise ValueError(f"Unknown tool: {name}") |
dependencies = [ | ||
"mcp>=1.1.1", | ||
"openai==1.52.0", | ||
"pydantic==2.8.2", | ||
"python-dotenv==1.0.1", | ||
"fastapi>=0.109.2,<0.110.0", | ||
"uvicorn==0.22.0", | ||
"requests==2.32.3", | ||
"aiohttp==3.10.10", | ||
"typing_extensions==4.12.2", | ||
"nest_asyncio==1.6.0", | ||
"numpy==1.26.4", | ||
"datasets==3.1.0", | ||
"falkordb==1.0.9", # Optional | ||
"boto3>=1.26.125,<2.0.0", | ||
"botocore>=1.35.54,<2.0.0", | ||
"gunicorn>=20.1.0,<21.0.0", | ||
"sqlalchemy==2.0.35", | ||
"instructor==1.5.2", | ||
"networkx>=3.2.1,<4.0.0", | ||
"aiosqlite>=0.20.0,<0.21.0", | ||
"pandas==2.0.3", | ||
"filetype>=1.2.0,<2.0.0", | ||
"nltk>=3.8.1,<4.0.0", | ||
"dlt[sqlalchemy]>=1.4.1,<2.0.0", | ||
"aiofiles>=23.2.1,<24.0.0", | ||
"qdrant-client>=1.9.0,<2.0.0", # Optional | ||
"graphistry>=0.33.5,<0.34.0", | ||
"tenacity>=8.4.1,<9.0.0", | ||
"weaviate-client==4.6.7", # Optional | ||
"scikit-learn>=1.5.0,<2.0.0", | ||
"pypdf>=4.1.0,<5.0.0", | ||
"neo4j>=5.20.0,<6.0.0", # Optional | ||
"jinja2>=3.1.3,<4.0.0", | ||
"matplotlib>=3.8.3,<4.0.0", | ||
"tiktoken==0.7.0", | ||
"langchain_text_splitters==0.3.2", # Optional | ||
"langsmith==0.1.139", # Optional | ||
"langdetect==1.0.9", | ||
"posthog>=3.5.0,<4.0.0", # Optional | ||
"lancedb==0.15.0", | ||
"litellm==1.49.1", | ||
"groq==0.8.0", # Optional | ||
"langfuse>=2.32.0,<3.0.0", # Optional | ||
"pydantic-settings>=2.2.1,<3.0.0", | ||
"anthropic>=0.26.1,<1.0.0", | ||
"sentry-sdk[fastapi]>=2.9.0,<3.0.0", | ||
"fastapi-users[sqlalchemy]", # Optional | ||
"alembic>=1.13.3,<2.0.0", | ||
"asyncpg==0.30.0", # Optional | ||
"pgvector>=0.3.5,<0.4.0", # Optional | ||
"psycopg2>=2.9.10,<3.0.0", # Optional | ||
"llama-index-core>=0.11.22,<0.12.0", # Optional | ||
"deepeval>=2.0.1,<3.0.0", # Optional | ||
"transformers>=4.46.3,<5.0.0", | ||
"pymilvus>=2.5.0,<3.0.0", # Optional | ||
"unstructured[csv,doc,docx,epub,md,odt,org,ppt,pptx,rst,rtf,tsv,xlsx]>=0.16.10,<1.0.0", # Optional | ||
"pytest>=7.4.0,<8.0.0", | ||
"pytest-asyncio>=0.21.1,<0.22.0", | ||
"coverage>=7.3.2,<8.0.0", | ||
"mypy>=1.7.1,<2.0.0", | ||
"deptry>=0.20.0,<0.21.0", | ||
"debugpy==1.8.2", | ||
"pylint>=3.0.3,<4.0.0", | ||
"ruff>=0.2.2,<0.3.0", | ||
"tweepy==4.14.0", | ||
"gitpython>=3.1.43,<4.0.0", | ||
"cognee", | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Remove unnecessary dependencies
The dependencies list includes many packages marked as optional or duplicates, which may not be required.
Review and clean up the dependencies to include only necessary packages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Outside diff range and nitpick comments (3)
cognee-mcp/README.md (3)
1-4
: Grammar fix needed and enhance project description
- Fix the grammar: "A MCP server" should be "An MCP server"
- Consider expanding the project description to include:
- What MCP (Microservice Communication Protocol) is
- Its purpose within the cognee ecosystem
- Key features or capabilities
-A MCP server project +An MCP server project that enables microservice communication within the cognee ecosystem.🧰 Tools
🪛 LanguageTool
[misspelling] ~2-~2: Use “An” instead of ‘A’ if the following word starts with a vowel sound, e.g. ‘an article’, ‘an hour’.
Context: # cognee MCP server A MCP server project Create a boilerplat...(EN_A_VS_AN)
7-20
: Improve code block formatting and path examples
- Use
shell
orbash
instead ofjsx
for command blocks- Use a placeholder notation for the path that's more obvious
-```jsx +```shell uvx create-mcp-server-
jsx +
shell
cd mcp_cognee
uv sync --dev --all-extras
22-26
: Add cross-platform activation instructionsInclude activation commands for both Unix-like systems and Windows.
Activate the venv with -```jsx +```shell source .venv/bin/activate # Unix-like systems +# OR +.venv\Scripts\activate # Windows</blockquote></details> </blockquote></details> <details> <summary>📜 Review details</summary> **Configuration used: CodeRabbit UI** **Review profile: CHILL** **Plan: Pro** <details> <summary>📥 Commits</summary> Reviewing files that changed from the base of the PR and between 3dc2d1e647d15e9709ad5437a0c80b9ed34cfef2 and 1ec24ee4074549ef78bb13619dc4b82783e0866a. </details> <details> <summary>📒 Files selected for processing (1)</summary> * `cognee-mcp/README.md` (1 hunks) </details> <details> <summary>🧰 Additional context used</summary> <details> <summary>🪛 LanguageTool</summary> <details> <summary>cognee-mcp/README.md</summary> [misspelling] ~2-~2: Use “An” instead of ‘A’ if the following word starts with a vowel sound, e.g. ‘an article’, ‘an hour’. Context: # cognee MCP server A MCP server project Create a boilerplat... (EN_A_VS_AN) </details> </details> <details> <summary>🪛 Markdownlint (0.37.0)</summary> <details> <summary>cognee-mcp/README.md</summary> 30-30: null Fenced code blocks should have a language specified (MD040, fenced-code-language) </details> </details> </details> </details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
This should already add the new server to your Claude config, but if not, add these lines manually: | ||
|
||
``` | ||
"mcpcognee": { | ||
"command": "uv", | ||
"args": [ | ||
"--directory", | ||
"/Users/your_username/mcp/mcp_cognee", | ||
"run", | ||
"mcpcognee" | ||
], | ||
"env": { | ||
"ENV": "local", | ||
"TOKENIZERS_PARALLELISM": "false", | ||
"LLM_API_KEY": "add_your_api_key_here", | ||
"GRAPH_DATABASE_PROVIDER": "neo4j", | ||
"GRAPH_DATABASE_URL": "bolt://localhost:7687", | ||
"GRAPH_DATABASE_USERNAME": "add_username_here", | ||
"GRAPH_DATABASE_PASSWORD": "add_pwd_here", | ||
"VECTOR_DB_PROVIDER": "lancedb", | ||
"DB_PROVIDER": "sqlite", | ||
"DB_NAME": "postgres" | ||
} | ||
} | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Enhance security and configuration documentation
- Add proper code block identifier (
json
) - Use environment variable placeholders instead of literal "add_your_api_key_here"
- Document each environment variable's purpose
-```
+```json
"mcpcognee": {
"command": "uv",
"args": [
"--directory",
- "/Users/your_username/mcp/mcp_cognee",
+ "${MCP_COGNEE_PATH}",
"run",
"mcpcognee"
],
"env": {
"ENV": "local",
"TOKENIZERS_PARALLELISM": "false",
- "LLM_API_KEY": "add_your_api_key_here",
+ "LLM_API_KEY": "${LLM_API_KEY}",
"GRAPH_DATABASE_PROVIDER": "neo4j",
"GRAPH_DATABASE_URL": "bolt://localhost:7687",
- "GRAPH_DATABASE_USERNAME": "add_username_here",
- "GRAPH_DATABASE_PASSWORD": "add_pwd_here",
+ "GRAPH_DATABASE_USERNAME": "${GRAPH_DB_USER}",
+ "GRAPH_DATABASE_PASSWORD": "${GRAPH_DB_PASSWORD}",
"VECTOR_DB_PROVIDER": "lancedb",
"DB_PROVIDER": "sqlite",
"DB_NAME": "postgres"
}
}
Consider adding a section documenting these environment variables:
```markdown
### Environment Variables
- `LLM_API_KEY`: Your Language Model API key
- `GRAPH_DATABASE_PROVIDER`: Graph database type (default: neo4j)
- `GRAPH_DATABASE_URL`: Connection URL for the graph database
...
🧰 Tools
🪛 Markdownlint (0.37.0)
30-30: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
Then, edit the pyproject.toml in your new folder so that it includes packages from the cognee requirements. Use the pyproject.toml in your cognee library for this, but match the syntax of the automatically generated pyproject.toml so that it is compatible with uv. | ||
|
||
Define cognify tool in server.py | ||
Restart your Claude desktop. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Provide more detailed setup instructions
The final setup steps need more specificity:
- Add an example of the required dependencies in pyproject.toml
- Provide a basic template or example for defining the cognify tool in server.py
Consider adding:
### Example pyproject.toml dependencies
```toml
[project]
dependencies = [
"cognee>=0.1.0",
# Add other required dependencies
]
Defining the Cognify Tool
In server.py
, implement the tool following this pattern:
from cognee.tools import BaseTool
class CognifyTool(BaseTool):
name = "cognify"
description = "Tool for cognitive operations"
# Add implementation details
<!-- This is an auto-generated comment by CodeRabbit -->
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Summary by CodeRabbit
New Features
Bug Fixes
Documentation