Skip to content

Commit

Permalink
Merge branch 'dev' of https://github.com/topoteretes/cognee into dev
Browse files Browse the repository at this point in the history
  • Loading branch information
alekszievr committed Jan 6, 2025
2 parents 399faf9 + fe672ce commit a6dfff8
Show file tree
Hide file tree
Showing 32 changed files with 5,421 additions and 253 deletions.
11 changes: 11 additions & 0 deletions .github/workflows/ruff_format.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
name: ruff format
on: [ pull_request ]

jobs:
ruff:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: astral-sh/ruff-action@v2
with:
args: "format --check"
9 changes: 9 additions & 0 deletions .github/workflows/ruff_lint.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
name: ruff lint
on: [ pull_request ]

jobs:
ruff:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: astral-sh/ruff-action@v2
20 changes: 20 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# See https://pre-commit.com for more information
# See https://pre-commit.com/hooks.html for more hooks
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v3.2.0
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
- id: check-yaml
- id: check-added-large-files
- repo: https://github.com/astral-sh/ruff-pre-commit
# Ruff version.
rev: v0.8.3
hooks:
# Run the linter.
- id: ruff
types_or: [ python, pyi ]
# Run the formatter.
- id: ruff-format
types_or: [ python, pyi ]
98 changes: 9 additions & 89 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,9 @@ Try it in a Google Colab <a href="https://colab.research.google.com/drive/1g-Qn

If you have questions, join our <a href="https://discord.gg/NQPKmU5CCg">Discord</a> community

<div align="center">
<img src="assets/cognee_benefits.png" alt="why cognee" width="80%" />
</div>

## 📦 Installation

Expand Down Expand Up @@ -193,93 +196,14 @@ if __name__ == '__main__':
When you run this script, you will see step-by-step messages in the console that help you trace the execution flow and understand what the script is doing at each stage.
A version of this example is here: `examples/python/simple_example.py`

### Create your own memory store
### Understand our architecture

cognee framework consists of tasks that can be grouped into pipelines.
Each task can be an independent part of business logic, that can be tied to other tasks to form a pipeline.
These tasks persist data into your memory store enabling you to search for relevant context of past conversations, documents, or any other data you have stored.


### Example: Classify your documents

Here is an example of how it looks for a default cognify pipeline:

1. To prepare the data for the pipeline run, first we need to add it to our metastore and normalize it:

Start with:
```
text = """Natural language processing (NLP) is an interdisciplinary
subfield of computer science and information retrieval"""
await cognee.add(text) # Add a new piece of information
```

2. In the next step we make a task. The task can be any business logic we need, but the important part is that it should be encapsulated in one function.

Here we show an example of creating a naive LLM classifier that takes a Pydantic model and then stores the data in both the graph and vector stores after analyzing each chunk.
We provided just a snippet for reference, but feel free to check out the implementation in our repo.

```
async def chunk_naive_llm_classifier(
data_chunks: list[DocumentChunk],
classification_model: Type[BaseModel]
):
# Extract classifications asynchronously
chunk_classifications = await asyncio.gather(
*(extract_categories(chunk.text, classification_model) for chunk in data_chunks)
)
# Collect classification data points using a set to avoid duplicates
classification_data_points = {
uuid5(NAMESPACE_OID, cls.label.type)
for cls in chunk_classifications
} | {
uuid5(NAMESPACE_OID, subclass.value)
for cls in chunk_classifications
for subclass in cls.label.subclass
}
vector_engine = get_vector_engine()
collection_name = "classification"
# Define the payload schema
class Keyword(BaseModel):
uuid: str
text: str
chunk_id: str
document_id: str
# Ensure the collection exists and retrieve existing data points
if not await vector_engine.has_collection(collection_name):
await vector_engine.create_collection(collection_name, payload_schema=Keyword)
existing_points_map = {}
else:
existing_points_map = {}
return data_chunks
...
```

We have many tasks that can be used in your pipelines, and you can also create your tasks to fit your business logic.


3. Once we have our tasks, it is time to group them into a pipeline.
This simplified snippet demonstrates how tasks can be added to a pipeline, and how they can pass the information forward from one to another.

```
Task(
chunk_naive_llm_classifier,
classification_model = cognee_config.classification_model,
)
pipeline = run_tasks(tasks, documents)
```

To see the working code, check cognee.api.v1.cognify default pipeline in our repo.
<div align="center">
<img src="assets/cognee_diagram.png" alt="cognee concept diagram" width="50%" />
</div>


## Vector retrieval, Graphs and LLMs
Expand Down Expand Up @@ -338,11 +262,7 @@ pip install cognee

## Vector & Graph Databases Implementation State

<style>
table {
width: 100%;
}
</style>


| Name | Type | Current state | Known Issues |
|----------|--------------------|-------------------|--------------|
Expand All @@ -353,4 +273,4 @@ pip install cognee
| NetworkX | Graph | Stable &#x2705; | |
| FalkorDB | Vector/Graph | Unstable &#x274C; | |
| PGVector | Vector | Stable &#x2705; | |
| Milvus | Vector | Stable &#x2705; | |
| Milvus | Vector | Stable &#x2705; | |
Binary file removed assets/architecture.png
Binary file not shown.
Binary file added assets/cognee_benefits.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/cognee_diagram.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
78 changes: 53 additions & 25 deletions cognee-mcp/README.md
Original file line number Diff line number Diff line change
@@ -1,57 +1,85 @@
# cognee MCP server




### Installing Manually
A MCP server project
=======
1. Clone the [cognee](www.github.com/topoteretes/cognee) repo

Create a boilerplate server:

```jsx
uvx create-mcp-server
```

1. The command will ask you to name your server, e.g. mcp_cognee
2. Install dependencies

```
pip install uv
```
```
brew install postgresql
```

2. Answer “Y” to connect with Claude
Then run
```
brew install rust
```

```jsx
cd mcp_cognee
cd cognee-mcp
uv sync --dev --all-extras
```

Activate the venv with
3. Activate the venv with

```jsx
source .venv/bin/activate
```

This should already add the new server to your Claude config, but if not, add these lines manually:
4. Add the new server to your Claude config:

The file should be located here: ~/Library/Application\ Support/Claude/
You need to create claude_desktop_config.json in this folder if it doesn't exist

```
"mcpcognee": {
"command": "uv",
"args": [
{
"mcpServers": {
"cognee": {
"command": "/Users/{user}/cognee/.venv/bin/uv",
"args": [
"--directory",
"/Users/your_username/mcp/mcp_cognee",
"/Users/{user}/cognee/cognee-mcp",
"run",
"mcpcognee"
"cognee"
],
"env": {
"ENV": "local",
"TOKENIZERS_PARALLELISM": "false",
"LLM_API_KEY": "add_your_api_key_here",
"GRAPH_DATABASE_PROVIDER": "neo4j",
"GRAPH_DATABASE_URL": "bolt://localhost:7687",
"GRAPH_DATABASE_USERNAME": "add_username_here",
"GRAPH_DATABASE_PASSWORD": "add_pwd_here",
"VECTOR_DB_PROVIDER": "lancedb",
"DB_PROVIDER": "sqlite",
"DB_NAME": "postgres"
"LLM_API_KEY": "sk-"
}
},
"filesystem": {
"command": "npx",
"args": [
"-y",
"@modelcontextprotocol/server-filesystem",
"/Users/{user}/Desktop",
"/Users/{user}/Projects"
]
}
}
}
```

Then, edit the pyproject.toml in your new folder so that it includes packages from the cognee requirements. Use the pyproject.toml in your cognee library for this, but match the syntax of the automatically generated pyproject.toml so that it is compatible with uv.
Restart your Claude desktop.

### Installing via Smithery

To install Cognee for Claude Desktop automatically via [Smithery](https://smithery.ai/server/cognee):

```bash
npx -y @smithery/cli install cognee --client claude
```

Define cognify tool in server.py
Restart your Claude desktop.
Restart your Claude desktop.
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ def main():
asyncio.run(server.main())

# Optionally expose other important items at package level
__all__ = ['main', 'server']
__all__ = ["main", "server"]

if __name__ == "__main__":
main()
Loading

0 comments on commit a6dfff8

Please sign in to comment.