Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename embedchain to mem0 and open sourcing code for long term memory #1474

Merged
merged 5 commits into from
Jul 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
1 change: 0 additions & 1 deletion .env.example

This file was deleted.

3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -179,3 +179,6 @@ notebooks/*.yaml

# cache db
*.db

# local directories for testing
eval/
20 changes: 0 additions & 20 deletions .pre-commit-config.yaml

This file was deleted.

60 changes: 18 additions & 42 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,56 +1,32 @@
# Variables
PYTHON := python3
PIP := $(PYTHON) -m pip
PROJECT_NAME := embedchain

# Targets
.PHONY: install format lint clean test ci_lint ci_test coverage

install:
poetry install

# TODO: use a more efficient way to install these packages
install_all:
poetry install --all-extras
poetry run pip install pinecone-text pinecone-client langchain-anthropic "unstructured[local-inference, all-docs]" ollama langchain_together==0.1.3 \
langchain_cohere==0.1.5 deepgram-sdk==3.2.7 langchain-huggingface psutil clarifai==10.0.1 flask==2.3.3 twilio==8.5.0 fastapi-poe==0.0.16 discord==2.3.2 \
slack-sdk==3.21.3 huggingface_hub==0.23.0 gitpython==3.1.38 yt_dlp==2023.11.14 PyGithub==1.59.1 feedparser==6.0.10 newspaper3k==0.2.8 listparser==0.19 \
modal==0.56.4329 dropbox==11.36.2 boto3==1.34.20 youtube-transcript-api==0.6.1 pytube==15.0.0 beautifulsoup4==4.12.3

install_es:
poetry install --extras elasticsearch

install_opensearch:
poetry install --extras opensearch
.PHONY: format sort lint

install_milvus:
poetry install --extras milvus

shell:
poetry shell
# Variables
RUFF_OPTIONS = --line-length 120
ISORT_OPTIONS = --profile black

py_shell:
poetry run python
# Default target
all: format sort lint

# Format code with ruff
format:
$(PYTHON) -m black .
$(PYTHON) -m isort .
poetry run ruff check . --fix $(RUFF_OPTIONS)

clean:
rm -rf dist build *.egg-info
# Sort imports with isort
sort:
poetry run isort . $(ISORT_OPTIONS)

# Lint code with ruff
lint:
poetry run ruff .
poetry run ruff check . $(RUFF_OPTIONS)

docs:
cd docs && mintlify dev

build:
poetry build

publish:
poetry publish

# for example: make test file=tests/test_factory.py
test:
poetry run pytest $(file)

coverage:
poetry run pytest --cov=$(PROJECT_NAME) --cov-report=xml
clean:
poetry run rm -rf dist
254 changes: 163 additions & 91 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,125 +1,197 @@
<p align="center">
<img src="docs/logo/dark.svg" width="400px" alt="Embedchain Logo">
</p>

<p align="center">
<a href="https://pypi.org/project/embedchain/">
<img src="https://img.shields.io/pypi/v/embedchain" alt="PyPI">
</a>
<a href="https://pepy.tech/project/embedchain">
<img src="https://static.pepy.tech/badge/embedchain" alt="Downloads">
</a>
<a href="https://embedchain.ai/slack">
<img src="https://img.shields.io/badge/slack-embedchain-brightgreen.svg?logo=slack" alt="Slack">
</a>
<a href="https://embedchain.ai/discord">
<img src="https://dcbadge.vercel.app/api/server/6PzXDgEjG5?style=flat" alt="Discord">
</a>
<a href="https://twitter.com/embedchain">
<img src="https://img.shields.io/twitter/follow/embedchain" alt="Twitter">
</a>
<a href="https://colab.research.google.com/drive/138lMWhENGeEu7Q1-6lNbNTHGLZXBBz_B?usp=sharing">
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open in Colab">
</a>
<a href="https://codecov.io/gh/embedchain/embedchain">
<img src="https://codecov.io/gh/embedchain/embedchain/graph/badge.svg?token=EMRRHZXW1Q" alt="codecov">
</a>
</p>

<hr />

## What is Embedchain?

Embedchain is an Open Source Framework for personalizing LLM responses. It makes it easy to create and deploy personalized AI apps. At its core, Embedchain follows the design principle of being *"Conventional but Configurable"* to serve both software engineers and machine learning engineers.

Embedchain streamlines the creation of personalized LLM applications, offering a seamless process for managing various types of unstructured data. It efficiently segments data into manageable chunks, generates relevant embeddings, and stores them in a vector database for optimized retrieval. With a suite of diverse APIs, it enables users to extract contextual information, find precise answers, or engage in interactive chat conversations, all tailored to their own data.

## 🔧 Quick install

### Python API
# Mem0: Long-Term Memory for LLMs

Mem0 provides a smart, self-improving memory layer for Large Language Models, enabling personalized AI experiences across applications.

## Features

- Persistent memory for users, sessions, and agents
- Self-improving personalization
- Simple API for easy integration
- Cross-platform consistency

## Quick Start

### Installation


```bash
pip install embedchain
pip install mem0ai
```

## ✨ Live demo
## Usage

Checkout the [Chat with PDF](https://embedchain.ai/demo/chat-pdf) live demo we created using Embedchain. You can find the source code [here](https://github.com/embedchain/embedchain/tree/main/examples/chat-pdf).
### Instantiate

## 🔍 Usage
```python
from mem0 import Memory

m = Memory()
```

<!-- Demo GIF or Image -->
<p align="center">
<img src="docs/images/cover.gif" width="900px" alt="Embedchain Demo">
</p>
If you want to use Qdrant in server mode, use the following method to instantiate.

For example, you can create an Elon Musk bot using the following code:
Run qdrant first:

```bash
docker pull qdrant/qdrant

docker run -p 6333:6333 -p 6334:6334 \
-v $(pwd)/qdrant_storage:/qdrant/storage:z \
qdrant/qdrant
```

Then, instantiate memory with qdrant server:

```python
import os
from embedchain import App
from mem0 import Memory

config = {
"vector_store": {
"provider": "qdrant",
"config": {
"host": "localhost",
"port": 6333,
}
},
}

# Create a bot instance
os.environ["OPENAI_API_KEY"] = "<YOUR_API_KEY>"
app = App()
m = Memory.from_config(config)
```

# Embed online resources
app.add("https://en.wikipedia.org/wiki/Elon_Musk")
app.add("https://www.forbes.com/profile/elon-musk")
### Store a Memory

# Query the app
app.query("How many companies does Elon Musk run and name those?")
# Answer: Elon Musk currently runs several companies. As of my knowledge, he is the CEO and lead designer of SpaceX, the CEO and product architect of Tesla, Inc., the CEO and founder of Neuralink, and the CEO and founder of The Boring Company. However, please note that this information may change over time, so it's always good to verify the latest updates.
```python
m.add("Likes to play cricket over weekend", user_id="alex", metadata={"foo": "bar"})
# Output:
# [
# {
# 'id': 'm1',
# 'event': 'add',
# 'data': 'Likes to play cricket over weekend'
# }
# ]

# Similarly, you can store a memory for an agent
m.add("Agent X is best travel agent in Paris", agent_id="agent-x", metadata={"type": "long-term"})
```

You can also try it in your browser with Google Colab:
### Retrieve all memories

[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/17ON1LPonnXAtLaZEebnOktstB_1cJJmh?usp=sharing)
#### 1. Get all memories
```python
m.get_all()
# Output:
# [
# {
# 'id': 'm1',
# 'text': 'Likes to play cricket over weekend',
# 'metadata': {
# 'data': 'Likes to play cricket over weekend'
# }
# },
# {
# 'id': 'm2',
# 'text': 'Agent X is best travel agent in Paris',
# 'metadata': {
# 'data': 'Agent X is best travel agent in Paris'
# }
# }
# ]

## 📖 Documentation
Comprehensive guides and API documentation are available to help you get the most out of Embedchain:
```
#### 2. Get memories for specific user

- [Introduction](https://docs.embedchain.ai/get-started/introduction#what-is-embedchain)
- [Getting Started](https://docs.embedchain.ai/get-started/quickstart)
- [Examples](https://docs.embedchain.ai/examples)
- [Supported data types](https://docs.embedchain.ai/components/data-sources/overview)
```python
m.get_all(user_id="alex")
```

## 🔗 Join the Community
#### 3. Get memories for specific agent

* Connect with fellow developers by joining our [Slack Community](https://embedchain.ai/slack) or [Discord Community](https://embedchain.ai/discord).
```python
m.get_all(agent_id="agent-x")
```

* Dive into [GitHub Discussions](https://github.com/embedchain/embedchain/discussions), ask questions, or share your experiences.
#### 4. Get memories for a user during an agent run

## 🤝 Schedule a 1-on-1 Session
```python
m.get_all(agent_id="agent-x", user_id="alex")
```

### Retrieve a Memory

Book a [1-on-1 Session](https://cal.com/taranjeetio/ec) with the founders, to discuss any issues, provide feedback, or explore how we can improve Embedchain for you.
```python
memory_id = "m1"
m.get(memory_id)
# Output:
# {
# 'id': '1',
# 'text': 'Likes to play cricket over weekend',
# 'metadata': {
# 'data': 'Likes to play cricket over weekend'
# }
# }
```

## 🌐 Contributing
### Search for related memories

Contributions are welcome! Please check out the issues on the repository, and feel free to open a pull request.
For more information, please see the [contributing guidelines](CONTRIBUTING.md).
```python
m.search(query="What is my name", user_id="deshraj")
```

### Update a Memory

```python
m.update(memory_id="m1", data="Likes to play tennis")
```

For more reference, please go through [Development Guide](https://docs.embedchain.ai/contribution/dev) and [Documentation Guide](https://docs.embedchain.ai/contribution/docs).
### Get history of a Memory

<a href="https://github.com/embedchain/embedchain/graphs/contributors">
<img src="https://contrib.rocks/image?repo=embedchain/embedchain" />
</a>
```python
m.history(memory_id="m1")
# Output:
# [
# {
# 'id': 'h1',
# 'memory_id': 'm1',
# 'prev_value': None,
# 'new_value': 'Likes to play cricket over weekend',
# 'event': 'add',
# 'timestamp': '2024-06-12 21:00:54.466687',
# 'is_deleted': 0
# },
# {
# 'id': 'h2',
# 'memory_id': 'm1',
# 'prev_value': 'Likes to play cricket over weekend',
# 'new_value': 'Likes to play tennis',
# 'event': 'update',
# 'timestamp': '2024-06-12 21:01:17.230943',
# 'is_deleted': 0
# }
# ]
```

## Anonymous Telemetry
### Delete a Memory

We collect anonymous usage metrics to enhance our package's quality and user experience. This includes data like feature usage frequency and system info, but never personal details. The data helps us prioritize improvements and ensure compatibility. If you wish to opt-out, set the environment variable `EC_TELEMETRY=false`. We prioritize data security and don't share this data externally.
#### Delete specific memory

## Citation
```python
m.delete(memory_id="m1")
```

If you utilize this repository, please consider citing it with:
#### Delete memories for a user or agent

```python
m.delete_all(user_id="alex")
m.delete_all(agent_id="agent-x")
```
@misc{embedchain,
author = {Taranjeet Singh, Deshraj Yadav},
title = {Embedchain: The Open Source RAG Framework},
year = {2023},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/embedchain/embedchain}},
}

#### Delete all Memories

```python
m.reset()
```

## License

[Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0)
Loading
Loading