Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: patch TEI error in load #725

Merged
merged 4 commits into from
Dec 28, 2023
Merged

fix: patch TEI error in load #725

merged 4 commits into from
Dec 28, 2023

Conversation

cpacker
Copy link
Collaborator

@cpacker cpacker commented Dec 28, 2023

Close #723

Please describe the purpose of this pull request.

Example command to test:

memgpt load directory --name test_load --input-dir memgpt/personas/examples --recursive

On main:

Works fine with OpenAI:

(pymemgpt-py3.10) (base) loaner@MacBook-Pro-5 MemGPT-2 % memgpt load directory --name test_load --input-dir memgpt/personas/examples 
--recursive
LLM is explicitly disabled. Using MockLLM.
LLM is explicitly disabled. Using MockLLM.
Parsing nodes: 100%|███████████████████████████████████████████████████████████████████████████████████| 7/7 [00:00<00:00, 30.85it/s]
Generating embeddings: 100%|███████████████████████████████████████████████████████████████████████| 100/100 [00:03<00:00, 31.43it/s]
100%|██████████████████████████████████████████████████████████████████████████████████████████| 100/100 [00:00<00:00, 609637.21it/s]
Generating embeddings: 0it [00:00, ?it/s]

Doesn't work with memgpt quickstart:

(pymemgpt-py3.10) (base) loaner@MacBook-Pro-5 MemGPT-2 % memgpt quickstart
📖 MemGPT configuration file updated!
🧠 model        -> ehartford/dolphin-2.5-mixtral-8x7b
🖥️  endpoint     -> https://api.memgpt.ai
⚡ Run "memgpt run" to create an agent with the new config.
(pymemgpt-py3.10) (base) loaner@MacBook-Pro-5 MemGPT-2 % memgpt load directory --name test_load2 --input-dir memgpt/personas/examples --recursive
LLM is explicitly disabled. Using MockLLM.
LLM is explicitly disabled. Using MockLLM.
Parsing nodes: 100%|███████████████████████████████████████████████████████████████████████████████████| 7/7 [00:00<00:00, 50.50it/s]
Generating embeddings: 100%|███████████████████████████████████████████████████████████████████████| 100/100 [00:10<00:00,  9.56it/s]
  0%|                                                                                                        | 0/100 [00:00<?, ?it/s]
    return callback(**use_params)  # type: ignore
  File "/Users/loaner/dev/MemGPT-2/memgpt/cli/cli_load.py", line 106, in load_directory
    store_docs(name, docs)
  File "/Users/loaner/dev/MemGPT-2/memgpt/cli/cli_load.py", line 48, in store_docs
    len(node.embedding) == config.embedding_dim
AssertionError: Expected embedding dimension 1536, got 4: {'object': 'list', 'data': [{'object': 'embedding', 'embedding': [0.0072218 ...

On PR (with patch)

% memgpt load directory --name test_load --input-dir memgpt/personas/examples --recursive
Parsing nodes: 100%|███████████████████████████████████████████████████████████████████████████████████| 7/7 [00:00<00:00, 51.74it/s]
Generating embeddings: 100%|███████████████████████████████████████████████████████████████████████| 100/100 [00:10<00:00,  9.25it/s]
  0%|                                                                                                        | 0/100 [00:00<?, ?it/s]
Traceback (most recent call last)
...
  File "/Users/loaner/dev/MemGPT-2/memgpt/cli/cli_load.py", line 121, in load_directory
    store_docs(name, docs)
  File "/Users/loaner/dev/MemGPT-2/memgpt/cli/cli_load.py", line 63, in store_docs
    len(node.embedding) == config.embedding_dim
AssertionError: Expected embedding dimension 1536, got 1024: [0.00722182, -0.028529543, 

config for reference:

[defaults]
preset = memgpt_chat
persona = sam_pov
human = basic

[model]
model = ehartford/dolphin-2.5-mixtral-8x7b
model_endpoint = https://api.memgpt.ai
model_endpoint_type = vllm
model_wrapper = chatml
context_window = 32768

[embedding]
embedding_endpoint_type = hugging-face
embedding_endpoint = https://embeddings.memgpt.ai
embedding_model = BAAI/bge-large-en-v1.5
embedding_dim = 1536
embedding_chunk_size = 300

[archival_storage]
type = local

[version]
memgpt_version = 0.2.10

[client]
anon_clientid = ...

How to test

See above.

Have you tested this PR?

  • load works
  • insert banana / search banana test works (quickstart)
  • insert banana / search banana test works (quickstart --backend openai)

Copy link
Collaborator

@sarahwooders sarahwooders left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@cpacker cpacker merged commit a3e94ae into main Dec 28, 2023
7 checks passed
@cpacker cpacker deleted the patch-load branch December 28, 2023 06:09
norton120 pushed a commit to norton120/MemGPT that referenced this pull request Feb 15, 2024
* patch TEI error in load (now get different error)

* more hiding of MOCKLLM

* fix embedding dim

* refactored bandaid patches into custom embedding class return object patch
mattzh72 pushed a commit that referenced this pull request Oct 9, 2024
* patch TEI error in load (now get different error)

* more hiding of MOCKLLM

* fix embedding dim

* refactored bandaid patches into custom embedding class return object patch
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

memgpt load directory with hugging-face embeddings fails to parse embed response
2 participants