Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add caching to BaseChatModel (issue #1644) #5089

Merged
merged 33 commits into from
Jun 24, 2023

Conversation

UmerHA
Copy link
Contributor

@UmerHA UmerHA commented May 22, 2023

Add caching to BaseChatModel

Fixes #1644

(Sidenote: While testing, I noticed we have multiple implementations of Fake LLMs, used for testing. I consolidated them.)

Who can review?

Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested:
Models

Twitter: @UmerHAdil | Discord: RicChilligerDude#7589

@dev2049 dev2049 added the 03 enhancement Enhancement of existing functionality label May 22, 2023
@kaikun213
Copy link
Contributor

Any comments on this?
@hwchase17
@agola11

Would be great to have caching included!

@abdulzain6
Copy link

Someone please take a look at this. Really need this :) Thanks

UmerHA added 2 commits May 29, 2023 15:32
- Resolved merge conflict
- Implemented general version of _combine_llm_outputs
- Cleaned up
@realjustinwu
Copy link

Need this too.

@ielmansouri
Copy link

I hope it's reviewed soon, we need caching for ChatModels !

@Rienkim
Copy link

Rienkim commented Jun 4, 2023

langchain.llm_cache = SQLiteCache(database_path=".langchain.db")
chat = ChatOpenAI(temperature=0, openai_api_key=get_openai_api_key())
messages = [
    SystemMessage(content="You are a helpful assistant that translates English to French."),
    HumanMessage(content="I love programming.")
]
start = time.time()
print(chat(messages))
print(f"first time = {time.time() - start}")
start = time.time()
print(chat(messages))
print(f"first time = {time.time() - start}")

I test this code with this PR. First request miss cache, so It works. But, second request hit cache, and error occur.

cls = <class 'langchain.schema.ChatGeneration'>
values = {'generation_info': None, 'text': "J'adore la programmation."}

    @root_validator
    def set_text(cls, values: Dict[str, Any]) -> Dict[str, Any]:
>       values["text"] = values["message"].content
E       KeyError: 'message'

With InMemoryCache, test code work fine

langchain.llm_cache = InMemoryCache()
chat = ChatOpenAI(temperature=0, openai_api_key=get_openai_api_key())
messages = [
    SystemMessage(content="You are a helpful assistant that translates English to French."),
    HumanMessage(content="I love programming.")
]
start = time.time()
print(chat(messages))
print(f"first time = {time.time() - start}")
start = time.time()
print(chat(messages))
print(f"first time = {time.time() - start}")

My guess is that InMemoryCache is just python dictionary, so it save data as ChatGeneration type. However, SQLiteCache is local database, so it save data as Generation type. If cache hit with SQLiteCache (and other type cache), loaded data is Generation type, not ChatGeneration. So, there is no "message" property in loaded data.
Fast(?) solution is implementing langchain.chat_model seperately, which save and load ChatGeneration type. But, it need two cache code, llm and chat_model for all cache implementation.
To solve this problem, BaseCache need to be modified (I think). But, It is complicated.

@UmerHA
Copy link
Contributor Author

UmerHA commented Jun 5, 2023

Hey @Rienkim, thanks for pointing that out! I'll take a look & add more tests that use more different caching options.

ETA should be this week. In the meanwhile, I'll turn this PR into a draft.

@UmerHA UmerHA marked this pull request as draft June 5, 2023 11:25
@UmerHA
Copy link
Contributor Author

UmerHA commented Jun 6, 2023

@Rienkim Fixed it & added more tests

@UmerHA UmerHA marked this pull request as ready for review June 6, 2023 22:11
@deepblue
Copy link
Contributor

deepblue commented Jun 6, 2023

@UmerHA thank you for working on this.

I found that _combine_llm_outputs implemented in this PR can be an issue with OpenAICallbackHandler https://github.com/hwchase17/langchain/blob/master/langchain/callbacks/openai_info.py#LL99C42-L99C42

@jakobsa
Copy link

jakobsa commented Jun 14, 2023

@deepblue could you elaborate on your concerns? I am waiting for the feature :)

@deepblue
Copy link
Contributor

Just realized I made an error in my code- used BaseChatModel._combine_llm_outputs in this PR instead of the default ChatOpenAI._combine_llm_outputs for testing. Retested with the right method and it's all clear. The original method doesn't affect our subclassed code. Apologies for the mix-up and the late response.

I tested and confirmed that it's working as expected

@pors
Copy link
Contributor

pors commented Jun 15, 2023

@hwchase17
@agola11

Good to go?

@kaikun213
Copy link
Contributor

@hwchase17
@agola11

Any update on this?

@vercel
Copy link

vercel bot commented Jun 24, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
langchain ⬜️ Ignored (Inspect) Jun 24, 2023 6:44pm

@@ -59,7 +110,24 @@ class Config:
arbitrary_types_allowed = True

def _combine_llm_outputs(self, llm_outputs: List[Optional[dict]]) -> dict:
return {}
"""Combine general llm outputs by aggregating them into lists
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems separate? am going to revert this

)
else:
result = self._generate(messages, stop=stop, **kwargs)
langchain.llm_cache.update(prompt, llm_string, result.generations)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

won't we be storing ChatGenerations if _generate creates ChatGenerations? in which case do we need to do extra parsing in 218

@@ -14,7 +14,7 @@ from langchain.cache import InMemoryCache
langchain.llm_cache = InMemoryCache()

# The first time, it is not yet in cache, so it should take longer
llm("Tell me a joke")
llm.predict("Tell me a joke")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this changing?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to use the consist predict/predict_messages interface

@@ -163,7 +179,7 @@ def update(self, prompt: str, llm_string: str, return_val: RETURN_VAL_TYPE) -> N
def clear(self, **kwargs: Any) -> None:
"""Clear cache."""
with Session(self.engine) as session:
session.execute(self.cache_schema.delete())
session.query(self.cache_schema).delete()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this changing ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah not sure, from previous pr, ill revert

@hwchase17 hwchase17 merged commit 068142f into langchain-ai:master Jun 24, 2023
This was referenced Jun 25, 2023
@ghost
Copy link

ghost commented Jul 11, 2023

Getting the below error when I use MomentoCache. @UmerHA please let me know if this is a bug or if I am doing anything wrong.

Edit: The error pops up for all calls after the cache has atleast 1 key set.

@root_validator
    def set_text(cls, values: Dict[str, Any]) -> Dict[str, Any]:
>       values["text"] = values["message"].content
E       KeyError: 'message'

Code:

import langchain
from datetime import timedelta
from langchain.cache import MomentoCache

langchain.llm_cache = MomentoCache.from_client_params("langchain_momento", imedelta(days=1))

# Further code for constructing and calling the chain using ChatOpenAI

@UmerHA
Copy link
Contributor Author

UmerHA commented Jul 11, 2023

Getting the below error when I use MomentoCache. @UmerHA please let me know if this is a bug or if I am doing anything wrong.

Edit: The error pops up for all calls after the cache has atleast 1 key set.

@root_validator
    def set_text(cls, values: Dict[str, Any]) -> Dict[str, Any]:
>       values["text"] = values["message"].content
E       KeyError: 'message'

Code:

import langchain
from datetime import timedelta
from langchain.cache import MomentoCache

langchain.llm_cache = MomentoCache.from_client_params("langchain_momento", imedelta(days=1))

# Further code for constructing and calling the chain using ChatOpenAI

Can you post the full code, error message, and stack trace?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
03 enhancement Enhancement of existing functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add caching support to BaseChatModel