-
Notifications
You must be signed in to change notification settings - Fork 16.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add caching to BaseChatModel (issue #1644) #5089
Add caching to BaseChatModel (issue #1644) #5089
Conversation
…A/langchain into 1644-BaseChatModel-Caching
Any comments on this? Would be great to have caching included! |
Someone please take a look at this. Really need this :) Thanks |
- Resolved merge conflict - Implemented general version of _combine_llm_outputs - Cleaned up
Need this too. |
I hope it's reviewed soon, we need caching for ChatModels ! |
langchain.llm_cache = SQLiteCache(database_path=".langchain.db")
chat = ChatOpenAI(temperature=0, openai_api_key=get_openai_api_key())
messages = [
SystemMessage(content="You are a helpful assistant that translates English to French."),
HumanMessage(content="I love programming.")
]
start = time.time()
print(chat(messages))
print(f"first time = {time.time() - start}")
start = time.time()
print(chat(messages))
print(f"first time = {time.time() - start}") I test this code with this PR. First request miss cache, so It works. But, second request hit cache, and error occur.
With InMemoryCache, test code work fine langchain.llm_cache = InMemoryCache()
chat = ChatOpenAI(temperature=0, openai_api_key=get_openai_api_key())
messages = [
SystemMessage(content="You are a helpful assistant that translates English to French."),
HumanMessage(content="I love programming.")
]
start = time.time()
print(chat(messages))
print(f"first time = {time.time() - start}")
start = time.time()
print(chat(messages))
print(f"first time = {time.time() - start}") My guess is that InMemoryCache is just python dictionary, so it save data as ChatGeneration type. However, SQLiteCache is local database, so it save data as Generation type. If cache hit with SQLiteCache (and other type cache), loaded data is Generation type, not ChatGeneration. So, there is no "message" property in loaded data. |
Hey @Rienkim, thanks for pointing that out! I'll take a look & add more tests that use more different caching options. ETA should be this week. In the meanwhile, I'll turn this PR into a draft. |
@Rienkim Fixed it & added more tests |
@UmerHA thank you for working on this. I found that |
@deepblue could you elaborate on your concerns? I am waiting for the feature :) |
Just realized I made an error in my code- used I tested and confirmed that it's working as expected |
Good to go? |
Any update on this? |
The latest updates on your projects. Learn more about Vercel for Git ↗︎ 1 Ignored Deployment
|
langchain/chat_models/base.py
Outdated
@@ -59,7 +110,24 @@ class Config: | |||
arbitrary_types_allowed = True | |||
|
|||
def _combine_llm_outputs(self, llm_outputs: List[Optional[dict]]) -> dict: | |||
return {} | |||
"""Combine general llm outputs by aggregating them into lists |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems separate? am going to revert this
) | ||
else: | ||
result = self._generate(messages, stop=stop, **kwargs) | ||
langchain.llm_cache.update(prompt, llm_string, result.generations) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
won't we be storing ChatGenerations if _generate creates ChatGenerations? in which case do we need to do extra parsing in 218
@@ -14,7 +14,7 @@ from langchain.cache import InMemoryCache | |||
langchain.llm_cache = InMemoryCache() | |||
|
|||
# The first time, it is not yet in cache, so it should take longer | |||
llm("Tell me a joke") | |||
llm.predict("Tell me a joke") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this changing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to use the consist predict/predict_messages interface
@@ -163,7 +179,7 @@ def update(self, prompt: str, llm_string: str, return_val: RETURN_VAL_TYPE) -> N | |||
def clear(self, **kwargs: Any) -> None: | |||
"""Clear cache.""" | |||
with Session(self.engine) as session: | |||
session.execute(self.cache_schema.delete()) | |||
session.query(self.cache_schema).delete() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this changing ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah not sure, from previous pr, ill revert
Getting the below error when I use Edit: The error pops up for all calls after the cache has atleast 1 key set. @root_validator
def set_text(cls, values: Dict[str, Any]) -> Dict[str, Any]:
> values["text"] = values["message"].content
E KeyError: 'message' Code: import langchain
from datetime import timedelta
from langchain.cache import MomentoCache
langchain.llm_cache = MomentoCache.from_client_params("langchain_momento", imedelta(days=1))
# Further code for constructing and calling the chain using ChatOpenAI |
Can you post the full code, error message, and stack trace? |
Add caching to BaseChatModel
Fixes #1644
(Sidenote: While testing, I noticed we have multiple implementations of Fake LLMs, used for testing. I consolidated them.)
Who can review?
Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested:
Models
Twitter: @UmerHAdil | Discord: RicChilligerDude#7589