Skip to content

Commit

Permalink
Merge pull request #421 from magicyuan876/main
Browse files Browse the repository at this point in the history
优化embedding相似度缓存机制,增加 LLM 相似性检查功能并优化缓存机制
  • Loading branch information
LarFii authored Dec 9, 2024
2 parents 67c4acb + 779ed60 commit 2644a29
Show file tree
Hide file tree
Showing 6 changed files with 205 additions and 276 deletions.
6 changes: 1 addition & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -596,11 +596,7 @@ if __name__ == "__main__":
| **enable\_llm\_cache** | `bool` | If `TRUE`, stores LLM results in cache; repeated prompts return cached responses | `TRUE` |
| **addon\_params** | `dict` | Additional parameters, e.g., `{"example_number": 1, "language": "Simplified Chinese"}`: sets example limit and output language | `example_number: all examples, language: English` |
| **convert\_response\_to\_json\_func** | `callable` | Not used | `convert_response_to_json` |
| **embedding\_cache\_config** | `dict` | Configuration for question-answer caching. Contains two parameters:
- `enabled`: Boolean value to enable/disable caching functionality. When enabled, questions and answers will be cached.
- `similarity_threshold`: Float value (0-1), similarity threshold. When a new question's similarity with a cached question exceeds this threshold, the cached answer will be returned directly without calling the LLM.

Default: `{"enabled": False, "similarity_threshold": 0.95}` | `{"enabled": False, "similarity_threshold": 0.95}` |
| **embedding\_cache\_config** | `dict` | Configuration for question-answer caching. Contains three parameters:<br>- `enabled`: Boolean value to enable/disable cache lookup functionality. When enabled, the system will check cached responses before generating new answers.<br>- `similarity_threshold`: Float value (0-1), similarity threshold. When a new question's similarity with a cached question exceeds this threshold, the cached answer will be returned directly without calling the LLM.<br>- `use_llm_check`: Boolean value to enable/disable LLM similarity verification. When enabled, LLM will be used as a secondary check to verify the similarity between questions before returning cached answers. | Default: `{"enabled": False, "similarity_threshold": 0.95, "use_llm_check": False}` |

## API Server Implementation

Expand Down
9 changes: 7 additions & 2 deletions lightrag/lightrag.py
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,11 @@ class LightRAG:
)
# Default not to use embedding cache
embedding_cache_config: dict = field(
default_factory=lambda: {"enabled": False, "similarity_threshold": 0.95}
default_factory=lambda: {
"enabled": False,
"similarity_threshold": 0.95,
"use_llm_check": False,
}
)
kv_storage: str = field(default="JsonKVStorage")
vector_storage: str = field(default="NanoVectorDBStorage")
Expand Down Expand Up @@ -174,7 +178,6 @@ def __post_init__(self):
if self.enable_llm_cache
else None
)

self.embedding_func = limit_async_func_call(self.embedding_func_max_async)(
self.embedding_func
)
Expand Down Expand Up @@ -481,6 +484,7 @@ async def aquery(self, query: str, param: QueryParam = QueryParam()):
self.text_chunks,
param,
asdict(self),
hashing_kv=self.llm_response_cache,
)
elif param.mode == "naive":
response = await naive_query(
Expand All @@ -489,6 +493,7 @@ async def aquery(self, query: str, param: QueryParam = QueryParam()):
self.text_chunks,
param,
asdict(self),
hashing_kv=self.llm_response_cache,
)
else:
raise ValueError(f"Unknown mode {param.mode}")
Expand Down
Loading

0 comments on commit 2644a29

Please sign in to comment.