Skip to content

Commit

Permalink
more query analysis docs (#18358)
Browse files Browse the repository at this point in the history
  • Loading branch information
hwchase17 authored Mar 2, 2024
1 parent f96dd57 commit bc768a1
Show file tree
Hide file tree
Showing 10 changed files with 1,797 additions and 29 deletions.
190 changes: 190 additions & 0 deletions docs/docs/use_cases/query_analysis/how_to/constructing-filters.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,190 @@
{
"cells": [
{
"cell_type": "raw",
"id": "df7d42b9-58a6-434c-a2d7-0b61142f6d3e",
"metadata": {},
"source": [
"---\n",
"sidebar_position: 6\n",
"---"
]
},
{
"cell_type": "markdown",
"id": "f2195672-0cab-4967-ba8a-c6544635547d",
"metadata": {},
"source": [
"# Construct Filters\n",
"\n",
"We may want to do query analysis to extract filters to pass into retrievers. One way we ask the LLM to represent these filters is as a Pydantic model. There is then the issue of converting that Pydantic model into a filter that can be passed into a retriever. \n",
"\n",
"This can be done manually, but LangChain also provides some \"Translators\" that are able to translate from a common syntax into filters specific to each retriever. Here, we will cover how to use those translators."
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "8ca446a0",
"metadata": {},
"outputs": [],
"source": [
"from typing import Optional\n",
"\n",
"from langchain.chains.query_constructor.ir import (\n",
" Comparator,\n",
" Comparison,\n",
" Operation,\n",
" Operator,\n",
" StructuredQuery,\n",
")\n",
"from langchain.retrievers.self_query.chroma import ChromaTranslator\n",
"from langchain.retrievers.self_query.elasticsearch import ElasticsearchTranslator\n",
"from langchain_core.pydantic_v1 import BaseModel"
]
},
{
"cell_type": "markdown",
"id": "bc1302ff",
"metadata": {},
"source": [
"In this example, `year` and `author` are both attributes to filter on."
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "64055006",
"metadata": {},
"outputs": [],
"source": [
"class Search(BaseModel):\n",
" query: str\n",
" start_year: Optional[int]\n",
" author: Optional[str]"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "44eb6d98",
"metadata": {},
"outputs": [],
"source": [
"search_query = Search(query=\"RAG\", start_year=2022, author=\"LangChain\")"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "e8ba6705",
"metadata": {},
"outputs": [],
"source": [
"def construct_comparisons(query: Search):\n",
" comparisons = []\n",
" if query.start_year is not None:\n",
" comparisons.append(\n",
" Comparison(\n",
" comparator=Comparator.GT,\n",
" attribute=\"start_year\",\n",
" value=query.start_year,\n",
" )\n",
" )\n",
" if query.author is not None:\n",
" comparisons.append(\n",
" Comparison(\n",
" comparator=Comparator.EQ,\n",
" attribute=\"author\",\n",
" value=query.author,\n",
" )\n",
" )\n",
" return comparisons"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "6a79c9da",
"metadata": {},
"outputs": [],
"source": [
"comparisons = construct_comparisons(search_query)"
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "2d0e9689",
"metadata": {},
"outputs": [],
"source": [
"_filter = Operation(operator=Operator.AND, arguments=comparisons)"
]
},
{
"cell_type": "code",
"execution_count": 18,
"id": "e4c0b2ce",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'bool': {'must': [{'range': {'metadata.start_year': {'gt': 2022}}},\n",
" {'term': {'metadata.author.keyword': 'LangChain'}}]}}"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ElasticsearchTranslator().visit_operation(_filter)"
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "d75455ae",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'$and': [{'start_year': {'$gt': 2022}}, {'author': {'$eq': 'LangChain'}}]}"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ChromaTranslator().visit_operation(_filter)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.1"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,9 @@
"id": "f2195672-0cab-4967-ba8a-c6544635547d",
"metadata": {},
"source": [
"# Adding examples to the prompt\n",
"# Add Examples to the Prompt\n",
"\n",
"As our query analysis becomes more complex, adding examples to the prompt can meaningfully improve performance.\n",
"As our query analysis becomes more complex, the LLM may struggle to understand how exactly it should respond in certain scenarios. In order to improve performance here, we can add examples to the prompt to guide the LLM.\n",
"\n",
"Let's take a look at how we can add examples for the LangChain YouTube video query analyzer we built in the [Quickstart](/docs/use_cases/query_analysis/quickstart)."
]
Expand Down Expand Up @@ -377,7 +377,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.1"
}
},
"nbformat": 4,
Expand Down
Loading

0 comments on commit bc768a1

Please sign in to comment.