Skip to content

Commit

Permalink
community[patch]: Fixing incorrect base URLs for Azure Cognitive Sear…
Browse files Browse the repository at this point in the history
…ch Retriever (langchain-ai#19352)

This PR adds code to make sure that the correct base URL is being
created for the Azure Cognitive Search retriever. At the moment an
incorrect base URL is being generated. I think this is happening because
the original code was based on a depreciated API version. No
dependencies need to be added. I've also added more context to the test
doc strings.

I should also note that ACS is now Azure AI Search. I will open a
separate PR to make these changes as that would be a breaking change and
should potentially be discussed.

Twitter: @marlene_zw



- No new tests added, however the current ACS retriever tests are now
passing when I run them.
- Code was linted.

Co-authored-by: Bagatur <[email protected]>
  • Loading branch information
2 people authored and gkorland committed Mar 30, 2024
1 parent d8b35f2 commit e5e9515
Show file tree
Hide file tree
Showing 2 changed files with 24 additions and 8 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ class AzureCognitiveSearchRetriever(BaseRetriever):
api_key: str = ""
"""API Key. Both Admin and Query keys work, but for reading data it's
recommended to use a Query key."""
api_version: str = "2020-06-30"
api_version: str = "2023-11-01"
"""API version"""
aiosession: Optional[aiohttp.ClientSession] = None
"""ClientSession, in case we want to reuse connection for better performance."""
Expand Down Expand Up @@ -59,7 +59,14 @@ def _build_search_url(self, query: str) -> str:
url_suffix = get_from_env(
"", "AZURE_COGNITIVE_SEARCH_URL_SUFFIX", DEFAULT_URL_SUFFIX
)
base_url = f"https://{self.service_name}.{url_suffix}/"
if url_suffix in self.service_name and "https://" in self.service_name:
base_url = f"{self.service_name}/"
elif url_suffix in self.service_name and "https://" not in self.service_name:
base_url = f"https://{self.service_name}/"
elif url_suffix not in self.service_name and "https://" in self.service_name:
base_url = f"{self.service_name}.{url_suffix}/"
else:
base_url = self.service_name
endpoint_path = f"indexes/{self.index_name}/docs?api-version={self.api_version}"
top_param = f"&$top={self.top_k}" if self.top_k else ""
return base_url + endpoint_path + f"&search={query}" + top_param
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,22 +7,31 @@


def test_azure_cognitive_search_get_relevant_documents() -> None:
"""Test valid call to Azure Cognitive Search."""
"""Test valid call to Azure Cognitive Search.
In order to run this test, you should provide a service name, azure search api key
and an index_name as arguments for the AzureCognitiveSearchRetriever in both tests.
"""
retriever = AzureCognitiveSearchRetriever()
documents = retriever.get_relevant_documents("what is langchain")

documents = retriever.get_relevant_documents("what is langchain?")
for doc in documents:
assert isinstance(doc, Document)
assert doc.page_content

retriever = AzureCognitiveSearchRetriever(top_k=1)
documents = retriever.get_relevant_documents("what is langchain")
retriever = AzureCognitiveSearchRetriever()
documents = retriever.get_relevant_documents("what is langchain?")
assert len(documents) <= 1


async def test_azure_cognitive_search_aget_relevant_documents() -> None:
"""Test valid async call to Azure Cognitive Search."""
"""Test valid async call to Azure Cognitive Search.
In order to run this test, you should provide a service name, azure search api key
and an index_name as arguments for the AzureCognitiveSearchRetriever.
"""
retriever = AzureCognitiveSearchRetriever()
documents = await retriever.aget_relevant_documents("what is langchain")
documents = await retriever.aget_relevant_documents("what is langchain?")
for doc in documents:
assert isinstance(doc, Document)
assert doc.page_content

0 comments on commit e5e9515

Please sign in to comment.