async mongo document loader #4285

# Fix Homepage Typo ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested... not sure

@eyurtsev

) # Docs and code review fixes for Docugami DataLoader 1. I noticed a couple of hyperlinks that are not loading in the langchain docs (I guess need explicit anchor tags). Added those. 2. In code review @eyurtsev had a [suggestion](langchain-ai#4727 (comment)) to allow string paths. Turns out just updating the type works (I tested locally with string paths). # Pre-submission checks I ran `make lint` and `make tests` successfully. --------- Co-authored-by: Taqi Jaffri <[email protected]>

# Update deployments doc with langcorn API server API server example ```python from fastapi import FastAPI from langcorn import create_service app: FastAPI = create_service( "examples.ex1:chain", "examples.ex2:chain", "examples.ex3:chain", "examples.ex4:sequential_chain", "examples.ex5:conversation", "examples.ex6:conversation_with_summary", ) ``` More examples: https://github.com/msoedov/langcorn/tree/main/examples Co-authored-by: Dev 2049 <[email protected]>

…angchain-ai#2361) It's currently not possible to change the `TEMPLATE_TOOL_RESPONSE` prompt for ConversationalChatAgent, this PR changes that. --------- Co-authored-by: Dev 2049 <[email protected]>

Co-authored-by: Ali Mirlou <[email protected]>

# Add generic document loader * This PR adds a generic document loader which can assemble a loader from a blob loader and a parser * Adds a registry for parsers * Populate registry with a default mimetype based parser ## Expected changes - Parsing involves loading content via IO so can be sped up via: * Threading in sync * Async - The actual parsing logic may be computatinoally involved: may need to figure out to add multi-processing support - May want to add suffix based parser since suffixes are easier to specify in comparison to mime types ## Before submitting No notebooks yet, we first need to get a few of the basic parsers up (prior to advertising the interface)

# Add bs4 html parser * Some minor refactors * Extract the bs4 html parsing code from the bs html loader * Move some tests from integration tests to unit tests

Co-authored-by: BenSchZA <[email protected]>

Co-authored-by: Daniel Chalef <[email protected]> Co-authored-by: Daniel Chalef <[email protected]>

@hwchase17

…langchain-ai#4389) # Documentation for Azure OpenAI embeddings model - OPENAI_API_VERSION environment variable is needed for the endpoint - The constructor does not work with model, it works with deployment. I fixed it in the notebook. (This is my first contribution) ## Who can review? @hwchase17 @agola Co-authored-by: Harrison Chase <[email protected]>

# Added another helpful way for developers who want to set OpenAI API Key dynamically Previous methods like exporting environment variables are good for project-wide settings. But many use cases need to assign API keys dynamically, recently. ```python from langchain.llms import OpenAI llm = OpenAI(openai_api_key="OPENAI_API_KEY") ``` ## Before submitting ```bash export OPENAI_API_KEY="..." ``` Or, ```python import os os.environ["OPENAI_API_KEY"] = "..." ``` <hr> Thank you. Cheers, Bongsang

@hwchase17

#docs: text splitters improvements Changes are only in the Jupyter notebooks. - added links to the source packages and a short description of these packages - removed " Text Splitters" suffixes from the TOC elements (they made the list of the text splitters messy) - moved text splitters, based on the length function into a separate list. They can be mixed with any classes from the "Text Splitters", so it is a different classification. ## Who can review? @hwchase17 - project lead @eyurtsev @vowelparrot NOTE: please, check out the results of the `Python code` text splitter example (text_splitters/examples/python.ipynb). It looks suboptimal.

Co-authored-by: Jerry Luan <[email protected]>

Co-authored-by: Jiaxin Shan <[email protected]>

Co-authored-by: Matthias Samwald <[email protected]>

@eyurtsev

…angchain-ai#4926) # Load specific file types from Google Drive (issue langchain-ai#4878) Add the possibility to define what file types you want to load from Google Drive. ``` loader = GoogleDriveLoader( folder_id="1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5", file_types=["document", "pdf"] recursive=False ) ``` Fixes #langchain-ai#4878 ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: DataLoaders - @eyurtsev Twitter: [@UmerHAdil](https://twitter.com/@UmerHAdil) | Discord: RicChilligerDude#7589 --------- Co-authored-by: UmerHA <[email protected]>

# API update: Engines -> Models see: https://community.openai.com/t/api-update-engines-models/18597 Co-authored-by: assert <[email protected]>

@eyurtsev

…exceptions (langchain-ai#4927) # TextLoader auto detect encoding and enhanced exception handling - Add an option to enable encoding detection on `TextLoader`. - The detection is done using `chardet` - The loading is done by trying all detected encodings by order of confidence or raise an exception otherwise. ### New Dependencies: - `chardet` Fixes langchain-ai#4479 ## Before submitting  ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: - @eyurtsev --------- Co-authored-by: blob42 <spike@w530>

@vowelparrot

# Fix bilibili api import error bilibili-api package is depracated and there is no sync module.   Fixes langchain-ai#2673 langchain-ai#2724 ## Before submitting  ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @vowelparrot @liaokongVFX

@vowelparrot

…ngchain-ai#4542) # Add human message as input variable to chat agent prompt creation This PR adds human message and system message input to `CHAT_ZERO_SHOT_REACT_DESCRIPTION` agent, similar to [conversational chat agent](https://github.com/hwchase17/langchain/blob/7bcf238a1acf40aef21a5a198cf0e62d76f93c15/langchain/agents/conversational_chat/base.py#L64-L71). I met this issue trying to use `create_prompt` function when using the [BabyAGI agent with tools notebook](https://python.langchain.com/en/latest/use_cases/autonomous_agents/baby_agi_with_agent.html), since BabyAGI uses “task” instead of “input” input variable. For normal zero shot react agent this is fine because I can manually change the suffix to “{input}/n/n{agent_scratchpad}” just like the notebook, but I cannot do this with conversational chat agent, therefore blocking me to use BabyAGI with chat zero shot agent. I tested this in my own project [Chrome-GPT](https://github.com/richardyc/Chrome-GPT) and this fix worked. ## Request for review Agents / Tools / Toolkits - @vowelparrot

Co-authored-by: Dev 2049 <[email protected]>

this makes it so we dont throw errors when importing langchain when sqlalchemy==1.3.1 we dont really want to support 1.3.1 (seems like unneccessary maintance cost) BUT we would like it to not terribly error should someone decide to run on it

# Docs: compound ecosystem and integrations **Problem statement:** We have a big overlap between the References/Integrations and Ecosystem/LongChain Ecosystem pages. It confuses users. It creates a situation when new integration is added only on one of these pages, which creates even more confusion. - removed References/Integrations page (but move all its information into the individual integration pages - in the next PR). - renamed Ecosystem/LongChain Ecosystem into Integrations/Integrations. I like the Ecosystem term. It is more generic and semantically richer than the Integration term. But it mentally overloads users. The `integration` term is more concrete. UPDATE: after discussion, the Ecosystem is the term. Ecosystem/Integrations is the page (in place of Ecosystem/LongChain Ecosystem). As a result, a user gets a single place to start with the individual integration.

# Update GPT4ALL integration GPT4ALL have completely changed their bindings. They use a bit odd implementation that doesn't fit well into base.py and it will probably be changed again, so it's a temporary solution. Fixes langchain-ai#3839, langchain-ai#4628

_get_gptcache method keep creating new gptcache instance, here's the fix # Fix GPTCache cache_obj creation loop Fixes langchain-ai#4830 Co-authored-by: Dev 2049 <[email protected]>

@tylerhutcherson

cc @tylerhutcherson

# docs: updated `Supabase` notebook - the title of the notebook was inconsistent (included redundant "Vectorstore"). Removed this "Vectorstore" - added `Postgress` to the title. It is important. The `Postgres` name is much more popular than `Supabase`. - added description for the `Postrgress` - added more info to the `Supabase` description

…angchain-ai#4938) Correct typo in APIChain example notebook (Farenheit -> Fahrenheit)

Updated the docs from "An agent consists of three parts:" to "An agent consists of two parts:" since there are only two parts in the documentation

# docs: added `ecosystem/dependents` page Added `ecosystem/dependents` page. Can we propose a better page name?

# docs: vectorstores, different updates and fixes Multiple updates: - added/improved descriptions - fixed header levels - added headers - fixed headers

the output parser form chat conversational agent now raises `OutputParserException` like the rest. The `raise OutputParserExeption(...) from e` form also carries through the original error details on what went wrong. I added the `ValueError` as a base class to `OutputParserException` to avoid breaking code that was relying on `ValueError` as a way to catch exceptions from the agent. So catching ValuError still works. Not sure if this is a good idea though ?

# Zep Retriever - Vector Search Over Chat History with the Zep Long-term Memory Service More on Zep: https://github.com/getzep/zep Note: This PR is related to and relies on langchain-ai#4834. I did not want to modify the `pyproject.toml` file to add the `zep-python` dependency a second time. Co-authored-by: Daniel Chalef <[email protected]>

The Anthropic classes used `BaseLanguageModel.get_num_tokens` because of an issue with multiple inheritance. Fixed by moving the method from `_AnthropicCommon` to both its subclasses. This change will significantly speed up token counting for Anthropic users.

… above. (langchain-ai#2675) Co-authored-by: Dev 2049 <[email protected]> Co-authored-by: Davis Chase <[email protected]>

@hwchase17

…n-ai#4761) - simplify the validation check a little bit. - re-tested in jupyter notebook. Reviewer: @hwchase17

…hain-ai#4747) # Fixes syntax for setting Snowflake database search_path An error occurs when using a Snowflake database and providing a schema argument. I have updated the syntax to run a Snowflake specific query when the database dialect is 'snowflake'.

Co-authored-by: Jan Minar <[email protected]>

@hwchase17

# Add Spark SQL support * Add Spark SQL support. It can connect to Spark via building a local/remote SparkSession. * Include a notebook example I tried some complicated queries (window function, table joins), and the tool works well. Compared to the [Spark Dataframe agent](https://python.langchain.com/en/latest/modules/agents/toolkits/examples/spark.html), this tool is able to generate queries across multiple tables. --------- # Your PR Title (What it does)   Fixes # (issue) ## Before submitting  ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested:  --------- Co-authored-by: Gengliang Wang <[email protected]> Co-authored-by: Mike W <[email protected]> Co-authored-by: Eugene Yurtsev <[email protected]> Co-authored-by: UmerHA <[email protected]> Co-authored-by: 张城铭 <[email protected]> Co-authored-by: assert <[email protected]> Co-authored-by: blob42 <spike@w530> Co-authored-by: Yuekai Zhang <[email protected]> Co-authored-by: Richard He <[email protected]> Co-authored-by: Dev 2049 <[email protected]> Co-authored-by: Leonid Ganeline <[email protected]> Co-authored-by: Alexey Nominas <[email protected]> Co-authored-by: elBarkey <[email protected]> Co-authored-by: Davis Chase <[email protected]> Co-authored-by: Jeffrey D <[email protected]> Co-authored-by: so2liu <[email protected]> Co-authored-by: Viswanadh Rayavarapu <[email protected]> Co-authored-by: Chakib Ben Ziane <[email protected]> Co-authored-by: Daniel Chalef <[email protected]> Co-authored-by: Daniel Chalef <[email protected]> Co-authored-by: Jari Bakken <[email protected]> Co-authored-by: escafati <[email protected]>

This PR adds support for Databricks runtime and Databricks SQL by using [Databricks SQL Connector for Python](https://docs.databricks.com/dev-tools/python-sql-connector.html). As a cloud data platform, accessing Databricks requires a URL as follows `databricks://token:{api_token}@{hostname}?http_path={http_path}&catalog={catalog}&schema={schema}`. **The URL is **complicated** and it may take users a while to figure it out**. Since the fields `api_token`/`hostname`/`http_path` fields are known in the Databricks notebook, I am proposing a new method `from_databricks` to simplify the connection to Databricks. ## In Databricks Notebook After changes, Databricks users only need to specify the `catalog` and `schema` field when using langchain. <img width="881" alt="image" src="https://github.com/hwchase17/langchain/assets/1097932/984b4c57-4c2d-489d-b060-5f4918ef2f37"> ## In Jupyter Notebook The method can be used on the local setup as well: <img width="678" alt="image" src="https://github.com/hwchase17/langchain/assets/1097932/142e8805-a6ef-4919-b28e-9796ca31ef19">

@hwchase17

Fixed assumptions misspelling in the link mentioned below:- https://python.langchain.com/en/latest/modules/chains/examples/llm_summarization_checker.html ![image](https://github.com/hwchase17/langchain/assets/16189966/94cf2be0-b3d0-495b-98ad-e1f44331727e) Fix for Issue:- langchain-ai#4959 @hwchase17

@dev2049

# Added a YouTube Tutorial Added a LangChain tutorial playlist aimed at onboarding newcomers to LangChain and its use cases. I've shared the video in the #tutorials channel and it seemed to be well received. I think this could be useful to the greater community. ## Who can review? @dev2049

Typos in the OpenAPI agent Prompt.

@hwchase17

# Powerbi API wrapper bug fix + integration tests - Bug fix by removing `TYPE_CHECKING` in in utilities/powerbi.py - Added integration test for power bi api in utilities/test_powerbi_api.py - Added integration test for power bi agent in agent/test_powerbi_agent.py - Edited .env.examples to help set up power bi related environment variables - Updated demo notebook with working code in docs../examples/powerbi.ipynb - AzureOpenAI -> ChatOpenAI Notes: Chat models (gpt3.5, gpt4) are much more capable than davinci at writing DAX queries, so that is important to getting the agent to work properly. Interestingly, gpt3.5-turbo needed the examples=DEFAULT_FEWSHOT_EXAMPLES to write consistent DAX queries, so gpt4 seems necessary as the smart llm. Fixes langchain-ai#4325 ## Before submitting Azure-core and Azure-identity are necessary dependencies check integration tests with the following: `pytest tests/integration_tests/utilities/test_powerbi_api.py` `pytest tests/integration_tests/agent/test_powerbi_agent.py` You will need a power bi account with a dataset id + table name in order to test. See .env.examples for details. ## Who can review? @hwchase17 @vowelparrot --------- Co-authored-by: aditya-pethe <[email protected]>

# Remove autoreload in examples Remove the `autoreload` in examples since it is not necessary for most users: ``` %load_ext autoreload, %autoreload 2 ```

# Bug fixes in Redis - Vectorstore (Added the version of redis to the error message and removed the cls argument from a classmethod) Co-authored-by: Tyler Hutcherson <[email protected]>

Add the async version for the search with relevance score Co-authored-by: Dev 2049 <[email protected]>

if https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#workflow_dispatch is to be believed this should make it possible to manually kick of test workflow, but i don't know much about these things

@dev2049

…gchain-ai#4982) # Adds "IN" metadata filter for pgvector to all checking for set presence PGVector currently supports metadata filters of the form: ``` {"filter": {"key": "value"}} ``` which will return documents where the "key" metadata field is equal to "value". This PR adds support for metadata filters of the form: ``` {"filter": {"key": { "IN" : ["list", "of", "values"]}}} ``` Other vector stores support this via an "$in" syntax. I chose to use "IN" to match postgres' syntax, though happy to switch. Tested locally with PGVector and ChatVectorDBChain. @dev2049 --------- Co-authored-by: [email protected] <[email protected]>

# Delete a useless "print"

# Change the logger message level The library is logging at `error` level a situation that is not an error. We noticed this error in our logs, but from our point of view it's an expected behavior and the log level should be `warning`.

# Improve Evernote Document Loader When exporting from Evernote you may export more than one note. Currently the Evernote loader concatenates the content of all notes in the export into a single document and only attaches the name of the export file as metadata on the document. This change ensures that each note is loaded as an independent document and all available metadata on the note e.g. author, title, created, updated are added as metadata on each document. It also uses an existing optional dependency of `html2text` instead of `pypandoc` to remove the need to download the pandoc application via `download_pandoc()` to be able to use the `pypandoc` python bindings. Fixes langchain-ai#4493 Co-authored-by: Mike McGarry <[email protected]> Co-authored-by: Dev 2049 <[email protected]>

Fix construction and add unit test.

# changed ValueError to ImportError in except Several places with this bug. ValueError does not catch ImportError.

- Higher accuracy on the responses - New redesigned UI - Pretty Sources: display the sources by title / sub-section instead of long URL. - Fixed Reset Button bugs and some other UI issues - Other tweaks

) # added instruction about pip install google-gerativeai added instruction about pip install google-gerativeai

# Update the GPTCache example Fixes langchain-ai#4757

…-ai#5008) This reverts commit 8c28ad6. Seems to be causing langchain-ai#5001

# Add self query translator for weaviate vectorstore Adds support for the EQ comparator and the AND/OR operators. Co-authored-by: Dominic Chan <[email protected]> Co-authored-by: Dev 2049 <[email protected]>

…chain-ai#4892) # Ensuring that users pass a single prompt when calling a LLM - This PR adds a check to the `__call__` method of the `BaseLLM` class to ensure that it is called with a single prompt - Raises a `ValueError` if users try to call a LLM with a list of prompt and instructs them to use the `generate` method instead ## Why this could be useful I stumbled across this by accident. I accidentally called the OpenAI LLM with a list of prompts instead of a single string and still got a result: ``` >>> from langchain.llms import OpenAI >>> llm = OpenAI() >>> llm(["Tell a joke"]*2) "\n\nQ: Why don't scientists trust atoms?\nA: Because they make up everything!" ``` It might be better to catch such a scenario preventing unnecessary costs and irritation for the user. ## Proposed behaviour ``` >>> from langchain.llms import OpenAI >>> llm = OpenAI() >>> llm(["Tell a joke"]*2) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Users/marcus/Projects/langchain/langchain/llms/base.py", line 291, in __call__ raise ValueError( ValueError: Argument `prompt` is expected to be a single string, not a list. If you want to run the LLM on multiple prompts, use `generate` instead. ```

to the plus server

will add unit tests

…i#4630) # Streaming only final output of agent (langchain-ai#2483) As requested in issue langchain-ai#2483, this Callback allows to stream only the final output of an agent (ie not the intermediate steps). Fixes langchain-ai#2483 Co-authored-by: Dev 2049 <[email protected]>

# Fixes an annoying typo in docs   Fixes Annoying typo in docs - "Therefor" -> "Therefore". It's so annoying to read that I just had to make this PR.

@dev2049

# Add documentation for Databricks integration This is a follow-up of langchain-ai#4702 It documents the details of how to integrate Databricks using langchain. It also provides examples in a notebook. ## Who can review? @dev2049 @hwchase17 since you are aware of the context. We will promote the integration after this doc is ready. Thanks in advance!

# Corrected Misspelling in agents.rst Documentation  In the [documentation](https://python.langchain.com/en/latest/modules/agents.html) it says "in fact, it is often best to have an Action Agent be in **change** of the execution for the Plan and Execute agent." **Suggested Change:** I propose correcting change to charge. Fix for issue: langchain-ai#5039

Co-authored-by: Ayan Bandyopadhyay <[email protected]> Co-authored-by: Dev 2049 <[email protected]>

…chain-ai#4525) ### Submit Multiple Files to the Unstructured API Enables batching multiple files into a single Unstructured API requests. Support for requests with multiple files was added to both `UnstructuredAPIFileLoader` and `UnstructuredAPIFileIOLoader`. Note that if you submit multiple files in "single" mode, the result will be concatenated into a single document. We recommend using this feature in "elements" mode. ### Testing The following should load both documents, using two of the example docs from the integration tests folder. ```python from langchain.document_loaders import UnstructuredAPIFileLoader file_paths = ["examples/layout-parser-paper.pdf", "examples/whatsapp_chat.txt"] loader = UnstructuredAPIFileLoader( file_paths=file_paths, api_key="FAKE_API_KEY", strategy="fast", mode="elements", ) docs = loader.load() ```

Without the addition of 'in its original language', the condensing response, more often than not, outputs the rephrased question in English, even when the conversation is in another language. This question in English then transfers to the question in the retrieval prompt and the chatbot is stuck in English. I'm sometimes surprised that this does not happen more often, but apparently the GPT models are smart enough to understand that when the template contains Question: .... Answer: then the answer should be in in the language of the question.

@dev2049

# docs: `deployments` page moved into `ecosystem/` The `Deployments` page moved into the `Ecosystem/` group Small fixes: - `index` page: fixed order of items in the `Modules` list, in the `Use Cases` list - item `References/Installation` was lost in the `index` page (not on the Navbar!). Restored it. - added `|` marker in several places. NOTE: I also thought about moving the `Additional Resources/Gallery` page into the `Ecosystem` group but decided to leave it unchanged. Please, advise on this. ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @dev2049

Extract the methods specific to running an LLM or Chain on a dataset to separate utility functions. This simplifies the client a bit and lets us separate concerns of LCP details from running examples (e.g., for evals)

Let user inspect the token ids in addition to getting th enumber of tokens --------- Co-authored-by: Zach Schillaci <[email protected]>

…#4997) Update to pull request langchain-ai#3215 Summary: 1) Improved the sanitization of query (using regex), by removing python command (since gpt-3.5-turbo sometimes assumes python console as a terminal, and runs python command first which causes error). Also sometimes 1 line python codes contain single backticks. 2) Added 7 new test cases. For more details, view the previous pull request. --------- Co-authored-by: Deepak S V <[email protected]>

Co-authored-by: Tomaz Bratanic <[email protected]> Co-authored-by: Dev 2049 <[email protected]>

@dev2049

…instead (langchain-ai#5015) tldr: The docarray [integration PR](langchain-ai#4483) introduced a pinned dependency to protobuf. This is a docarray dependency, not a langchain dependency. Since this is handled by the docarray dependencies, it is unnecessary here. Further, as a pinned dependency, this quickly leads to incompatibilities with application code that consumes the library. Much less with a heavily used library like protobuf. Detail: as we see in the [docarray integration](https://github.com/hwchase17/langchain/pull/4483/files#diff-50c86b7ed8ac2cf95bd48334961bf0530cdc77b5a56f852c5c61b89d735fd711R81-R83), the transitive dependencies of docarray were also listed as langchain dependencies. This is unnecessary as the docarray project has an appropriate [extras](https://github.com/docarray/docarray/blob/a01a05542d17264b8a164bec783633658deeedb8/pyproject.toml#L70). The docarray project also does not require this _pinned_ version of protobuf, rather [a minimum version](https://github.com/docarray/docarray/blob/a01a05542d17264b8a164bec783633658deeedb8/pyproject.toml#L41). So this pinned version was likely in error. To fix this, this PR reverts the explicit hnswlib and protobuf dependencies and adds the hnswlib extras install for docarray (which installs hnswlib and protobuf, as originally intended). Because version `0.32.0` of the docarray hnswlib extras added protobuf, we bump the docarray dependency from `^0.31.0` to `^0.32.0`. # revert docarray explicit transitive dependencies and use extras instead ## Who can review? @dev2049 -- reviewed the original PR @eyurtsev -- bumped the pinned protobuf dependency a few days ago --------- Co-authored-by: Dev 2049 <[email protected]>

@hwchase17

This is a highly optimized update to the pull request langchain-ai#3269 Summary: 1) Added ability to MRKL agent to self solve the ValueError(f"Could not parse LLM output: `{llm_output}`") error, whenever llm (especially gpt-3.5-turbo) does not follow the format of MRKL Agent, while returning "Action:" & "Action Input:". 2) The way I am solving this error is by responding back to the llm with the messages "Invalid Format: Missing 'Action:' after 'Thought:'" & "Invalid Format: Missing 'Action Input:' after 'Action:'" whenever Action: and Action Input: are not present in the llm output respectively. For a detailed explanation, look at the previous pull request. New Updates: 1) Since @hwchase17 , requested in the previous PR to communicate the self correction (error) message, using the OutputParserException, I have added new ability to the OutputParserException class to store the observation & previous llm_output in order to communicate it to the next Agent's prompt. This is done, without breaking/modifying any of the functionality OutputParserException previously performs (i.e. OutputParserException can be used in the same way as before, without passing any observation & previous llm_output too). --------- Co-authored-by: Deepak S V <[email protected]>

@dev2049

…ngchain-ai#5098) # Improve pinecone hybrid search retriever adding metadata support I simply remove the hardwiring of metadata to the existing implementation allowing one to pass `metadatas` attribute to the constructors and in `get_relevant_documents`. I also add one missing pip install to the accompanying notebook (I am not adding dependencies, they were pre-existing). First contribution, just hoping to help, feel free to critique :) my twitter username is `@andreliebschner` While looking at hybrid search I noticed langchain-ai#3043 and langchain-ai#1743. I think the former can be closed as following the example right now (even prior to my improvements) works just fine, the latter I think can be also closed safely, maybe pointing out the relevant classes and example. Should I reply those issues mentioning someone? @dev2049, @hwchase17 --------- Co-authored-by: Andreas Liebschner <[email protected]>

@dev2049

… authentication (langchain-ai#5058) Enhance the code to support SSL authentication for Elasticsearch when using the VectorStore module, as previous versions did not provide this capability. @dev2049 --------- Co-authored-by: caidong <[email protected]> Co-authored-by: Dev 2049 <[email protected]>

…ex (langchain-ai#5059) # Row-wise cosine similarity between two equal-width matrices and return the max top_k score and index, the score all greater than threshold_score. Co-authored-by: Dev 2049 <[email protected]>

…angchain-ai#5090) # PowerBI major refinement in working of tool and tweaks in the rest I've gained some experience with more complex sets and the earlier implementation had too many tries by the agent to create DAX, so refactored the code to run the LLM to create dax based on a question and then immediately run the same against the dataset, with retries and a prompt that includes the error for the retry. This works much better! Also did some other refactoring of the inner workings, making things clearer, more concise and faster.

langchain-ai#4933) # fix a bug in the add_texts method of Weaviate vector store that creats wrong embeddings The following is the original code in the `add_texts` method of the Weaviate vector store, from line 131 to 153, which contains a bug. The code here includes some extra explanations in the form of comments and some omissions. ```python for i, doc in enumerate(texts): # some code omitted if self._embedding is not None: # variable texts is a list of string and doc here is just a string. # list(doc) actually breaks up the string into characters. # so, embeddings[0] is just the embedding of the first character embeddings = self._embedding.embed_documents(list(doc)) batch.add_data_object( data_object=data_properties, class_name=self._index_name, uuid=_id, vector=embeddings[0], ) ``` To fix this bug, I pulled the embedding operation out of the for loop and embed all texts at once. Co-authored-by: Shawn91 <[email protected]> Co-authored-by: Dev 2049 <[email protected]>

…angchain-ai#5005) # Currently, only the dev images are updated

langchain-ai#5101) `from langchain.experimental.autonomous_agents.autogpt.agent import AutoGPT` results in an import error as AutoGPT is not defined in the __init__.py file https://python.langchain.com/en/latest/use_cases/autonomous_agents/marathon_times.html An Alternate, way would be to be directly update the import statement to be `from langchain.experimental import AutoGPT` Co-authored-by: Dev 2049 <[email protected]>

@vowelparrot

Added link option in _process_response   Fixes # (issue) In _process_response link provided correct answers while the snippet reply provided non working links @vowelparrot ## Before submitting  ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested:  --------- Co-authored-by: Dev 2049 <[email protected]>

# changed ValueError to ImportError Code cleaning. Fixed inconsistencies in ImportError handling. Sometimes it raises ImportError and sometime ValueError. I've changed all cases to the `raise ImportError` Also: - added installation instruction in the error message, where it missed; - fixed several installation instructions in the error message; - fixed several error handling in regards to the ImportError

@hwchase17

…angchain-ai#5045) # Assign `current_time` to `datetime.now()` if it `current_time is None` in `time_weighted_retriever` Fixes langchain-ai#4825 As implemented, `add_documents` in `TimeWeightedVectorStoreRetriever` assigns `doc.metadata["last_accessed_at"]` and `doc.metadata["created_at"]` to `datetime.datetime.now()` if `current_time` is not in `kwargs`. ```python def add_documents(self, documents: List[Document], **kwargs: Any) -> List[str]: """Add documents to vectorstore.""" current_time = kwargs.get("current_time", datetime.datetime.now()) # Avoid mutating input documents dup_docs = [deepcopy(d) for d in documents] for i, doc in enumerate(dup_docs): if "last_accessed_at" not in doc.metadata: doc.metadata["last_accessed_at"] = current_time if "created_at" not in doc.metadata: doc.metadata["created_at"] = current_time doc.metadata["buffer_idx"] = len(self.memory_stream) + i self.memory_stream.extend(dup_docs) return self.vectorstore.add_documents(dup_docs, **kwargs) ``` However, from the way `add_documents` is being called from `GenerativeAgentMemory`, `current_time` is set as a `kwarg`, but it is given a value of `None`: ```python def add_memory( self, memory_content: str, now: Optional[datetime] = None ) -> List[str]: """Add an observation or memory to the agent's memory.""" importance_score = self._score_memory_importance(memory_content) self.aggregate_importance += importance_score document = Document( page_content=memory_content, metadata={"importance": importance_score} ) result = self.memory_retriever.add_documents([document], current_time=now) ``` The default of `now` was set in langchain-ai#4658 to be None. The proposed fix is the following: ```python def add_documents(self, documents: List[Document], **kwargs: Any) -> List[str]: """Add documents to vectorstore.""" current_time = kwargs.get("current_time", datetime.datetime.now()) # `current_time` may exist in kwargs, but may still have the value of None. if current_time is None: current_time = datetime.datetime.now() ``` Alternatively, we could just set the default of `now` to be `datetime.datetime.now()` everywhere instead. Thoughts @hwchase17? If we still want to keep the default to be `None`, then this PR should fix the above issue. If we want to set the default to be `datetime.datetime.now()` instead, I can update this PR with that alternative fix. EDIT: seems like from langchain-ai#5018 it looks like we would prefer to keep the default to be `None`, in which case this PR should fix the error.

# Add Mastodon toots loader. Loader works either with public toots, or Mastodon app credentials. Toot text and user info is loaded. I've also added integration test for this new loader as it works with public data, and a notebook with example output run now. --------- Co-authored-by: Dev 2049 <[email protected]>

OpenLM is a zero-dependency OpenAI-compatible LLM provider that can call different inference endpoints directly via HTTP. It implements the OpenAI Completion class so that it can be used as a drop-in replacement for the OpenAI API. This changeset utilizes BaseOpenAI for minimal added code. --------- Co-authored-by: Dev 2049 <[email protected]>

Pass dataset name by name

…angchain-ai#5085) Implementation is similar to search_distance and where_filter # adds 'additional' support to Weaviate queries Co-authored-by: Dev 2049 <[email protected]>

@eyurtsev

…gchain-ai#5111) # Improve TextSplitter.split_documents, collect page_content and metadata in one iteration ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @eyurtsev In the case where documents is a generator that can only be iterated once making this change is a huge help. Otherwise a silent issue happens where metadata is empty for all documents when documents is a generator. So we expand the argument from `List[Document]` to `Union[Iterable[Document], Sequence[Document]]` --------- Co-authored-by: Steven Tartakovsky <[email protected]>

@andrewelizondo

# Add a WhyLabs callback handler * Adds a simple WhyLabsCallbackHandler * Add required dependencies as optional * protect against missing modules with imports * Add docs/ecosystem basic example based on initial prototype from @andrewelizondo > this integration gathers privacy preserving telemetry on text with whylogs and sends stastical profiles to WhyLabs platform to monitoring these metrics over time. For more information on what WhyLabs is see: https://whylabs.ai After you run the notebook (if you have env variables set for the API Keys, org_id and dataset_id) you get something like this in WhyLabs: ![Screenshot (443)](https://github.com/hwchase17/langchain/assets/88007022/6bdb3e1c-4243-4ae8-b974-23a8bb12edac) Co-authored-by: Andre Elizondo <[email protected]> Co-authored-by: Dev 2049 <[email protected]>

langchain-ai#5012) # Add AzureCognitiveServicesToolkit to call Azure Cognitive Services API: achieve some multimodal capabilities This PR adds a toolkit named AzureCognitiveServicesToolkit which bundles the following tools: - AzureCogsImageAnalysisTool: calls Azure Cognitive Services image analysis API to extract caption, objects, tags, and text from images. - AzureCogsFormRecognizerTool: calls Azure Cognitive Services form recognizer API to extract text, tables, and key-value pairs from documents. - AzureCogsSpeech2TextTool: calls Azure Cognitive Services speech to text API to transcribe speech to text. - AzureCogsText2SpeechTool: calls Azure Cognitive Services text to speech API to synthesize text to speech. This toolkit can be used to process image, document, and audio inputs. --------- Co-authored-by: Dev 2049 <[email protected]>

…in-ai#5115) # Add link to Psychic from document loaders documentation page In my previous PR I forgot to update `document_loaders.rst` to link to `psychic.ipynb` to make it discoverable from the main documentation.

…an_input_llm.ipynb (langchain-ai#5118) # Fix typo + add wikipedia package installation part in human_input_llm.ipynb This PR 1. Fixes typo ("the the human input LLM"), 2. Addes wikipedia package installation part (in accordance with `WikipediaQueryRun` [documentation](https://python.langchain.com/en/latest/modules/agents/tools/examples/wikipedia.html)) in `human_input_llm.ipynb` (`docs/modules/models/llms/examples/human_input_llm.ipynb`)

# Allowing openAI fine-tuned models Very simple fix that checks whether a openAI `model_name` is a fine-tuned model when loading `context_size` and when computing call's cost in the `openai_callback`. Fixes langchain-ai#2887 --------- Co-authored-by: Dev 2049 <[email protected]>

Some LLM's will produce numbered lists with leading whitespace, i.e. in response to "What is the sum of 2 and 3?": ``` Plan: 1. Add 2 and 3. 2. Given the above steps taken, please respond to the users original question. ``` This commit updates the PlanningOutputParser regex to ignore leading whitespace before the step number, enabling it to correctly parse this format.

…sticsearch models (langchain-ai#3401) This PR introduces a new module, `elasticsearch_embeddings.py`, which provides a wrapper around Elasticsearch embedding models. The new ElasticsearchEmbeddings class allows users to generate embeddings for documents and query texts using a [model deployed in an Elasticsearch cluster](https://www.elastic.co/guide/en/machine-learning/current/ml-nlp-model-ref.html#ml-nlp-model-ref-text-embedding). ### Main features: 1. The ElasticsearchEmbeddings class initializes with an Elasticsearch connection object and a model_id, providing an interface to interact with the Elasticsearch ML client through [infer_trained_model](https://elasticsearch-py.readthedocs.io/en/v8.7.0/api.html?highlight=trained%20model%20infer#elasticsearch.client.MlClient.infer_trained_model) . 2. The `embed_documents()` method generates embeddings for a list of documents, and the `embed_query()` method generates an embedding for a single query text. 3. The class supports custom input text field names in case the deployed model expects a different field name than the default `text_field`. 4. The implementation is compatible with any model deployed in Elasticsearch that generates embeddings as output. ### Benefits: 1. Simplifies the process of generating embeddings using Elasticsearch models. 2. Provides a clean and intuitive interface to interact with the Elasticsearch ML client. 3. Allows users to easily integrate Elasticsearch-generated embeddings. Related issue langchain-ai#3400 --------- Co-authored-by: Dev 2049 <[email protected]>

Co-authored-by: Tyler Hutcherson <[email protected]> Co-authored-by: Dev 2049 <[email protected]>

@hwchase17

# Add MosaicML inference endpoints This PR adds support in langchain for MosaicML inference endpoints. We both serve a select few open source models, and allow customers to deploy their own models using our inference service. Docs are here (https://docs.mosaicml.com/en/latest/inference.html), and sign up form is here (https://forms.mosaicml.com/demo?utm_source=langchain). I'm not intimately familiar with the details of langchain, or the contribution process, so please let me know if there is anything that needs fixing or this is the wrong way to submit a new integration, thanks! I'm also not sure what the procedure is for integration tests. I have tested locally with my api key. ## Who can review? @hwchase17 --------- Co-authored-by: Harrison Chase <[email protected]>

# Check whether 'other' is empty before popping This PR could fix a potential 'popping empty set' error. Co-authored-by: Junlin Zhou <[email protected]>

@hwchase17

…4867) # Add async versions of predict() and predict_messages() langchain-ai#4615 introduced a unifying interface for "base" and "chat" LLM models via the new `predict()` and `predict_messages()` methods that allow both types of models to operate on string and message-based inputs, respectively. This PR adds async versions of the same (`apredict()` and `apredict_messages()`) that are identical except for their use of `agenerate()` in place of `generate()`, which means they repurpose all existing work on the async backend. ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @hwchase17 (follows his work on langchain-ai#4615) @agola11 (async) --------- Co-authored-by: Harrison Chase <[email protected]>

@dev2049

…ever (langchain-ai#5155) # Same as PR langchain-ai#5045, but for async   Fixes langchain-ai#4825 I had forgotten to update the asynchronous counterpart `aadd_documents` with the bug fix from PR langchain-ai#5045, so this PR also fixes `aadd_documents` too. ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @dev2049

@vowelparrot

# Docs: updated getting_started.md Just accommodating some unnecessary spaces in the example of "pass few shot examples to a prompt template". @vowelparrot

@hwchase17

langchain-ai#5154) # Clarification of the reference to the "get_text_legth" function in getting_started.md Reference to the function "get_text_legth" in the documentation did not make sense. Comment added for clarification. @hwchase17

@hwchase17

# DOCS added missed document_loader examples Added missed examples: `JSON`, `Open Document Format (ODT)`, `Wikipedia`, `tomarkdown`. Updated them to a consistent format. ## Who can review? @hwchase17 @dev2049

Closes langchain-ai#931. --------- Co-authored-by: Dev 2049 <[email protected]>

# Vectara Integration This PR provides integration with Vectara. Implemented here are: * langchain/vectorstore/vectara.py * tests/integration_tests/vectorstores/test_vectara.py * langchain/retrievers/vectara_retriever.py And two IPYNB notebooks to do more testing: * docs/modules/chains/index_examples/vectara_text_generation.ipynb * docs/modules/indexes/vectorstores/examples/vectara.ipynb --------- Co-authored-by: Dev 2049 <[email protected]>

# Beam Calls the Beam API wrapper to deploy and make subsequent calls to an instance of the gpt2 LLM in a cloud deployment. Requires installation of the Beam library and registration of Beam Client ID and Client Secret. Additional calls can then be made through the instance of the large language model in your code or by calling the Beam API. --------- Co-authored-by: Dev 2049 <[email protected]>

# Your PR Title (What it does) HuggingFace -> Hugging Face

Adding example usage for elasticsearch knn embeddings [per](langchain-ai#3401 (comment)) https://github.com/hwchase17/langchain/blob/master/langchain/embeddings/elasticsearch.py

Follow up of langchain-ai#5015 Thanks for catching this! Just a small PR to adjust couple of strings to these changes Signed-off-by: jupyterjazz <[email protected]>

Co-authored-by: thomas-yanxin <[email protected]> Co-authored-by: Dev 2049 <[email protected]>

@hwchase17

# Reuse `length_func` in `MapReduceDocumentsChain` Pretty straightforward refactor in `MapReduceDocumentsChain`. Reusing the local variable `length_func`, instead of the longer alternative `self.combine_document_chain.prompt_length`. @hwchase17

# Improve Cypher QA prompt The current QA prompt is optimized for networkX answer generation, which returns all the possible triples. However, Cypher search is a bit more focused and doesn't necessary return all the context information. Due to that reason, the model sometimes refuses to generate an answer even though the information is provided: ![Screenshot from 2023-05-24 08-36-23](https://github.com/hwchase17/langchain/assets/19948365/351cf9c1-2567-447c-91fd-284ae3fa1ccf) To fix this issue, I have updated the prompt. Interestingly, I tried many variations with less instructions and they didn't work properly. However, the current fix works nicely. ![Screenshot from 2023-05-24 08-37-25](https://github.com/hwchase17/langchain/assets/19948365/fc830603-e6ec-4a23-8a86-eaf572996014)

# Improve weaviate vectorstore docs

Co-authored-by: vempaliakhil96 <[email protected]>

Co-authored-by: Dev 2049 <[email protected]>

# OpanAI finetuned model giving zero tokens cost Very simple fix to the previously committed solution to allowing finetuned Openai models. Improves langchain-ai#5127 --------- Co-authored-by: Dev 2049 <[email protected]>

`vectorstore.PGVector`: The transactional boundary should be increased to cover the query itself Currently, within the `similarity_search_with_score_by_vector` the transactional boundary (created via the `Session` call) does not include the select query being made. This can result in un-intended consequences when interacting with the PGVector instance methods directly --------- Co-authored-by: Dev 2049 <[email protected]>

# Output parsing variation allowance for self-ask with search This change makes self-ask with search easier for Llama models to follow, as they tend toward returning 'Followup:' instead of 'Follow up:' despite an otherwise valid remaining output. Co-authored-by: Dev 2049 <[email protected]>

## Description The html structure of readthedocs can differ. Currently, the html tag is hardcoded in the reader, and unable to fit into some cases. This pr includes the following changes: 1. Replace `find_all` with `find` because we just want one tag. 2. Provide `custom_html_tag` to the loader. 3. Add tests for readthedoc loader 4. Refactor code ## Issues See more in langchain-ai#2609. The problem was not completely fixed in that pr. --------- Signed-off-by: byhsu <[email protected]> Co-authored-by: byhsu <[email protected]> Co-authored-by: Dev 2049 <[email protected]>

Create IUGU loader --------- Co-authored-by: Dev 2049 <[email protected]>

# Add Joplin document loader [Joplin](https://joplinapp.org/) is an open source note-taking app. Joplin has a [REST API](https://joplinapp.org/api/references/rest_api/) for accessing its local database. The proposed `JoplinLoader` uses the API to retrieve all notes in the database and their metadata. Joplin needs to be installed and running locally, and an access token is required. - The PR includes an integration test. - The PR includes an example notebook. --------- Co-authored-by: Dev 2049 <[email protected]>

…#5213)

Changes debug log to warning log when LC Tracer fails to instantiate

Example: ``` $ langchain plus start --expose ... $ langchain plus status The LangChainPlus server is currently running. Service Status Published Ports langchain-backend Up 40 seconds 1984 langchain-db Up 41 seconds 5433 langchain-frontend Up 40 seconds 80 ngrok Up 41 seconds 4040 To connect, set the following environment variables in your LangChain application: LANGCHAIN_TRACING_V2=true LANGCHAIN_ENDPOINT=https://5cef-70-23-89-158.ngrok.io $ langchain plus stop $ langchain plus status The LangChainPlus server is not running. $ langchain plus start The LangChainPlus server is currently running. Service Status Published Ports langchain-backend Up 5 seconds 1984 langchain-db Up 6 seconds 5433 langchain-frontend Up 5 seconds 80 To connect, set the following environment variables in your LangChain application: LANGCHAIN_TRACING_V2=true LANGCHAIN_ENDPOINT=http://localhost:1984 ```

Co-authored-by: Leonid Kuligin <[email protected]> Co-authored-by: Leonid Kuligin <[email protected]> Co-authored-by: sasha-gitg <[email protected]> Co-authored-by: Justin Flick <[email protected]> Co-authored-by: Justin Flick <[email protected]>

# fix a mistake in concepts.md ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested:

@agola11

…-ai#5214) Copies `GraphIndexCreator.from_text()` to make an async version called `GraphIndexCreator.afrom_text()`. This is (should be) a trivial change: it just adds a copy of `GraphIndexCreator.from_text()` which is async and awaits a call to `chain.apredict()` instead of `chain.predict()`. There is no unit test for GraphIndexCreator, and I did not create one, but this code works for me locally. @agola11 @hwchase17

@hwchase17

I found an API key for `serpapi_api_key` while reading the docs. It seems to have been modified very recently. Removed it in this PR @hwchase17 - project lead

@eyurtsev

…issue langchain-ai#5104) (langchain-ai#5220) # Change Default GoogleDriveLoader Behavior to not Load Trashed Files (issue langchain-ai#5104) Fixes langchain-ai#5104 If the previous behavior of loading files that used to live in the folder, but are now trashed, you can use the `load_trashed_files` parameter: ``` loader = GoogleDriveLoader( folder_id="1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5", recursive=False, load_trashed_files=True ) ``` As not loading trashed files should be expected behavior, should we 1. even provide the `load_trashed_files` parameter? 2. add documentation? Feels most users will stick with default behavior ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: DataLoaders - @eyurtsev Twitter: [@nicholasliu77](https://twitter.com/nicholasliu77)

@Xmaster6y

…ai#5190) # Allow to specify ID when adding to the FAISS vectorstore This change allows unique IDs to be specified when adding documents / embeddings to a faiss vectorstore. - This reflects the current approach with the chroma vectorstore. - It allows rejection of inserts on duplicate IDs - will allow deletion / update by searching on deterministic ID (such as a hash). - If not specified, a random UUID is generated (as per previous behaviour, so non-breaking). This commit fixes langchain-ai#5065 and langchain-ai#3896 and should fix langchain-ai#2699 indirectly. I've tested adding and merging. Kindly tagging @Xmaster6y @dev2049 for review. --------- Co-authored-by: Ati Sharma <[email protected]> Co-authored-by: Harrison Chase <[email protected]>

# Bibtex integration Wrap bibtexparser to retrieve a list of docs from a bibtex file. * Get the metadata from the bibtex entries * `page_content` get from the local pdf referenced in the `file` field of the bibtex entry using `pymupdf` * If no valid pdf file, `page_content` set to the `abstract` field of the bibtex entry * Support Zotero flavour using regex to get the file path * Added usage example in `docs/modules/indexes/document_loaders/examples/bibtex.ipynb` --------- Co-authored-by: Sébastien M. Popoff <[email protected]> Co-authored-by: Dev 2049 <[email protected]>

- Add support for MiniMax embeddings Doc: [MiniMax embeddings](https://api.minimax.chat/document/guides/embeddings?id=6464722084cdc277dfaa966a) --------- Co-authored-by: Archon <[email protected]> Co-authored-by: Dev 2049 <[email protected]>

@hwchase17

# Add QnA with sources example   Fixes: see https://stackoverflow.com/questions/76207160/langchain-doesnt-work-with-weaviate-vector-database-getting-valueerror/76210017#76210017 ## Before submitting  ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested:  @dev2049

langchain-ai#5232) remove extra "\n" to ensure that the format of the description, example, and prompt&generation are completely consistent.

# Resolve error in StructuredOutputParser docs Documentation for `StructuredOutputParser` currently not reproducible, that is, `output_parser.parse(output)` raises an error because the LLM returns a response with an invalid format ```python _input = prompt.format_prompt(question="what's the capital of france") output = model(_input.to_string()) output # ? # # ```json # { # "answer": "Paris", # "source": "https://www.worldatlas.com/articles/what-is-the-capital-of-france.html" # } # ``` ``` Was fixed by adding a question mark to the prompt

…ai#5246) # Added the option of specifying a proxy for the OpenAI API Fixes langchain-ai#5243 Co-authored-by: Yves Maurer <>

@naveentatikonda

For most queries it's the `size` parameter that determines final number of documents to return. Since our abstractions refer to this as `k`, set this to be `k` everywhere instead of expecting a separate param. Would be great to have someone more familiar with OpenSearch validate that this is reasonable (e.g. that having `size` and what OpenSearch calls `k` be the same won't lead to any strange behavior). cc @naveentatikonda Closes langchain-ai#5212

@dev2049

Fixes a regression in JoplinLoader that was introduced during the code review (bad `page` wildcard in _get_note_url). ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @dev2049 @leo-gan

# Docs: link custom agent page in getting started

zep-python's sync methods no longer need an asyncio wrapper. This was causing issues with FastAPI deployment. Zep also now supports putting and getting of arbitrary message metadata. Bump zep-python version to v0.30 Remove nest-asyncio from Zep example notebooks. Modify tests to include metadata. --------- Co-authored-by: Daniel Chalef <[email protected]> Co-authored-by: Daniel Chalef <[email protected]>

# Add C Transformers for GGML Models I created Python bindings for the GGML models: https://github.com/marella/ctransformers Currently it supports GPT-2, GPT-J, GPT-NeoX, LLaMA, MPT, etc. See [Supported Models](https://github.com/marella/ctransformers#supported-models). It provides a unified interface for all models: ```python from langchain.llms import CTransformers llm = CTransformers(model='/path/to/ggml-gpt-2.bin', model_type='gpt2') print(llm('AI is going to')) ``` It can be used with models hosted on the Hugging Face Hub: ```py llm = CTransformers(model='marella/gpt-2-ggml') ``` It supports streaming: ```py from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler llm = CTransformers(model='marella/gpt-2-ggml', callbacks=[StreamingStdOutCallbackHandler()]) ``` Please see [README](https://github.com/marella/ctransformers#readme) for more details. --------- Co-authored-by: Dev 2049 <[email protected]>

) Partially addresses: langchain-ai#4066

…5009) Add Multi-CSV/DF support in CSV and DataFrame Toolkits * CSV and DataFrame toolkits now accept list of CSVs/DFs * Add default prompts for many dataframes in `pandas_dataframe` toolkit Fixes langchain-ai#1958 Potentially fixes langchain-ai#4423 ## Testing * Add single and multi-dataframe integration tests for `pandas_dataframe` toolkit with permutations of `include_df_in_prompt` * Add single and multi-CSV integration tests for csv toolkit --------- Co-authored-by: Harrison Chase <[email protected]>

Causing lint issues if you have openai installed, annoying for local dev

…ai#5268) The current `HuggingFacePipeline.from_model_id` does not allow passing of pipeline arguments to the transformer pipeline. This PR enables adding important pipeline parameters like setting `max_new_tokens` for example. Previous to this PR it would be necessary to manually create the pipeline through huggingface transformers then handing it to langchain. For example instead of this ```py model_id = "gpt2" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained(model_id) pipe = pipeline( "text-generation", model=model, tokenizer=tokenizer, max_new_tokens=10 ) hf = HuggingFacePipeline(pipeline=pipe) ``` You can write this ```py hf = HuggingFacePipeline.from_model_id( model_id="gpt2", task="text-generation", pipeline_kwargs={"max_new_tokens": 10} ) ``` Co-authored-by: Dev 2049 <[email protected]>

@dev2049

# Your PR Title (What it does) Adding an if statement to deal with bigquery sql dialect. When I use bigquery dialect before, it failed while using SET search_path TO. So added a condition to set dataset as the schema parameter which is equivalent to SET search_path TO . I have tested and it works. ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @dev2049

…er (langchain-ai#5221) # Add Momento as a standard cache and chat message history provider This PR adds Momento as a standard caching provider. Implements the interface, adds integration tests, and documentation. We also add Momento as a chat history message provider along with integration tests, and documentation. [Momento](https://www.gomomento.com/) is a fully serverless cache. Similar to S3 or DynamoDB, it requires zero configuration, infrastructure management, and is instantly available. Users sign up for free and get 50GB of data in/out for free every month. ## Before submitting ✅ We have added documentation, notebooks, and integration tests demonstrating usage. Co-authored-by: Dev 2049 <[email protected]>

# Fixed typo: 'ouput' to 'output' in all documentation In this instance, the typo 'ouput' was amended to 'output' in all occurrences within the documentation. There are no dependencies required for this change.

# Add twilio sms tool --------- Co-authored-by: Dev 2049 <[email protected]>

This PR adds LLM wrapper for Databricks. It supports two endpoint types: * serving endpoint * cluster driver proxy app An integration notebook is included to show how it works. Co-authored-by: Davis Chase <[email protected]> Co-authored-by: Gengliang Wang <[email protected]> Co-authored-by: Dev 2049 <[email protected]>

# Add example to LLMMath to help with power operator Add example to LLMMath that helps the model to interpret `^` as the power operator rather than the python xor operator.

# Update contribution guidelines and PR template This PR updates the contribution guidelines to include more information on how to handle optional dependencies. The PR template is updated to include a link to the contribution guidelines document.

# Fixed passing creds to VertexAI LLM Fixes langchain-ai#5279 It looks like we should drop a type annotation for Credentials. Co-authored-by: Leonid Kuligin <[email protected]>

@hwchase17

# Better docs for weaviate hybrid search   Fixes: NA ## Before submitting  ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested:  @dev2049

# Add instructions to pyproject.toml * Add instructions to pyproject.toml about how to handle optional dependencies. ## Before submitting ## Who can review? --------- Co-authored-by: Davis Chase <[email protected]> Co-authored-by: Zander Chase <[email protected]>

@dev2049

# docs: improve flow of llm caching notebook The notebook `llm_caching` demos various caching providers. In the previous version, there was setup common to all examples but under the `In Memory Caching` heading. If a user comes and only wants to try a particular example, they will run the common setup, then the cells for the specific provider they are interested in. Then they will get import and variable reference errors. This commit moves the common setup to the top to avoid this. ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @dev2049

# Documentation typo fixes Fixes # (issue) Simple typos in the blockchain .ipynb documentation

# added a link to LangChain Handbook ## Who can review? Community members can review the PR once tests pass.

# Add Chainlit to deployment options Add [Chainlit](https://github.com/Chainlit/chainlit) as deployment options Used links to Github examples and Chainlit doc on the LangChain integration Co-authored-by: Dan Constantini <[email protected]>

@vowelparrot

…i#5331) Fixed the issue of blank Thoughts being printed in verbose when `handle_parsing_errors=True`, as below: Before Fix: ``` Observation: There are 38175 accounts available in the dataframe. Thought: Observation: Invalid or incomplete response Thought: Observation: Invalid or incomplete response Thought: ``` After Fix: ``` Observation: There are 38175 accounts available in the dataframe. Thought:AI: { "action": "Final Answer", "action_input": "There are 38175 accounts available in the dataframe." } Observation: Invalid Action or Action Input format Thought:AI: { "action": "Final Answer", "action_input": "The number of available accounts is 38175." } Observation: Invalid Action or Action Input format ``` @vowelparrot currently I have set the colour of thought to green (same as the colour when `handle_parsing_errors=False`). If you want to change the colour of this "_Exception" case to red or something else (when `handle_parsing_errors=True`), feel free to change it in line 789.

@hwchase17

…5320) # remove empty lines in GenerativeAgentMemory that cause InvalidRequestError in OpenAIEmbeddings   Let's say the text given to `GenerativeAgent._parse_list` is ``` text = """ Insight 1: <insight 1> Insight 2: <insight 2> """ ``` This creates an `openai.error.InvalidRequestError: [''] is not valid under any of the given schemas - 'input'` because `GenerativeAgent.add_memory()` tries to add an empty string to the vectorstore. This PR fixes the issue by removing the empty line between `Insight 1` and `Insight 2` ## Before submitting  ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested:  @hwchase17 @vowelparrot @dev2049

@dev2049

# Sample Notebook for DynamoDB Chat Message History @dev2049 Adding a sample notebook for the DynamoDB Chat Message History class.

# Added the ability to pass kwargs to cosmos client constructor The cosmos client has a ton of options that can be set, so allowing those to be passed to the constructor from the chat memory constructor with this PR.

@vowelparrot

# Support for shopping search in SerpApi ## Who can review? @vowelparrot

# Add SKLearnVectorStore This PR adds SKLearnVectorStore, a simply vector store based on NearestNeighbors implementations in the scikit-learn package. This provides a simple drop-in vector store implementation with minimal dependencies (scikit-learn is typically installed in a data scientist / ml engineer environment). The vector store can be persisted and loaded from json, bson and parquet format. SKLearnVectorStore has soft (dynamic) dependency on the scikit-learn, numpy and pandas packages. Persisting to bson requires the bson package, persisting to parquet requires the pyarrow package. ## Before submitting Integration tests are provided under `tests/integration_tests/vectorstores/test_sklearn.py` Sample usage notebook is provided under `docs/modules/indexes/vectorstores/examples/sklear.ipynb` Co-authored-by: Dev 2049 <[email protected]>

@dev2049

# Remove re-use of iter within add_embeddings causing error As reported in langchain-ai#5336 there is an issue currently involving the atempted re-use of an iterator within the FAISS vectorstore adapter Fixes # langchain-ai#5336 ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: VectorStores / Retrievers / Memory - @dev2049

@eyurtsev

# Add path validation to DirectoryLoader This PR introduces a minor adjustment to the DirectoryLoader by adding validation for the path argument. Previously, if the provided path didn't exist or wasn't a directory, DirectoryLoader would return an empty document list due to the behavior of the `glob` method. This could potentially cause confusion for users, as they might expect a file-loading error instead. So, I've added two validations to the load method of the DirectoryLoader: - Raise a FileNotFoundError if the provided path does not exist - Raise a ValueError if the provided path is not a directory Due to the relatively small scope of these changes, a new issue was not created. ## Before submitting  ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @eyurtsev

@dev2049

…angchain-ai#5304) (langchain-ai#5306) # Fix: Handle empty documents in ContextualCompressionRetriever (Issue langchain-ai#5304) Fixes langchain-ai#5304 Prevent cohere.error.CohereAPIError caused by an empty list of documents by adding a condition to check if the input documents list is empty in the compress_documents method. If the list is empty, return an empty list immediately, avoiding the error and unnecessary processing. @dev2049 --------- Co-authored-by: Dev 2049 <[email protected]>

adds tests cases, consolidates a lot of PRs

We shouldn't be calling a constructor for a default value - should use default_factory instead. This is especially ad in this case since it requires an optional dependency and an API key to be set. Resolves langchain-ai#5361

# Updates PR template to request Twitter handle for shoutouts! Makes it easier for maintainers to show their appreciation 😄

@eyurtsev

# Fix lost mimetype when using Blob.from_data method The mimetype is lost due to a typo in the class attribue name Fixes # - (no issue opened but I can open one if needed) ## Changes * Fixed typo in name * Added unit-tests to validate the output Blob ## Review @eyurtsev

@agola11

# Add async support for (LLM) routing chains   Add asynchronous LLM calls support for the routing chains. More specifically: - Add async `aroute` function (i.e. async version of `route`) to the `RouterChain` which calls the routing LLM asynchronously - Implement the async `_acall` for the `LLMRouterChain` - Implement the async `_acall` function for `MultiRouteChain` which first calls asynchronously the routing chain with its new `aroute` function, and then calls asynchronously the relevant destination chain.  ## Who can review? - @agola11

@dev2049

…ai#5359) # Fix for `update_document` Function in Chroma ## Summary This pull request addresses an issue with the `update_document` function in the Chroma class, as described in [langchain-ai#5031](langchain-ai#5031 (comment)). The issue was identified as an `AttributeError` raised when calling `update_document` due to a missing corresponding method in the `Collection` object. This fix refactors the `update_document` method in `Chroma` to correctly interact with the `Collection` object. ## Changes 1. Fixed the `update_document` method in the `Chroma` class to correctly call methods on the `Collection` object. 2. Added the corresponding test `test_chroma_update_document` in `tests/integration_tests/vectorstores/test_chroma.py` to reflect the updated method call. 3. Added an example and explanation of how to use the `update_document` function in the Jupyter notebook tutorial for Chroma. ## Test Plan All existing tests pass after this change. In addition, the `test_chroma_update_document` test case now correctly checks the functionality of `update_document`, ensuring that the function works as expected and updates the content of documents correctly. ## Reviewers @dev2049 This fix will ensure that users are able to use the `update_document` function as expected, without encountering the previous `AttributeError`. This will enhance the usability and reliability of the Chroma class for all users. Thank you for considering this pull request. I look forward to your feedback and suggestions.

@hwchase17

# Update llamacpp demonstration notebook Add instructions to install with BLAS backend, and update the example of model usage. Fixes langchain-ai#5071. However, it is more like a prevention of similar issues in the future, not a fix, since there was no problem in the framework functionality ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: - @hwchase17 - @agola11

@hwchase17

# Removed deprecated llm attribute for load_chain Currently `load_chain` for some chain types expect `llm` attribute to be present but `llm` is deprecated attribute for those chains and might not be persisted during their `chain.save`. Fixes langchain-ai#5224 [(issue)](langchain-ai#5224) ## Who can review? @hwchase17 @dev2049 --------- Co-authored-by: imeckr <[email protected]>

Co-authored-by: Gavin S <[email protected]>

Fixes langchain-ai#5316 --------- Co-authored-by: Justin Flick <[email protected]> Co-authored-by: Harrison Chase <[email protected]>

@hwchase17

# Reformat the openai proxy setting as code Only affect the doc for openai Model - @hwchase17 - @agola11

Co-authored-by: Yessen Kanapin <[email protected]> Co-authored-by: Yessen Kanapin <[email protected]>

Co-authored-by: Daniel Whitenack <[email protected]>

@hwchase17

# Implemented appending arbitrary messages to the base chat message history, the in-memory and cosmos ones.  As discussed this is the alternative way instead of langchain-ai#4480, with a add_message method added that takes a BaseMessage as input, so that the user can control what is in the base message like kwargs.  Fixes # (issue) ## Before submitting  ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @hwchase17 --------- Co-authored-by: Harrison Chase <[email protected]>

@hwchase17

# docs: ecosystem/integrations update 2 langchain-ai#5219 - part 1 The second part of this update (parts are independent of each other! no overlap): - added diffbot.md - updated confluence.ipynb; added confluence.md - updated college_confidential.md - updated openai.md - added blackboard.md - added bilibili.md - added azure_blob_storage.md - added azlyrics.md - added aws_s3.md ## Who can review? @hwchase17@agola11 @agola11 @vowelparrot @dev2049

@hwchase17

# docs: ecosystem/integrations update It is the first in a series of `ecosystem/integrations` updates. The ecosystem/integrations list is missing many integrations. I'm adding the missing integrations in a consistent format: 1. description of the integrated system 2. `Installation and Setup` section with 'pip install ...`, Key setup, and other necessary settings 3. Sections like `LLM`, `Text Embedding Models`, `Chat Models`... with links to correspondent examples and imports of the used classes. This PR keeps new docs, that are presented in the `docs/modules/models/text_embedding/examples` but missed in the `ecosystem/integrations`. The next PRs will cover the next example sections. Also updated `integrations.rst`: added the `Dependencies` section with a link to the packages used in LangChain. ## Who can review? @hwchase17 @eyurtsev @dev2049

Co-authored-by: Jacob Valdez <[email protected]> Co-authored-by: Jacob Valdez <[email protected]> Co-authored-by: Eugene Yurtsev <[email protected]>

@vowelparrot

# Add ToolException that a tool can throw This is an optional exception that tool throws when execution error occurs. When this exception is thrown, the agent will not stop working,but will handle the exception according to the handle_tool_error variable of the tool,and the processing result will be returned to the agent as observation,and printed in pink on the console.It can be used like this: ```python from langchain.schema import ToolException from langchain import LLMMathChain, SerpAPIWrapper, OpenAI from langchain.agents import AgentType, initialize_agent from langchain.chat_models import ChatOpenAI from langchain.tools import BaseTool, StructuredTool, Tool, tool from langchain.chat_models import ChatOpenAI llm = ChatOpenAI(temperature=0) llm_math_chain = LLMMathChain(llm=llm, verbose=True) class Error_tool: def run(self, s: str): raise ToolException('The current search tool is not available.') def handle_tool_error(error) -> str: return "The following errors occurred during tool execution:"+str(error) search_tool1 = Error_tool() search_tool2 = SerpAPIWrapper() tools = [ Tool.from_function( func=search_tool1.run, name="Search_tool1", description="useful for when you need to answer questions about current events.You should give priority to using it.", handle_tool_error=handle_tool_error, ), Tool.from_function( func=search_tool2.run, name="Search_tool2", description="useful for when you need to answer questions about current events", return_direct=True, ) ] agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True, handle_tool_errors=handle_tool_error) agent.run("Who is Leo DiCaprio's girlfriend? What is her current age raised to the 0.43 power?") ``` ![image](https://github.com/hwchase17/langchain/assets/32786500/51930410-b26e-4f85-a1e1-e6a6fb450ada) ## Who can review? - @vowelparrot --------- Co-authored-by: Dev 2049 <[email protected]>

adds support for keeping separators around when using recursive text splitter

# Added New Trello loader class and documentation Simple Loader on top of py-trello wrapper. With a board name you can pull cards and to do some field parameter tweaks on load operation. I included documentation and examples. Included unit test cases using patch and a fixture for py-trello client class. --------- Co-authored-by: Dev 2049 <[email protected]>

# Creates GitHubLoader (langchain-ai#5257) GitHubLoader is a DocumentLoader that loads issues and PRs from GitHub. Fixes langchain-ai#5257 --------- Co-authored-by: Dev 2049 <[email protected]>

Co-authored-by: Rithwik Ediga Lakhamsani <[email protected]> Co-authored-by: Dev 2049 <[email protected]>

Issue from: https://discord.com/channels/1038097195422978059/1069478035918688346/1112445980466483222

# Fix typo in LanceDB notebook filename

# Add MongoDBAtlasVectorSearch for the python library Fixes langchain-ai#5337 --------- Co-authored-by: Dev 2049 <[email protected]>

@eyurtsev

…ift, rust) (langchain-ai#5171) As the title says, I added more code splitters. The implementation is trivial, so i don't add separate tests for each splitter. Let me know if any concerns. Fixes # (issue) langchain-ai#5170 ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @eyurtsev @hwchase17 --------- Signed-off-by: byhsu <[email protected]> Co-authored-by: byhsu <[email protected]>

# Fix for docstring in faiss.py vectorstore (load_local) The doctring should reflect that load_local loads something FROM the disk.

…-ai#5449) This removes duplicate code presumably introduced by a cut-and-paste error, spotted while reviewing the code in ```langchain/client/langchain.py```. The original code had back to back occurrences of the following code block: ``` response = self._get( path, params=params, ) raise_for_status_with_text(response) ```

# What does this PR do? Bring support of `encode_kwargs` for ` HuggingFaceInstructEmbeddings`, change the docstring example and add a test to illustrate with `normalize_embeddings`. Fixes langchain-ai#3605 (Similar to langchain-ai#3914) Use case: ```python from langchain.embeddings import HuggingFaceInstructEmbeddings model_name = "hkunlp/instructor-large" model_kwargs = {'device': 'cpu'} encode_kwargs = {'normalize_embeddings': True} hf = HuggingFaceInstructEmbeddings( model_name=model_name, model_kwargs=model_kwargs, encode_kwargs=encode_kwargs ) ```

@hwchase17

…#5432) # Handles the edge scenario in which the action input is a well formed SQL query which ends with a quoted column There may be a cleaner option here (or indeed other edge scenarios) but this seems to robustly determine if the action input is likely to be a well formed SQL query in which we don't want to arbitrarily trim off `"` characters Fixes langchain-ai#5423 ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Agents / Tools / Toolkits - @vowelparrot

@hwchase17

# docs cleaning Changed docs to consistent format (probably, we need an official doc integration template): - ClearML - added product descriptions; changed title/headers - Rebuff - added product descriptions; changed title/headers - WhyLabs - added product descriptions; changed title/headers - Docugami - changed title/headers/structure - Airbyte - fixed title - Wolfram Alpha - added descriptions, fixed title - OpenWeatherMap - - added product descriptions; changed title/headers - Unstructured - changed description ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @hwchase17 @dev2049

@hwchase17

# Added Async _acall to FakeListLLM FakeListLLM is handy when unit testing apps built with langchain. This allows the use of FakeListLLM inside concurrent code with [asyncio](https://docs.python.org/3/library/asyncio.html). I also changed the pydocstring which was out of date. ## Who can review? @hwchase17 - project lead @agola11 - async

# Add batching to Qdrant Several people requested a batching mechanism while uploading data to Qdrant. It is important, as there are some limits for the maximum size of the request payload, and without batching implemented in Langchain, users need to implement it on their own. This PR exposes a new optional `batch_size` parameter, so all the documents/texts are loaded in batches of the expected size (64, by default). The integration tests of Qdrant are extended to cover two cases: 1. Documents are sent in separate batches. 2. All the documents are sent in a single request.

Update [psychicapi](https://pypi.org/project/psychicapi/) python package dependency to the latest version 0.5. The newest python package version addresses breaking changes in the Psychic http api.

# Add maximal relevance search to SKLearnVectorStore This PR implements the maximum relevance search in SKLearnVectorStore. Twitter handle: jtolgyesi (I submitted also the original implementation of SKLearnVectorStore) ## Before submitting Unit tests are included. Co-authored-by: Dev 2049 <[email protected]>

Co-authored-by: Dev 2049 <[email protected]>

…loader (langchain-ai#5466) # Adds ability to specify credentials when using Google BigQuery as a data loader Fixes langchain-ai#5465 . Adds ability to set credentials which must be of the `google.auth.credentials.Credentials` type. This argument is optional and will default to `None. Co-authored-by: Dev 2049 <[email protected]>

…the class BooleanOutputParser (langchain-ai#5397) when the LLMs output 'yes|no'，BooleanOutputParser can parse it to 'True|False', fix the ValueError in parse().  Fixes # (issue) langchain-ai#5396 langchain-ai#5396 --------- Co-authored-by: gaofeng27692 <[email protected]>

# Added support for modifying the number of threads in the GPT4All model I have added the capability to modify the number of threads used by the GPT4All model. This allows users to adjust the model's parallel processing capabilities based on their specific requirements. ## Changes Made - Updated the `validate_environment` method to set the number of threads for the GPT4All model using the `values["n_threads"]` parameter from the `GPT4All` class constructor. ## Context Useful in scenarios where users want to optimize the model's performance by leveraging multi-threading capabilities. Please note that the `n_threads` parameter was included in the `GPT4All` class constructor but was previously unused. This change ensures that the specified number of threads is utilized by the model . ## Dependencies There are no new dependencies introduced by this change. It only utilizes existing functionality provided by the GPT4All package. ## Testing Since this is a minor change testing is not required. --------- Co-authored-by: Dev 2049 <[email protected]>

# Allow for async use of SelfAskWithSearchChain Co-authored-by: Dev 2049 <[email protected]>

…bject (langchain-ai#5321) This PR adds a new method `from_es_connection` to the `ElasticsearchEmbeddings` class allowing users to use Elasticsearch clusters outside of Elastic Cloud. Users can create an Elasticsearch Client object and pass that to the new function. The returned object is identical to the one returned by calling `from_credentials` ``` # Create Elasticsearch connection es_connection = Elasticsearch( hosts=['https://es_cluster_url:port'], basic_auth=('user', 'password') ) # Instantiate ElasticsearchEmbeddings using es_connection embeddings = ElasticsearchEmbeddings.from_es_connection( model_id, es_connection, ) ``` I also added examples to the elasticsearch jupyter notebook Fixes # langchain-ai#5239 --------- Co-authored-by: Dev 2049 <[email protected]>

# SQLite-backed Entity Memory Following the initiative of langchain-ai#2397 I think it would be helpful to be able to persist Entity Memory on disk by default Co-authored-by: Dev 2049 <[email protected]>

Co-authored-by: David Revillas <[email protected]>

@dev2049

# Support Qdrant filters Qdrant has an [extensive filtering system](https://qdrant.tech/documentation/concepts/filtering/) with rich type support. This PR makes it possible to use the filters in Langchain by passing an additional param to both the `similarity_search_with_score` and `similarity_search` methods. ## Who can review? @dev2049 @hwchase17 --------- Co-authored-by: Dev 2049 <[email protected]>

Co-authored-by: Tom Piaggio <[email protected]> Co-authored-by: scafati98 <[email protected]> Co-authored-by: scafati98 <[email protected]> Co-authored-by: Dev 2049 <[email protected]>

…angchain into mongo-document-loader

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

async mongo document loader #4285

async mongo document loader #4285

Commits on May 17, 2023

Commits on May 18, 2023

Commits on May 19, 2023

Commits on May 20, 2023

Commits on May 21, 2023

Commits on May 22, 2023

Commits on May 23, 2023

Commits on May 24, 2023

Commits on May 25, 2023

Commits on May 26, 2023

Commits on May 27, 2023

Commits on May 28, 2023

Commits on May 29, 2023

Commits on May 30, 2023

Commits on May 31, 2023

Commits on Sep 13, 2023

Commits on Sep 14, 2023