Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

async mongo document loader #4285

Closed
This pull request is big! We’re only showing the most recent 250 commits.

Commits on May 17, 2023

  1. fix homepage typo (langchain-ai#4883)

    # Fix Homepage Typo
    
    ## Who can review?
    
    Community members can review the PR once tests pass. Tag
    maintainers/contributors who might be interested... not sure
    cjcjameson authored May 17, 2023
    Configuration menu
    Copy the full SHA
    d6e0b9a View commit details
    Browse the repository at this point in the history
  2. Tiny code review and docs fix for Docugami DataLoader (langchain-ai#4877

    )
    
    # Docs and code review fixes for Docugami DataLoader
    
    1. I noticed a couple of hyperlinks that are not loading in the
    langchain docs (I guess need explicit anchor tags). Added those.
    2. In code review @eyurtsev had a
    [suggestion](langchain-ai#4727 (comment))
    to allow string paths. Turns out just updating the type works (I tested
    locally with string paths).
    
    # Pre-submission checks
    I ran `make lint` and `make tests` successfully.
    
    ---------
    
    Co-authored-by: Taqi Jaffri <[email protected]>
    tjaffri and Taqi Jaffri authored May 17, 2023
    Configuration menu
    Copy the full SHA
    ef8b5f6 View commit details
    Browse the repository at this point in the history
  3. feat(Add FastAPI + Vercel deployment option): (langchain-ai#4520)

    # Update deployments doc with langcorn API server
    
    API server example 
    
    ```python
    from fastapi import FastAPI
    
    from langcorn import create_service
    
    app: FastAPI = create_service(
        "examples.ex1:chain",
        "examples.ex2:chain",
        "examples.ex3:chain",
        "examples.ex4:sequential_chain",
        "examples.ex5:conversation",
        "examples.ex6:conversation_with_summary",
    )
    
    ```
    More examples: https://github.com/msoedov/langcorn/tree/main/examples
    
    Co-authored-by: Dev 2049 <[email protected]>
    msoedov and dev2049 authored May 17, 2023
    Configuration menu
    Copy the full SHA
    4c3ab55 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    1ff7c95 View commit details
    Browse the repository at this point in the history

Commits on May 18, 2023

  1. ConversationalChatAgent: Allow customizing TEMPLATE_TOOL_RESPONSE (l…

    …angchain-ai#2361)
    
    It's currently not possible to change the `TEMPLATE_TOOL_RESPONSE`
    prompt for ConversationalChatAgent, this PR changes that.
    
    ---------
    
    Co-authored-by: Dev 2049 <[email protected]>
    FOLLGAD and dev2049 authored May 18, 2023
    Configuration menu
    Copy the full SHA
    5c9205d View commit details
    Browse the repository at this point in the history
  2. Faiss no avx2 (langchain-ai#4895)

    Co-authored-by: Ali Mirlou <[email protected]>
    dev2049 and AliMirlou authored May 18, 2023
    Configuration menu
    Copy the full SHA
    df0c33a View commit details
    Browse the repository at this point in the history
  3. Add a generic document loader (langchain-ai#4875)

    # Add generic document loader
    
    * This PR adds a generic document loader which can assemble a loader
    from a blob loader and a parser
    * Adds a registry for parsers
    * Populate registry with a default mimetype based parser
    
    
    ## Expected changes
    
    - Parsing involves loading content via IO so can be sped up via:
      * Threading in sync
      * Async  
    - The actual parsing logic may be computatinoally involved: may need to
    figure out to add multi-processing support
    - May want to add suffix based parser since suffixes are easier to
    specify in comparison to mime types
    
    ## Before submitting
    
    No notebooks yet, we first need to get a few of the basic parsers up
    (prior to advertising the interface)
    eyurtsev authored May 18, 2023
    Configuration menu
    Copy the full SHA
    8e41143 View commit details
    Browse the repository at this point in the history
  4. Add html parsers (langchain-ai#4874)

    # Add bs4 html parser
    
    * Some minor refactors
    * Extract the bs4 html parsing code from the bs html loader
    * Move some tests from integration tests to unit tests
    eyurtsev authored May 18, 2023
    Configuration menu
    Copy the full SHA
    0dc304c View commit details
    Browse the repository at this point in the history
  5. Cadlabs/python tool sanitization (langchain-ai#4754)

    Co-authored-by: BenSchZA <[email protected]>
    dev2049 and BenSchZA authored May 18, 2023
    Configuration menu
    Copy the full SHA
    e28bdf4 View commit details
    Browse the repository at this point in the history
  6. Zep memory (langchain-ai#4898)

    Co-authored-by: Daniel Chalef <[email protected]>
    Co-authored-by: Daniel Chalef <[email protected]>
    3 people authored May 18, 2023
    Configuration menu
    Copy the full SHA
    8966f61 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    a4ac006 View commit details
    Browse the repository at this point in the history
  8. Fix AzureOpenAI embeddings documentation example. model -> deployment (

    …langchain-ai#4389)
    
    # Documentation for Azure OpenAI embeddings model
    
    - OPENAI_API_VERSION environment variable is needed for the endpoint
    - The constructor does not work with model, it works with deployment.
    
    I fixed it in the notebook.
    
    (This is my first contribution)
    
    ## Who can review?
    
    @hwchase17 
    @agola
    
    Co-authored-by: Harrison Chase <[email protected]>
    IsmaelGSerrano and hwchase17 authored May 18, 2023
    Configuration menu
    Copy the full SHA
    41e2394 View commit details
    Browse the repository at this point in the history
  9. Update getting_started.md (langchain-ai#4482)

    # Added another helpful way for developers who want to set OpenAI API
    Key dynamically
    
    Previous methods like exporting environment variables are good for
    project-wide settings.
    But many use cases need to assign API keys dynamically, recently.
    
    ```python
    from langchain.llms import OpenAI
    llm = OpenAI(openai_api_key="OPENAI_API_KEY")
    ```
    
    ## Before submitting
    ```bash
    export OPENAI_API_KEY="..."
    ```
    Or,
    ```python
    import os
    os.environ["OPENAI_API_KEY"] = "..."
    ```
    
    <hr>
    
    Thank you.
    Cheers,
    Bongsang
    bongsang authored May 18, 2023
    Configuration menu
    Copy the full SHA
    613bf9b View commit details
    Browse the repository at this point in the history
  10. docs: text splitters improvements (langchain-ai#4490)

    #docs: text splitters improvements
    
    Changes are only in the Jupyter notebooks.
    - added links to the source packages and a short description of these
    packages
    - removed " Text Splitters" suffixes from the TOC elements (they made
    the list of the text splitters messy)
    - moved text splitters, based on the length function into a separate
    list. They can be mixed with any classes from the "Text Splitters", so
    it is a different classification.
    
    ## Who can review?
            @hwchase17 - project lead
            @eyurtsev
            @vowelparrot
    
    NOTE: please, check out the results of the `Python code` text splitter
    example (text_splitters/examples/python.ipynb). It looks suboptimal.
    leo-gan authored May 18, 2023
    Configuration menu
    Copy the full SHA
    c998569 View commit details
    Browse the repository at this point in the history
  11. Harrison/serper api bug (langchain-ai#4902)

    Co-authored-by: Jerry Luan <[email protected]>
    hwchase17 and luanjunyi authored May 18, 2023
    Configuration menu
    Copy the full SHA
    9e2227b View commit details
    Browse the repository at this point in the history
  12. Harrison/faiss norm (langchain-ai#4903)

    Co-authored-by: Jiaxin Shan <[email protected]>
    hwchase17 and Jeffwan authored May 18, 2023
    Configuration menu
    Copy the full SHA
    ba023d5 View commit details
    Browse the repository at this point in the history
  13. Configuration menu
    Copy the full SHA
    9165267 View commit details
    Browse the repository at this point in the history
  14. Harrison/unified objectives (langchain-ai#4905)

    Co-authored-by: Matthias Samwald <[email protected]>
    hwchase17 and matthias-samwald authored May 18, 2023
    Configuration menu
    Copy the full SHA
    b8d4893 View commit details
    Browse the repository at this point in the history
  15. Configuration menu
    Copy the full SHA
    dfbf45f View commit details
    Browse the repository at this point in the history
  16. Load specific file types from Google Drive (issue langchain-ai#4878) (l…

    …angchain-ai#4926)
    
    # Load specific file types from Google Drive (issue langchain-ai#4878)
    Add the possibility to define what file types you want to load from
    Google Drive.
     
    ```
     loader = GoogleDriveLoader(
        folder_id="1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5",
        file_types=["document", "pdf"]
        recursive=False
    )
    ```
    
    Fixes #langchain-ai#4878
    
    ## Who can review?
    Community members can review the PR once tests pass. Tag
    maintainers/contributors who might be interested:
    DataLoaders
    - @eyurtsev
    
    Twitter: [@UmerHAdil](https://twitter.com/@UmerHAdil) | Discord:
    RicChilligerDude#7589
    
    ---------
    
    Co-authored-by: UmerHA <[email protected]>
    eyurtsev and UmerHA authored May 18, 2023
    Configuration menu
    Copy the full SHA
    c06a47a View commit details
    Browse the repository at this point in the history
  17. API update: Engines -> Models (langchain-ai#4915)

    # API update: Engines -> Models
    
    see: https://community.openai.com/t/api-update-engines-models/18597
    
    Co-authored-by: assert <[email protected]>
    assert6 and assert6 authored May 18, 2023
    Configuration menu
    Copy the full SHA
    8c28ad6 View commit details
    Browse the repository at this point in the history
  18. feat langchain-ai#4479: TextLoader auto detect encoding and improved …

    …exceptions (langchain-ai#4927)
    
    # TextLoader auto detect encoding and enhanced exception handling
    
    - Add an option to enable encoding detection on `TextLoader`. 
    - The detection is done using `chardet`
    - The loading is done by trying all detected encodings by order of
    confidence or raise an exception otherwise.
    
    ### New Dependencies:
    - `chardet`
    
    Fixes langchain-ai#4479 
    
    ## Before submitting
    
    <!-- If you're adding a new integration, include an integration test and
    an example notebook showing its use! -->
    
    ## Who can review?
    
    Community members can review the PR once tests pass. Tag
    maintainers/contributors who might be interested:
    
    - @eyurtsev
    
    ---------
    
    Co-authored-by: blob42 <spike@w530>
    eyurtsev and blob42 authored May 18, 2023
    1 Configuration menu
    Copy the full SHA
    e462028 View commit details
    Browse the repository at this point in the history
  19. Fix bilibili (langchain-ai#4860)

    # Fix bilibili api import error
    
    bilibili-api package is depracated and there is no sync module.
    
    <!--
    Thank you for contributing to LangChain! Your PR will appear in our next
    release under the title you set. Please make sure it highlights your
    valuable contribution.
    
    Replace this with a description of the change, the issue it fixes (if
    applicable), and relevant context. List any dependencies required for
    this change.
    
    After you're done, someone will review your PR. They may suggest
    improvements. If no one reviews your PR within a few days, feel free to
    @-mention the same people again, as notifications can get lost.
    -->
    
    <!-- Remove if not applicable -->
    
    Fixes langchain-ai#2673 langchain-ai#2724 
    
    ## Before submitting
    
    <!-- If you're adding a new integration, include an integration test and
    an example notebook showing its use! -->
    
    ## Who can review?
    
    Community members can review the PR once tests pass. Tag
    maintainers/contributors who might be interested:
    @vowelparrot  @liaokongVFX 
    
    <!-- For a quicker response, figure out the right person to tag with @
    
            @hwchase17 - project lead
    
            Tracing / Callbacks
            - @agola11
    
            Async
            - @agola11
    
            DataLoaders
            - @eyurtsev
    
            Models
            - @hwchase17
            - @agola11
    
            Agents / Tools / Toolkits
            - @vowelparrot
            
            VectorStores / Retrievers / Memory
            - @dev2049
            
     -->
    yuekaizhang authored May 18, 2023
    Configuration menu
    Copy the full SHA
    1ed4228 View commit details
    Browse the repository at this point in the history
  20. Add human message as input variable to chat agent prompt creation (la…

    …ngchain-ai#4542)
    
    # Add human message as input variable to chat agent prompt creation
    
    This PR adds human message and system message input to
    `CHAT_ZERO_SHOT_REACT_DESCRIPTION` agent, similar to [conversational
    chat
    agent](https://github.com/hwchase17/langchain/blob/7bcf238a1acf40aef21a5a198cf0e62d76f93c15/langchain/agents/conversational_chat/base.py#L64-L71).
    
    I met this issue trying to use `create_prompt` function when using the
    [BabyAGI agent with tools
    notebook](https://python.langchain.com/en/latest/use_cases/autonomous_agents/baby_agi_with_agent.html),
    since BabyAGI uses “task” instead of “input” input variable. For normal
    zero shot react agent this is fine because I can manually change the
    suffix to “{input}/n/n{agent_scratchpad}” just like the notebook, but I
    cannot do this with conversational chat agent, therefore blocking me to
    use BabyAGI with chat zero shot agent.
    
    I tested this in my own project
    [Chrome-GPT](https://github.com/richardyc/Chrome-GPT) and this fix
    worked.
    
    ## Request for review
    Agents / Tools / Toolkits
    - @vowelparrot
    richardyc authored May 18, 2023
    Configuration menu
    Copy the full SHA
    7642f21 View commit details
    Browse the repository at this point in the history
  21. add alias for model (langchain-ai#4553)

    Co-authored-by: Dev 2049 <[email protected]>
    hwchase17 and dev2049 authored May 18, 2023
    Configuration menu
    Copy the full SHA
    c9a362e View commit details
    Browse the repository at this point in the history
  22. dont error on sql import (langchain-ai#4647)

    this makes it so we dont throw errors when importing langchain when
    sqlalchemy==1.3.1
    
    we dont really want to support 1.3.1 (seems like unneccessary maintance
    cost) BUT we would like it to not terribly error should someone decide
    to run on it
    hwchase17 authored May 18, 2023
    Configuration menu
    Copy the full SHA
    d5a0704 View commit details
    Browse the repository at this point in the history
  23. docs: compound ecosystem and integrations (langchain-ai#4870)

    # Docs: compound ecosystem and integrations
    
    **Problem statement:** We have a big overlap between the
    References/Integrations and Ecosystem/LongChain Ecosystem pages. It
    confuses users. It creates a situation when new integration is added
    only on one of these pages, which creates even more confusion.
    - removed References/Integrations page (but move all its information
    into the individual integration pages - in the next PR).
    - renamed Ecosystem/LongChain Ecosystem into Integrations/Integrations.
    I like the Ecosystem term. It is more generic and semantically richer
    than the Integration term. But it mentally overloads users. The
    `integration` term is more concrete.
    UPDATE: after discussion, the Ecosystem is the term.
    Ecosystem/Integrations is the page (in place of Ecosystem/LongChain
    Ecosystem).
    
    As a result, a user gets a single place to start with the individual
    integration.
    leo-gan authored May 18, 2023
    Configuration menu
    Copy the full SHA
    e2d7677 View commit details
    Browse the repository at this point in the history
  24. Update GPT4ALL integration (langchain-ai#4567)

    # Update GPT4ALL integration
    
    GPT4ALL have completely changed their bindings. They use a bit odd
    implementation that doesn't fit well into base.py and it will probably
    be changed again, so it's a temporary solution.
    
    Fixes langchain-ai#3839, langchain-ai#4628
    Chae4ek authored May 18, 2023
    Configuration menu
    Copy the full SHA
    c9e2a01 View commit details
    Browse the repository at this point in the history
  25. FIX: GPTCache cache_obj creation loop (langchain-ai#4827)

    _get_gptcache method keep creating new gptcache instance, here's the fix
    
    # Fix GPTCache cache_obj creation loop
    
    Fixes langchain-ai#4830 
    
    Co-authored-by: Dev 2049 <[email protected]>
    elBarkey and dev2049 authored May 18, 2023
    Configuration menu
    Copy the full SHA
    a8ded21 View commit details
    Browse the repository at this point in the history
  26. Configuration menu
    Copy the full SHA
    440b876 View commit details
    Browse the repository at this point in the history
  27. Configuration menu
    Copy the full SHA
    55baa0d View commit details
    Browse the repository at this point in the history
  28. docs supabase update (langchain-ai#4935)

    # docs: updated `Supabase` notebook
    
    - the title of the notebook was inconsistent (included redundant
    "Vectorstore"). Removed this "Vectorstore"
    - added `Postgress` to the title. It is important. The `Postgres` name
    is much more popular than `Supabase`.
    - added description for the `Postrgress`
    - added more info to the `Supabase` description
    leo-gan authored May 18, 2023
    Configuration menu
    Copy the full SHA
    c75c077 View commit details
    Browse the repository at this point in the history
  29. Correct typo in APIChain example notebook (Farenheit -> Fahrenheit) (l…

    …angchain-ai#4938)
    
    Correct typo in APIChain example notebook (Farenheit -> Fahrenheit)
    verygoodsoftwarenotvirus authored May 18, 2023
    Configuration menu
    Copy the full SHA
    7e8e21c View commit details
    Browse the repository at this point in the history
  30. Configuration menu
    Copy the full SHA
    3002c1d View commit details
    Browse the repository at this point in the history
  31. Update custom_multi_action_agent.ipynb (langchain-ai#4931)

    Updated the docs from 
    "An agent consists of three parts:" to 
    "An agent consists of two parts:" since there are only two parts in the
    documentation
    vishwa-rn authored May 18, 2023
    Configuration menu
    Copy the full SHA
    c9f963e View commit details
    Browse the repository at this point in the history
  32. docs: added ecosystem/dependents page (langchain-ai#4941)

    # docs: added `ecosystem/dependents` page
    
    Added `ecosystem/dependents` page. Can we propose a better page name?
    leo-gan authored May 18, 2023
    Configuration menu
    Copy the full SHA
    8f8593a View commit details
    Browse the repository at this point in the history
  33. docs: vectorstores, different updates and fixes (langchain-ai#4939)

    # docs: vectorstores, different updates and fixes
    
    Multiple updates:
    - added/improved descriptions
    - fixed header levels
    - added headers
    - fixed headers
    leo-gan authored May 18, 2023
    Configuration menu
    Copy the full SHA
    a9bb314 View commit details
    Browse the repository at this point in the history
  34. Chatconv agent: output parser exception (langchain-ai#4923)

    the output parser form chat conversational agent now raises
    `OutputParserException` like the rest.
    
    The `raise OutputParserExeption(...) from e` form also carries through
    the original error details on what went wrong.
    
    I added the `ValueError` as a base class to `OutputParserException` to
    avoid breaking code that was relying on `ValueError` as a way to catch
    exceptions from the agent. So catching ValuError still works. Not sure
    if this is a good idea though ?
    blob42 authored May 18, 2023
    Configuration menu
    Copy the full SHA
    5525b70 View commit details
    Browse the repository at this point in the history
  35. Zep Retriever - Vector Search Over Chat History (langchain-ai#4533)

    # Zep Retriever - Vector Search Over Chat History with the Zep Long-term
    Memory Service
    
    More on Zep: https://github.com/getzep/zep
    
    Note: This PR is related to and relies on
    langchain-ai#4834. I did not want to
    modify the `pyproject.toml` file to add the `zep-python` dependency a
    second time.
    
    Co-authored-by: Daniel Chalef <[email protected]>
    danielchalef and Daniel Chalef authored May 18, 2023
    Configuration menu
    Copy the full SHA
    c8c2276 View commit details
    Browse the repository at this point in the history
  36. Fix get_num_tokens for Anthropic models (langchain-ai#4911)

    The Anthropic classes used `BaseLanguageModel.get_num_tokens` because of
    an issue with multiple inheritance. Fixed by moving the method from
    `_AnthropicCommon` to both its subclasses.
    
    This change will significantly speed up token counting for Anthropic
    users.
    jarib authored May 18, 2023
    Configuration menu
    Copy the full SHA
    3df2d83 View commit details
    Browse the repository at this point in the history

Commits on May 19, 2023

  1. NIT: Instead of hardcoding k in each definition, define it as a param…

    … above. (langchain-ai#2675)
    
    Co-authored-by: Dev 2049 <[email protected]>
    Co-authored-by: Davis Chase <[email protected]>
    3 people authored May 19, 2023
    Configuration menu
    Copy the full SHA
    e027a38 View commit details
    Browse the repository at this point in the history
  2. [nit] Simplify Spark Creation Validation Check A Little Bit (langchai…

    …n-ai#4761)
    
    - simplify the validation check a little bit.
    - re-tested in jupyter notebook.
    
    Reviewer: @hwchase17
    skcoirz authored May 19, 2023
    Configuration menu
    Copy the full SHA
    db6f7ed View commit details
    Browse the repository at this point in the history
  3. Fix for syntax when setting search_path for Snowflake database (langc…

    …hain-ai#4747)
    
    # Fixes syntax for setting Snowflake database search_path
    
    An error occurs when using a Snowflake database and providing a schema
    argument.
    I have updated the syntax to run a Snowflake specific query when the
    database dialect is 'snowflake'.
    aboland authored May 19, 2023
    Configuration menu
    Copy the full SHA
    c069732 View commit details
    Browse the repository at this point in the history
  4. Harrison/spell executor (langchain-ai#4914)

    Co-authored-by: Jan Minar <[email protected]>
    hwchase17 and rdancer authored May 19, 2023
    Configuration menu
    Copy the full SHA
    5feb60f View commit details
    Browse the repository at this point in the history
  5. Add Spark SQL support (langchain-ai#4602) (langchain-ai#4956)

    # Add Spark SQL support 
    * Add Spark SQL support. It can connect to Spark via building a
    local/remote SparkSession.
    * Include a notebook example
    
    I tried some complicated queries (window function, table joins), and the
    tool works well.
    Compared to the [Spark Dataframe
    
    agent](https://python.langchain.com/en/latest/modules/agents/toolkits/examples/spark.html),
    this tool is able to generate queries across multiple tables.
    
    ---------
    
    # Your PR Title (What it does)
    
    <!--
    Thank you for contributing to LangChain! Your PR will appear in our next
    release under the title you set. Please make sure it highlights your
    valuable contribution.
    
    Replace this with a description of the change, the issue it fixes (if
    applicable), and relevant context. List any dependencies required for
    this change.
    
    After you're done, someone will review your PR. They may suggest
    improvements. If no one reviews your PR within a few days, feel free to
    @-mention the same people again, as notifications can get lost.
    -->
    
    <!-- Remove if not applicable -->
    
    Fixes # (issue)
    
    ## Before submitting
    
    <!-- If you're adding a new integration, include an integration test and
    an example notebook showing its use! -->
    
    ## Who can review?
    
    Community members can review the PR once tests pass. Tag
    maintainers/contributors who might be interested:
    
    <!-- For a quicker response, figure out the right person to tag with @
    
            @hwchase17 - project lead
    
            Tracing / Callbacks
            - @agola11
    
            Async
            - @agola11
    
            DataLoaders
            - @eyurtsev
    
            Models
            - @hwchase17
            - @agola11
    
            Agents / Tools / Toolkits
            - @vowelparrot
            
            VectorStores / Retrievers / Memory
            - @dev2049
            
     -->
    
    ---------
    
    Co-authored-by: Gengliang Wang <[email protected]>
    Co-authored-by: Mike W <[email protected]>
    Co-authored-by: Eugene Yurtsev <[email protected]>
    Co-authored-by: UmerHA <[email protected]>
    Co-authored-by: 张城铭 <[email protected]>
    Co-authored-by: assert <[email protected]>
    Co-authored-by: blob42 <spike@w530>
    Co-authored-by: Yuekai Zhang <[email protected]>
    Co-authored-by: Richard He <[email protected]>
    Co-authored-by: Dev 2049 <[email protected]>
    Co-authored-by: Leonid Ganeline <[email protected]>
    Co-authored-by: Alexey Nominas <[email protected]>
    Co-authored-by: elBarkey <[email protected]>
    Co-authored-by: Davis Chase <[email protected]>
    Co-authored-by: Jeffrey D <[email protected]>
    Co-authored-by: so2liu <[email protected]>
    Co-authored-by: Viswanadh Rayavarapu <[email protected]>
    Co-authored-by: Chakib Ben Ziane <[email protected]>
    Co-authored-by: Daniel Chalef <[email protected]>
    Co-authored-by: Daniel Chalef <[email protected]>
    Co-authored-by: Jari Bakken <[email protected]>
    Co-authored-by: escafati <[email protected]>
    23 people authored May 19, 2023
    Configuration menu
    Copy the full SHA
    88a3a56 View commit details
    Browse the repository at this point in the history
  6. Support Databricks in SQLDatabase (langchain-ai#4702)

    This PR adds support for Databricks runtime and Databricks SQL by using
    [Databricks SQL Connector for
    Python](https://docs.databricks.com/dev-tools/python-sql-connector.html).
    As a cloud data platform, accessing Databricks requires a URL as follows
    
    `databricks://token:{api_token}@{hostname}?http_path={http_path}&catalog={catalog}&schema={schema}`.
    
    **The URL is **complicated** and it may take users a while to figure it
    out**. Since the fields `api_token`/`hostname`/`http_path` fields are
    known in the Databricks notebook, I am proposing a new method
    `from_databricks` to simplify the connection to Databricks.
    
    ## In Databricks Notebook
    After changes, Databricks users only need to specify the `catalog` and
    `schema` field when using langchain.
    <img width="881" alt="image"
    src="https://github.com/hwchase17/langchain/assets/1097932/984b4c57-4c2d-489d-b060-5f4918ef2f37">
    
    ## In Jupyter Notebook
    The method can be used on the local setup as well:
    <img width="678" alt="image"
    src="https://github.com/hwchase17/langchain/assets/1097932/142e8805-a6ef-4919-b28e-9796ca31ef19">
    gengliangwang authored May 19, 2023
    Configuration menu
    Copy the full SHA
    bf5a3c6 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    13c3763 View commit details
    Browse the repository at this point in the history
  8. Update tutorials.md (langchain-ai#4960)

    # Added a YouTube Tutorial
    
    Added a LangChain tutorial playlist aimed at onboarding newcomers to
    LangChain and its use cases.
    
    I've shared the video in the #tutorials channel and it seemed to be well
    received. I think this could be useful to the greater community.
    
    ## Who can review?
    
    @dev2049
    edrickdch authored May 19, 2023
    Configuration menu
    Copy the full SHA
    e80585b View commit details
    Browse the repository at this point in the history
  9. Update planner_prompt.py (langchain-ai#4967)

    Typos in the OpenAPI agent Prompt.
    vishwa-rn authored May 19, 2023
    Configuration menu
    Copy the full SHA
    e68dfa7 View commit details
    Browse the repository at this point in the history
  10. power bi api wrapper integration tests & bug fix (langchain-ai#4983)

    # Powerbi API wrapper bug fix + integration tests
    
    - Bug fix by removing `TYPE_CHECKING` in in utilities/powerbi.py
    - Added integration test for power bi api in
    utilities/test_powerbi_api.py
    - Added integration test for power bi agent in
    agent/test_powerbi_agent.py
    - Edited .env.examples to help set up power bi related environment
    variables
    - Updated demo notebook with working code in
    docs../examples/powerbi.ipynb - AzureOpenAI -> ChatOpenAI
    
    Notes: 
    
    Chat models (gpt3.5, gpt4) are much more capable than davinci at writing
    DAX queries, so that is important to getting the agent to work properly.
    Interestingly, gpt3.5-turbo needed the examples=DEFAULT_FEWSHOT_EXAMPLES
    to write consistent DAX queries, so gpt4 seems necessary as the smart
    llm.
    
    Fixes langchain-ai#4325
    
    ## Before submitting
    
    Azure-core and Azure-identity are necessary dependencies
    
    check integration tests with the following:
    `pytest tests/integration_tests/utilities/test_powerbi_api.py`
    `pytest tests/integration_tests/agent/test_powerbi_agent.py`
    
    You will need a power bi account with a dataset id + table name in order
    to test. See .env.examples for details.
    
    ## Who can review?
    @hwchase17
    @vowelparrot
    
    ---------
    
    Co-authored-by: aditya-pethe <[email protected]>
    eyurtsev and aditya-pethe authored May 19, 2023
    Configuration menu
    Copy the full SHA
    06e5244 View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    2abf6b9 View commit details
    Browse the repository at this point in the history
  12. Remove autoreload in examples (langchain-ai#4994)

    # Remove autoreload in examples
    Remove the `autoreload` in examples since it is not necessary for most
    users:
    ```
    %load_ext autoreload,
    %autoreload 2
    ```
    gengliangwang authored May 19, 2023
    Configuration menu
    Copy the full SHA
    a87a252 View commit details
    Browse the repository at this point in the history
  13. Bug fixes and error handling in Redis - Vectorstore (langchain-ai#4932)

    # Bug fixes in Redis - Vectorstore (Added the version of redis to the
    error message and removed the cls argument from a classmethod)
    
    
    Co-authored-by: Tyler Hutcherson <[email protected]>
    iamadhee and tylerhutcherson authored May 19, 2023
    Configuration menu
    Copy the full SHA
    616e9a9 View commit details
    Browse the repository at this point in the history
  14. Add async search with relevance score (langchain-ai#4558)

    Add the async version for the search with relevance score
    
    Co-authored-by: Dev 2049 <[email protected]>
    jpzhangvincent and dev2049 authored May 19, 2023
    Configuration menu
    Copy the full SHA
    22d844d View commit details
    Browse the repository at this point in the history
  15. Make test gha workflow manually runnable (langchain-ai#4998)

    if https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#workflow_dispatch
    is to be believed this should make it possible to manually kick of test
    workflow, but i don't know much about these things
    dev2049 authored May 19, 2023
    Configuration menu
    Copy the full SHA
    56cb77a View commit details
    Browse the repository at this point in the history
  16. Adds 'IN' metadata filter for pgvector for checking set presence (lan…

    …gchain-ai#4982)
    
    # Adds "IN" metadata filter for pgvector to all checking for set
    presence
    
    PGVector currently supports metadata filters of the form:
    ```
    {"filter": {"key": "value"}}
    ```
    which will return documents where the "key" metadata field is equal to
    "value".
    
    This PR adds support for metadata filters of the form:
    ```
    {"filter": {"key": { "IN" : ["list", "of", "values"]}}}
    ```
    
    Other vector stores support this via an "$in" syntax. I chose to use
    "IN" to match postgres' syntax, though happy to switch.
    Tested locally with PGVector and ChatVectorDBChain.
    
    
    @dev2049
    
    ---------
    
    Co-authored-by: [email protected] <[email protected]>
    eyurtsev and jadespanning authored May 19, 2023
    Configuration menu
    Copy the full SHA
    0ff5956 View commit details
    Browse the repository at this point in the history
  17. Update python.py (langchain-ai#4971)

    # Delete a useless "print"
    pengwork authored May 19, 2023
    Configuration menu
    Copy the full SHA
    62d0a01 View commit details
    Browse the repository at this point in the history
  18. PGVector logger message level (langchain-ai#4920)

    # Change the logger message level
    
    The library is logging at `error` level a situation that is not an
    error.
    We noticed this error in our logs, but from our point of view it's an
    expected behavior and the log level should be `warning`.
    jmtristancho authored May 19, 2023
    Configuration menu
    Copy the full SHA
    729e935 View commit details
    Browse the repository at this point in the history
  19. feature/4493 Improve Evernote Document Loader (langchain-ai#4577)

    # Improve Evernote Document Loader
    
    When exporting from Evernote you may export more than one note.
    Currently the Evernote loader concatenates the content of all notes in
    the export into a single document and only attaches the name of the
    export file as metadata on the document.
    
    This change ensures that each note is loaded as an independent document
    and all available metadata on the note e.g. author, title, created,
    updated are added as metadata on each document.
    
    It also uses an existing optional dependency of `html2text` instead of
    `pypandoc` to remove the need to download the pandoc application via
    `download_pandoc()` to be able to use the `pypandoc` python bindings.
    
    Fixes langchain-ai#4493 
    
    Co-authored-by: Mike McGarry <[email protected]>
    Co-authored-by: Dev 2049 <[email protected]>
    3 people authored May 19, 2023
    Configuration menu
    Copy the full SHA
    ddd595f View commit details
    Browse the repository at this point in the history
  20. Fix graphql tool (langchain-ai#4984)

    Fix construction and add unit test.
    dev2049 authored May 19, 2023
    Configuration menu
    Copy the full SHA
    080eb1b View commit details
    Browse the repository at this point in the history
  21. changed ValueError to ImportError (langchain-ai#5006)

    # changed ValueError to ImportError in except
    
    Several places with this bug. ValueError does not catch ImportError.
    leo-gan authored May 19, 2023
    Configuration menu
    Copy the full SHA
    2ab0e1d View commit details
    Browse the repository at this point in the history
  22. docs: Big Mendable Improvements (langchain-ai#4964)

    - Higher accuracy on the responses
    - New redesigned UI
    - Pretty Sources: display the sources by title / sub-section instead of
    long URL.
    - Fixed Reset Button bugs and some other UI issues
    - Other tweaks
    nickscamara authored May 19, 2023
    Configuration menu
    Copy the full SHA
    02632d5 View commit details
    Browse the repository at this point in the history
  23. added instruction about pip install google-gerativeai (langchain-ai#5004

    )
    
    # added instruction about pip install google-gerativeai
    
    added instruction about pip install google-gerativeai
    leo-gan authored May 19, 2023
    Configuration menu
    Copy the full SHA
    ddc2d4c View commit details
    Browse the repository at this point in the history
  24. Update the GPTCache example (langchain-ai#4985)

    # Update the GPTCache example
    
    Fixes langchain-ai#4757
    SimFG authored May 19, 2023
    Configuration menu
    Copy the full SHA
    f07b9fd View commit details
    Browse the repository at this point in the history
  25. Configuration menu
    Copy the full SHA
    9928fb2 View commit details
    Browse the repository at this point in the history
  26. Add self query translator for weaviate vectorstore (langchain-ai#4804)

    # Add self query translator for weaviate vectorstore
    
    Adds support for the EQ comparator and the AND/OR operators. 
    
    Co-authored-by: Dominic Chan <[email protected]>
    Co-authored-by: Dev 2049 <[email protected]>
    3 people authored May 19, 2023
    Configuration menu
    Copy the full SHA
    6c60251 View commit details
    Browse the repository at this point in the history
  27. Check for single prompt in __call__ method of the BaseLLM class (lang…

    …chain-ai#4892)
    
    # Ensuring that users pass a single prompt when calling a LLM 
    
    - This PR adds a check to the `__call__` method of the `BaseLLM` class
    to ensure that it is called with a single prompt
    - Raises a `ValueError` if users try to call a LLM with a list of prompt
    and instructs them to use the `generate` method instead
    
    ## Why this could be useful
    
    I stumbled across this by accident. I accidentally called the OpenAI LLM
    with a list of prompts instead of a single string and still got a
    result:
    
    ```
    >>> from langchain.llms import OpenAI
    >>> llm = OpenAI()
    >>> llm(["Tell a joke"]*2)
    "\n\nQ: Why don't scientists trust atoms?\nA: Because they make up everything!"
    ```
    
    It might be better to catch such a scenario preventing unnecessary costs
    and irritation for the user.
    
    ## Proposed behaviour
    
    ```
    >>> from langchain.llms import OpenAI
    >>> llm = OpenAI()
    >>> llm(["Tell a joke"]*2)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/Users/marcus/Projects/langchain/langchain/llms/base.py", line 291, in __call__
        raise ValueError(
    ValueError: Argument `prompt` is expected to be a single string, not a list. If you want to run the LLM on multiple prompts, use `generate` instead.
    ```
    mwinterde authored May 19, 2023
    Configuration menu
    Copy the full SHA
    2aa3754 View commit details
    Browse the repository at this point in the history

Commits on May 20, 2023

  1. Add logs command (langchain-ai#5007)

    to the plus server
    vowelparrot authored May 20, 2023
    Configuration menu
    Copy the full SHA
    27e63b9 View commit details
    Browse the repository at this point in the history
  2. fix prompt saving (langchain-ai#4987)

    will add unit tests
    dev2049 authored May 20, 2023
    Configuration menu
    Copy the full SHA
    3bc0bf0 View commit details
    Browse the repository at this point in the history
  3. Streaming only final output of agent (langchain-ai#2483) (langchain-a…

    …i#4630)
    
    # Streaming only final output of agent (langchain-ai#2483)
    As requested in issue langchain-ai#2483, this Callback allows to stream only the
    final output of an agent (ie not the intermediate steps).
    
    Fixes langchain-ai#2483
    
    Co-authored-by: Dev 2049 <[email protected]>
    UmerHA and dev2049 authored May 20, 2023
    Configuration menu
    Copy the full SHA
    7388248 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    9d1280d View commit details
    Browse the repository at this point in the history

Commits on May 21, 2023

  1. Fix annoying typo in docs (langchain-ai#5029)

    # Fixes an annoying typo in docs
    
    <!--
    Thank you for contributing to LangChain! Your PR will appear in our next
    release under the title you set. Please make sure it highlights your
    valuable contribution.
    
    Replace this with a description of the change, the issue it fixes (if
    applicable), and relevant context. List any dependencies required for
    this change.
    
    After you're done, someone will review your PR. They may suggest
    improvements. If no one reviews your PR within a few days, feel free to
    @-mention the same people again, as notifications can get lost.
    -->
    
    <!-- Remove if not applicable -->
    
    Fixes Annoying typo in docs - "Therefor" -> "Therefore". It's so
    annoying to read that I just had to make this PR.
    tornikeo authored May 21, 2023
    Configuration menu
    Copy the full SHA
    a6ef20d View commit details
    Browse the repository at this point in the history
  2. Add documentation for Databricks integration (langchain-ai#5013)

    # Add documentation for Databricks integration
    
    This is a follow-up of langchain-ai#4702
    It documents the details of how to integrate Databricks using langchain.
    It also provides examples in a notebook.
    
    
    ## Who can review?
    @dev2049 @hwchase17 since you are aware of the context. We will promote
    the integration after this doc is ready. Thanks in advance!
    gengliangwang authored May 21, 2023
    Configuration menu
    Copy the full SHA
    f9f08c4 View commit details
    Browse the repository at this point in the history
  3. DOC: Misspelling in agents.rst documentation (langchain-ai#5038)

    # Corrected Misspelling in agents.rst Documentation
    
    <!--
    Thank you for contributing to LangChain! Your PR will appear in our next
    release under the title you set. Please make sure it highlights your
    valuable contribution.
    
    Replace this with a description of the change, the issue it fixes (if
    applicable), and relevant context. List any dependencies required for
    this change.
    
    After you're done, someone will review your PR. They may suggest
    improvements. If no one reviews your PR within a few days, feel free to
    @-mention the same people again, as notifications can get
    -->
    
    In the
    [documentation](https://python.langchain.com/en/latest/modules/agents.html)
    it says "in fact, it is often best to have an Action Agent be in
    **change** of the execution for the Plan and Execute agent."
    
    **Suggested Change:** I propose correcting change to charge.
    
    Fix for issue: langchain-ai#5039
    jeffzheng13 authored May 21, 2023
    Configuration menu
    Copy the full SHA
    424a573 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    8c661ba View commit details
    Browse the repository at this point in the history
  5. Harrison/psychic (langchain-ai#5063)

    Co-authored-by: Ayan Bandyopadhyay <[email protected]>
    Co-authored-by: Dev 2049 <[email protected]>
    3 people authored May 21, 2023
    Configuration menu
    Copy the full SHA
    b0431c6 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    6c25f86 View commit details
    Browse the repository at this point in the history
  7. move docs

    hwchase17 committed May 21, 2023
    Configuration menu
    Copy the full SHA
    224f73e View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    0c3de0a View commit details
    Browse the repository at this point in the history

Commits on May 22, 2023

  1. feat: batch multiple files in a single Unstructured API request (lang…

    …chain-ai#4525)
    
    ### Submit Multiple Files to the Unstructured API
    
    Enables batching multiple files into a single Unstructured API requests.
    Support for requests with multiple files was added to both
    `UnstructuredAPIFileLoader` and `UnstructuredAPIFileIOLoader`. Note that
    if you submit multiple files in "single" mode, the result will be
    concatenated into a single document. We recommend using this feature in
    "elements" mode.
    
    ### Testing
    
    The following should load both documents, using two of the example docs
    from the integration tests folder.
    
    ```python
        from langchain.document_loaders import UnstructuredAPIFileLoader
    
        file_paths = ["examples/layout-parser-paper.pdf",  "examples/whatsapp_chat.txt"]
    
        loader = UnstructuredAPIFileLoader(
            file_paths=file_paths,
            api_key="FAKE_API_KEY",
            strategy="fast",
            mode="elements",
        )
        docs = loader.load()
    ```
    MthwRobinson authored May 22, 2023
    Configuration menu
    Copy the full SHA
    bf3f554 View commit details
    Browse the repository at this point in the history
  2. preserve language in conversation retrieval (langchain-ai#4969)

    Without the addition of 'in its original language', the condensing
    response, more often than not, outputs the rephrased question in
    English, even when the conversation is in another language. This
    question in English then transfers to the question in the retrieval
    prompt and the chatbot is stuck in English.
    
    I'm sometimes surprised that this does not happen more often, but
    apparently the GPT models are smart enough to understand that when the
    template contains
    
    Question: ....
    Answer:
    
    then the answer should be in in the language of the question.
    hansvdam authored May 22, 2023
    Configuration menu
    Copy the full SHA
    a395ff7 View commit details
    Browse the repository at this point in the history
  3. docs: Deployments page moved into Ecosystem/ (langchain-ai#4949)

    # docs: `deployments` page moved into `ecosystem/`
    
    The `Deployments` page moved into the `Ecosystem/` group
    
    Small fixes:
    - `index` page: fixed order of items in the `Modules` list, in the `Use
    Cases` list
    - item `References/Installation` was lost in the `index` page (not on
    the Navbar!). Restored it.
    - added `|` marker in several places.
    
    NOTE: I also thought about moving the `Additional Resources/Gallery`
    page into the `Ecosystem` group but decided to leave it unchanged.
    Please, advise on this.
    
    ## Who can review?
    
    Community members can review the PR once tests pass. Tag
    maintainers/contributors who might be interested:
    @dev2049
    leo-gan authored May 22, 2023
    Configuration menu
    Copy the full SHA
    443ebe2 View commit details
    Browse the repository at this point in the history
  4. Separate Runner Functions from Client (langchain-ai#5079)

    Extract the methods specific to running an LLM or Chain on a dataset to
    separate utility functions.
    
    This simplifies the client a bit and lets us separate concerns of LCP
    details from running examples (e.g., for evals)
    vowelparrot authored May 22, 2023
    Configuration menu
    Copy the full SHA
    ef7d015 View commit details
    Browse the repository at this point in the history
  5. Add 'get_token_ids' method (langchain-ai#4784)

    Let user inspect the token ids in addition to getting th enumber of tokens
    
    ---------
    
    Co-authored-by: Zach Schillaci <[email protected]>
    vowelparrot and zachschillaci27 authored May 22, 2023
    Configuration menu
    Copy the full SHA
    785502e View commit details
    Browse the repository at this point in the history
  6. Improved query, print & exception handling in REPL Tool (langchain-ai…

    …#4997)
    
    Update to pull request langchain-ai#3215
    
    Summary:
    1) Improved the sanitization of query (using regex), by removing python
    command (since gpt-3.5-turbo sometimes assumes python console as a
    terminal, and runs python command first which causes error). Also
    sometimes 1 line python codes contain single backticks.
    2) Added 7 new test cases.
    
    For more details, view the previous pull request.
    
    ---------
    
    Co-authored-by: Deepak S V <[email protected]>
    svdeepak99 and svdeepak99 authored May 22, 2023
    Configuration menu
    Copy the full SHA
    49ca027 View commit details
    Browse the repository at this point in the history
  7. Harrison/neo4j (langchain-ai#5078)

    Co-authored-by: Tomaz Bratanic <[email protected]>
    Co-authored-by: Dev 2049 <[email protected]>
    3 people authored May 22, 2023
    Configuration menu
    Copy the full SHA
    10ba201 View commit details
    Browse the repository at this point in the history
  8. Bump 177 (langchain-ai#5095)

    dev2049 authored May 22, 2023
    Configuration menu
    Copy the full SHA
    fcd88bc View commit details
    Browse the repository at this point in the history
  9. fix: revert docarray explicit transitive dependencies and use extras …

    …instead (langchain-ai#5015)
    
    tldr: The docarray [integration
    PR](langchain-ai#4483) introduced a
    pinned dependency to protobuf. This is a docarray dependency, not a
    langchain dependency. Since this is handled by the docarray
    dependencies, it is unnecessary here.
    
    Further, as a pinned dependency, this quickly leads to incompatibilities
    with application code that consumes the library. Much less with a
    heavily used library like protobuf.
    
    Detail: as we see in the [docarray
    
    integration](https://github.com/hwchase17/langchain/pull/4483/files#diff-50c86b7ed8ac2cf95bd48334961bf0530cdc77b5a56f852c5c61b89d735fd711R81-R83),
    the transitive dependencies of docarray were also listed as langchain
    dependencies. This is unnecessary as the docarray project has an
    appropriate
    [extras](https://github.com/docarray/docarray/blob/a01a05542d17264b8a164bec783633658deeedb8/pyproject.toml#L70).
    The docarray project also does not require this _pinned_ version of
    protobuf, rather [a minimum
    version](https://github.com/docarray/docarray/blob/a01a05542d17264b8a164bec783633658deeedb8/pyproject.toml#L41).
    So this pinned version was likely in error.
    
    To fix this, this PR reverts the explicit hnswlib and protobuf
    dependencies and adds the hnswlib extras install for docarray (which
    installs hnswlib and protobuf, as originally intended). Because version
    `0.32.0`
    of the docarray hnswlib extras added protobuf, we bump the docarray
    dependency from `^0.31.0` to `^0.32.0`.
    
    # revert docarray explicit transitive dependencies and use extras
    instead
    
    ## Who can review?
    
    @dev2049 -- reviewed the original PR
    @eyurtsev -- bumped the pinned protobuf dependency a few days ago
    
    ---------
    
    Co-authored-by: Dev 2049 <[email protected]>
    malandis and dev2049 authored May 22, 2023
    Configuration menu
    Copy the full SHA
    6eacd88 View commit details
    Browse the repository at this point in the history
  10. Improving Resilience of MRKL Agent (langchain-ai#5014)

    This is a highly optimized update to the pull request
    langchain-ai#3269
    
    Summary:
    1) Added ability to MRKL agent to self solve the ValueError(f"Could not
    parse LLM output: `{llm_output}`") error, whenever llm (especially
    gpt-3.5-turbo) does not follow the format of MRKL Agent, while returning
    "Action:" & "Action Input:".
    2) The way I am solving this error is by responding back to the llm with
    the messages "Invalid Format: Missing 'Action:' after 'Thought:'" &
    "Invalid Format: Missing 'Action Input:' after 'Action:'" whenever
    Action: and Action Input: are not present in the llm output
    respectively.
    
    For a detailed explanation, look at the previous pull request.
    
    New Updates:
    1) Since @hwchase17 , requested in the previous PR to communicate the
    self correction (error) message, using the OutputParserException, I have
    added new ability to the OutputParserException class to store the
    observation & previous llm_output in order to communicate it to the next
    Agent's prompt. This is done, without breaking/modifying any of the
    functionality OutputParserException previously performs (i.e.
    OutputParserException can be used in the same way as before, without
    passing any observation & previous llm_output too).
    
    ---------
    
    Co-authored-by: Deepak S V <[email protected]>
    svdeepak99 and svdeepak99 authored May 22, 2023
    Configuration menu
    Copy the full SHA
    5cd1210 View commit details
    Browse the repository at this point in the history
  11. Improve pinecone hybrid search retriever adding metadata support (la…

    …ngchain-ai#5098)
    
    # Improve pinecone hybrid search retriever adding metadata support
    
    I simply remove the hardwiring of metadata to the existing
    implementation allowing one to pass `metadatas` attribute to the
    constructors and in `get_relevant_documents`. I also add one missing pip
    install to the accompanying notebook (I am not adding dependencies, they
    were pre-existing).
    
    First contribution, just hoping to help, feel free to critique :) 
    my twitter username is `@andreliebschner`
    
    While looking at hybrid search I noticed langchain-ai#3043 and langchain-ai#1743. I think the
    former can be closed as following the example right now (even prior to
    my improvements) works just fine, the latter I think can be also closed
    safely, maybe pointing out the relevant classes and example. Should I
    reply those issues mentioning someone?
    
    @dev2049, @hwchase17
    
    ---------
    
    Co-authored-by: Andreas Liebschner <[email protected]>
    lbsnrs and lbsnrs authored May 22, 2023
    Configuration menu
    Copy the full SHA
    44dc959 View commit details
    Browse the repository at this point in the history
  12. Add the usage of SSL certificates for Elasticsearch and user password…

    … authentication (langchain-ai#5058)
    
    Enhance the code to support SSL authentication for Elasticsearch when
    using the VectorStore module, as previous versions did not provide this
    capability.
    @dev2049
    
    ---------
    
    Co-authored-by: caidong <[email protected]>
    Co-authored-by: Dev 2049 <[email protected]>
    3 people authored May 22, 2023
    Configuration menu
    Copy the full SHA
    039f8f1 View commit details
    Browse the repository at this point in the history
  13. add get_top_k_cosine_similarity method to get max top k score and ind…

    …ex (langchain-ai#5059)
    
    # Row-wise cosine similarity between two equal-width matrices and return
    the max top_k score and index, the score all greater than
    threshold_score.
    
    Co-authored-by: Dev 2049 <[email protected]>
    hwaking and dev2049 authored May 22, 2023
    Configuration menu
    Copy the full SHA
    e57ebf3 View commit details
    Browse the repository at this point in the history
  14. PowerBI major refinement in working of tool and tweaks in the rest (l…

    …angchain-ai#5090)
    
    # PowerBI major refinement in working of tool and tweaks in the rest
    
    I've gained some experience with more complex sets and the earlier
    implementation had too many tries by the agent to create DAX, so
    refactored the code to run the LLM to create dax based on a question and
    then immediately run the same against the dataset, with retries and a
    prompt that includes the error for the retry. This works much better!
    
    Also did some other refactoring of the inner workings, making things
    clearer, more concise and faster.
    eavanvalkenburg authored May 22, 2023
    Configuration menu
    Copy the full SHA
    1cb04f2 View commit details
    Browse the repository at this point in the history
  15. fix: add_texts method of Weaviate vector store creats wrong embeddings (

    langchain-ai#4933)
    
    # fix a bug in the add_texts method of Weaviate vector store that creats
    wrong embeddings
    
    The following is the original code in the `add_texts` method of the
    Weaviate vector store, from line 131 to 153, which contains a bug. The
    code here includes some extra explanations in the form of comments and
    some omissions.
    
    ```python
                for i, doc in enumerate(texts):
    
                    # some code omitted
    
                    if self._embedding is not None:
                        # variable texts is a list of string and doc here is just a string. 
                        # list(doc) actually breaks up the string into characters.
                        # so, embeddings[0] is just the embedding of the first character
                        embeddings = self._embedding.embed_documents(list(doc))
                        batch.add_data_object(
                            data_object=data_properties,
                            class_name=self._index_name,
                            uuid=_id,
                            vector=embeddings[0],
                        )
    ```
    
    To fix this bug, I pulled the embedding operation out of the for loop
    and embed all texts at once.
    
    Co-authored-by: Shawn91 <[email protected]>
    Co-authored-by: Dev 2049 <[email protected]>
    3 people authored May 22, 2023
    Configuration menu
    Copy the full SHA
    9e64946 View commit details
    Browse the repository at this point in the history
  16. update langchainplus client and docker file to reflect port changes (l…

    …angchain-ai#5005)
    
    # Currently, only the dev images are updated
    agola11 authored May 22, 2023
    Configuration menu
    Copy the full SHA
    467ca6f View commit details
    Browse the repository at this point in the history
  17. Fixed import error for AutoGPT e.g. from langchain.experimental.auton… (

    langchain-ai#5101)
    
    `from langchain.experimental.autonomous_agents.autogpt.agent import
    AutoGPT` results in an import error as AutoGPT is not defined in the
    __init__.py file
    
    https://python.langchain.com/en/latest/use_cases/autonomous_agents/marathon_times.html
    
    An Alternate, way would be to be directly update the import statement to
    be `from langchain.experimental import AutoGPT`
    
    Co-authored-by: Dev 2049 <[email protected]>
    ankitarya1019 and dev2049 authored May 22, 2023
    Configuration menu
    Copy the full SHA
    5b2b436 View commit details
    Browse the repository at this point in the history
  18. Update serpapi.py (langchain-ai#4947)

    Added link option in  _process_response
    
    <!--
    In _process_respons "snippet" provided non working links for the case
    that "links" had the correct answer. Thus added an elif statement before
    snippet
    -->
    
    <!-- Remove if not applicable -->
    
    Fixes # (issue)
    In _process_response link provided correct answers while the snippet
    reply provided non working links
    
    @vowelparrot 
    ## Before submitting
    
    <!-- If you're adding a new integration, include an integration test and
    an example notebook showing its use! -->
    
    ## Who can review?
    
    Community members can review the PR once tests pass. Tag
    maintainers/contributors who might be interested:
    
    <!-- For a quicker response, figure out the right person to tag with @
    
            @hwchase17 - project lead
    
            Tracing / Callbacks
            - @agola11
    
            Async
            - @agola11
    
            DataLoaders
            - @eyurtsev
    
            Models
            - @hwchase17
            - @agola11
    
            Agents / Tools / Toolkits
            - @vowelparrot
            
            VectorStores / Retrievers / Memory
            - @dev2049
            
     -->
    
    ---------
    
    Co-authored-by: Dev 2049 <[email protected]>
    venetisgr and dev2049 authored May 22, 2023
    Configuration menu
    Copy the full SHA
    5e47c64 View commit details
    Browse the repository at this point in the history
  19. changed ValueError to ImportError (langchain-ai#5103)

    # changed ValueError to ImportError
    
    Code cleaning.
    Fixed inconsistencies in ImportError handling. Sometimes it raises
    ImportError and sometime ValueError.
    I've changed all cases to the `raise ImportError`
    Also:
    - added installation instruction in the error message, where it missed;
    - fixed several installation instructions in the error message;
    - fixed several error handling in regards to the ImportError
    leo-gan authored May 22, 2023
    Configuration menu
    Copy the full SHA
    c28cc0f View commit details
    Browse the repository at this point in the history
  20. fix: assign current_time to datetime.now() if current_time is None (l…

    …angchain-ai#5045)
    
    # Assign `current_time` to `datetime.now()` if it `current_time is None`
    in `time_weighted_retriever`
    
    Fixes langchain-ai#4825 
    
    As implemented, `add_documents` in `TimeWeightedVectorStoreRetriever`
    assigns `doc.metadata["last_accessed_at"]` and
    `doc.metadata["created_at"]` to `datetime.datetime.now()` if
    `current_time` is not in `kwargs`.
    ```python
        def add_documents(self, documents: List[Document], **kwargs: Any) -> List[str]:
            """Add documents to vectorstore."""
            current_time = kwargs.get("current_time", datetime.datetime.now())
            # Avoid mutating input documents
            dup_docs = [deepcopy(d) for d in documents]
            for i, doc in enumerate(dup_docs):
                if "last_accessed_at" not in doc.metadata:
                    doc.metadata["last_accessed_at"] = current_time
                if "created_at" not in doc.metadata:
                    doc.metadata["created_at"] = current_time
                doc.metadata["buffer_idx"] = len(self.memory_stream) + i
            self.memory_stream.extend(dup_docs)
            return self.vectorstore.add_documents(dup_docs, **kwargs)
    ``` 
    However, from the way `add_documents` is being called from
    `GenerativeAgentMemory`, `current_time` is set as a `kwarg`, but it is
    given a value of `None`:
    ```python
        def add_memory(
            self, memory_content: str, now: Optional[datetime] = None
        ) -> List[str]:
            """Add an observation or memory to the agent's memory."""
            importance_score = self._score_memory_importance(memory_content)
            self.aggregate_importance += importance_score
            document = Document(
                page_content=memory_content, metadata={"importance": importance_score}
            )
            result = self.memory_retriever.add_documents([document], current_time=now)
    ```
    The default of `now` was set in langchain-ai#4658 to be None. The proposed fix is
    the following:
    ```python
        def add_documents(self, documents: List[Document], **kwargs: Any) -> List[str]:
            """Add documents to vectorstore."""
            current_time = kwargs.get("current_time", datetime.datetime.now())
            # `current_time` may exist in kwargs, but may still have the value of None.
            if current_time is None:
                current_time = datetime.datetime.now()
    ```
    Alternatively, we could just set the default of `now` to be
    `datetime.datetime.now()` everywhere instead. Thoughts @hwchase17? If we
    still want to keep the default to be `None`, then this PR should fix the
    above issue. If we want to set the default to be
    `datetime.datetime.now()` instead, I can update this PR with that
    alternative fix. EDIT: seems like from langchain-ai#5018 it looks like we would
    prefer to keep the default to be `None`, in which case this PR should
    fix the error.
    mbchang authored May 22, 2023
    Configuration menu
    Copy the full SHA
    e173e03 View commit details
    Browse the repository at this point in the history
  21. Add Mastodon toots loader (langchain-ai#5036)

    # Add Mastodon toots loader.
    
    Loader works either with public toots, or Mastodon app credentials. Toot
    text and user info is loaded.
    
    I've also added integration test for this new loader as it works with
    public data, and a notebook with example output run now.
    
    ---------
    
    Co-authored-by: Dev 2049 <[email protected]>
    imrehg and dev2049 authored May 22, 2023
    Configuration menu
    Copy the full SHA
    69de33e View commit details
    Browse the repository at this point in the history

Commits on May 23, 2023

  1. Add OpenLM LLM multi-provider (langchain-ai#4993)

    OpenLM is a zero-dependency OpenAI-compatible LLM provider that can call
    different inference endpoints directly via HTTP. It implements the
    OpenAI Completion class so that it can be used as a drop-in replacement
    for the OpenAI API. This changeset utilizes BaseOpenAI for minimal added
    code.
    
    ---------
    
    Co-authored-by: Dev 2049 <[email protected]>
    r2d4 and dev2049 authored May 23, 2023
    Configuration menu
    Copy the full SHA
    de6a401 View commit details
    Browse the repository at this point in the history
  2. Pass Dataset Name by Name not Position (langchain-ai#5108)

    Pass dataset name by name
    vowelparrot authored May 23, 2023
    Configuration menu
    Copy the full SHA
    87bba2e View commit details
    Browse the repository at this point in the history
  3. Fixes issue langchain-ai#5072 - adds additional support to Weaviate (l…

    …angchain-ai#5085)
    
    Implementation is similar to search_distance and where_filter
    
    # adds 'additional' support to Weaviate queries
    
    Co-authored-by: Dev 2049 <[email protected]>
    jettro and dev2049 authored May 23, 2023
    Configuration menu
    Copy the full SHA
    b950022 View commit details
    Browse the repository at this point in the history
  4. Improve effeciency of TextSplitter.split_documents, iterate once (lan…

    …gchain-ai#5111)
    
    # Improve TextSplitter.split_documents, collect page_content and
    metadata in one iteration
    
    ## Who can review?
    
    Community members can review the PR once tests pass. Tag
    maintainers/contributors who might be interested:
    
    @eyurtsev In the case where documents is a generator that can only be
    iterated once making this change is a huge help. Otherwise a silent
    issue happens where metadata is empty for all documents when documents
    is a generator. So we expand the argument from `List[Document]` to
    `Union[Iterable[Document], Sequence[Document]]`
    
    ---------
    
    Co-authored-by: Steven Tartakovsky <[email protected]>
    eyurtsev and startakovsky authored May 23, 2023
    Configuration menu
    Copy the full SHA
    d56313a View commit details
    Browse the repository at this point in the history
  5. WhyLabs callback (langchain-ai#4906)

    # Add a WhyLabs callback handler
    
    * Adds a simple WhyLabsCallbackHandler
    * Add required dependencies as optional
    * protect against missing modules with imports
    * Add docs/ecosystem basic example
    
    based on initial prototype from @andrewelizondo
    
    > this integration gathers privacy preserving telemetry on text with
    whylogs and sends stastical profiles to WhyLabs platform to monitoring
    these metrics over time. For more information on what WhyLabs is see:
    https://whylabs.ai
    
    After you run the notebook (if you have env variables set for the API
    Keys, org_id and dataset_id) you get something like this in WhyLabs:
    ![Screenshot
    (443)](https://github.com/hwchase17/langchain/assets/88007022/6bdb3e1c-4243-4ae8-b974-23a8bb12edac)
    
    Co-authored-by: Andre Elizondo <[email protected]>
    Co-authored-by: Dev 2049 <[email protected]>
    3 people authored May 23, 2023
    Configuration menu
    Copy the full SHA
    d4fd589 View commit details
    Browse the repository at this point in the history
  6. Add AzureCognitiveServicesToolkit to call Azure Cognitive Services API (

    langchain-ai#5012)
    
    # Add AzureCognitiveServicesToolkit to call Azure Cognitive Services
    API: achieve some multimodal capabilities
    
    This PR adds a toolkit named AzureCognitiveServicesToolkit which bundles
    the following tools:
    - AzureCogsImageAnalysisTool: calls Azure Cognitive Services image
    analysis API to extract caption, objects, tags, and text from images.
    - AzureCogsFormRecognizerTool: calls Azure Cognitive Services form
    recognizer API to extract text, tables, and key-value pairs from
    documents.
    - AzureCogsSpeech2TextTool: calls Azure Cognitive Services speech to
    text API to transcribe speech to text.
    - AzureCogsText2SpeechTool: calls Azure Cognitive Services text to
    speech API to synthesize text to speech.
    
    This toolkit can be used to process image, document, and audio inputs.
    ---------
    
    Co-authored-by: Dev 2049 <[email protected]>
    whiskyboy and dev2049 authored May 23, 2023
    Configuration menu
    Copy the full SHA
    d7f807b View commit details
    Browse the repository at this point in the history
  7. Add link to Psychic from document loaders documentation page (langcha…

    …in-ai#5115)
    
    # Add link to Psychic from document loaders documentation page
    
    In my previous PR I forgot to update `document_loaders.rst` to link to
    `psychic.ipynb` to make it discoverable from the main documentation.
    Ayan-Bandyopadhyay authored May 23, 2023
    Configuration menu
    Copy the full SHA
    5c87dbf View commit details
    Browse the repository at this point in the history
  8. bump 178 (langchain-ai#5130)

    dev2049 authored May 23, 2023
    Configuration menu
    Copy the full SHA
    753f4cf View commit details
    Browse the repository at this point in the history
  9. docs: fix minor typo + add wikipedia package installation part in hum…

    …an_input_llm.ipynb (langchain-ai#5118)
    
    # Fix typo + add wikipedia package installation part in
    human_input_llm.ipynb
    This PR
    1. Fixes typo ("the the human input LLM"), 
    2. Addes wikipedia package installation part (in accordance with
    `WikipediaQueryRun`
    [documentation](https://python.langchain.com/en/latest/modules/agents/tools/examples/wikipedia.html))
    
    in `human_input_llm.ipynb`
    (`docs/modules/models/llms/examples/human_input_llm.ipynb`)
    amicus-veritatis authored May 23, 2023
    Configuration menu
    Copy the full SHA
    7a75bb2 View commit details
    Browse the repository at this point in the history
  10. solving langchain-ai#2887 (langchain-ai#5127)

    # Allowing openAI fine-tuned models
    Very simple fix that checks whether a openAI `model_name` is a
    fine-tuned model when loading `context_size` and when computing call's
    cost in the `openai_callback`.
    
    Fixes langchain-ai#2887 
    ---------
    
    Co-authored-by: Dev 2049 <[email protected]>
    tommasodelorenzo and dev2049 authored May 23, 2023
    Configuration menu
    Copy the full SHA
    5002f3a View commit details
    Browse the repository at this point in the history
  11. Improve PlanningOutputParser whitespace handling (langchain-ai#5143)

    Some LLM's will produce numbered lists with leading whitespace, i.e. in
    response to "What is the sum of 2 and 3?":
    ```
    Plan:
      1. Add 2 and 3.
      2. Given the above steps taken, please respond to the users original question.
    ```
    This commit updates the PlanningOutputParser regex to ignore leading
    whitespace before the step number, enabling it to correctly parse this
    format.
    TMRolle authored May 23, 2023
    Configuration menu
    Copy the full SHA
    754b513 View commit details
    Browse the repository at this point in the history
  12. Add ElasticsearchEmbeddings class for generating embeddings using Ela…

    …sticsearch models (langchain-ai#3401)
    
    This PR introduces a new module, `elasticsearch_embeddings.py`, which
    provides a wrapper around Elasticsearch embedding models. The new
    ElasticsearchEmbeddings class allows users to generate embeddings for
    documents and query texts using a [model deployed in an Elasticsearch
    cluster](https://www.elastic.co/guide/en/machine-learning/current/ml-nlp-model-ref.html#ml-nlp-model-ref-text-embedding).
    
    ### Main features:
    
    1. The ElasticsearchEmbeddings class initializes with an Elasticsearch
    connection object and a model_id, providing an interface to interact
    with the Elasticsearch ML client through
    [infer_trained_model](https://elasticsearch-py.readthedocs.io/en/v8.7.0/api.html?highlight=trained%20model%20infer#elasticsearch.client.MlClient.infer_trained_model)
    .
    2. The `embed_documents()` method generates embeddings for a list of
    documents, and the `embed_query()` method generates an embedding for a
    single query text.
    3. The class supports custom input text field names in case the deployed
    model expects a different field name than the default `text_field`.
    4. The implementation is compatible with any model deployed in
    Elasticsearch that generates embeddings as output.
    
    ### Benefits:
    
    1. Simplifies the process of generating embeddings using Elasticsearch
    models.
    2. Provides a clean and intuitive interface to interact with the
    Elasticsearch ML client.
    3. Allows users to easily integrate Elasticsearch-generated embeddings.
    
    Related issue langchain-ai#3400
    
    ---------
    
    Co-authored-by: Dev 2049 <[email protected]>
    jeffvestal and dev2049 authored May 23, 2023
    Configuration menu
    Copy the full SHA
    0b542a9 View commit details
    Browse the repository at this point in the history
  13. Adding Weather Loader (langchain-ai#5056)

    
    Co-authored-by: Tyler Hutcherson <[email protected]>
    Co-authored-by: Dev 2049 <[email protected]>
    3 people authored May 23, 2023
    Configuration menu
    Copy the full SHA
    68f0d45 View commit details
    Browse the repository at this point in the history
  14. Add MosaicML inference endpoints (langchain-ai#4607)

    # Add MosaicML inference endpoints
    This PR adds support in langchain for MosaicML inference endpoints. We
    both serve a select few open source models, and allow customers to
    deploy their own models using our inference service. Docs are here
    (https://docs.mosaicml.com/en/latest/inference.html), and sign up form
    is here (https://forms.mosaicml.com/demo?utm_source=langchain). I'm not
    intimately familiar with the details of langchain, or the contribution
    process, so please let me know if there is anything that needs fixing or
    this is the wrong way to submit a new integration, thanks!
    
    I'm also not sure what the procedure is for integration tests. I have
    tested locally with my api key.
    
    ## Who can review?
    @hwchase17
    
    ---------
    
    Co-authored-by: Harrison Chase <[email protected]>
    dakinggg and hwchase17 authored May 23, 2023
    Configuration menu
    Copy the full SHA
    de6e6c7 View commit details
    Browse the repository at this point in the history
  15. Empty check before pop (langchain-ai#4929)

    # Check whether 'other' is empty before popping
    
    This PR could fix a potential 'popping empty set' error.
    
    Co-authored-by: Junlin Zhou <[email protected]>
    edwardzjl and edwardzjl authored May 23, 2023
    Configuration menu
    Copy the full SHA
    9242998 View commit details
    Browse the repository at this point in the history

Commits on May 24, 2023

  1. Add async versions of predict() and predict_messages() (langchain-ai#…

    …4867)
    
    # Add async versions of predict() and predict_messages()
    
    langchain-ai#4615 introduced a unifying interface for "base" and "chat" LLM models
    via the new `predict()` and `predict_messages()` methods that allow both
    types of models to operate on string and message-based inputs,
    respectively.
    
    This PR adds async versions of the same (`apredict()` and
    `apredict_messages()`) that are identical except for their use of
    `agenerate()` in place of `generate()`, which means they repurpose all
    existing work on the async backend.
    
    
    ## Who can review?
    
    Community members can review the PR once tests pass. Tag
    maintainers/contributors who might be interested:
            @hwchase17 (follows his work on langchain-ai#4615)
            @agola11 (async)
    
    ---------
    
    Co-authored-by: Harrison Chase <[email protected]>
    jlowin and hwchase17 authored May 24, 2023
    Configuration menu
    Copy the full SHA
    925dd3e View commit details
    Browse the repository at this point in the history
  2. fix: fix current_time=Now bug for aadd_documents in TimeWeightedRetri…

    …ever (langchain-ai#5155)
    
    # Same as PR langchain-ai#5045, but for async
    
    <!--
    Thank you for contributing to LangChain! Your PR will appear in our next
    release under the title you set. Please make sure it highlights your
    valuable contribution.
    
    Replace this with a description of the change, the issue it fixes (if
    applicable), and relevant context. List any dependencies required for
    this change.
    
    After you're done, someone will review your PR. They may suggest
    improvements. If no one reviews your PR within a few days, feel free to
    @-mention the same people again, as notifications can get lost.
    -->
    
    <!-- Remove if not applicable -->
    
    Fixes langchain-ai#4825 
    
    I had forgotten to update the asynchronous counterpart `aadd_documents`
    with the bug fix from PR langchain-ai#5045, so this PR also fixes `aadd_documents`
    too.
    
    ## Who can review?
    
    Community members can review the PR once tests pass. Tag
    maintainers/contributors who might be interested:
    @dev2049
    
    <!-- For a quicker response, figure out the right person to tag with @
    
            @hwchase17 - project lead
    
            Tracing / Callbacks
            - @agola11
    
            Async
            - @agola11
    
            DataLoaders
            - @eyurtsev
    
            Models
            - @hwchase17
            - @agola11
    
            Agents / Tools / Toolkits
            - @vowelparrot
            
            VectorStores / Retrievers / Memory
            - @dev2049
            
     -->
    mbchang authored May 24, 2023
    Configuration menu
    Copy the full SHA
    b1b7f35 View commit details
    Browse the repository at this point in the history
  3. Docs: updated getting_started.md (langchain-ai#5151)

    # Docs: updated getting_started.md
    
    Just accommodating some unnecessary spaces in the example of "pass few
    shot examples to a prompt template".
    
    @vowelparrot
    DanQuin authored May 24, 2023
    Configuration menu
    Copy the full SHA
    de4ef24 View commit details
    Browse the repository at this point in the history
  4. Clarification of the reference to the "get_text_legth" function in ge… (

    langchain-ai#5154)
    
    # Clarification of the reference to the "get_text_legth" function in
    getting_started.md
    
    Reference to the function "get_text_legth" in the documentation did not
    make sense. Comment added for clarification.
    
    @hwchase17
    DanQuin authored May 24, 2023
    Configuration menu
    Copy the full SHA
    c111134 View commit details
    Browse the repository at this point in the history
  5. docs: added missed document_loaders examples (langchain-ai#5150)

    # DOCS added missed document_loader examples
    
    Added missed examples: `JSON`, `Open Document Format (ODT)`,
    `Wikipedia`, `tomarkdown`.
    Updated them to a consistent format.
    
    ## Who can review?
    
    @hwchase17 
    @dev2049
    leo-gan authored May 24, 2023
    Configuration menu
    Copy the full SHA
    3392948 View commit details
    Browse the repository at this point in the history
  6. Add Typesense vector store (langchain-ai#1674)

    Closes langchain-ai#931.
    
    ---------
    
    Co-authored-by: Dev 2049 <[email protected]>
    jasonbosco and dev2049 authored May 24, 2023
    Configuration menu
    Copy the full SHA
    9c4b43b View commit details
    Browse the repository at this point in the history
  7. Vectara (langchain-ai#5069)

    # Vectara Integration
    
    This PR provides integration with Vectara. Implemented here are:
    * langchain/vectorstore/vectara.py
    * tests/integration_tests/vectorstores/test_vectara.py
    * langchain/retrievers/vectara_retriever.py
    And two IPYNB notebooks to do more testing:
    * docs/modules/chains/index_examples/vectara_text_generation.ipynb
    * docs/modules/indexes/vectorstores/examples/vectara.ipynb
    
    ---------
    
    Co-authored-by: Dev 2049 <[email protected]>
    ofermend and dev2049 authored May 24, 2023
    Configuration menu
    Copy the full SHA
    c81fb88 View commit details
    Browse the repository at this point in the history
  8. Beam (langchain-ai#4996)

    # Beam
    
    Calls the Beam API wrapper to deploy and make subsequent calls to an
    instance of the gpt2 LLM in a cloud deployment. Requires installation of
    the Beam library and registration of Beam Client ID and Client Secret.
    Additional calls can then be made through the instance of the large
    language model in your code or by calling the Beam API.
    
    ---------
    
    Co-authored-by: Dev 2049 <[email protected]>
    NolanTrem and dev2049 authored May 24, 2023
    Configuration menu
    Copy the full SHA
    faa2665 View commit details
    Browse the repository at this point in the history
  9. Update rellm_experimental.ipynb (langchain-ai#5189)

    # Your PR Title (What it does)
    
    HuggingFace -> Hugging Face
    eltociear authored May 24, 2023
    Configuration menu
    Copy the full SHA
    fff21a0 View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    cf19a2a View commit details
    Browse the repository at this point in the history
  11. adjust docarray docstrings (langchain-ai#5185)

    Follow up of langchain-ai#5015
    
    Thanks for catching this! 
    
    Just a small PR to adjust couple of strings to these changes
    
    Signed-off-by: jupyterjazz <[email protected]>
    jupyterjazz authored May 24, 2023
    Configuration menu
    Copy the full SHA
    47e4ee4 View commit details
    Browse the repository at this point in the history
  12. bump 179 (langchain-ai#5200)

    dev2049 authored May 24, 2023
    Configuration menu
    Copy the full SHA
    2d5588c View commit details
    Browse the repository at this point in the history
  13. Harrison/modelscope (langchain-ai#5156)

    Co-authored-by: thomas-yanxin <[email protected]>
    Co-authored-by: Dev 2049 <[email protected]>
    3 people authored May 24, 2023
    Configuration menu
    Copy the full SHA
    11c26eb View commit details
    Browse the repository at this point in the history
  14. Reuse length_func in MapReduceDocumentsChain (langchain-ai#5181)

    # Reuse `length_func` in `MapReduceDocumentsChain`
    
    Pretty straightforward refactor in `MapReduceDocumentsChain`. Reusing
    the local variable `length_func`, instead of the longer alternative
    `self.combine_document_chain.prompt_length`.
    
    @hwchase17
    zachschillaci27 authored May 24, 2023
    Configuration menu
    Copy the full SHA
    aa14e22 View commit details
    Browse the repository at this point in the history
  15. Update Cypher QA prompt (langchain-ai#5173)

    # Improve Cypher QA prompt
    
    The current QA prompt is optimized for networkX answer generation, which
    returns all the possible triples.
    However, Cypher search is a bit more focused and doesn't necessary
    return all the context information.
    Due to that reason, the model sometimes refuses to generate an answer
    even though the information is provided:
    
    ![Screenshot from 2023-05-24
    08-36-23](https://github.com/hwchase17/langchain/assets/19948365/351cf9c1-2567-447c-91fd-284ae3fa1ccf)
    
    
    To fix this issue, I have updated the prompt. Interestingly, I tried
    many variations with less instructions and they didn't work properly.
    However, the current fix works nicely.
    ![Screenshot from 2023-05-24
    08-37-25](https://github.com/hwchase17/langchain/assets/19948365/fc830603-e6ec-4a23-8a86-eaf572996014)
    tomasonjo authored May 24, 2023
    Configuration menu
    Copy the full SHA
    fd866d1 View commit details
    Browse the repository at this point in the history
  16. Improve weaviate vectorstore docs (langchain-ai#5201)

    # Improve weaviate vectorstore docs
    hsm207 authored May 24, 2023
    Configuration menu
    Copy the full SHA
    b00c77d View commit details
    Browse the repository at this point in the history
  17. tfidf retriever (langchain-ai#5114)

    Co-authored-by: vempaliakhil96 <[email protected]>
    dev2049 and vempaliakhil96 authored May 24, 2023
    Configuration menu
    Copy the full SHA
    2b2176a View commit details
    Browse the repository at this point in the history
  18. standardize json parsing (langchain-ai#5168)

    Co-authored-by: Dev 2049 <[email protected]>
    hwchase17 and dev2049 authored May 24, 2023
    Configuration menu
    Copy the full SHA
    94cf391 View commit details
    Browse the repository at this point in the history
  19. fixing total cost finetuned model giving zero (langchain-ai#5144)

    # OpanAI finetuned model giving zero tokens cost
    
    Very simple fix to the previously committed solution to allowing
    finetuned Openai models.
    
    Improves langchain-ai#5127 
    
    ---------
    
    Co-authored-by: Dev 2049 <[email protected]>
    tommasodelorenzo and dev2049 authored May 24, 2023
    Configuration menu
    Copy the full SHA
    52714ce View commit details
    Browse the repository at this point in the history
  20. Fixes scope of query Session in PGVector (langchain-ai#5194)

    `vectorstore.PGVector`: The transactional boundary should be increased
    to cover the query itself
    
    Currently, within the `similarity_search_with_score_by_vector` the
    transactional boundary (created via the `Session` call) does not include
    the select query being made.
    
    This can result in un-intended consequences when interacting with the
    PGVector instance methods directly
    
    
    ---------
    
    Co-authored-by: Dev 2049 <[email protected]>
    Matt Wells and dev2049 authored May 24, 2023
    Configuration menu
    Copy the full SHA
    c173bf1 View commit details
    Browse the repository at this point in the history
  21. Output parsing variation allowance (langchain-ai#5178)

    # Output parsing variation allowance for self-ask with search
    
    This change makes self-ask with search easier for Llama models to
    follow, as they tend toward returning 'Followup:' instead of 'Follow
    up:' despite an otherwise valid remaining output.
    
    
    Co-authored-by: Dev 2049 <[email protected]>
    dibrale and dev2049 authored May 24, 2023
    Configuration menu
    Copy the full SHA
    d8eed60 View commit details
    Browse the repository at this point in the history
  22. Allow readthedoc loader to pass custom html tag (langchain-ai#5175)

    ## Description
    
    The html structure of readthedocs can differ. Currently, the html tag is
    hardcoded in the reader, and unable to fit into some cases. This pr
    includes the following changes:
    
    1. Replace `find_all` with `find` because we just want one tag.
    2. Provide `custom_html_tag` to the loader.
    3. Add tests for readthedoc loader
    4. Refactor code
    
    ## Issues
    
    See more in langchain-ai#2609. The
    problem was not completely fixed in that pr.
    ---------
    
    Signed-off-by: byhsu <[email protected]>
    Co-authored-by: byhsu <[email protected]>
    Co-authored-by: Dev 2049 <[email protected]>
    3 people authored May 24, 2023
    Configuration menu
    Copy the full SHA
    f0730c6 View commit details
    Browse the repository at this point in the history
  23. Add Iugu document loader (langchain-ai#5162)

    Create IUGU loader
    ---------
    
    Co-authored-by: Dev 2049 <[email protected]>
    rasiqueira and dev2049 authored May 24, 2023
    Configuration menu
    Copy the full SHA
    f10be07 View commit details
    Browse the repository at this point in the history
  24. Add Joplin document loader (langchain-ai#5153)

    # Add Joplin document loader
    
    [Joplin](https://joplinapp.org/) is an open source note-taking app.
    
    Joplin has a [REST API](https://joplinapp.org/api/references/rest_api/)
    for accessing its local database. The proposed `JoplinLoader` uses the
    API to retrieve all notes in the database and their metadata. Joplin
    needs to be installed and running locally, and an access token is
    required.
    
    - The PR includes an integration test.
    - The PR includes an example notebook.
    
    ---------
    
    Co-authored-by: Dev 2049 <[email protected]>
    alondmnt and dev2049 authored May 24, 2023
    Configuration menu
    Copy the full SHA
    44abe92 View commit details
    Browse the repository at this point in the history
  25. nit (langchain-ai#5208)

    dev2049 authored May 24, 2023
    Configuration menu
    Copy the full SHA
    dcee893 View commit details
    Browse the repository at this point in the history
  26. Configuration menu
    Copy the full SHA
    b7fcb35 View commit details
    Browse the repository at this point in the history
  27. Log warning (langchain-ai#5192)

    Changes debug log to warning log when LC Tracer fails to instantiate
    vowelparrot authored May 24, 2023
    Configuration menu
    Copy the full SHA
    66113c2 View commit details
    Browse the repository at this point in the history
  28. Configuration menu
    Copy the full SHA
    e76e68b View commit details
    Browse the repository at this point in the history
  29. Add 'status' command to get server status (langchain-ai#5197)

    Example:
    
    
    ```
    $ langchain plus start --expose
    ...
    $ langchain plus status
    The LangChainPlus server is currently running.
    
    Service             Status         Published Ports
    langchain-backend   Up 40 seconds  1984
    langchain-db        Up 41 seconds  5433
    langchain-frontend  Up 40 seconds  80
    ngrok               Up 41 seconds  4040
    
    To connect, set the following environment variables in your LangChain application:
    LANGCHAIN_TRACING_V2=true
    LANGCHAIN_ENDPOINT=https://5cef-70-23-89-158.ngrok.io
    
    $ langchain plus stop
    $ langchain plus status
    The LangChainPlus server is not running.
    $ langchain plus start
    The LangChainPlus server is currently running.
    
    Service             Status        Published Ports
    langchain-backend   Up 5 seconds  1984
    langchain-db        Up 6 seconds  5433
    langchain-frontend  Up 5 seconds  80
    
    To connect, set the following environment variables in your LangChain application:
    LANGCHAIN_TRACING_V2=true
    LANGCHAIN_ENDPOINT=http://localhost:1984
    ```
    vowelparrot authored May 24, 2023
    Configuration menu
    Copy the full SHA
    e6c4571 View commit details
    Browse the repository at this point in the history
  30. Harrison/vertex (langchain-ai#5049)

    Co-authored-by: Leonid Kuligin <[email protected]>
    Co-authored-by: Leonid Kuligin <[email protected]>
    Co-authored-by: sasha-gitg <[email protected]>
    Co-authored-by: Justin Flick <[email protected]>
    Co-authored-by: Justin Flick <[email protected]>
    6 people authored May 24, 2023
    Configuration menu
    Copy the full SHA
    a775aa6 View commit details
    Browse the repository at this point in the history

Commits on May 25, 2023

  1. fix a mistake in concepts.md (langchain-ai#5222)

    # fix a mistake in concepts.md
    
    
    ## Who can review?
    
    Community members can review the PR once tests pass. Tag
    maintainers/contributors who might be interested:
    leo-gan authored May 25, 2023
    Configuration menu
    Copy the full SHA
    2ad29f4 View commit details
    Browse the repository at this point in the history
  2. Create async copy of from_text() inside GraphIndexCreator. (langchain…

    …-ai#5214)
    
    Copies `GraphIndexCreator.from_text()` to make an async version called
    `GraphIndexCreator.afrom_text()`.
    
    This is (should be) a trivial change: it just adds a copy of
    `GraphIndexCreator.from_text()` which is async and awaits a call to
    `chain.apredict()` instead of `chain.predict()`. There is no unit test
    for GraphIndexCreator, and I did not create one, but this code works for
    me locally.
    
    @agola11 @hwchase17
    maspotts authored May 25, 2023
    Configuration menu
    Copy the full SHA
    95c9aa1 View commit details
    Browse the repository at this point in the history
  3. Remove API key from docs (langchain-ai#5223)

    I found an API key for `serpapi_api_key` while reading the docs. It
    seems to have been modified very recently. Removed it in this PR
    @hwchase17 - project lead
    kbressem authored May 25, 2023
    Configuration menu
    Copy the full SHA
    eff31a3 View commit details
    Browse the repository at this point in the history
  4. Change Default GoogleDriveLoader Behavior to not Load Trashed Files (…

    …issue langchain-ai#5104) (langchain-ai#5220)
    
    # Change Default GoogleDriveLoader Behavior to not Load Trashed Files
    (issue langchain-ai#5104)
    
    Fixes langchain-ai#5104
    
    If the previous behavior of loading files that used to live in the
    folder, but are now trashed, you can use the `load_trashed_files`
    parameter:
    
    ```
    loader = GoogleDriveLoader(
        folder_id="1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5",
        recursive=False,
        load_trashed_files=True
    )
    ```
    
    As not loading trashed files should be expected behavior, should we
    1. even provide the `load_trashed_files` parameter?
    2. add documentation? Feels most users will stick with default behavior
    
    ## Who can review?
    
    Community members can review the PR once tests pass. Tag
    maintainers/contributors who might be interested:
    
    DataLoaders
    - @eyurtsev
    
    Twitter: [@nicholasliu77](https://twitter.com/nicholasliu77)
    NickL77 authored May 25, 2023
    Configuration menu
    Copy the full SHA
    f0ea093 View commit details
    Browse the repository at this point in the history
  5. Allow to specify ID when adding to the FAISS vectorstore. (langchain-…

    …ai#5190)
    
    # Allow to specify ID when adding to the FAISS vectorstore
    
    This change allows unique IDs to be specified when adding documents /
    embeddings to a faiss vectorstore.
    
    - This reflects the current approach with the chroma vectorstore.
    - It allows rejection of inserts on duplicate IDs
    - will allow deletion / update by searching on deterministic ID (such as
    a hash).
    - If not specified, a random UUID is generated (as per previous
    behaviour, so non-breaking).
    
    This commit fixes langchain-ai#5065 and langchain-ai#3896 and should fix langchain-ai#2699 indirectly. I've
    tested adding and merging.
    
    Kindly tagging @Xmaster6y @dev2049 for review.
    
    ---------
    
    Co-authored-by: Ati Sharma <[email protected]>
    Co-authored-by: Harrison Chase <[email protected]>
    3 people authored May 25, 2023
    Configuration menu
    Copy the full SHA
    40b086d View commit details
    Browse the repository at this point in the history
  6. Bibtex integration for document loader and retriever (langchain-ai#5137)

    # Bibtex integration
    
    Wrap bibtexparser to retrieve a list of docs from a bibtex file.
    * Get the metadata from the bibtex entries
    * `page_content` get from the local pdf referenced in the `file` field
    of the bibtex entry using `pymupdf`
    * If no valid pdf file, `page_content` set to the `abstract` field of
    the bibtex entry
    * Support Zotero flavour using regex to get the file path
    * Added usage example in
    `docs/modules/indexes/document_loaders/examples/bibtex.ipynb`
    ---------
    
    Co-authored-by: Sébastien M. Popoff <[email protected]>
    Co-authored-by: Dev 2049 <[email protected]>
    3 people authored May 25, 2023
    Configuration menu
    Copy the full SHA
    5cfa72a View commit details
    Browse the repository at this point in the history
  7. Add MiniMax embeddings (langchain-ai#5174)

    - Add support for MiniMax embeddings
    
    Doc: [MiniMax
    embeddings](https://api.minimax.chat/document/guides/embeddings?id=6464722084cdc277dfaa966a)
    
    ---------
    
    Co-authored-by: Archon <[email protected]>
    Co-authored-by: Dev 2049 <[email protected]>
    3 people authored May 25, 2023
    Configuration menu
    Copy the full SHA
    5cdd9ab View commit details
    Browse the repository at this point in the history
  8. Weaviate: Add QnA with sources example (langchain-ai#5247)

    # Add QnA with sources example 
    
    <!--
    Thank you for contributing to LangChain! Your PR will appear in our next
    release under the title you set. Please make sure it highlights your
    valuable contribution.
    
    Replace this with a description of the change, the issue it fixes (if
    applicable), and relevant context. List any dependencies required for
    this change.
    
    After you're done, someone will review your PR. They may suggest
    improvements. If no one reviews your PR within a few days, feel free to
    @-mention the same people again, as notifications can get lost.
    -->
    
    <!-- Remove if not applicable -->
    
    Fixes: see
    https://stackoverflow.com/questions/76207160/langchain-doesnt-work-with-weaviate-vector-database-getting-valueerror/76210017#76210017
    
    ## Before submitting
    
    <!-- If you're adding a new integration, include an integration test and
    an example notebook showing its use! -->
    
    ## Who can review?
    
    Community members can review the PR once tests pass. Tag
    maintainers/contributors who might be interested:
    
    <!-- For a quicker response, figure out the right person to tag with @
    
            @hwchase17 - project lead
    
            Tracing / Callbacks
            - @agola11
    
            Async
            - @agola11
    
            DataLoaders
            - @eyurtsev
    
            Models
            - @hwchase17
            - @agola11
    
            Agents / Tools / Toolkits
            - @vowelparrot
            
            VectorStores / Retrievers / Memory
            - @dev2049
            
     -->
    @dev2049
    hsm207 authored May 25, 2023
    Configuration menu
    Copy the full SHA
    09e246f View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    9e57be4 View commit details
    Browse the repository at this point in the history
  10. bump 180 (langchain-ai#5248)

    dev2049 authored May 25, 2023
    Configuration menu
    Copy the full SHA
    15b17f9 View commit details
    Browse the repository at this point in the history
  11. remove extra "\n" to ensure that the format of the description, examp… (

    langchain-ai#5232)
    
    remove extra "\n" to ensure that the format of the description, example,
    and prompt&generation are completely consistent.
    pengqu123 authored May 25, 2023
    Configuration menu
    Copy the full SHA
    c7e2151 View commit details
    Browse the repository at this point in the history
  12. Resolve error in StructuredOutputParser docs (langchain-ai#5240)

    # Resolve error in StructuredOutputParser docs
    
    Documentation for `StructuredOutputParser` currently not reproducible,
    that is, `output_parser.parse(output)` raises an error because the LLM
    returns a response with an invalid format
    
    ```python
    _input = prompt.format_prompt(question="what's the capital of france")
    output = model(_input.to_string())
    
    output
    
    # ?
    #
    # ```json
    # {
    # 	"answer": "Paris",
    # 	"source": "https://www.worldatlas.com/articles/what-is-the-capital-of-france.html"
    # }
    # ```
    ```
    
    Was fixed by adding a question mark to the prompt
    mwinterde authored May 25, 2023
    Configuration menu
    Copy the full SHA
    9c0cb90 View commit details
    Browse the repository at this point in the history
  13. Added the option of specifying a proxy for the OpenAI API (langchain-…

    …ai#5246)
    
    # Added the option of specifying a proxy for the OpenAI API
    
    Fixes langchain-ai#5243
    
    Co-authored-by: Yves Maurer <>
    ymaurer authored May 25, 2023
    Configuration menu
    Copy the full SHA
    88ed8e1 View commit details
    Browse the repository at this point in the history
  14. OpenSearch top k parameter fix (langchain-ai#5216)

    For most queries it's the `size` parameter that determines final number
    of documents to return. Since our abstractions refer to this as `k`, set
    this to be `k` everywhere instead of expecting a separate param. Would
    be great to have someone more familiar with OpenSearch validate that
    this is reasonable (e.g. that having `size` and what OpenSearch calls
    `k` be the same won't lead to any strange behavior). cc @naveentatikonda
    
    Closes langchain-ai#5212
    dev2049 authored May 25, 2023
    Configuration menu
    Copy the full SHA
    3be9ba1 View commit details
    Browse the repository at this point in the history
  15. Fixed regression in JoplinLoader's get note url (langchain-ai#5265)

    Fixes a regression in JoplinLoader that was introduced during the code
    review (bad `page` wildcard in _get_note_url).
    
    ## Who can review?
    
    Community members can review the PR once tests pass. Tag
    maintainers/contributors who might be interested:
    
    @dev2049
    @leo-gan
    alondmnt authored May 25, 2023
    Configuration menu
    Copy the full SHA
    d3cd21c View commit details
    Browse the repository at this point in the history
  16. Docs link custom agent page in getting started (langchain-ai#5250)

    # Docs: link custom agent page in getting started
    JanilsWoerst authored May 25, 2023
    Configuration menu
    Copy the full SHA
    5525602 View commit details
    Browse the repository at this point in the history
  17. Zep sdk version (langchain-ai#5267)

    zep-python's sync methods no longer need an asyncio wrapper. This was
    causing issues with FastAPI deployment.
    Zep also now supports putting and getting of arbitrary message metadata.
    
    Bump zep-python version to v0.30
    
    Remove nest-asyncio from Zep example notebooks.
    
    Modify tests to include metadata.
    
    ---------
    
    Co-authored-by: Daniel Chalef <[email protected]>
    Co-authored-by: Daniel Chalef <[email protected]>
    3 people authored May 25, 2023
    Configuration menu
    Copy the full SHA
    ca88b25 View commit details
    Browse the repository at this point in the history
  18. Add C Transformers for GGML Models (langchain-ai#5218)

    # Add C Transformers for GGML Models
    I created Python bindings for the GGML models:
    https://github.com/marella/ctransformers
    
    Currently it supports GPT-2, GPT-J, GPT-NeoX, LLaMA, MPT, etc. See
    [Supported
    Models](https://github.com/marella/ctransformers#supported-models).
    
    
    It provides a unified interface for all models:
    
    ```python
    from langchain.llms import CTransformers
    
    llm = CTransformers(model='/path/to/ggml-gpt-2.bin', model_type='gpt2')
    
    print(llm('AI is going to'))
    ```
    
    It can be used with models hosted on the Hugging Face Hub:
    
    ```py
    llm = CTransformers(model='marella/gpt-2-ggml')
    ```
    
    It supports streaming:
    
    ```py
    from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
    
    llm = CTransformers(model='marella/gpt-2-ggml', callbacks=[StreamingStdOutCallbackHandler()])
    ```
    
    Please see [README](https://github.com/marella/ctransformers#readme) for
    more details.
    ---------
    
    Co-authored-by: Dev 2049 <[email protected]>
    marella and dev2049 authored May 25, 2023
    Configuration menu
    Copy the full SHA
    b398862 View commit details
    Browse the repository at this point in the history
  19. Configuration menu
    Copy the full SHA
    3223a97 View commit details
    Browse the repository at this point in the history
  20. Add Multi-CSV/DF support in CSV and DataFrame Toolkits (langchain-ai#…

    …5009)
    
    Add Multi-CSV/DF support in CSV and DataFrame Toolkits
    * CSV and DataFrame toolkits now accept list of CSVs/DFs
    * Add default prompts for many dataframes in `pandas_dataframe` toolkit
    
    Fixes langchain-ai#1958
    Potentially fixes langchain-ai#4423
    
    ## Testing
    * Add single and multi-dataframe integration tests for
    `pandas_dataframe` toolkit with permutations of `include_df_in_prompt`
    * Add single and multi-CSV integration tests for csv toolkit
    ---------
    
    Co-authored-by: Harrison Chase <[email protected]>
    NickL77 and hwchase17 authored May 25, 2023
    Configuration menu
    Copy the full SHA
    7652d2a View commit details
    Browse the repository at this point in the history
  21. OpenAI lint (langchain-ai#5273)

    Causing lint issues if you have openai installed, annoying for local dev
    dev2049 authored May 25, 2023
    Configuration menu
    Copy the full SHA
    f01dfe8 View commit details
    Browse the repository at this point in the history

Commits on May 26, 2023

  1. Added pipline args to HuggingFacePipeline.from_model_id (langchain-…

    …ai#5268)
    
    The current `HuggingFacePipeline.from_model_id` does not allow passing
    of pipeline arguments to the transformer pipeline.
    This PR enables adding important pipeline parameters like setting
    `max_new_tokens` for example.
    Previous to this PR it would be necessary to manually create the
    pipeline through huggingface transformers then handing it to langchain.
    
    For example instead of this
    ```py
    model_id = "gpt2"
    tokenizer = AutoTokenizer.from_pretrained(model_id)
    model = AutoModelForCausalLM.from_pretrained(model_id)
    pipe = pipeline(
        "text-generation", model=model, tokenizer=tokenizer, max_new_tokens=10
    )
    hf = HuggingFacePipeline(pipeline=pipe)
    ```
    You can write this
    ```py
    hf = HuggingFacePipeline.from_model_id(
        model_id="gpt2", task="text-generation", pipeline_kwargs={"max_new_tokens": 10}
    )
    ```
    
    
    Co-authored-by: Dev 2049 <[email protected]>
    solomspd and dev2049 authored May 26, 2023
    Configuration menu
    Copy the full SHA
    2ef5579 View commit details
    Browse the repository at this point in the history
  2. Support bigquery dialect - SQL (langchain-ai#5261)

    # Your PR Title (What it does)
    
    Adding an if statement to deal with bigquery sql dialect. When I use
    bigquery dialect before, it failed while using SET search_path TO. So
    added a condition to set dataset as the schema parameter which is
    equivalent to SET search_path TO . I have tested and it works.
    
    
    ## Who can review?
    
    Community members can review the PR once tests pass. Tag
    maintainers/contributors who might be interested:
    @dev2049
    HassanOuda authored May 26, 2023
    Configuration menu
    Copy the full SHA
    56ad56c View commit details
    Browse the repository at this point in the history
  3. feat: add Momento as a standard cache and chat message history provid…

    …er (langchain-ai#5221)
    
    # Add Momento as a standard cache and chat message history provider
    
    This PR adds Momento as a standard caching provider. Implements the
    interface, adds integration tests, and documentation. We also add
    Momento as a chat history message provider along with integration tests,
    and documentation.
    
    [Momento](https://www.gomomento.com/) is a fully serverless cache.
    Similar to S3 or DynamoDB, it requires zero configuration,
    infrastructure management, and is instantly available. Users sign up for
    free and get 50GB of data in/out for free every month.
    
    ## Before submitting
    
    ✅ We have added documentation, notebooks, and integration tests
    demonstrating usage.
    
    Co-authored-by: Dev 2049 <[email protected]>
    malandis and dev2049 authored May 26, 2023
    Configuration menu
    Copy the full SHA
    7047a2c View commit details
    Browse the repository at this point in the history
  4. Fixed typo: 'ouput' to 'output' in all documentation (langchain-ai#5272)

    # Fixed typo: 'ouput' to 'output' in all documentation
    
    In this instance, the typo 'ouput' was amended to 'output' in all
    occurrences within the documentation. There are no dependencies required
    for this change.
    deepblue authored May 26, 2023
    Configuration menu
    Copy the full SHA
    a0281f5 View commit details
    Browse the repository at this point in the history
  5. Tedma4/twilio tool (langchain-ai#5136)

    # Add twilio sms tool
    
    ---------
    
    Co-authored-by: Dev 2049 <[email protected]>
    tedma4 and dev2049 authored May 26, 2023
    Configuration menu
    Copy the full SHA
    1cb6498 View commit details
    Browse the repository at this point in the history
  6. LLM wrapper for Databricks (langchain-ai#5142)

    This PR adds LLM wrapper for Databricks. It supports two endpoint types:
    * serving endpoint
    * cluster driver proxy app
    
    An integration notebook is included to show how it works.
    
    
    Co-authored-by: Davis Chase <[email protected]>
    Co-authored-by: Gengliang Wang <[email protected]>
    Co-authored-by: Dev 2049 <[email protected]>
    4 people authored May 26, 2023
    Configuration menu
    Copy the full SHA
    aec642f View commit details
    Browse the repository at this point in the history
  7. Add an example to make the prompt more robust (langchain-ai#5291)

    # Add example to LLMMath to help with power operator
    
    Add example to LLMMath that helps the model to interpret `^` as the power operator rather than the python xor operator.
    pengqu123 authored May 26, 2023
    Configuration menu
    Copy the full SHA
    d481d88 View commit details
    Browse the repository at this point in the history
  8. Update CONTRIBUTION guidelines and PR Template (langchain-ai#5140)

    # Update contribution guidelines and PR template
    
    This PR updates the contribution guidelines to include more information
    on how to handle optional dependencies. 
    
    The PR template is updated to include a link to the contribution guidelines document.
    eyurtsev authored May 26, 2023
    Configuration menu
    Copy the full SHA
    a669abf View commit details
    Browse the repository at this point in the history
  9. Fixed passing creds to VertexAI LLM (langchain-ai#5297)

    # Fixed passing creds to VertexAI LLM
    
    Fixes  langchain-ai#5279 
    
    It looks like we should drop a type annotation for Credentials.
    
    Co-authored-by: Leonid Kuligin <[email protected]>
    lkuligin and Leonid Kuligin authored May 26, 2023
    Configuration menu
    Copy the full SHA
    aa3c7b3 View commit details
    Browse the repository at this point in the history
  10. bump 181 (langchain-ai#5302)

    dev2049 authored May 26, 2023
    Configuration menu
    Copy the full SHA
    641303a View commit details
    Browse the repository at this point in the history
  11. Better docs for weaviate hybrid search (langchain-ai#5290)

    # Better docs for weaviate hybrid search
    
    <!--
    Thank you for contributing to LangChain! Your PR will appear in our next
    release under the title you set. Please make sure it highlights your
    valuable contribution.
    
    Replace this with a description of the change, the issue it fixes (if
    applicable), and relevant context. List any dependencies required for
    this change.
    
    After you're done, someone will review your PR. They may suggest
    improvements. If no one reviews your PR within a few days, feel free to
    @-mention the same people again, as notifications can get lost.
    -->
    
    <!-- Remove if not applicable -->
    
    Fixes: NA
    
    ## Before submitting
    
    <!-- If you're adding a new integration, include an integration test and
    an example notebook showing its use! -->
    
    ## Who can review?
    
    Community members can review the PR once tests pass. Tag
    maintainers/contributors who might be interested:
    
    <!-- For a quicker response, figure out the right person to tag with @
    
            @hwchase17 - project lead
    
            Tracing / Callbacks
            - @agola11
    
            Async
            - @agola11
    
            DataLoaders
            - @eyurtsev
    
            Models
            - @hwchase17
            - @agola11
    
            Agents / Tools / Toolkits
            - @vowelparrot
            
            VectorStores / Retrievers / Memory
            - @dev2049
            
     -->
    @dev2049
    hsm207 authored May 26, 2023
    Configuration menu
    Copy the full SHA
    58e95cd View commit details
    Browse the repository at this point in the history
  12. Add instructions to pyproject.toml (langchain-ai#5138)

    # Add instructions to pyproject.toml
    
    * Add instructions to pyproject.toml about how to handle optional
    dependencies.
    
    ## Before submitting
    
    
    ## Who can review?
    
    ---------
    
    Co-authored-by: Davis Chase <[email protected]>
    Co-authored-by: Zander Chase <[email protected]>
    3 people authored May 26, 2023
    Configuration menu
    Copy the full SHA
    0a8d6bc View commit details
    Browse the repository at this point in the history
  13. docs: improve flow of llm caching notebook (langchain-ai#5309)

    # docs: improve flow of llm caching notebook
    
    The notebook `llm_caching` demos various caching providers. In the
    previous version, there was setup common to all examples but under the
    `In Memory Caching` heading.
    
    If a user comes and only wants to try a particular example, they will
    run the common setup, then the cells for the specific provider they are
    interested in. Then they will get import and variable reference errors.
    This commit moves the common setup to the top to avoid this.
    
    ## Who can review?
    
    Community members can review the PR once tests pass. Tag
    maintainers/contributors who might be interested:
    
    @dev2049
    malandis authored May 26, 2023
    Configuration menu
    Copy the full SHA
    f75f0db View commit details
    Browse the repository at this point in the history

Commits on May 27, 2023

  1. Fix typos (langchain-ai#5323)

    # Documentation typo fixes
    
    Fixes # (issue)
    
    Simple typos in the blockchain .ipynb documentation
    russellpwirtz authored May 27, 2023
    Configuration menu
    Copy the full SHA
    6e974b5 View commit details
    Browse the repository at this point in the history

Commits on May 28, 2023

  1. docs: added link to LangChain Handbook (langchain-ai#5311)

    # added a link to LangChain Handbook
    
    ## Who can review?
    
    Community members can review the PR once tests pass.
    leo-gan authored May 28, 2023
    Configuration menu
    Copy the full SHA
    465a970 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    179ddbe View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    5292e85 View commit details
    Browse the repository at this point in the history
  4. Add Chainlit to deployment options (langchain-ai#5314)

    # Add Chainlit to deployment options
    
    Add [Chainlit](https://github.com/Chainlit/chainlit) as deployment
    options
    Used links to Github examples and Chainlit doc on the LangChain
    integration
    
    Co-authored-by: Dan Constantini <[email protected]>
    constantinidan and Dan Constantini authored May 28, 2023
    Configuration menu
    Copy the full SHA
    c49c6ac View commit details
    Browse the repository at this point in the history
  5. Fixing blank thoughts in verbose for "_Exception" Action (langchain-a…

    …i#5331)
    
    Fixed the issue of blank Thoughts being printed in verbose when
    `handle_parsing_errors=True`, as below:
    
    Before Fix:
    ```
    Observation: There are 38175 accounts available in the dataframe.
    Thought:
    Observation: Invalid or incomplete response
    Thought:
    Observation: Invalid or incomplete response
    Thought:
    ```
    
    After Fix:
    ```
    Observation: There are 38175 accounts available in the dataframe.
    Thought:AI: {
        "action": "Final Answer",
        "action_input": "There are 38175 accounts available in the dataframe."
    }
    Observation: Invalid Action or Action Input format
    Thought:AI: {
        "action": "Final Answer",
        "action_input": "The number of available accounts is 38175."
    }
    Observation: Invalid Action or Action Input format
    ```
    
    @vowelparrot currently I have set the colour of thought to green (same
    as the colour when `handle_parsing_errors=False`). If you want to change
    the colour of this "_Exception" case to red or something else (when
    `handle_parsing_errors=True`), feel free to change it in line 789.
    svdeepak99 authored May 28, 2023
    Configuration menu
    Copy the full SHA
    c6e5d90 View commit details
    Browse the repository at this point in the history
  6. fix: remove empty lines that cause InvalidRequestError (langchain-ai#…

    …5320)
    
    # remove empty lines in GenerativeAgentMemory that cause
    InvalidRequestError in OpenAIEmbeddings
    
    <!--
    Thank you for contributing to LangChain! Your PR will appear in our
    release under the title you set. Please make sure it highlights your
    valuable contribution.
    
    Replace this with a description of the change, the issue it fixes (if
    applicable), and relevant context. List any dependencies required for
    this change.
    
    After you're done, someone will review your PR. They may suggest
    improvements. If no one reviews your PR within a few days, feel free to
    @-mention the same people again, as notifications can get lost.
    -->
    
    <!-- Remove if not applicable -->
    
    Let's say the text given to `GenerativeAgent._parse_list` is
    ```
    text = """
    Insight 1: <insight 1>
    
    Insight 2: <insight 2>
    """
    ```
    This creates an `openai.error.InvalidRequestError: [''] is not valid
    under any of the given schemas - 'input'` because
    `GenerativeAgent.add_memory()` tries to add an empty string to the
    vectorstore.
    
    This PR fixes the issue by removing the empty line between `Insight 1`
    and `Insight 2`
    
    ## Before submitting
    
    <!-- If you're adding a new integration, please include:
    
    1. a test for the integration - favor unit tests that does not rely on
    network access.
    2. an example notebook showing its use
    
    
    See contribution guidelines for more information on how to write tests,
    lint
    etc:
    
    
    https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
    -->
    
    ## Who can review?
    
    Community members can review the PR once tests pass. Tag
    maintainers/contributors who might be interested:
    
    <!-- For a quicker response, figure out the right person to tag with @
    
      @hwchase17 - project lead
    
      Tracing / Callbacks
      - @agola11
    
      Async
      - @agola11
    
      DataLoaders
      - @eyurtsev
    
      Models
      - @hwchase17
      - @agola11
    
      Agents / Tools / Toolkits
      - @vowelparrot
    
      VectorStores / Retrievers / Memory
      - @dev2049
            
     -->
    @hwchase17
    @vowelparrot
    @dev2049
    mbchang authored May 28, 2023
    Configuration menu
    Copy the full SHA
    f079cdf View commit details
    Browse the repository at this point in the history
  7. Sample Notebook for DynamoDB Chat Message History (langchain-ai#5351)

    # Sample Notebook for DynamoDB Chat Message History
    
    @dev2049
    
    Adding a sample notebook for the DynamoDB Chat Message History class.
    
    <!-- For a quicker response, figure out the right person to tag with @
    
      @hwchase17 - project lead
    
      Tracing / Callbacks
      - @agola11
    
      Async
      - @agola11
    
      DataLoaders
      - @eyurtsev
    
      Models
      - @hwchase17
      - @agola11
    
      Agents / Tools / Toolkits
      - @vowelparrot
    
      VectorStores / Retrievers / Memory
      - @dev2049
            
     -->
    KBB99 authored May 28, 2023
    Configuration menu
    Copy the full SHA
    881dfe8 View commit details
    Browse the repository at this point in the history
  8. added cosmos kwargs option (langchain-ai#5292)

    # Added the ability to pass kwargs to cosmos client constructor
    
    The cosmos client has a ton of options that can be set, so allowing
    those to be passed to the constructor from the chat memory constructor
    with this PR.
    eavanvalkenburg authored May 28, 2023
    Configuration menu
    Copy the full SHA
    1daa706 View commit details
    Browse the repository at this point in the history
  9. feat: support for shopping search in SerpApi (langchain-ai#5259)

    # Support for shopping search in SerpApi
    
    ## Who can review?
    @vowelparrot
    aymenfurter authored May 28, 2023
    Configuration menu
    Copy the full SHA
    e274295 View commit details
    Browse the repository at this point in the history
  10. Add SKLearnVectorStore (langchain-ai#5305)

    # Add SKLearnVectorStore
    
    This PR adds SKLearnVectorStore, a simply vector store based on
    NearestNeighbors implementations in the scikit-learn package. This
    provides a simple drop-in vector store implementation with minimal
    dependencies (scikit-learn is typically installed in a data scientist /
    ml engineer environment). The vector store can be persisted and loaded
    from json, bson and parquet format.
    
    SKLearnVectorStore has soft (dynamic) dependency on the scikit-learn,
    numpy and pandas packages. Persisting to bson requires the bson package,
    persisting to parquet requires the pyarrow package.
    
    ## Before submitting
    
    Integration tests are provided under
    `tests/integration_tests/vectorstores/test_sklearn.py`
    
    Sample usage notebook is provided under
    `docs/modules/indexes/vectorstores/examples/sklear.ipynb`
    
    Co-authored-by: Dev 2049 <[email protected]>
    mrtj and dev2049 authored May 28, 2023
    Configuration menu
    Copy the full SHA
    5f45523 View commit details
    Browse the repository at this point in the history
  11. bump 182 (langchain-ai#5364)

    dev2049 authored May 28, 2023
    Configuration menu
    Copy the full SHA
    b705f26 View commit details
    Browse the repository at this point in the history
  12. Fixes iter error in FAISS add_embeddings call (langchain-ai#5367)

    # Remove re-use of iter within add_embeddings causing error
    
    As reported in langchain-ai#5336 there
    is an issue currently involving the atempted re-use of an iterator
    within the FAISS vectorstore adapter
    
    Fixes # langchain-ai#5336
    
    ## Who can review?
    
    Community members can review the PR once tests pass. Tag
    maintainers/contributors who might be interested:
    
      VectorStores / Retrievers / Memory
      - @dev2049
    Matt Wells authored May 28, 2023
    Configuration menu
    Copy the full SHA
    9a5c9df View commit details
    Browse the repository at this point in the history
  13. Configuration menu
    Copy the full SHA
    b692797 View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    ad7f4c0 View commit details
    Browse the repository at this point in the history
  15. Add path validation to DirectoryLoader (langchain-ai#5327)

    # Add path validation to DirectoryLoader
    
    This PR introduces a minor adjustment to the DirectoryLoader by adding
    validation for the path argument. Previously, if the provided path
    didn't exist or wasn't a directory, DirectoryLoader would return an
    empty document list due to the behavior of the `glob` method. This could
    potentially cause confusion for users, as they might expect a
    file-loading error instead.
    
    So, I've added two validations to the load method of the
    DirectoryLoader:
    
    - Raise a FileNotFoundError if the provided path does not exist
    - Raise a ValueError if the provided path is not a directory
    
    Due to the relatively small scope of these changes, a new issue was not
    created.
    
    ## Before submitting
    
    <!-- If you're adding a new integration, please include:
    
    1. a test for the integration - favor unit tests that does not rely on
    network access.
    2. an example notebook showing its use
    
    
    See contribution guidelines for more information on how to write tests,
    lint
    etc:
    
    
    https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
    -->
    
    ## Who can review?
    
    Community members can review the PR once tests pass. Tag
    maintainers/contributors who might be interested:
    
    @eyurtsev
    os1ma authored May 28, 2023
    Configuration menu
    Copy the full SHA
    1366d07 View commit details
    Browse the repository at this point in the history
  16. Fix: Handle empty documents in ContextualCompressionRetriever (Issue l…

    …angchain-ai#5304) (langchain-ai#5306)
    
    # Fix: Handle empty documents in ContextualCompressionRetriever (Issue
    langchain-ai#5304)
    
    Fixes langchain-ai#5304 
    
    Prevent cohere.error.CohereAPIError caused by an empty list of documents
    by adding a condition to check if the input documents list is empty in
    the compress_documents method. If the list is empty, return an empty
    list immediately, avoiding the error and unnecessary processing.
    
    @dev2049
    
    ---------
    
    Co-authored-by: Dev 2049 <[email protected]>
    hanguofeng and dev2049 authored May 28, 2023
    Configuration menu
    Copy the full SHA
    99a1e3f View commit details
    Browse the repository at this point in the history

Commits on May 29, 2023

  1. handle json parsing errors (langchain-ai#5371)

    adds tests cases, consolidates a lot of PRs
    hwchase17 authored May 29, 2023
    Configuration menu
    Copy the full SHA
    6df90ad View commit details
    Browse the repository at this point in the history
  2. Use Default Factory (langchain-ai#5380)

    We shouldn't be calling a constructor for a default value - should use
    default_factory instead. This is especially ad in this case since it
    requires an optional dependency and an API key to be set.
     
    Resolves langchain-ai#5361
    vowelparrot authored May 29, 2023
    Configuration menu
    Copy the full SHA
    14099f1 View commit details
    Browse the repository at this point in the history
  3. Update PR template with Twitter handle request (langchain-ai#5382)

    # Updates PR template to request Twitter handle for shoutouts!
    
    Makes it easier for maintainers to show their appreciation 😄
    jacoblee93 authored May 29, 2023
    Configuration menu
    Copy the full SHA
    f77f271 View commit details
    Browse the repository at this point in the history
  4. fix: Blob.from_data mimetype is lost (langchain-ai#5395)

    # Fix lost mimetype when using Blob.from_data method
    
    The mimetype is lost due to a typo in the class attribue name
    
    Fixes # - (no issue opened but I can open one if needed)
    
    ## Changes
    
    * Fixed typo in name
    * Added unit-tests to validate the output Blob
    
    
    ## Review
    @eyurtsev
    Digma authored May 29, 2023
    Configuration menu
    Copy the full SHA
    8b7721e View commit details
    Browse the repository at this point in the history
  5. Add async support to routing chains (langchain-ai#5373)

    # Add async support for (LLM) routing chains
    
    <!--
    Thank you for contributing to LangChain! Your PR will appear in our
    release under the title you set. Please make sure it highlights your
    valuable contribution.
    
    Replace this with a description of the change, the issue it fixes (if
    applicable), and relevant context. List any dependencies required for
    this change.
    
    After you're done, someone will review your PR. They may suggest
    improvements. If no one reviews your PR within a few days, feel free to
    @-mention the same people again, as notifications can get lost.
    -->
    
    <!-- Remove if not applicable -->
    
    Add asynchronous LLM calls support for the routing chains. More
    specifically:
    - Add async `aroute` function (i.e. async version of `route`) to the
    `RouterChain` which calls the routing LLM asynchronously
    - Implement the async `_acall` for the `LLMRouterChain`
    - Implement the async `_acall` function for `MultiRouteChain` which
    first calls asynchronously the routing chain with its new `aroute`
    function, and then calls asynchronously the relevant destination chain.
    
    <!-- If you're adding a new integration, please include:
    
    1. a test for the integration - favor unit tests that does not rely on
    network access.
    2. an example notebook showing its use
    
    
    See contribution guidelines for more information on how to write tests,
    lint
    etc:
    
    
    https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
    -->
    
    ## Who can review?
    
    - @agola11
    
    <!-- For a quicker response, figure out the right person to tag with @
    
      @hwchase17 - project lead
      Async
      - @agola11
            
     -->
    amaudruz authored May 29, 2023
    Configuration menu
    Copy the full SHA
    e455ba4 View commit details
    Browse the repository at this point in the history
  6. Fix update_document function, add test and documentation. (langchain-…

    …ai#5359)
    
    # Fix for `update_document` Function in Chroma
    
    ## Summary
    This pull request addresses an issue with the `update_document` function
    in the Chroma class, as described in
    [langchain-ai#5031](langchain-ai#5031 (comment)).
    The issue was identified as an `AttributeError` raised when calling
    `update_document` due to a missing corresponding method in the
    `Collection` object. This fix refactors the `update_document` method in
    `Chroma` to correctly interact with the `Collection` object.
    
    ## Changes
    1. Fixed the `update_document` method in the `Chroma` class to correctly
    call methods on the `Collection` object.
    2. Added the corresponding test `test_chroma_update_document` in
    `tests/integration_tests/vectorstores/test_chroma.py` to reflect the
    updated method call.
    3. Added an example and explanation of how to use the `update_document`
    function in the Jupyter notebook tutorial for Chroma.
    
    ## Test Plan
    All existing tests pass after this change. In addition, the
    `test_chroma_update_document` test case now correctly checks the
    functionality of `update_document`, ensuring that the function works as
    expected and updates the content of documents correctly.
    
    ## Reviewers
    @dev2049
    
    This fix will ensure that users are able to use the `update_document`
    function as expected, without encountering the previous
    `AttributeError`. This will enhance the usability and reliability of the
    Chroma class for all users.
    
    Thank you for considering this pull request. I look forward to your
    feedback and suggestions.
    martinholecekmax authored May 29, 2023
    Configuration menu
    Copy the full SHA
    44b48d9 View commit details
    Browse the repository at this point in the history
  7. Update llamacpp demonstration notebook (langchain-ai#5344)

    # Update llamacpp demonstration notebook
    
    Add instructions to install with BLAS backend, and update the example of
    model usage.
    
    Fixes langchain-ai#5071. However, it is more like a prevention of similar issues in
    the future, not a fix, since there was no problem in the framework
    functionality
    
    ## Who can review?
    
    Community members can review the PR once tests pass. Tag
    maintainers/contributors who might be interested:
    
    - @hwchase17 
    - @agola11
    sadaisystems authored May 29, 2023
    Configuration menu
    Copy the full SHA
    f6615ca View commit details
    Browse the repository at this point in the history
  8. Removed deprecated llm attribute for load_chain (langchain-ai#5343)

    # Removed deprecated llm attribute for load_chain
    
    Currently `load_chain` for some chain types expect `llm` attribute to be
    present but `llm` is deprecated attribute for those chains and might not
    be persisted during their `chain.save`.
    
    Fixes langchain-ai#5224
    [(issue)](langchain-ai#5224)
    
    ## Who can review?
    @hwchase17
    @dev2049
    
    ---------
    
    Co-authored-by: imeckr <[email protected]>
    imeckr and imeckr authored May 29, 2023
    Configuration menu
    Copy the full SHA
    642ae83 View commit details
    Browse the repository at this point in the history
  9. Harrison/llamacpp (langchain-ai#5402)

    Co-authored-by: Gavin S <[email protected]>
    hwchase17 and s7726 authored May 29, 2023
    Configuration menu
    Copy the full SHA
    3e16468 View commit details
    Browse the repository at this point in the history
  10. Add pagination for Vertex AI embeddings (langchain-ai#5325)

    Fixes langchain-ai#5316
    
    ---------
    
    Co-authored-by: Justin Flick <[email protected]>
    Co-authored-by: Harrison Chase <[email protected]>
    3 people authored May 29, 2023
    Configuration menu
    Copy the full SHA
    c09f8e4 View commit details
    Browse the repository at this point in the history
  11. Reformat openai proxy setting as code (langchain-ai#5330)

    # Reformat the openai proxy setting as code
    
    
      Only affect the doc for openai Model
      - @hwchase17
      - @agola11
    sevendark authored May 29, 2023
    Configuration menu
    Copy the full SHA
    100d665 View commit details
    Browse the repository at this point in the history
  12. Harrison/deep infra (langchain-ai#5403)

    Co-authored-by: Yessen Kanapin <[email protected]>
    Co-authored-by: Yessen Kanapin <[email protected]>
    3 people authored May 29, 2023
    Configuration menu
    Copy the full SHA
    416c8b1 View commit details
    Browse the repository at this point in the history
  13. Harrison/prediction guard update (langchain-ai#5404)

    Co-authored-by: Daniel Whitenack <[email protected]>
    hwchase17 and dwhitena authored May 29, 2023
    Configuration menu
    Copy the full SHA
    d6fb25c View commit details
    Browse the repository at this point in the history
  14. Implemented appending arbitrary messages (langchain-ai#5293)

    # Implemented appending arbitrary messages to the base chat message
    history, the in-memory and cosmos ones.
    
    <!--
    Thank you for contributing to LangChain! Your PR will appear in our next
    release under the title you set. Please make sure it highlights your
    valuable contribution.
    
    Replace this with a description of the change, the issue it fixes (if
    applicable), and relevant context. List any dependencies required for
    this change.
    
    After you're done, someone will review your PR. They may suggest
    improvements. If no one reviews your PR within a few days, feel free to
    @-mention the same people again, as notifications can get lost.
    -->
    
    As discussed this is the alternative way instead of langchain-ai#4480, with a
    add_message method added that takes a BaseMessage as input, so that the
    user can control what is in the base message like kwargs.
    
    <!-- Remove if not applicable -->
    
    Fixes # (issue)
    
    ## Before submitting
    
    <!-- If you're adding a new integration, include an integration test and
    an example notebook showing its use! -->
    
    ## Who can review?
    
    Community members can review the PR once tests pass. Tag
    maintainers/contributors who might be interested:
    
    @hwchase17
    
    ---------
    
    Co-authored-by: Harrison Chase <[email protected]>
    eavanvalkenburg and hwchase17 authored May 29, 2023
    Configuration menu
    Copy the full SHA
    ccb6238 View commit details
    Browse the repository at this point in the history
  15. docs: ecosystem/integrations update 2 (langchain-ai#5282)

    # docs: ecosystem/integrations update 2
    
    langchain-ai#5219 - part 1 
    The second part of this update (parts are independent of each other! no
    overlap):
    
    - added diffbot.md
    - updated confluence.ipynb; added confluence.md
    - updated college_confidential.md
    - updated openai.md
    - added blackboard.md
    - added bilibili.md
    - added azure_blob_storage.md
    - added azlyrics.md
    - added aws_s3.md
    
    ## Who can review?
    
    @hwchase17@agola11
    @agola11
     @vowelparrot
     @dev2049
    leo-gan authored May 29, 2023
    Configuration menu
    Copy the full SHA
    a359819 View commit details
    Browse the repository at this point in the history
  16. docs: ecosystem/integrations update 1 (langchain-ai#5219)

    # docs: ecosystem/integrations update
    
    It is the first in a series of `ecosystem/integrations` updates.
    
    The ecosystem/integrations list is missing many integrations.
    I'm adding the missing integrations in a consistent format: 
    1. description of the integrated system
    2. `Installation and Setup` section with 'pip install ...`, Key setup,
    and other necessary settings
    3. Sections like `LLM`, `Text Embedding Models`, `Chat Models`... with
    links to correspondent examples and imports of the used classes.
    
    This PR keeps new docs, that are presented in the
    `docs/modules/models/text_embedding/examples` but missed in the
    `ecosystem/integrations`. The next PRs will cover the next example
    sections.
    
    Also updated `integrations.rst`: added the `Dependencies` section with a
    link to the packages used in LangChain.
    
    ## Who can review?
    
    @hwchase17
    @eyurtsev
    @dev2049
    leo-gan authored May 29, 2023
    Configuration menu
    Copy the full SHA
    1837caa View commit details
    Browse the repository at this point in the history
  17. Harrison/datetime parser (langchain-ai#4693)

    Co-authored-by: Jacob Valdez <[email protected]>
    Co-authored-by: Jacob Valdez <[email protected]>
    Co-authored-by: Eugene Yurtsev <[email protected]>
    4 people authored May 29, 2023
    Configuration menu
    Copy the full SHA
    2da8c48 View commit details
    Browse the repository at this point in the history
  18. Configuration menu
    Copy the full SHA
    cce731c View commit details
    Browse the repository at this point in the history
  19. Add ToolException that a tool can throw. (langchain-ai#5050)

    # Add ToolException that a tool can throw
    This is an optional exception that tool throws when execution error
    occurs.
    When this exception is thrown, the agent will not stop working,but will
    handle the exception according to the handle_tool_error variable of the
    tool,and the processing result will be returned to the agent as
    observation,and printed in pink on the console.It can be used like this:
    ```python 
    from langchain.schema import ToolException
    from langchain import LLMMathChain, SerpAPIWrapper, OpenAI
    from langchain.agents import AgentType, initialize_agent
    from langchain.chat_models import ChatOpenAI
    from langchain.tools import BaseTool, StructuredTool, Tool, tool
    from langchain.chat_models import ChatOpenAI
    
    llm = ChatOpenAI(temperature=0)
    llm_math_chain = LLMMathChain(llm=llm, verbose=True)
    
    class Error_tool:
        def run(self, s: str):
            raise ToolException('The current search tool is not available.')
        
    def handle_tool_error(error) -> str:
        return "The following errors occurred during tool execution:"+str(error)
    
    search_tool1 = Error_tool()
    search_tool2 = SerpAPIWrapper()
    tools = [
        Tool.from_function(
            func=search_tool1.run,
            name="Search_tool1",
            description="useful for when you need to answer questions about current events.You should give priority to using it.",
            handle_tool_error=handle_tool_error,
        ),
        Tool.from_function(
            func=search_tool2.run,
            name="Search_tool2",
            description="useful for when you need to answer questions about current events",
            return_direct=True,
        )
    ]
    agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True,
                             handle_tool_errors=handle_tool_error)
    agent.run("Who is Leo DiCaprio's girlfriend? What is her current age raised to the 0.43 power?")
    ```
    
    ![image](https://github.com/hwchase17/langchain/assets/32786500/51930410-b26e-4f85-a1e1-e6a6fb450ada)
    
    ## Who can review?
    - @vowelparrot
    
    ---------
    
    Co-authored-by: Dev 2049 <[email protected]>
    xming521 and dev2049 authored May 29, 2023
    Configuration menu
    Copy the full SHA
    cf5803e View commit details
    Browse the repository at this point in the history
  20. Harrison/text splitter (langchain-ai#5417)

    adds support for keeping separators around when using recursive text
    splitter
    hwchase17 authored May 29, 2023
    Configuration menu
    Copy the full SHA
    72f99ff View commit details
    Browse the repository at this point in the history

Commits on May 30, 2023

  1. New Trello document loader (langchain-ai#4767)

    # Added New Trello loader class and documentation
    
    Simple Loader on top of py-trello wrapper. 
    With a board name you can pull cards and to do some field parameter
    tweaks on load operation.
    I included documentation and examples.
    Included unit test cases using patch and a fixture for py-trello client
    class.
    
    ---------
    
    Co-authored-by: Dev 2049 <[email protected]>
    GMartin-dev and dev2049 authored May 30, 2023
    Configuration menu
    Copy the full SHA
    0b3e0dd View commit details
    Browse the repository at this point in the history
  2. DocumentLoader for GitHub (langchain-ai#5408)

    # Creates GitHubLoader (langchain-ai#5257)
    
    GitHubLoader is a DocumentLoader that loads issues and PRs from GitHub.
    
    Fixes langchain-ai#5257
    
    ---------
    
    Co-authored-by: Dev 2049 <[email protected]>
    UmerHA and dev2049 authored May 30, 2023
    Configuration menu
    Copy the full SHA
    8259f9b View commit details
    Browse the repository at this point in the history
  3. Harrison/spark reader (langchain-ai#5405)

    Co-authored-by: Rithwik Ediga Lakhamsani <[email protected]>
    Co-authored-by: Dev 2049 <[email protected]>
    3 people authored May 30, 2023
    Configuration menu
    Copy the full SHA
    760632b View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    26ff185 View commit details
    Browse the repository at this point in the history
  5. Rename and fix typo in lancedb (langchain-ai#5425)

    # Fix typo in LanceDB notebook filename
    eddyxu authored May 30, 2023
    Configuration menu
    Copy the full SHA
    ee57054 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    c4b502a View commit details
    Browse the repository at this point in the history
  7. adding MongoDBAtlasVectorSearch (langchain-ai#5338)

    # Add MongoDBAtlasVectorSearch for the python library
    
    Fixes langchain-ai#5337
    ---------
    
    Co-authored-by: Dev 2049 <[email protected]>
    P-E-B and dev2049 authored May 30, 2023
    Configuration menu
    Copy the full SHA
    a61b7f7 View commit details
    Browse the repository at this point in the history
  8. Add more code splitters (go, rst, js, java, cpp, scala, ruby, php, sw…

    …ift, rust) (langchain-ai#5171)
    
    As the title says, I added more code splitters.
    The implementation is trivial, so i don't add separate tests for each
    splitter.
    Let me know if any concerns.
    
    Fixes # (issue)
    langchain-ai#5170
    
    ## Who can review?
    
    Community members can review the PR once tests pass. Tag
    maintainers/contributors who might be interested:
    @eyurtsev @hwchase17
    
    ---------
    
    Signed-off-by: byhsu <[email protected]>
    Co-authored-by: byhsu <[email protected]>
    ByronHsu and ByronHsu authored May 30, 2023
    Configuration menu
    Copy the full SHA
    9d658aa View commit details
    Browse the repository at this point in the history
  9. bump 185 (langchain-ai#5442)

    dev2049 authored May 30, 2023
    Configuration menu
    Copy the full SHA
    64b4165 View commit details
    Browse the repository at this point in the history
  10. fix (langchain-ai#5457)

    dev2049 authored May 30, 2023
    Configuration menu
    Copy the full SHA
    2649b63 View commit details
    Browse the repository at this point in the history
  11. bump 186 (langchain-ai#5459)

    dev2049 authored May 30, 2023
    Configuration menu
    Copy the full SHA
    4379bd4 View commit details
    Browse the repository at this point in the history
  12. Fixed docstring in faiss.py for load_local (langchain-ai#5440)

    # Fix for docstring in faiss.py vectorstore (load_local)
    
    The doctring should reflect that load_local loads something FROM the
    disk.
    luckyduck authored May 30, 2023
    Configuration menu
    Copy the full SHA
    0d3a9d4 View commit details
    Browse the repository at this point in the history
  13. Removes duplicated call from langchain/client/langchain.py (langchain…

    …-ai#5449)
    
    This removes duplicate code presumably introduced by a cut-and-paste
    error, spotted while reviewing the code in
    ```langchain/client/langchain.py```. The original code had back to back
    occurrences of the following code block:
    
    ```
            response = self._get(
                path,
                params=params,
            )
            raise_for_status_with_text(response)
    ```
    patrickkeane authored May 30, 2023
    Configuration menu
    Copy the full SHA
    e09afb4 View commit details
    Browse the repository at this point in the history
  14. encoding_kwargs for InstructEmbeddings (langchain-ai#5450)

    # What does this PR do?
    
    Bring support of `encode_kwargs` for ` HuggingFaceInstructEmbeddings`,
    change the docstring example and add a test to illustrate with
    `normalize_embeddings`.
    
    Fixes langchain-ai#3605
    (Similar to langchain-ai#3914)
    
    Use case:
    ```python
    from langchain.embeddings import HuggingFaceInstructEmbeddings
    
    model_name = "hkunlp/instructor-large"
    model_kwargs = {'device': 'cpu'}
    encode_kwargs = {'normalize_embeddings': True}
    hf = HuggingFaceInstructEmbeddings(
        model_name=model_name,
        model_kwargs=model_kwargs,
        encode_kwargs=encode_kwargs
    )
    ```
    Xmaster6y authored May 30, 2023
    Configuration menu
    Copy the full SHA
    c1807d8 View commit details
    Browse the repository at this point in the history
  15. MRKL output parser no longer breaks well formed queries (langchain-ai…

    …#5432)
    
    # Handles the edge scenario in which the action input is a well formed
    SQL query which ends with a quoted column
    
    There may be a cleaner option here (or indeed other edge scenarios) but
    this seems to robustly determine if the action input is likely to be a
    well formed SQL query in which we don't want to arbitrarily trim off `"`
    characters
    
    Fixes langchain-ai#5423
    
    ## Who can review?
    
    Community members can review the PR once tests pass. Tag
    maintainers/contributors who might be interested:
    
    For a quicker response, figure out the right person to tag with @
    
      @hwchase17 - project lead
    
      Agents / Tools / Toolkits
      - @vowelparrot
    Matt Wells authored May 30, 2023
    Configuration menu
    Copy the full SHA
    1d861dc View commit details
    Browse the repository at this point in the history
  16. docs: cleaning (langchain-ai#5413)

    # docs cleaning
    
    Changed docs to consistent format (probably, we need an official doc
    integration template):
    - ClearML - added product descriptions; changed title/headers
    - Rebuff  - added product descriptions; changed title/headers
    - WhyLabs  - added product descriptions; changed title/headers
    - Docugami - changed title/headers/structure
    - Airbyte - fixed title
    - Wolfram Alpha - added descriptions, fixed title
    - OpenWeatherMap -  - added product descriptions; changed title/headers
    - Unstructured - changed description
    
    ## Who can review?
    
    Community members can review the PR once tests pass. Tag
    maintainers/contributors who might be interested:
    
    @hwchase17
    @dev2049
    leo-gan authored May 30, 2023
    Configuration menu
    Copy the full SHA
    1f11f80 View commit details
    Browse the repository at this point in the history
  17. Added async _acall to FakeListLLM (langchain-ai#5439)

    # Added Async _acall to FakeListLLM
    
    FakeListLLM is handy when unit testing apps built with langchain. This
    allows the use of FakeListLLM inside concurrent code with
    [asyncio](https://docs.python.org/3/library/asyncio.html).
    
    I also changed the pydocstring which was out of date.
    
    ## Who can review?
    
    @hwchase17 - project lead
    @agola11 - async
    camille-vanhoffelen authored May 30, 2023
    Configuration menu
    Copy the full SHA
    80e133f View commit details
    Browse the repository at this point in the history
  18. Feat: Add batching to Qdrant (langchain-ai#5443)

    # Add batching to Qdrant
    
    Several people requested a batching mechanism while uploading data to
    Qdrant. It is important, as there are some limits for the maximum size
    of the request payload, and without batching implemented in Langchain,
    users need to implement it on their own. This PR exposes a new optional
    `batch_size` parameter, so all the documents/texts are loaded in batches
    of the expected size (64, by default).
    
    The integration tests of Qdrant are extended to cover two cases:
    1. Documents are sent in separate batches.
    2. All the documents are sent in a single request.
    kacperlukawski authored May 30, 2023
    Configuration menu
    Copy the full SHA
    f93d256 View commit details
    Browse the repository at this point in the history
  19. Update psychicapi version (langchain-ai#5471)

    Update [psychicapi](https://pypi.org/project/psychicapi/) python package
    dependency to the latest version 0.5. The newest python package version
    addresses breaking changes in the Psychic http api.
    Ayan-Bandyopadhyay authored May 30, 2023
    Configuration menu
    Copy the full SHA
    8181f9e View commit details
    Browse the repository at this point in the history
  20. Add maximal relevance search to SKLearnVectorStore (langchain-ai#5430)

    # Add maximal relevance search to SKLearnVectorStore
    
    This PR implements the maximum relevance search in SKLearnVectorStore. 
    
    Twitter handle: jtolgyesi (I submitted also the original implementation
    of SKLearnVectorStore)
    
    ## Before submitting
    
    Unit tests are included.
    
    Co-authored-by: Dev 2049 <[email protected]>
    mrtj and dev2049 authored May 30, 2023
    Configuration menu
    Copy the full SHA
    1111f18 View commit details
    Browse the repository at this point in the history
  21. add simple test for imports (langchain-ai#5461)

    Co-authored-by: Dev 2049 <[email protected]>
    hwchase17 and dev2049 authored May 30, 2023
    Configuration menu
    Copy the full SHA
    eab4b4c View commit details
    Browse the repository at this point in the history
  22. Ability to specify credentials wihen using Google BigQuery as a data …

    …loader (langchain-ai#5466)
    
    # Adds ability to specify credentials when using Google BigQuery as a
    data loader
    
    Fixes langchain-ai#5465 . Adds ability to set credentials which must be of the
    `google.auth.credentials.Credentials` type. This argument is optional
    and will default to `None.
    
    Co-authored-by: Dev 2049 <[email protected]>
    nsheils and dev2049 authored May 30, 2023
    Configuration menu
    Copy the full SHA
    199cc70 View commit details
    Browse the repository at this point in the history
  23. convert the parameter 'text' to uppercase in the function 'parse' of …

    …the class BooleanOutputParser (langchain-ai#5397)
    
    when the LLMs output 'yes|no',BooleanOutputParser can parse it to
    'True|False', fix the ValueError in parse().
    <!--
    when use the BooleanOutputParser in the chain_filter.py, the LLMs output
    'yes|no',the function 'parse' will throw ValueError。
    -->
    
    Fixes # (issue)
      langchain-ai#5396
      langchain-ai#5396
    
    ---------
    
    Co-authored-by: gaofeng27692 <[email protected]>
    ARSblithe212 and gaofeng27692 authored May 30, 2023
    Configuration menu
    Copy the full SHA
    e31705b View commit details
    Browse the repository at this point in the history
  24. added n_threads functionality for gpt4all (langchain-ai#5427)

    # Added support for modifying the number of threads in the GPT4All model
    
    I have added the capability to modify the number of threads used by the
    GPT4All model. This allows users to adjust the model's parallel
    processing capabilities based on their specific requirements.
    
    ## Changes Made
    - Updated the `validate_environment` method to set the number of threads
    for the GPT4All model using the `values["n_threads"]` parameter from the
    `GPT4All` class constructor.
    
    ## Context
    Useful in scenarios where users want to optimize the model's performance
    by leveraging multi-threading capabilities.
    Please note that the `n_threads` parameter was included in the `GPT4All`
    class constructor but was previously unused. This change ensures that
    the specified number of threads is utilized by the model .
    
    ## Dependencies
    There are no new dependencies introduced by this change. It only
    utilizes existing functionality provided by the GPT4All package.
    
    ## Testing
    Since this is a minor change testing is not required.
    
    ---------
    
    Co-authored-by: Dev 2049 <[email protected]>
    Vokturz and dev2049 authored May 30, 2023
    Configuration menu
    Copy the full SHA
    8121e04 View commit details
    Browse the repository at this point in the history

Commits on May 31, 2023

  1. Allow for async use of SelfAskWithSearchChain (langchain-ai#5394)

    # Allow for async use of SelfAskWithSearchChain
    
    
    Co-authored-by: Dev 2049 <[email protected]>
    pors and dev2049 authored May 31, 2023
    Configuration menu
    Copy the full SHA
    0a44bfd View commit details
    Browse the repository at this point in the history
  2. Allow ElasticsearchEmbeddings to create a connection with ES Client o…

    …bject (langchain-ai#5321)
    
    This PR adds a new method `from_es_connection` to the
    `ElasticsearchEmbeddings` class allowing users to use Elasticsearch
    clusters outside of Elastic Cloud.
    
    Users can create an Elasticsearch Client object and pass that to the new
    function.
    The returned object is identical to the one returned by calling
    `from_credentials`
    
    ```
    # Create Elasticsearch connection
    es_connection = Elasticsearch(
        hosts=['https://es_cluster_url:port'], 
        basic_auth=('user', 'password')
    )
    
    # Instantiate ElasticsearchEmbeddings using es_connection
    embeddings = ElasticsearchEmbeddings.from_es_connection(
      model_id,
      es_connection,
    )
    ```
    
    I also added examples to the elasticsearch jupyter notebook
    
    Fixes # langchain-ai#5239
    
    ---------
    
    Co-authored-by: Dev 2049 <[email protected]>
    jeffvestal and dev2049 authored May 31, 2023
    Configuration menu
    Copy the full SHA
    46e181a View commit details
    Browse the repository at this point in the history
  3. SQLite-backed Entity Memory (langchain-ai#5129)

    # SQLite-backed Entity Memory
    
    Following the initiative of
    langchain-ai#2397 I think it would be
    helpful to be able to persist Entity Memory on disk by default
    
    Co-authored-by: Dev 2049 <[email protected]>
    JoseHervas and dev2049 authored May 31, 2023
    Configuration menu
    Copy the full SHA
    ce8b7a2 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    1671c2a View commit details
    Browse the repository at this point in the history
  5. Harrison/html splitter (langchain-ai#5468)

    Co-authored-by: David Revillas <[email protected]>
    hwchase17 and r3v1 authored May 31, 2023
    Configuration menu
    Copy the full SHA
    f72bb96 View commit details
    Browse the repository at this point in the history
  6. Feature: Qdrant filters supports (langchain-ai#5446)

    # Support Qdrant filters
    
    Qdrant has an [extensive filtering
    system](https://qdrant.tech/documentation/concepts/filtering/) with rich
    type support. This PR makes it possible to use the filters in Langchain
    by passing an additional param to both the
    `similarity_search_with_score` and `similarity_search` methods.
    
    ## Who can review?
    
    @dev2049 @hwchase17
    
    ---------
    
    Co-authored-by: Dev 2049 <[email protected]>
    kacperlukawski and dev2049 authored May 31, 2023
    Configuration menu
    Copy the full SHA
    8bcaca4 View commit details
    Browse the repository at this point in the history
  7. Add matching engine vectorstore (langchain-ai#3350)

    Co-authored-by: Tom Piaggio <[email protected]>
    Co-authored-by: scafati98 <[email protected]>
    Co-authored-by: scafati98 <[email protected]>
    Co-authored-by: Dev 2049 <[email protected]>
    5 people authored May 31, 2023
    Configuration menu
    Copy the full SHA
    470b282 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    b39c069 View commit details
    Browse the repository at this point in the history

Commits on Sep 13, 2023

  1. Configuration menu
    Copy the full SHA
    272c63c View commit details
    Browse the repository at this point in the history

Commits on Sep 14, 2023

  1. Configuration menu
    Copy the full SHA
    922e147 View commit details
    Browse the repository at this point in the history