Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: sentence transformers #517

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from
Draft

Conversation

jamescalam
Copy link
Member

@jamescalam jamescalam commented Jan 21, 2025

PR Type

Enhancement, Documentation, Tests


Description

  • Introduced STEncoder and LocalEncoder for local dense embeddings.

  • Refactored encoders to use TorchAbstractDenseEncoder for shared PyTorch logic.

  • Added a new introduction notebook for local encoder usage.

  • Updated dependencies to support local and sentence-transformers functionality.


Changes walkthrough 📝

Relevant files
Enhancement
7 files
__init__.py
Added STEncoder and LocalEncoder to encoder imports and
initialization.
+8/-0     
clip.py
Refactored `CLIPEncoder` to use `TorchAbstractDenseEncoder`.
+4/-23   
local.py
Introduced `LocalEncoder` extending `STEncoder` for local embeddings.
+10/-0   
sentence_transformers.py
Added `STEncoder` for sentence-transformers-based dense embeddings.
+47/-0   
torch.py
Created `TorchAbstractDenseEncoder` for shared PyTorch functionality.
+33/-0   
vit.py
Refactored `VitEncoder` to use `TorchAbstractDenseEncoder`.
+6/-17   
schema.py
Added new encoder types `LOCAL` and `SENTENCE_TRANSFORMERS`.
+2/-0     
Documentation
1 files
00a-introduction-local.ipynb
Added a notebook demonstrating `LocalEncoder` usage.         
+313/-0 
Dependencies
1 files
pyproject.toml
Updated dependencies for local and sentence-transformers support.
+2/-1     

💡 PR-Agent usage: Comment /help "your question" on any pull request to receive relevant information

Copy link

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review

Possible Issue

The normalization of embeddings in the STEncoder class (xd = xd / np.linalg.norm(xd, axis=0)) may be incorrect. Normalizing along axis 0 could lead to unintended behavior. This should be reviewed and validated.

    normalize_embeddings: bool = True,
) -> list[list[float]]:
    # compute document embeddings `xd`
    xd = self._model.encode(docs, batch_size=batch_size)
    if normalize_embeddings:
        # TODO not sure if required
        xd = xd / np.linalg.norm(xd, axis=0)
Dependency Management

The _initialize_torch method in TorchAbstractDenseEncoder raises an ImportError if PyTorch is not installed. This could cause runtime issues if dependencies are not properly managed. Consider adding a more robust dependency check or fallback mechanism.

def _initialize_torch(self):
    try:
        import torch
    except ImportError:
        raise ImportError(
            f"Please install PyTorch to use {self.__class__.__name__}. "
            "You can install it with: `pip install semantic-router[local]`"
Documentation Clarity

The notebook introduces the LocalEncoder but does not provide sufficient explanation of its advantages or limitations compared to other encoders. This could confuse users unfamiliar with the library.

{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "K7NsuSPNf3px"
      },
      "source": [
        "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/aurelio-labs/semantic-router/blob/main/docs/00-introduction.ipynb) [![Open nbviewer](https://raw.githubusercontent.com/pinecone-io/examples/master/assets/nbviewer-shield.svg)](https://nbviewer.org/github/aurelio-labs/semantic-router/blob/main/docs/00-introduction.ipynb)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Am2hmLzTf3py"
      },
      "source": [
        "# Semantic Router Intro"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "k1nRRAbYf3py"
      },
      "source": [
        "The Semantic Router library can be used as a super fast decision making layer on top of LLMs. That means rather than waiting on a slow agent to decide what to do, we can use the magic of semantic vector space to make routes. Cutting decision making time down from seconds to milliseconds.\n",
        "\n",
        "In this notebook we will be introducing the library (as done in the `00-introduction.ipynb` notebook) but using the `LocalEncoder` class, allowing us to run the library locally without the need for any APIs or external services."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "NggrMQP2f3py"
      },
      "source": [
        "## Getting Started"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "9zP-l_T7f3py"
      },
      "source": [
        "We start by installing the library:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "4YI81tu0f3pz"
      },
      "outputs": [],
      "source": [
        "!pip install -qU \"semantic-router==0.1.0.dev6[local]\""
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "HfB8252ff3pz"
      },
      "source": [
        "We start by defining a dictionary mapping routes to example phrases that should trigger those routes."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "lslfqYOEf3pz",
        "outputId": "c13e3e77-310c-4b86-e291-4b6005d698bd"
      },
      "outputs": [
        {
          "name": "stderr",
          "output_type": "stream",
          "text": [
            "c:\\Users\\Siraj\\Documents\\Personal\\Work\\Aurelio\\Virtual Environments\\semantic_router_3\\Lib\\site-packages\\tqdm\\auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n",
            "  from .autonotebook import tqdm as notebook_tqdm\n"
          ]
        }
      ],
      "source": [
        "from semantic_router import Route\n",
        "\n",
        "politics = Route(\n",
        "    name=\"politics\",\n",
        "    utterances=[\n",
        "        \"isn't politics the best thing ever\",\n",
        "        \"why don't you tell me about your political opinions\",\n",
        "        \"don't you just love the president\",\n",
        "        \"don't you just hate the president\",\n",
        "        \"they're going to destroy this country!\",\n",
        "        \"they will save the country!\",\n",
        "    ],\n",
        ")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "WYLHUDa1f3p0"
      },
      "source": [
        "Let's define another for good measure:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "LAdY1jdxf3p0"
      },
      "outputs": [],
      "source": [
        "chitchat = Route(\n",
        "    name=\"chitchat\",\n",
        "    utterances=[\n",
        "        \"how's the weather today?\",\n",
        "        \"how are things going?\",\n",
        "        \"lovely weather today\",\n",
        "        \"the weather is horrendous\",\n",
        "        \"let's go to the chippy\",\n",
        "    ],\n",
        ")\n",
        "\n",
        "routes = [politics, chitchat]"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ReN59ieGf3p0"
      },
      "source": [
        "Now we initialize our encoder. Under-the-hood we're using the `sentence-transformers` library, which supports loading encoders from the HuggingFace Hub. We'll be using Nvidia's [nvidia/NV-Embed-v2](https://huggingface.co/nvidia/NV-Embed-v2) encoder"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "MF47W_Sof3p2"
      },
      "outputs": [],
      "source": [
        "from semantic_router.encoders import LocalEncoder\n",
        "\n",
        "encoder = LocalEncoder(name=\"nvidia/NV-Embed-v2\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "lYuLO0l9f3p3"
      },
      "source": [
        "Now we define the `Router`. When called, the router will consume text (a query) and output the category (`Route`) it belongs to — to initialize a `Router` we need our `encoder` model and a list of `routes`."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "dh1U8IDOf3p3",
        "outputId": "872810da-956a-47af-a91f-217ce351a88b"
      },
      "outputs": [
        {
          "name": "stderr",
          "output_type": "stream",
          "text": [
            "\u001b[32m2024-05-07 15:02:46 INFO semantic_router.utils.logger local\u001b[0m\n"
          ]
        }
      ],
      "source": [
        "from semantic_router.routers import SemanticRouter\n",
        "\n",
        "sr = SemanticRouter(encoder=encoder, routes=routes, auto_sync=\"local\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Xj32uEF-f3p3"
      },
      "source": [
        "Now we can test it:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "fIXOjRp9f3p3",
        "outputId": "8b9b5746-ae7c-43bb-d84f-5fa7c30e423e"
      },
      "outputs": [
        {
          "data": {
            "text/plain": [
              "RouteChoice(name='politics', function_call=None, similarity_score=None)"
            ]
          },
          "execution_count": 6,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "sr(\"don't you love politics?\")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "0UN2mKvjf3p4",
        "outputId": "062f9499-7db3-49d2-81ef-e7d5dc9a88f6"
      },
      "outputs": [
        {
          "data": {
            "text/plain": [
              "RouteChoice(name='chitchat', function_call=None, similarity_score=None)"
            ]
          },
          "execution_count": 7,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "sr(\"how's the weather today?\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "NHZWZKoTf3p4"
      },
      "source": [
        "Both are classified accurately, what if we send a query that is unrelated to our existing `Route` objects?"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "0WnvGJByf3p4",
        "outputId": "4496e9b2-7cd8-4466-fe1a-3e6f5cf30b0d"
      },
      "outputs": [
        {
          "data": {
            "text/plain": [
              "RouteChoice(name=None, function_call=None, similarity_score=None)"
            ]
          },
          "execution_count": 8,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "sr(\"I'm interested in learning about llama 2\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "With this we see `None` is returned, ie no routes were matched."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "---"
      ]
    }
  ],
  "metadata": {
    "colab": {
      "provenance": []
    },
    "kernelspec": {
      "display_name": "decision-layer",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.11.4"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}

Copy link

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Score
Possible issue
Fix embedding normalization axis

Ensure that the normalization of embeddings in the call method is performed
along the correct axis, as using axis=0 may lead to incorrect results depending on
the shape of xd. Typically, normalization should be done along axis=1 for row-wise
normalization.

semantic_router/encoders/sentence_transformers.py [46]

-xd = xd / np.linalg.norm(xd, axis=0)
+xd = xd / np.linalg.norm(xd, axis=1, keepdims=True)
Suggestion importance[1-10]: 9

Why: The suggestion correctly identifies a potential issue with the normalization axis in the __call__ method. Changing the axis to 1 ensures row-wise normalization, which is the standard approach for embedding vectors. This fix improves the correctness of the code.

9
Ensure torch is initialized safely

Add a check in _get_device to ensure that the torch module is properly initialized
before accessing its attributes like cuda or backends.mps, to prevent potential
runtime errors if _initialize_torch fails.

semantic_router/encoders/torch.py [27]

-elif self._torch.cuda.is_available():
+elif self._torch and self._torch.cuda.is_available():
Suggestion importance[1-10]: 8

Why: Adding a check to ensure _torch is initialized before accessing its attributes is a good safeguard against runtime errors. This suggestion enhances the robustness of the _get_device method.

8
General
Prevent redundant torch initialization

Ensure that _initialize_torch is called only once during initialization to avoid
redundant imports or potential performance issues, as it is currently invoked
multiple times in the class.

semantic_router/encoders/clip.py [61]

-torch = self._initialize_torch()
+if not hasattr(self, '_torch'):
+    self._torch = self._initialize_torch()
+torch = self._torch
Suggestion importance[1-10]: 7

Why: The suggestion to avoid redundant calls to _initialize_torch is valid and improves performance by ensuring the initialization is done only once. However, the performance impact may not be significant, so the improvement is moderate.

7

@jamescalam jamescalam linked an issue Jan 21, 2025 that may be closed by this pull request
@jamescalam jamescalam marked this pull request as draft January 21, 2025 09:59
@jamescalam jamescalam self-assigned this Jan 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Sentence transformer support
1 participant