Refactor Import of `HuggingFaceEndpointEmbeddings` to Avoid Unnecessary large `pytorch` Dependency #24482

nkratzke · 2024-07-21T17:03:42Z

nkratzke
Jul 21, 2024

Checked

I searched existing ideas and did not find a similar one
I added a very descriptive title
I've clearly described the feature request and motivation for it

Feature request

Description

When importing HuggingFaceEndpointEmbeddings from langchain_huggingface.embeddings, it is currently necessary to install the complete langchain-huggingface package. This package includes the pytorch library as a dependency, which significantly increases the size of container images by up to 6GB. This is problematic for use cases that only require remote embedding API access and do not need the pytorch library.

Proposed Solution

Refactor the HuggingFaceEndpointEmbeddings module so that it can be imported and used without the need for pytorch and other heavy dependencies. This could be achieved by:

Splitting the langchain-huggingface package into smaller, more focused modules.
Creating a lightweight version of the HuggingFaceEndpointEmbeddings class that only includes the necessary components for remote embedding API access.
Using optional dependencies or extras to include pytorch only when necessary for local model execution.

Benefits

Reduced Container Image Size: By removing the unnecessary pytorch dependency, the size of container images can be significantly reduced, making deployments faster and more efficient.
Improved Performance: Smaller container images can lead to quicker startup times and lower memory usage.
Flexibility: Users who only need remote embedding API access will not be forced to install and manage the heavyweight pytorch library.

Additional Context

This change is particularly important for users who operate in environments with strict resource limitations or those who prioritize lightweight and efficient deployments.

Example

Current import statement:

from langchain_huggingface.embeddings import HuggingFaceEndpointEmbeddings

Proposed import statement after refactoring:

from langchain_huggingface.embeddingsimport HuggingFaceEndpointEmbeddings

but with an optional dependency:

pip install langchain-huggingface[endpoint]

Impact

This change will help in making the langchain-huggingface package more modular and user-friendly, especially for those who rely solely on remote services for embedding tasks.

Thank you for considering this proposal. I believe it will greatly enhance the usability and efficiency of the langchain-huggingface package.

Motivation

I wrote a containerized Retrieval Augmented Generation (RAG) app just needing the HuggingFaceEndpointEmbeddings as a dependency. The container image size was about 200 MB. After adding the langchain-huggingface dependency the container image size exploded up to 6GB! I think it's mainly because of the necessary pytorch dependency (for this remote embedding point use case).

Proposal (If applicable)

No response

morgandiverrez · 2024-09-03T08:27:04Z

mujtabajalil · 2024-11-29T11:41:43Z

mujtabajalil
Nov 29, 2024

You can copy this code instead

from (your py file name) import HuggingFaceEndpointEmbeddings
make sure you have langchain_core and huggingface_hub installed

import json
import os
from typing import Any, List, Optional

from langchain_core.embeddings import Embeddings
from langchain_core.utils import from_env
from pydantic import BaseModel, ConfigDict, Field, model_validator
from typing_extensions import Self

DEFAULT_MODEL = "sentence-transformers/all-mpnet-base-v2"
VALID_TASKS = ("feature-extraction",)

class HuggingFaceEndpointEmbeddings(BaseModel, Embeddings):

client: Any = None  #: :meta private:
async_client: Any = None  #: :meta private:
model: Optional[str] = None
"""Model name to use."""
repo_id: Optional[str] = None
"""Huggingfacehub repository id, for backward compatibility."""
task: Optional[str] = "feature-extraction"
"""Task to call the model with."""
model_kwargs: Optional[dict] = None
"""Keyword arguments to pass to the model."""

huggingfacehub_api_token: Optional[str] = Field(
    default_factory=from_env("HUGGINGFACEHUB_API_TOKEN", default=None)
)

model_config = ConfigDict(
    extra="forbid",
    protected_namespaces=(),
)

@model_validator(mode="after")
def validate_environment(self) -> Self:
    """Validate that api key and python package exists in environment."""
    huggingfacehub_api_token = self.huggingfacehub_api_token or os.getenv(
        "HF_TOKEN"
    )

    try:
        from huggingface_hub import (  # type: ignore[import]
            AsyncInferenceClient,
            InferenceClient,
        )

        if self.model:
            self.repo_id = self.model
        elif self.repo_id:
            self.model = self.repo_id
        else:
            self.model = DEFAULT_MODEL
            self.repo_id = DEFAULT_MODEL

        client = InferenceClient(
            model=self.model,
            token=huggingfacehub_api_token,
        )

        async_client = AsyncInferenceClient(
            model=self.model,
            token=huggingfacehub_api_token,
        )

        if self.task not in VALID_TASKS:
            raise ValueError(
                f"Got invalid task {self.task}, "
                f"currently only {VALID_TASKS} are supported"
            )
        self.client = client
        self.async_client = async_client

    except ImportError:
        raise ImportError(
            "Could not import huggingface_hub python package. "
            "Please install it with `pip install huggingface_hub`."
        )
    return self

def embed_documents(self, texts: List[str]) -> List[List[float]]:
    """Call out to HuggingFaceHub's embedding endpoint for embedding search docs.

    Args:
        texts: The list of texts to embed.

    Returns:
        List of embeddings, one for each text.
    """
    # replace newlines, which can negatively affect performance.
    texts = [text.replace("\n", " ") for text in texts]
    _model_kwargs = self.model_kwargs or {}
    #  api doc: https://huggingface.github.io/text-embeddings-inference/#/Text%20Embeddings%20Inference/embed
    responses = self.client.post(
        json={"inputs": texts, **_model_kwargs}, task=self.task
    )
    return json.loads(responses.decode())

async def aembed_documents(self, texts: List[str]) -> List[List[float]]:
    """Async Call to HuggingFaceHub's embedding endpoint for embedding search docs.

    Args:
        texts: The list of texts to embed.

    Returns:
        List of embeddings, one for each text.
    """
    # replace newlines, which can negatively affect performance.
    texts = [text.replace("\n", " ") for text in texts]
    _model_kwargs = self.model_kwargs or {}
    responses = await self.async_client.post(
        json={"inputs": texts, "parameters": _model_kwargs}, task=self.task
    )
    return json.loads(responses.decode())

def embed_query(self, text: str) -> List[float]:
    """Call out to HuggingFaceHub's embedding endpoint for embedding query text.

    Args:
        text: The text to embed.

    Returns:
        Embeddings for the text.
    """
    response = self.embed_documents([text])[0]
    return response

async def aembed_query(self, text: str) -> List[float]:
    """Async Call to HuggingFaceHub's embedding endpoint for embedding query text.

    Args:
        text: The text to embed.

    Returns:
        Embeddings for the text.
    """
    response = (await self.aembed_documents([text]))[0]
    return response

To use it

embedding = HuggingFaceEndpointEmbeddings(
model="snowflake/snowflake-arctic-embed-m",
task="feature-extraction",
huggingfacehub_api_token=HUGGINGFACE_API_TOKEN,
)

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor Import of `HuggingFaceEndpointEmbeddings` to Avoid Unnecessary large `pytorch` Dependency #24482

{{title}}

Replies: 2 comments 2 replies

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Refactor Import of HuggingFaceEndpointEmbeddings to Avoid Unnecessary large pytorch Dependency #24482

nkratzke Jul 21, 2024

Checked

Feature request

Description

Proposed Solution

Benefits

Additional Context

Example

Impact

Motivation

Proposal (If applicable)

Replies: 2 comments · 2 replies

morgandiverrez Sep 3, 2024

blacksmithop Sep 27, 2024

blacksmithop Sep 27, 2024

mujtabajalil Nov 29, 2024

Refactor Import of `HuggingFaceEndpointEmbeddings` to Avoid Unnecessary large `pytorch` Dependency #24482

nkratzke
Jul 21, 2024

Replies: 2 comments 2 replies

morgandiverrez
Sep 3, 2024

mujtabajalil
Nov 29, 2024