Performance: Embeddings are recalculated on each startup with PostgresIndex despite existing in database #459

ChristianWeyer · 2024-11-06T16:37:49Z

ChristianWeyer
Nov 6, 2024

Hi all,

OK, I am sure I am holding it the wrong way and doing something completely silly wrong. So, here goes...

I'm using semantic-router with PostgresIndex and notice that embeddings are being recalculated on each startup, even though they should already exist in the Postgres database from previous runs. This makes startup slow, especially with our larger set of utterances.

Current behavior:

First run: Embeddings are calculated and stored in Postgres (expected)
Subsequent runs: Embeddings are recalculated again instead of being reused from Postgres (unexpected)

Example code:

from semantic_router import Route
from semantic_router.encoders import HuggingFaceEncoder
from semantic_router.layer import RouteLayer
from semantic_router.index.postgres import PostgresIndex

# Initialize Postgres connection
postgres_index = PostgresIndex(
    connection_string="postgresql://user:pass@host:5432/db",
    dimensions=1536
)

# Our routes with significant number of utterances
routes = [
    Route(
        name="availability",
        utterances=[...],  # ~100 utterances (50 German + 50 English variants)
    ),
    Route(
        name="readme",
        utterances=[...],  # ~250 utterances (mix of German and English)
    ),
    Route(
        name="bad",
        utterances=[...],  # 4 utterances
    )
]

# Initialize encoder
encoder = HuggingFaceEncoder(
    name="BAAI/bge-small-en-v1.5",
    model_kwargs={"trust_remote_code": True},
    tokenizer_kwargs={"trust_remote_code": True}
)

# Create route layer - this recalculates embeddings even though they exist in Postgres
rl = RouteLayer(encoder=encoder, routes=routes, index=postgres_index)

Questions:

Is this the right approach and understanding?
Is there a way to check if embeddings already exist in Postgres before initializing the encoder?
How can we reuse existing embeddings from Postgres without recalculating them on each startup?
What's the recommended pattern for caching embeddings with PostgresIndex when dealing with a larger number of utterances (~350 in our case)?

Environment:

semantic-router version: 0.0.72
Python version: 3.9+
Database: PostgreSQL
Scale: ~350 total utterances across routes

The recalculation is particularly noticeable with our larger set of utterances. We'd like to leverage Postgres's persistence to avoid this overhead on subsequent runs.

Any guidance on the proper way to handle embedding persistence and reuse would be greatly appreciated.

Thanks.

ChristianWeyer · 2024-11-07T10:21:01Z

ChristianWeyer
Nov 7, 2024
Author

Maybe @jamescalam or @itsfrankjames can enlighten me? 🙂

0 replies

ChristianWeyer · 2024-11-11T19:12:13Z

ChristianWeyer
Nov 11, 2024
Author

FYI: I created a new optimized route layer which solves my issue.

class OptimizedRouteLayer(RouteLayer):
    """
    An optimized version of RouteLayer that efficiently handles route initialization
    by reusing existing embeddings and using batch operations.
    """
    def _add_routes(self, routes):
        if not routes:
            return
            
        # Get existing routes from the index
        existing_routes = {(route, utterance) for route, utterance in self.index.get_routes()}
        
        # Separate new routes that need embeddings
        new_routes = []
        new_utterances = []
        new_function_schemas = []
        new_metadata = []
        
        for route in routes:
            for utterance in route.utterances:
                if (route.name, utterance) not in existing_routes:
                    new_routes.append(route.name)
                    new_utterances.append(utterance)
                    new_function_schemas.append(
                        route.function_schemas[0] if route.function_schemas is not None else {}
                    )
                    new_metadata.append(route.metadata if route.metadata else {})

        # Only calculate embeddings for new routes
        if new_utterances:
            logger.debug(f"Calculating embeddings for {len(new_utterances)} new utterances")
            start = time.time()
            embedded_utterances = self.encoder(new_utterances)
            logger.debug(f"Embeddings calculated in {(time.time() - start) * 1000:.2f} ms")
            try:
                # Batch insertion into the index
                start = time.time()
                self.index.add(
                    embeddings=embedded_utterances,
                    routes=new_routes,
                    utterances=new_utterances,
                    function_schemas=new_function_schemas,
                    metadata_list=new_metadata,
                )
                logger.debug(f"Vectors added to index in {(time.time() - start) * 1000:.2f} ms")
            except Exception as e:
                logger.error(f"Failed to add routes to the index: {e}")
                raise
        else:
            logger.debug("All routes already exist in the index")

        # Update the local routes list
        self.routes = routes

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance: Embeddings are recalculated on each startup with PostgresIndex despite existing in database #459

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Performance: Embeddings are recalculated on each startup with PostgresIndex despite existing in database #459

ChristianWeyer Nov 6, 2024

Replies: 2 comments

ChristianWeyer Nov 7, 2024 Author

ChristianWeyer Nov 11, 2024 Author

ChristianWeyer
Nov 6, 2024

ChristianWeyer
Nov 7, 2024
Author

ChristianWeyer
Nov 11, 2024
Author