Superlinked - The Vector Computer

Experiment in a notebook | Run in production | Use-cases | Supported VDBs | Resources

Superlinked is a compute framework for your information retrieval and feature engineering systems, focused on turning complex (structured+unstructured) data into ultra-modal vector embeddings within your RAG, Search, Recommendations and Analytics stack. Integrate Superlinked into your machine learning stack for custom model performance with pre-trained model convenience.

If you like what we do, give us a star! ⭐

Visit Superlinked for more information about the company behind this product and our other initiatives.

Features

Embed structured and unstructured data (Text | Number | Category | Time | Event)
Combine encoders to build a custom model (notebook)
Add a custom encoder (notebook)
Update your vectors with behavioral events & relationships (notebook)
Use query-time weights (notebook)
Query with natural language (notebook)
Filter your results (notebook)
Export vectors for analysis (notebook)

You can check a full list of our features or head to our reference section for more information.

Use-cases

Dive deeper with our notebooks into how each use-case benefits from the Superlinked framework.

RAG: HR Knowledgebase
Semantic Search: Movies, Business News
Recommendation Systems: E-commerce
Analytics: User Acquisition, Keyword expansion

You can check a full list of examples here.

Experiment in a notebook

Example on combining Text with Numerical encoders to get correct results with LLMs.

Install the superlinked library

%pip install superlinked

Run the example:

First run will take slightly longer as it has to download the embedding model.

import json

from superlinked.framework.common.embedding.number_embedding import Mode
from superlinked.framework.common.nlq.open_ai import OpenAIClientConfig
from superlinked.framework.common.parser.dataframe_parser import DataFrameParser
from superlinked.framework.common.schema.schema import schema
from superlinked.framework.common.schema.schema_object import Integer, String
from superlinked.framework.common.schema.id_schema_object import IdField
from superlinked.framework.dsl.space.number_space import NumberSpace
from superlinked.framework.dsl.space.text_similarity_space import TextSimilaritySpace
from superlinked.framework.dsl.index.index import Index
from superlinked.framework.dsl.query.param import Param
from superlinked.framework.dsl.query.query import Query
from superlinked.framework.dsl.source.in_memory_source import InMemorySource
from superlinked.framework.dsl.executor.in_memory.in_memory_executor import (
    InMemoryExecutor,
)

@schema
class Review:
    id: IdField
    review_text: String
    rating: Integer


review = Review()

review_text_space = TextSimilaritySpace(
    text=review.review_text, model="Alibaba-NLP/gte-large-en-v1.5"
)
rating_maximizer_space = NumberSpace(
    number=review.rating, min_value=1, max_value=5, mode=Mode.MAXIMUM
)
index = Index([review_text_space, rating_maximizer_space], fields=[review.rating])

# fill this with your API key - this will drive param extraction
openai_config = OpenAIClientConfig(
    api_key="YOUR_OPENAI_API_KEY", model="gpt-4o"
)

# it is possible now to add descriptions to a `Param` to aid the parsing of information from natural language queries.
text_similar_param = Param(
    "query_text",
    description="The text in the user's query that is used to search in the reviews' body. Extract info that does apply to other spaces or params.",
)

# Define your query using dynamic parameters for query text and weights.
# we will have our LLM fill them based on our natural language query
query = (
    Query(
        index,
        weights={
            review_text_space: Param("review_text_weight"),
            rating_maximizer_space: Param("rating_maximizer_weight"),
        },
    )
    .find(review)
    .similar(
        review_text_space.text,
        text_similar_param,
    )
    .limit(Param("limit"))
    .with_natural_query(Param("natural_query"), openai_config)
)

# Run the app.
source: InMemorySource = InMemorySource(review)
executor = InMemoryExecutor(sources=[source], indices=[index])
app = executor.run()

# Download dataset.
data = [
    {"id": 1, "review_text": "Useless product", "rating": 1},
    {"id": 2, "review_text": "Great product I am so happy!", "rating": 5},
    {"id": 3, "review_text": "Mediocre stuff fits the purpose", "rating": 3},
]

# Ingest data to the framework.
source.put(data)

result = app.query(query, natural_query="Show me the best product", limit=1)

# examine the extracted parameters from your query
print(json.dumps(result.knn_params, indent=2))
# the result is the 5 star rated product
result.to_pandas()

Run in production

Superlinked Server allows you to leverage the power of Superlinked in deployable projects. With a single script, you can deploy a Superlinked-powered app instance that creates REST endpoints and connects to external Vector Databases. This makes it an ideal solution for those seeking an easy-to-deploy environment for their Superlinked projects.

If your are interested in learning more about running at scale, Book a demo for an early access to our managed cloud.

Supported VDBs

We have started partnering with vector database providers to allow you to use Superlinked with your VDB of choice. If you are unsure, which VDB to chose, check-out our Vector DB Comparison.

Missing your favorite VDB? Tell us which vector database we should support next!

Reference

Describe your data using Python classes with the @schema decorator.
Describe your vector embeddings from building blocks with Spaces.
Combine your embeddings into a queryable Index.
Define your search with dynamic parameters and weights as a Query.
Load your data using a Source.
Define your transformations with a Parser (e.g.: from pd.DataFrame).
Run your configuration with an Executor.

You can check all references here.

Logging

Contextual information is automatically included in log messages, such as the process ID and package scope. Personally Identifiable Information (PII) is filtered out by default but can be exposed with the SUPERLINKED_EXPOSE_PII environment variable to true.

Resources

Vector DB Comparison: Open-source collaborative comparison of vector databases by Superlinked.
Vector Hub: VectorHub is a free and open-sourced learning hub for people interested in adding vector retrieval to their ML stack

Support

If you encounter any challenges during your experiments, feel free to create an issue, request a feature or to start a discussion. Make sure to group your feedback in separate issues and discussions by topic. Thank you for your feedback!

Name		Name	Last commit message	Last commit date
Latest commit History 454 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
framework		framework
notebook		notebook
server		server
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Superlinked - The Vector Computer

Features

Use-cases

Experiment in a notebook

Install the superlinked library

Run the example:

Run in production

Supported VDBs

Reference

Logging

Resources

Support

About

Releases

Packages

Languages

License

doubleshow/superlinked

Folders and files

Latest commit

History

Repository files navigation

Superlinked - The Vector Computer

Features

Use-cases

Experiment in a notebook

Install the superlinked library

Run the example:

Run in production

Supported VDBs

Reference

Logging

Resources

Support

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages