Skip to content

Commit

Permalink
Fix links and re-type auto-ibis meta-data
Browse files Browse the repository at this point in the history
  • Loading branch information
blythed committed Dec 2, 2023
1 parent 78f8255 commit e5c5186
Show file tree
Hide file tree
Showing 17 changed files with 400 additions and 66 deletions.
14 changes: 7 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,14 +54,14 @@ SuperDuperDB eliminates the need for complex MLOps pipelines and specialized vec


### Key Features:
- **[Integration of AI with your existing data infrastructure](https://docs.superduperdb.com/docs/docs/apply_models):** Integrate any AI models and APIs with your databases in a single scalable deployment, without the need for additional pre-processing steps, ETL or boilerplate code.
- **[Streaming Inference](https://docs.superduperdb.com/docs/docs/daemonizing_models_with_listeners):** Have your models compute outputs automatically and immediately as new data arrives, keeping your deployment always up-to-date.
- **[Scalable Model Training](https://docs.superduperdb.com/docs/docs/training_models):** Train AI models on large, diverse datasets simply by querying your training data. Ensured optimal performance via in-build computational optimizations.
- **[Model Chaining](https://docs.superduperdb.com/docs/docs/linking_interdependent_models)**: Easily setup complex workflows by connecting models and APIs to work together in an interdependent and sequential manner.
- **[Simple, but Extendable Interface](https://docs.superduperdb.com/docs/docs/procedural_vs_declarative_api)**: Add and leverage any function, program, script or algorithm from the Python ecosystem to enhance your workflows and applications. Drill down on any layer as deep as it gets, up until the inner workings of your models while operating SuperDuperDB with simple Python commands.
- **[Difficult Data-Types](https://docs.superduperdb.com/docs/docs/encoding_special_data_types)**: Work directly with images, video, audio in your datastore, and any type which can be encoded as `bytes` in Python.
- **[Integration of AI with your existing data infrastructure](https://docs.superduperdb.com/docs/docs/walkthrough/apply_models):** Integrate any AI models and APIs with your databases in a single scalable deployment, without the need for additional pre-processing steps, ETL or boilerplate code.
- **[Streaming Inference](https://docs.superduperdb.com/docs/docs/walkthrough/daemonizing_models_with_listeners):** Have your models compute outputs automatically and immediately as new data arrives, keeping your deployment always up-to-date.
- **[Scalable Model Training](https://docs.superduperdb.com/docs/docs/walkthrough/training_models):** Train AI models on large, diverse datasets simply by querying your training data. Ensured optimal performance via in-build computational optimizations.
- **[Model Chaining](https://docs.superduperdb.com/docs/docs/walkthrough/linking_interdependent_models/)**: Easily setup complex workflows by connecting models and APIs to work together in an interdependent and sequential manner.
- **[Simple, but Extendable Interface](https://docs.superduperdb.com/docs/docs/fundamentals/procedural_vs_declarative_api)**: Add and leverage any function, program, script or algorithm from the Python ecosystem to enhance your workflows and applications. Drill down on any layer as deep as it gets, up until the inner workings of your models while operating SuperDuperDB with simple Python commands.
- **[Difficult Data-Types](https://docs.superduperdb.com/docs/docs/walkthrough/encoding_special_data_types/)**: Work directly with images, video, audio in your datastore, and any type which can be encoded as `bytes` in Python.
- **[Feature Storing](https://docs.superduperdb.com/docs/docs/mongodb_query_API#inserts):** Turn your database into a centralized repository for storing and managing inputs and outputs of AI models of arbitrary data-types, making them available in a structured format and known environment.
- **[Vector Search](https://docs.superduperdb.com/docs/docs/vector_search):** No need for duplicating and migrating your data to additional specialized vector databases - turn your existing battle-tested datastore into a fully-fledged multi-modal vector-search database, including easy generation of vector embeddings and vector indexes of your data with preferred models and APIs.
- **[Vector Search](https://docs.superduperdb.com/docs/docs/walkthrough/vector_search):** No need for duplicating and migrating your data to additional specialized vector databases - turn your existing battle-tested datastore into a fully-fledged multi-modal vector-search database, including easy generation of vector embeddings and vector indexes of your data with preferred models and APIs.

### Why opt for SuperDuperDB?
|| With SuperDuperDB | Without |
Expand Down
28 changes: 17 additions & 11 deletions deploy/app_template/app_template.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,17 +14,23 @@

@app.get("/")
def show():
return {"models": db.show('model'), 'listeners': db.show('listener'), 'vector_indexes': db.show('vector_index')}
return {
"models": db.show('model'),
'listeners': db.show('listener'),
'vector_indexes': db.show('vector_index'),
}


@app.get("/search")
def search(input: str):
results = sorted(list(
collection
.like(Document({'<key>': input}), vector_index='<index-name>', n=20)
.find({}, {'_id': 0}),
key=lambda x: -x['score'],
))
results = sorted(
list(
collection.like(
Document({'<key>': input}), vector_index='<index-name>', n=20
).find({}, {'_id': 0}),
key=lambda x: -x['score'],
)
)
return {'results': results}


Expand All @@ -35,10 +41,10 @@ def predict(input: str):
model_name='<model-name>',
input=input,
context_select=(
collection
.like(Document({'<key>': input}), vector_index='<index-name>', n=num_results)
.find()
collection.like(
Document({'<key>': input}), vector_index='<index-name>', n=num_results
).find()
),
context_key='txt',
)
return {'prediction': output}
return {'prediction': output}
1 change: 1 addition & 0 deletions deploy/testenv/preload.py
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
import sys

sys.path.append('./')
4 changes: 2 additions & 2 deletions docs/hr/content/docs/data_integrations/mongodb.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ db.execute(
)
```

Read more about vector-search [here](../fundamentals/25_vector_search.mdx).
Read more about vector-search [here](../fundamentals/vector_search_algorithm.md).

## Deletes

Expand All @@ -68,4 +68,4 @@ db.execute(collection.delete_many({}))
Aggregates are exactly as in `pymongo`, with the exception that a `$vectorSearch` stage may be
fed with an additional field `'like': Document({...})`, which plays the same role as in selects.

Read more about this in [the vector-search section](../fundamentals/25_vector_search.mdx).
Read more about this in [the vector-search section](../walkthrough/vector_search).
2 changes: 1 addition & 1 deletion docs/hr/content/docs/fundamentals/component_abstraction.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ instances.

### `Stack`

A `Stack` is a way of connecting diverse and interoperating sets of functionality. See [here](../walkthrough/28_creating_stacks_of_functionality.md) for more details.
A `Stack` is a way of connecting diverse and interoperating sets of functionality. See [here](../walkthrough/creating_stacks_of_functionality) for more details.

## Activating components

Expand Down
5 changes: 3 additions & 2 deletions docs/hr/content/docs/fundamentals/component_versioning.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ sidebar_position: 26

# Component versioning

Whenever a `Component` is created (see [here](../fundamentals/09_component_abstraction.md) for overview of `Component` classes),
Whenever a `Component` is created (see [here](../fundamentals/component_abstraction.md) for overview of `Component` classes),
information about that `Component` is saved in the `db.metadata` store.

All components come with attributes `.identifier` which is a unique identifying string for that `Component` instance.
Expand Down Expand Up @@ -47,4 +47,5 @@ When one adds the `VectorIndex` with `db.add(vector_index)`,
the sub-components are also versioned, if a version has not already
been assigned to those components in the same session.

Read more about `VectorIndex` and vector-searches [here](../fundamentals/25_vector_search.mdx).
Read more about `VectorIndex` and vector-searches [here](../walkthrough/vector_search.md).

8 changes: 4 additions & 4 deletions docs/hr/content/docs/fundamentals/datalayer_overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,9 +44,9 @@ The databackend typically connects to your database (although `superduperdb` als
and dispatches queries written in an query API which is compatible with that databackend, but which also includes additional aspects
specific to `superduperdb`.

Read more [here](../walkthrough/11_supported_query_APIs.md).
Read more [here](../data_integrations/supported_query_APIs.md).

The databackend is configured by setting the URI `CFG.databackend` in the [configuration system](../walkthrough/01_configuration.md).
The databackend is configured by setting the URI `CFG.databackend` in the [configuration system](../setup/configuration.md).

We support the same databackends as supported by the [`ibis` project](https://ibis-project.org/):

Expand Down Expand Up @@ -168,7 +168,7 @@ Here are the key methods which you'll use again and again:

### `db.execute`

This method executes a query. For an overview of how this works see [here](../walkthrough/11_supported_query_APIs.md).
This method executes a query. For an overview of how this works see [here](../data_integrations/supported_query_APIs.md).

### `db.add`

Expand Down Expand Up @@ -196,4 +196,4 @@ Validate your components (mostly models)

### `db.predict`

Infer predictions from models hosted by `superduperdb`. Read more about this and about models [here](../fundamentals/21_apply_models.mdx).
Infer predictions from models hosted by `superduperdb`. Read more about this and about models [here](../walkthrough/apply_models.md).
Original file line number Diff line number Diff line change
Expand Up @@ -54,4 +54,4 @@ db.add(
)
```

Read more about the `VectorIndex` concept [here](25_vector_search.mdx).
Read more about the `VectorIndex` concept [here](../walkthrough/vector_search.md).
Original file line number Diff line number Diff line change
Expand Up @@ -50,4 +50,4 @@ The most similar `ids` are retrieved. The `select` part of the query is then tra
a similar query which searches within the retrieved `ids`. The full set of results are returned
to the client.

Read [here](../walkthrough/vector_search.mdx) about setting up and detailed usage of vector-search.
Read [here](../walkthrough/vector_search.md) about setting up and detailed usage of vector-search.
12 changes: 6 additions & 6 deletions docs/hr/content/docs/get_started/quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,30 +12,30 @@ Follow these steps to quickly get started:
hosted there, can be executed directly in the environment, but can also be cloned in [their original form in the open source repo](https://github.com/SuperDuperDB/superduperdb/tree/main/examples), and executed locally.
These notebooks are also described on this documentation website in the [use-cases section](/docs/use-cases).

1. **Get setup**
2. **Get setup**

Follow [the installation guide](./installation.md) and check the [minimum-working example](./minimum_working_example.md)
to set-up your environment. For more detailed configuration, read the detailed [setup](/docs/category/setup) section.

1. **Dive into the documentation**
3. **Dive into the documentation**

Refer to our comprehensive [`README.md`](https://github.com/superDuperDB/) for a high level of SuperDuperDB. The long-form documentation you are reading now provides deeper insights into features, usage, and best practices.

1. **Explore the community apps examples**
4. **Explore the community apps examples**

Visit our [`community apps example`](https://github.com/superDuperDB/superduper-community-apps) repository to explore more examples of how SuperDuperDB can enhance your experience. Learn from real-world use cases and implementations.

1. **Grasp the fundamentals**
5. **Grasp the fundamentals**

Read through the [`Fundamentals`](../fundamentals/glossary) section to gain a solid understanding of SuperDuperDB's architecture and refer to the [`API References`](https://docs.superduperdb.com/apidocs/source/superduperdb.html) for detailed information on API usage.

1. **Engage with the Community**
6. **Engage with the Community**

If you encounter challenges, join our [Slack Channels](https://join.slack.com/t/superduperdb/shared_invite/zt-1zuojj0k0-RjAYBs1TDsvEa7yaFGa6QA) for assistance. Report bugs and share feature requests [by raising an issue]((https://github.com/SuperDuperDB/superduperdb/issues).). Our community is here to support you.

You are welcome to join the conversation on our [discussions forum](https://github.com/SuperDuperDB/superduperdb/discussions) and follow our open-source roadmap [here](https://github.com/orgs/SuperDuperDB/projects/1/views/10).

1. **Contribute and Share**
7. **Contribute and Share**

Contribute to the SuperDuperDB community by sharing your solutions and experiences.
Help us grow by promoting SuperDuperDB to your peers and the wider world. Your involvement is valuable to us! Don't forget to give us a star ⭐!
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,8 +33,8 @@ There are several gradations of a more productionized deployment.
In the most distributed case we have:

- A `jupyter` environment running in its own process
- A [distributed **Dask** cluster](31_non_blocking_dask_jobs.md), with scheduler and workers configured to work with `superduperdb`
- A [**change-data-capture** service](32_change_data_capture.md)
- A [**vector-search** service](33_vector_comparison_service.md), which finds similar vectors, given an input vector
- A [distributed **Dask** cluster](non_blocking_dask_jobs.md), with scheduler and workers configured to work with `superduperdb`
- A [**change-data-capture** service](change_data_capture.md)
- A [**vector-search** service](vector_comparison_service.md), which finds similar vectors, given an input vector

In the remainder of this section we describe the use of each of these services
2 changes: 1 addition & 1 deletion docs/hr/content/docs/setup/connecting.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,4 +41,4 @@ db = superduper('mongodb://localhost:27018', CFG=CFG)
```

The `db` object is an instance of `superduperdb.base.datalayer.Datalayer`.
The `Datalayer` class handles AI models and communicates with the databackend and associated components. Read more [here](07_datalayer_overview.md).
The `Datalayer` class handles AI models and communicates with the databackend and associated components. Read more [here](../fundamentals/datalayer_overview.md).
2 changes: 1 addition & 1 deletion docs/hr/content/docs/walkthrough/apply_models.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@ my_model.predict(
Under-the-hood, this call creates a `Listener` which is deployed on
the query passed to the `.predict` call.

Read more about the `Listener` abstraction [here](22_daemonizing_models_with_listeners.md)
Read more about the `Listener` abstraction [here](daemonizing_models_with_listeners.md)

### Activating models for vector-search with `create_vector_index=True`

Expand Down
2 changes: 1 addition & 1 deletion docs/hr/content/docs/walkthrough/vector_search.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ SuperDuperDB supports queries via:
- `pymongo`
- `ibis`

Read more about this [here](../walkthrough/11_supported_query_APIs.md).
Read more about this [here](../data_integrations/supported_query_APIs.md).

In order to use vector-search in a query, one combines these APIs with the `.like` operator.

Expand Down
Loading

0 comments on commit e5c5186

Please sign in to comment.