Fix links and re-type auto-ibis meta-data

superduper-io · Dec 2, 2023 · e5c5186 · e5c5186
1 parent 78f8255
commit e5c5186
Show file tree

Hide file tree

Showing 17 changed files with 400 additions and 66 deletions.
diff --git a/README.md b/README.md
@@ -54,14 +54,14 @@ SuperDuperDB eliminates the need for complex MLOps pipelines and specialized vec
 
 
 ### Key Features:
-- **[Integration of AI with your existing data infrastructure](https://docs.superduperdb.com/docs/docs/apply_models):** Integrate any AI models and APIs with your databases in a single scalable deployment, without the need for additional pre-processing steps, ETL or boilerplate code.
-- **[Streaming Inference](https://docs.superduperdb.com/docs/docs/daemonizing_models_with_listeners):** Have your models compute outputs automatically and immediately as new data arrives, keeping your deployment always up-to-date.
-- **[Scalable Model Training](https://docs.superduperdb.com/docs/docs/training_models):** Train AI models on large, diverse datasets simply by querying your training data. Ensured optimal performance via in-build computational optimizations.
-- **[Model Chaining](https://docs.superduperdb.com/docs/docs/linking_interdependent_models)**: Easily setup complex workflows by connecting models and APIs to work together in an interdependent and sequential manner.
-- **[Simple, but Extendable Interface](https://docs.superduperdb.com/docs/docs/procedural_vs_declarative_api)**: Add and leverage any function, program, script or algorithm from the Python ecosystem to enhance your workflows and applications. Drill down on any layer as deep as it gets, up until the inner workings of your models while operating SuperDuperDB with simple Python commands.
-- **[Difficult Data-Types](https://docs.superduperdb.com/docs/docs/encoding_special_data_types)**: Work directly with images, video, audio in your datastore, and any type which can be encoded as `bytes` in Python.
+- **[Integration of AI with your existing data infrastructure](https://docs.superduperdb.com/docs/docs/walkthrough/apply_models):** Integrate any AI models and APIs with your databases in a single scalable deployment, without the need for additional pre-processing steps, ETL or boilerplate code.
+- **[Streaming Inference](https://docs.superduperdb.com/docs/docs/walkthrough/daemonizing_models_with_listeners):** Have your models compute outputs automatically and immediately as new data arrives, keeping your deployment always up-to-date.
+- **[Scalable Model Training](https://docs.superduperdb.com/docs/docs/walkthrough/training_models):** Train AI models on large, diverse datasets simply by querying your training data. Ensured optimal performance via in-build computational optimizations.
+- **[Model Chaining](https://docs.superduperdb.com/docs/docs/walkthrough/linking_interdependent_models/)**: Easily setup complex workflows by connecting models and APIs to work together in an interdependent and sequential manner.
+- **[Simple, but Extendable Interface](https://docs.superduperdb.com/docs/docs/fundamentals/procedural_vs_declarative_api)**: Add and leverage any function, program, script or algorithm from the Python ecosystem to enhance your workflows and applications. Drill down on any layer as deep as it gets, up until the inner workings of your models while operating SuperDuperDB with simple Python commands.
+- **[Difficult Data-Types](https://docs.superduperdb.com/docs/docs/walkthrough/encoding_special_data_types/)**: Work directly with images, video, audio in your datastore, and any type which can be encoded as `bytes` in Python.
 - **[Feature Storing](https://docs.superduperdb.com/docs/docs/mongodb_query_API#inserts):** Turn your database into a centralized repository for storing and managing inputs and outputs of AI models of arbitrary data-types, making them available in a structured format and known environment.
-- **[Vector Search](https://docs.superduperdb.com/docs/docs/vector_search):** No need for duplicating and migrating your data to additional specialized vector databases - turn your existing battle-tested datastore into a fully-fledged multi-modal vector-search database, including easy generation of vector embeddings and vector indexes of your data with preferred models and APIs.
+- **[Vector Search](https://docs.superduperdb.com/docs/docs/walkthrough/vector_search):** No need for duplicating and migrating your data to additional specialized vector databases - turn your existing battle-tested datastore into a fully-fledged multi-modal vector-search database, including easy generation of vector embeddings and vector indexes of your data with preferred models and APIs.
 
 ### Why opt for SuperDuperDB?
 || With SuperDuperDB | Without |

diff --git a/deploy/app_template/app_template.py b/deploy/app_template/app_template.py
@@ -14,17 +14,23 @@
 
 @app.get("/")
 def show():
-    return {"models": db.show('model'), 'listeners': db.show('listener'), 'vector_indexes': db.show('vector_index')}
+    return {
+        "models": db.show('model'),
+        'listeners': db.show('listener'),
+        'vector_indexes': db.show('vector_index'),
+    }
 
 
 @app.get("/search")
 def search(input: str):
-    results = sorted(list(
-        collection
-            .like(Document({'<key>': input}), vector_index='<index-name>', n=20)
-            .find({}, {'_id': 0}),
-        key=lambda x: -x['score'],
-    ))
+    results = sorted(
+        list(
+            collection.like(
+                Document({'<key>': input}), vector_index='<index-name>', n=20
+            ).find({}, {'_id': 0}),
+            key=lambda x: -x['score'],
+        )
+    )
     return {'results': results}
 
 
@@ -35,10 +41,10 @@ def predict(input: str):
         model_name='<model-name>',
         input=input,
         context_select=(
-            collection
-                .like(Document({'<key>': input}), vector_index='<index-name>', n=num_results)
-                .find()
+            collection.like(
+                Document({'<key>': input}), vector_index='<index-name>', n=num_results
+            ).find()
         ),
         context_key='txt',
     )
-    return {'prediction': output}
+    return {'prediction': output}
diff --git a/deploy/testenv/preload.py b/deploy/testenv/preload.py
@@ -1,2 +1,3 @@
 import sys
+
 sys.path.append('./')
diff --git a/docs/hr/content/docs/data_integrations/mongodb.md b/docs/hr/content/docs/data_integrations/mongodb.md
@@ -55,7 +55,7 @@ db.execute(
 )
 ```
 
-Read more about vector-search [here](../fundamentals/25_vector_search.mdx).
+Read more about vector-search [here](../fundamentals/vector_search_algorithm.md).
 
 ## Deletes
 
@@ -68,4 +68,4 @@ db.execute(collection.delete_many({}))
 Aggregates are exactly as in `pymongo`, with the exception that a `$vectorSearch` stage may be
 fed with an additional field `'like': Document({...})`, which plays the same role as in selects.
 
-Read more about this in [the vector-search section](../fundamentals/25_vector_search.mdx).
+Read more about this in [the vector-search section](../walkthrough/vector_search).
diff --git a/docs/hr/content/docs/fundamentals/component_abstraction.md b/docs/hr/content/docs/fundamentals/component_abstraction.md
@@ -56,7 +56,7 @@ instances.
 
 ### `Stack`
 
-A `Stack` is a way of connecting diverse and interoperating sets of functionality. See [here](../walkthrough/28_creating_stacks_of_functionality.md) for more details.
+A `Stack` is a way of connecting diverse and interoperating sets of functionality. See [here](../walkthrough/creating_stacks_of_functionality) for more details.
 
 ## Activating components
 

diff --git a/docs/hr/content/docs/fundamentals/component_versioning.md b/docs/hr/content/docs/fundamentals/component_versioning.md
@@ -4,7 +4,7 @@ sidebar_position: 26
 
 # Component versioning
 
-Whenever a `Component` is created (see [here](../fundamentals/09_component_abstraction.md) for overview of `Component` classes),
+Whenever a `Component` is created (see [here](../fundamentals/component_abstraction.md) for overview of `Component` classes),
 information about that `Component` is saved in the `db.metadata` store.
 
 All components come with attributes `.identifier` which is a unique identifying string for that `Component` instance.
@@ -47,4 +47,5 @@ When one adds the `VectorIndex` with `db.add(vector_index)`,
 the sub-components are also versioned, if a version has not already 
 been assigned to those components in the same session.
 
-Read more about `VectorIndex` and vector-searches [here](../fundamentals/25_vector_search.mdx).
+Read more about `VectorIndex` and vector-searches [here](../walkthrough/vector_search.md).
+
diff --git a/docs/hr/content/docs/fundamentals/datalayer_overview.md b/docs/hr/content/docs/fundamentals/datalayer_overview.md
@@ -44,9 +44,9 @@ The databackend typically connects to your database (although `superduperdb` als
 and dispatches queries written in an query API which is compatible with that databackend, but which also includes additional aspects
 specific to `superduperdb`.
 
-Read more [here](../walkthrough/11_supported_query_APIs.md).
+Read more [here](../data_integrations/supported_query_APIs.md).
 
-The databackend is configured by setting the URI `CFG.databackend` in the [configuration system](../walkthrough/01_configuration.md).
+The databackend is configured by setting the URI `CFG.databackend` in the [configuration system](../setup/configuration.md).
 
 We support the same databackends as supported by the [`ibis` project](https://ibis-project.org/):
 
@@ -168,7 +168,7 @@ Here are the key methods which you'll use again and again:
 
 ### `db.execute`
 
-This method executes a query. For an overview of how this works see [here](../walkthrough/11_supported_query_APIs.md).
+This method executes a query. For an overview of how this works see [here](../data_integrations/supported_query_APIs.md).
 
 ### `db.add`
 
@@ -196,4 +196,4 @@ Validate your components (mostly models)
 
 ### `db.predict`
 
-Infer predictions from models hosted by `superduperdb`. Read more about this and about models [here](../fundamentals/21_apply_models.mdx).
+Infer predictions from models hosted by `superduperdb`. Read more about this and about models [here](../walkthrough/apply_models.md).
diff --git a/docs/hr/content/docs/fundamentals/procedural_vs_declarative_api.md b/docs/hr/content/docs/fundamentals/procedural_vs_declarative_api.md
@@ -54,4 +54,4 @@ db.add(
 )
 ```
 
-Read more about the `VectorIndex` concept [here](25_vector_search.mdx).
+Read more about the `VectorIndex` concept [here](../walkthrough/vector_search.md).
diff --git a/docs/hr/content/docs/fundamentals/vector_search_algorithm.md b/docs/hr/content/docs/fundamentals/vector_search_algorithm.md
@@ -50,4 +50,4 @@ The most similar `ids` are retrieved. The `select` part of the query is then tra
 a similar query which searches within the retrieved `ids`. The full set of results are returned
 to the client.
 
-Read [here](../walkthrough/vector_search.mdx) about setting up and detailed usage of vector-search.
+Read [here](../walkthrough/vector_search.md) about setting up and detailed usage of vector-search.
diff --git a/docs/hr/content/docs/get_started/quickstart.md b/docs/hr/content/docs/get_started/quickstart.md
@@ -12,30 +12,30 @@ Follow these steps to quickly get started:
     hosted there, can be executed directly in the environment, but can also be cloned in [their original form in the open source repo](https://github.com/SuperDuperDB/superduperdb/tree/main/examples), and executed locally.
     These notebooks are also described on this documentation website in the [use-cases section](/docs/use-cases).
 
-1. **Get setup**
+2. **Get setup**
 
     Follow [the installation guide](./installation.md) and check the [minimum-working example](./minimum_working_example.md)
     to set-up your environment. For more detailed configuration, read the detailed [setup](/docs/category/setup) section.
 
-1. **Dive into the documentation**
+3. **Dive into the documentation**
 
     Refer to our comprehensive [`README.md`](https://github.com/superDuperDB/) for a high level of SuperDuperDB. The long-form documentation you are reading now provides deeper insights into features, usage, and best practices.
 
-1. **Explore the community apps examples**
+4. **Explore the community apps examples**
 
     Visit our [`community apps example`](https://github.com/superDuperDB/superduper-community-apps) repository to explore more examples of how SuperDuperDB can enhance your experience. Learn from real-world use cases and implementations.
 
-1. **Grasp the fundamentals**
+5. **Grasp the fundamentals**
 
     Read through the [`Fundamentals`](../fundamentals/glossary) section to gain a solid understanding of SuperDuperDB's architecture and refer to the [`API References`](https://docs.superduperdb.com/apidocs/source/superduperdb.html) for detailed information on API usage.
 
-1. **Engage with the Community**
+6. **Engage with the Community**
 
     If you encounter challenges, join our [Slack Channels](https://join.slack.com/t/superduperdb/shared_invite/zt-1zuojj0k0-RjAYBs1TDsvEa7yaFGa6QA) for assistance. Report bugs and share feature requests [by raising an issue]((https://github.com/SuperDuperDB/superduperdb/issues).). Our community is here to support you.
 
     You are welcome to join the conversation on our [discussions forum](https://github.com/SuperDuperDB/superduperdb/discussions) and follow our open-source roadmap [here](https://github.com/orgs/SuperDuperDB/projects/1/views/10).
 
-1. **Contribute and Share**
+7. **Contribute and Share**
 
     Contribute to the SuperDuperDB community by sharing your solutions and experiences. 
     Help us grow by promoting SuperDuperDB to your peers and the wider world. Your involvement is valuable to us! Don't forget to give us a star ⭐!

diff --git a/docs/hr/content/docs/production/developer_vs_production_mode.md b/docs/hr/content/docs/production/developer_vs_production_mode.md
@@ -33,8 +33,8 @@ There are several gradations of a more productionized deployment.
 In the most distributed case we have:
 
 - A `jupyter` environment running in its own process
-- A [distributed **Dask** cluster](31_non_blocking_dask_jobs.md), with scheduler and workers configured to work with `superduperdb`
-- A [**change-data-capture** service](32_change_data_capture.md)
-- A [**vector-search** service](33_vector_comparison_service.md), which finds similar vectors, given an input vector
+- A [distributed **Dask** cluster](non_blocking_dask_jobs.md), with scheduler and workers configured to work with `superduperdb`
+- A [**change-data-capture** service](change_data_capture.md)
+- A [**vector-search** service](vector_comparison_service.md), which finds similar vectors, given an input vector
 
 In the remainder of this section we describe the use of each of these services
diff --git a/docs/hr/content/docs/setup/connecting.md b/docs/hr/content/docs/setup/connecting.md
@@ -41,4 +41,4 @@ db = superduper('mongodb://localhost:27018', CFG=CFG)
 ```
 
 The `db` object is an instance of `superduperdb.base.datalayer.Datalayer`.
-The `Datalayer` class handles AI models and communicates with the databackend and associated components. Read more [here](07_datalayer_overview.md).
+The `Datalayer` class handles AI models and communicates with the databackend and associated components. Read more [here](../fundamentals/datalayer_overview.md).
diff --git a/docs/hr/content/docs/walkthrough/apply_models.md b/docs/hr/content/docs/walkthrough/apply_models.md
@@ -77,7 +77,7 @@ my_model.predict(
 Under-the-hood, this call creates a `Listener` which is deployed on 
 the query passed to the `.predict` call.
 
-Read more about the `Listener` abstraction [here](22_daemonizing_models_with_listeners.md)
+Read more about the `Listener` abstraction [here](daemonizing_models_with_listeners.md)
 
 ### Activating models for vector-search with `create_vector_index=True`
 

diff --git a/docs/hr/content/docs/walkthrough/vector_search.md b/docs/hr/content/docs/walkthrough/vector_search.md
@@ -60,7 +60,7 @@ SuperDuperDB supports queries via:
 - `pymongo`
 - `ibis`
 
-Read more about this [here](../walkthrough/11_supported_query_APIs.md).
+Read more about this [here](../data_integrations/supported_query_APIs.md).
 
 In order to use vector-search in a query, one combines these APIs with the `.like` operator.
-Original file line number
+Diff line change
@@ Expand Up / @@ -54,4 +54,4 @@ db.add( @@
     )
     ```
-    Read more about the `VectorIndex` concept [here](25_vector_search.mdx).
+    Read more about the `VectorIndex` concept [here](../walkthrough/vector_search.md).