diff --git a/.gitmodules b/.gitmodules
index e69de29bb..18b5bd727 100644
--- a/.gitmodules
+++ b/.gitmodules
@@ -0,0 +1,3 @@
+[submodule "examples"]
+	path = examples
+	url = https://github.com/SuperDuperDB/superduper-community-apps
diff --git a/README.md b/README.md
index aef415752..cd4c6584c 100644
--- a/README.md
+++ b/README.md
@@ -254,58 +254,58 @@ Also find use-cases and apps built by the community in the [superduper-community
 <table>
   <tr>
     <td width="30%">
-      <a href="https://demo.superduperdb.com/user-redirect/lab/tree/examples/multimodal_image_search_clip.ipynb">
+      <a href="https://demo.superduperdb.com/user-redirect/lab/tree/examples/official-notebooks/multimodal_image_search_clip.ipynb">
         <img src="https://raw.githubusercontent.com/SuperDuperDB/superduperdb/main/docs/hr/static/icons/featured-examples/image-search.svg" />
       </a>
     </td>
     <td width="30%">
-      <a href="https://demo.superduperdb.com/user-redirect/lab/tree/examples/video_search.ipynb">
+      <a href="https://demo.superduperdb.com/user-redirect/lab/tree/examples/official-notebooks/video_search.ipynb">
         <img src="https://raw.githubusercontent.com/SuperDuperDB/superduperdb/main/docs/hr/static/icons/featured-examples/video-search.svg" />
       </a>
     </td>
     <td width="30%">
-      <a href="https://demo.superduperdb.com/user-redirect/lab/tree/examples/question_the_docs.ipynb">
+      <a href="https://demo.superduperdb.com/user-redirect/lab/tree/examples/official-notebooks/question_the_docs.ipynb">
         <img src="https://raw.githubusercontent.com/SuperDuperDB/superduperdb/main/docs/hr/static/icons/featured-examples/semantic-search.svg" />
       </a>
     </td>
   </tr>
   <tr>
     <th>
-      <a href="https://demo.superduperdb.com/user-redirect/lab/tree/examples/multimodal_image_search_clip.ipynb">Text-To-Image Search</a>
+      <a href="https://demo.superduperdb.com/user-redirect/lab/tree/examples/official-notebooks/multimodal_image_search_clip.ipynb">Text-To-Image Search</a>
     </th>
     <th>
-      <a href="https://demo.superduperdb.com/user-redirect/lab/tree/examples/video_search.ipynb">Text-To-Video Search</a>
+      <a href="https://demo.superduperdb.com/user-redirect/lab/tree/examples/official-notebooks/video_search.ipynb">Text-To-Video Search</a>
     </th>
     <th>
-      <a href="https://demo.superduperdb.com/user-redirect/lab/tree/examples/question_the_docs.ipynb">Question the Docs</a>
+      <a href="https://demo.superduperdb.com/user-redirect/lab/tree/examples/official-notebooks/question_the_docs.ipynb">Question the Docs</a>
     </th>
   </tr>
   <tr>     
     <td width="30%">
-      <a href="https://demo.superduperdb.com/user-redirect/lab/tree/examples/vector_search.ipynb">
+      <a href="https://demo.superduperdb.com/user-redirect/lab/tree/examples/official-notebooks/vector_search.ipynb">
         <img src="https://raw.githubusercontent.com/SuperDuperDB/superduperdb/main/docs/hr/static/icons/featured-examples/document-search.svg" />
       </a>
     </td>
     <td width="30%">
-      <a href="https://demo.superduperdb.com/user-redirect/lab/tree/examples/mnist_torch.ipynb">
+      <a href="https://demo.superduperdb.com/user-redirect/lab/tree/examples/official-notebooks/mnist_torch.ipynb">
         <img src="https://raw.githubusercontent.com/SuperDuperDB/superduperdb/main/docs/hr/static/icons/featured-examples/machine-learning.svg" />
       </a>
     </td>
     <td width="30%">
-      <a href="https://demo.superduperdb.com/user-redirect/lab/tree/examples/transfer_learning.ipynb">
+      <a href="https://demo.superduperdb.com/user-redirect/lab/tree/examples/official-notebooks/transfer_learning.ipynb">
         <img src="https://raw.githubusercontent.com/SuperDuperDB/superduperdb/main/docs/hr/static/icons/featured-examples/transfer-learning.svg" />
       </a>
     </td>
   </tr>
   <tr>
     <th>
-      <a href="https://demo.superduperdb.com/user-redirect/lab/tree/examples/vector_search.ipynb">Semantic Search Engine</a>
+      <a href="https://demo.superduperdb.com/user-redirect/lab/tree/examples/official-notebooks/vector_search.ipynb">Semantic Search Engine</a>
     </th>
     <th>
-      <a href="https://demo.superduperdb.com/user-redirect/lab/tree/examples/mnist_torch.ipynb">Classical Machine Learning</a>
+      <a href="https://demo.superduperdb.com/user-redirect/lab/tree/examples/official-notebooks/mnist_torch.ipynb">Classical Machine Learning</a>
     </th>
     <th>
-      <a href="https://demo.superduperdb.com/user-redirect/lab/tree/examples/transfer_learning.ipynb">Cross-Framework Transfer Learning</a>
+      <a href="https://demo.superduperdb.com/user-redirect/lab/tree/examples/official-notebooks/transfer_learning.ipynb">Cross-Framework Transfer Learning</a>
     </th>
   </tr>
 </table>
diff --git a/examples b/examples
new file mode 160000
index 000000000..a908a70b4
--- /dev/null
+++ b/examples
@@ -0,0 +1 @@
+Subproject commit a908a70b422ad15ced59127e8a9d2e973cef995f
diff --git a/examples/mnist_torch.ipynb b/examples/mnist_torch.ipynb
deleted file mode 100644
index e5a119f18..000000000
--- a/examples/mnist_torch.ipynb
+++ /dev/null
@@ -1,418 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "id": "4b24af19",
-   "metadata": {},
-   "source": [
-    "# Training and Maintaining MNIST Predictions with SuperDuperDB"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "8905783f",
-   "metadata": {},
-   "source": [
-    "## Introduction\n",
-    "\n",
-    "This notebook outlines the process of implementing a classic machine learning classification task - MNIST handwritten digit recognition, using a convolutional neural network. However, we introduce a unique twist by performing the task in a database using SuperDuperDB."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "## Prerequisites\n",
-    "\n",
-    "Before diving into the implementation, ensure that you have the necessary libraries installed by running the following commands:"
-   ],
-   "metadata": {
-    "collapsed": false
-   },
-   "id": "95f897a45b2a02cc"
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "a9897997-dee8-4947-9327-b96fe06a5a2c",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "!pip install superduperdb\n",
-    "!pip install torch torchvision matplotlib"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "e3812091",
-   "metadata": {},
-   "source": [
-    "## Connect to datastore \n",
-    "\n",
-    "First, we need to establish a connection to a MongoDB datastore via SuperDuperDB. You can configure the `MongoDB_URI` based on your specific setup. \n",
-    "Here are some examples of MongoDB URIs:\n",
-    "\n",
-    "* For testing (default connection): `mongomock://test`\n",
-    "* Local MongoDB instance: `mongodb://localhost:27017`\n",
-    "* MongoDB with authentication: `mongodb://superduper:superduper@mongodb:27017/documents`\n",
-    "* MongoDB Atlas: `mongodb+srv://<username>:<password>@<atlas_cluster>/<database>`"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "a28adbce",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from superduperdb import superduper\n",
-    "from superduperdb.backends.mongodb import Collection\n",
-    "import os\n",
-    "\n",
-    "mongodb_uri = os.getenv(\"MONGODB_URI\",\"mongomock://test\")\n",
-    "db = superduper(mongodb_uri)\n",
-    "\n",
-    "# Create a collection for MNIST\n",
-    "mnist_collection = Collection('mnist')"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "6233e891",
-   "metadata": {},
-   "source": [
-    "\n",
-    "## Load Dataset\n",
-    "\n",
-    "After connecting to MongoDB, we add the MNIST dataset. SuperDuperDB excels at handling \"difficult\" data types, and we achieve this using an `Encoder`, which works in tandem with the `Document` wrappers. Together, they enable Python dictionaries containing non-JSONable or bytes objects to be inserted into the underlying data infrastructure. \n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "bf0934cc",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import torchvision\n",
-    "from superduperdb.ext.pillow import pil_image\n",
-    "from superduperdb import Document\n",
-    "from superduperdb.backends.mongodb import Collection\n",
-    "\n",
-    "import random\n",
-    "\n",
-    "# Load MNIST images as Python objects using the Python Imaging Library.\n",
-    "mnist_data = list(torchvision.datasets.MNIST(root='./data', download=True))\n",
-    "document_list = [Document({'img': pil_image(x[0]), 'class': x[1]}) for x in mnist_data]\n",
-    "\n",
-    "# Shuffle the data and select a subset of 1000 documents\n",
-    "random.shuffle(document_list)\n",
-    "data = document_list[:1000]\n",
-    "\n",
-    "# Insert the selected data into the mnist_collection\n",
-    "db.execute(\n",
-    "    mnist_collection.insert_many(data[:-100]),  # Insert all but the last 100 documents\n",
-    "    encoders=(pil_image,) # Encode images using the Pillow library.\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "c5341135",
-   "metadata": {},
-   "source": [
-    "Now that the images and their classes are inserted into the database, we can query the data in its original format. Particularly, we can use the `PIL.Image` instances to inspect the data."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "a36f9c3b",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Get and display one of the images\n",
-    "r = db.execute(mnist_collection.find_one())\n",
-    "r.unpack()['img']"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "1413d4c5",
-   "metadata": {},
-   "source": [
-    "## Build Model"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "68fde8bb",
-   "metadata": {},
-   "source": [
-    "Next, we create our machine learning model. SuperDuperDB supports various frameworks out of the box, and in this case, we are using PyTorch, which is well-suited for computer vision tasks. In this example, we combine torch with torchvision.\n",
-    "\n",
-    "We create `postprocess` and `preprocess` functions to handle the communication with the SuperDuperDB `Datalayer`, and then wrap model, preprocessing and postprocessing to create a native SuperDuperDB handler.\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "cfb425e1",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import torch\n",
-    "\n",
-    "class LeNet5(torch.nn.Module):\n",
-    "    def __init__(self, num_classes):\n",
-    "        super().__init__()\n",
-    "        self.layer1 = torch.nn.Sequential(\n",
-    "            torch.nn.Conv2d(1, 6, kernel_size=5, stride=1, padding=0),\n",
-    "            torch.nn.BatchNorm2d(6),\n",
-    "            torch.nn.ReLU(),\n",
-    "            torch.nn.MaxPool2d(kernel_size=2, stride=2))\n",
-    "        self.layer2 = torch.nn.Sequential(\n",
-    "            torch.nn.Conv2d(6, 16, kernel_size=5, stride=1, padding=0),\n",
-    "            torch.nn.BatchNorm2d(16),\n",
-    "            torch.nn.ReLU(),\n",
-    "            torch.nn.MaxPool2d(kernel_size=2, stride=2))\n",
-    "        self.fc = torch.nn.Linear(400, 120)\n",
-    "        self.relu = torch.nn.ReLU()\n",
-    "        self.fc1 = torch.nn.Linear(120, 84)\n",
-    "        self.relu1 = torch.nn.ReLU()\n",
-    "        self.fc2 = torch.nn.Linear(84, num_classes)\n",
-    "\n",
-    "    def forward(self, x):\n",
-    "        out = self.layer1(x)\n",
-    "        out = self.layer2(out)\n",
-    "        out = out.reshape(out.size(0), -1)\n",
-    "        out = self.fc(out)\n",
-    "        out = self.relu(out)\n",
-    "        out = self.fc1(out)\n",
-    "        out = self.relu1(out)\n",
-    "        out = self.fc2(out)\n",
-    "        return out\n",
-    "\n",
-    "    \n",
-    "def postprocess(x):\n",
-    "    return int(x.topk(1)[1].item())\n",
-    "\n",
-    "\n",
-    "def preprocess(x):\n",
-    "    return torchvision.transforms.Compose([\n",
-    "        torchvision.transforms.Resize((32, 32)),\n",
-    "        torchvision.transforms.ToTensor(),\n",
-    "        torchvision.transforms.Normalize(mean=(0.1307,), std=(0.3081,))]\n",
-    "    )(x)\n",
-    "\n",
-    "\n",
-    "# Create and insert a SuperDuperDB model into the database\n",
-    "model = superduper(LeNet5(10), preprocess=preprocess, postprocess=postprocess, preferred_devices=('cpu',))\n",
-    "db.add(model)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "dcf0457e",
-   "metadata": {},
-   "source": [
-    "## Train Model\n",
-    "\n",
-    "Now we are ready to \"train\" or \"fit\" the model. Trainable models in SuperDuperDB come with a sklearn-like `.fit` method. \n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "e7c610c1",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from torch.nn.functional import cross_entropy\n",
-    "\n",
-    "from superduperdb import Metric\n",
-    "from superduperdb import Dataset\n",
-    "from superduperdb.ext.torch.model import TorchTrainerConfiguration\n",
-    "\n",
-    "# Fit the model to the training data\n",
-    "job = model.fit(\n",
-    "    X='img', # Feature matrix used as input data \n",
-    "    y='class', # Target variable for training\n",
-    "    db=db, # Database used for data retrieval\n",
-    "    select=mnist_collection.find(), # Select the dataset\n",
-    "    configuration=TorchTrainerConfiguration(\n",
-    "        identifier='my_configuration',\n",
-    "        objective=cross_entropy,\n",
-    "        loader_kwargs={'batch_size': 10},\n",
-    "        max_iterations=10,\n",
-    "        validation_interval=5,\n",
-    "    ),\n",
-    "    metrics=[Metric(identifier='acc', object=lambda x, y: sum([xx == yy for xx, yy in zip(x, y)]) / len(x))],\n",
-    "    validation_sets=[\n",
-    "        Dataset(\n",
-    "            identifier='my_valid',\n",
-    "            select=Collection('mnist').find({'_fold': 'valid'}),\n",
-    "        )\n",
-    "    ],\n",
-    "    distributed=False,\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "## Monitoring Training Efficiency\n",
-    "You can monitor the training efficiency with visualization tools like Matplotlib:"
-   ],
-   "metadata": {
-    "collapsed": false
-   },
-   "id": "fdf5cccb2fe0b97b"
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "200d3be1",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from matplotlib import pyplot as plt\n",
-    "\n",
-    "# Load the model from the database\n",
-    "model = db.load('model', model.identifier)\n",
-    "\n",
-    "# Plot the accuracy values\n",
-    "plt.plot(model.metric_values['my_valid/acc'])\n",
-    "plt.show()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "0199b952",
-   "metadata": {},
-   "source": [
-    "\n",
-    "## On-the-fly Predictions\n",
-    "Once the model is trained, you can use it to continuously predict on new data as it arrives. This is set up by enabling a `listener` for the database (without loading all the data client-side). The listen toggle activates the model to make predictions on incoming data changes.\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "f0e53249",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "model.predict(\n",
-    "    X='img', # Input feature  \n",
-    "    db=db,  # Database used for data retrieval\n",
-    "    select=mnist_collection.find(), # Select the dataset\n",
-    "    listen=True, # Continuous predictions on incoming data \n",
-    "    max_chunk_size=100, # Number of predictions to return at once\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "7daae786",
-   "metadata": {},
-   "source": [
-    "We can see that predictions are available in `_outputs.img.lenet5`."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "bc71a143",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "r = db.execute(mnist_collection.find_one({'_fold': 'valid'}))\n",
-    "r.unpack()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "7a78a2a1",
-   "metadata": {},
-   "source": [
-    "## Verification\n",
-    "\n",
-    "The models \"activated\" can be seen here:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "outputs": [],
-   "source": [
-    "db.show('listener')"
-   ],
-   "metadata": {
-    "collapsed": false
-   },
-   "id": "2a5308f4a158c931"
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "We can verify that the model is activated, by inserting the rest of the data:"
-   ],
-   "metadata": {
-    "collapsed": false
-   },
-   "id": "dee36a804224cbb6"
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "c1aa56d0",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "for r in data[-100:]:\n",
-    "    r['update'] = True\n",
-    "\n",
-    "db.execute(mnist_collection.insert_many(data[-100:]))"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "9eb48a30",
-   "metadata": {},
-   "source": [
-    "You can see that the inserted data, are now also populated with predictions:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "d8161983",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "db.execute(mnist_collection.find_one({'update': True}))['_outputs']"
-   ]
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.11.6"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 5
-}
diff --git a/examples/multimodal_image_search_clip.ipynb b/examples/multimodal_image_search_clip.ipynb
deleted file mode 100644
index 68d7d6885..000000000
--- a/examples/multimodal_image_search_clip.ipynb
+++ /dev/null
@@ -1,352 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "id": "238520e0",
-   "metadata": {},
-   "source": [
-    "# Multimodal Search Using CLIP"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "a3590f0e",
-   "metadata": {},
-   "source": [
-    "## Introduction\n",
-    "\n",
-    "This notebook showcases the capabilities of SuperDuperDB for performing multimodal searches using the `VectorIndex`. SuperDuperDB's flexibility enables users and developers to integrate various models into the system and use them for vectorizing diverse queries during search and inference. In this demonstration, we leverage the [CLIP multimodal architecture](https://openai.com/research/clip)."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "40272d6a2681c8e8",
-   "metadata": {
-    "collapsed": false,
-    "jupyter": {
-     "outputs_hidden": false
-    }
-   },
-   "source": [
-    "## Prerequisites\n",
-    "\n",
-    "Before diving into the implementation, ensure that you have the necessary libraries installed by running the following commands:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "5ebe1497",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "!pip install superduperdb\n",
-    "!pip install ipython openai-clip\n",
-    "!pip install -U datasets"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "b2f94ae8",
-   "metadata": {},
-   "source": [
-    "## Connect to datastore \n",
-    "\n",
-    "First, we need to establish a connection to a MongoDB datastore via SuperDuperDB. You can configure the `MongoDB_URI` based on your specific setup. \n",
-    "Here are some examples of MongoDB URIs:\n",
-    "\n",
-    "* For testing (default connection): `mongomock://test`\n",
-    "* Local MongoDB instance: `mongodb://localhost:27017`\n",
-    "* MongoDB with authentication: `mongodb://superduper:superduper@mongodb:27017/documents`\n",
-    "* MongoDB Atlas: `mongodb+srv://<username>:<password>@<atlas_cluster>/<database>`"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "2b5ef986",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import os\n",
-    "from superduperdb import superduper\n",
-    "from superduperdb.backends.mongodb import Collection\n",
-    "\n",
-    "mongodb_uri = os.getenv(\"MONGODB_URI\", \"mongomock://test\")\n",
-    "db = superduper(mongodb_uri, artifact_store='filesystem://./models/')\n",
-    "\n",
-    "# Super-Duper your Database!\n",
-    "db = superduper(mongodb_uri, artifact_store='filesystem://.data')\n",
-    "\n",
-    "collection = Collection('multimodal')"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "6cd6d6b0",
-   "metadata": {},
-   "source": [
-    "## Load Dataset \n",
-    "\n",
-    "To make this notebook easily executable and interactive, we'll work with a sub-sample of the [Tiny-Imagenet dataset](https://paperswithcode.com/dataset/tiny-imagenet). The processes demonstrated here can be applied to larger datasets with higher resolution images as well. For such use-cases, however, it's advisable to use a machine with a GPU, otherwise they'll be some significant thumb twiddling to do.\n",
-    "\n",
-    "To insert images into the database, we utilize the `Encoder`-`Document` framework, which allows saving Python class instances as blobs in the `Datalayer` and retrieving them as Python objects. To this end, SuperDuperDB contains pre-configured support for `PIL.Image` instances. This simplifies the integration of Python AI models with the datalayer. It's also possible to create your own encoders.\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "5f0f14fb-8e79-4bc6-88af-1a800aecb8db",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "!curl -O https://superduperdb-public.s3.eu-west-1.amazonaws.com/coco_sample.zip\n",
-    "!unzip coco_sample.zip"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "e41e6faa-6b83-46d8-ab37-6de6fd346ee7",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from superduperdb import Document\n",
-    "from superduperdb.ext.pillow import pil_image as i\n",
-    "import glob\n",
-    "import random\n",
-    "\n",
-    "images = glob.glob('images_small/*.jpg')\n",
-    "documents = [Document({'image': i(uri=f'file://{img}')}) for img in images][:500]"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "1b3a63bf-9e1f-4266-823a-7a2208937e01",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "documents[1]"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "c9c7e282",
-   "metadata": {},
-   "source": [
-    "The wrapped python dictionaries may be inserted directly to the `Datalayer`:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "c32a91a5",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "db.execute(collection.insert_many(documents), encoders=(i,))"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "10d37264",
-   "metadata": {},
-   "source": [
-    "You can verify that the images are correctly stored as follows:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "7282a0fe",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "x = db.execute(imagenet_collection.find_one()).unpack()['image']\n",
-    "display(x.resize((300, 300 * int(x.size[1] / x.size[0]))))"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "dab27b50",
-   "metadata": {},
-   "source": [
-    "## Build Models\n",
-    "We now can wrap the CLIP model, to ready it for multimodal search. It involves 2 components:\n",
-    "\n",
-    "Now, let's prepare the CLIP model for multimodal search, which involves two components: `text encoding` and `visual encoding`. After installing both components, you can perform searches using both images and text to find matching items:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "916792d3",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import clip\n",
-    "from superduperdb import vector\n",
-    "from superduperdb.ext.torch import TorchModel\n",
-    "\n",
-    "# Load the CLIP model\n",
-    "model, preprocess = clip.load(\"RN50\", device='cpu')\n",
-    "\n",
-    "# Define a vector\n",
-    "e = vector(shape=(1024,))\n",
-    "\n",
-    "# Create a TorchModel for text encoding\n",
-    "text_model = TorchModel(\n",
-    "    identifier='clip_text',\n",
-    "    object=model,\n",
-    "    preprocess=lambda x: clip.tokenize(x)[0],\n",
-    "    postprocess=lambda x: x.tolist(),\n",
-    "    encoder=e,\n",
-    "    forward_method='encode_text',    \n",
-    ")\n",
-    "\n",
-    "# Create a TorchModel for visual encoding\n",
-    "visual_model = TorchModel(\n",
-    "    identifier='clip_image',\n",
-    "    object=model.visual,    \n",
-    "    preprocess=preprocess,\n",
-    "    postprocess=lambda x: x.tolist(),\n",
-    "    encoder=e,\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "b716bcb2",
-   "metadata": {},
-   "source": [
-    "## Create a Vector-Search Index\n",
-    "\n",
-    "Let's create the index for vector-based searching. We'll register both models with the index simultaneously, but specify that the `visual_model` will be responsible for creating the vectors in the database (`indexing_listener`). The `compatible_listener` specifies how an alternative model can be used to search the vectors, enabling multimodal search with models expecting different types of indexes."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "c4e0302c",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from superduperdb import VectorIndex\n",
-    "from superduperdb import Listener\n",
-    "\n",
-    "# Create a VectorIndex and add it to the database\n",
-    "db.add(\n",
-    "    VectorIndex(\n",
-    "        'my-index',\n",
-    "        indexing_listener=Listener(\n",
-    "            model=visual_model,\n",
-    "            key='image',\n",
-    "            select=collection.find(),\n",
-    "            predict_kwargs={'batch_size': 10},\n",
-    "        ),\n",
-    "        compatible_listener=Listener(\n",
-    "            model=text_model,\n",
-    "            key='text',\n",
-    "            active=False,\n",
-    "            select=None,\n",
-    "        )\n",
-    "    )\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "18971a6d",
-   "metadata": {},
-   "source": [
-    "## Search Images Using Text\n",
-    "\n",
-    "Now we can demonstrate searching for images using text queries:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "ab994b5e",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from IPython.display import display\n",
-    "from superduperdb import Document\n",
-    "\n",
-    "query_string = 'sports'\n",
-    "\n",
-    "out = db.execute(\n",
-    "    collection.like(Document({'text': query_string}), vector_index='my-index', n=3).find({})\n",
-    ")\n",
-    "\n",
-    "# Display the images from the search results\n",
-    "for r in search_results:\n",
-    "    x = r['image'].x\n",
-    "    display(x.resize((300, int(300 * x.size[1] / x.size[0]))))"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "b5e3ac22-044f-4675-976a-68ff9b59efe9",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "img = db.execute(collection.find_one({}))['image']\n",
-    "img.x"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "c8569e4f-74f2-4ee5-9674-7829b2fcc62b",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "cur = db.execute(\n",
-    "    collection.like(Document({'image': img}), vector_index='my-index', n=3).find({})\n",
-    ")\n",
-    "\n",
-    "for r in cur:\n",
-    "    x = r['image'].x\n",
-    "    display(x.resize((300, int(300 * x.size[1] / x.size[0]))))"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "806a445f1dfacd90",
-   "metadata": {
-    "collapsed": false,
-    "jupyter": {
-     "outputs_hidden": false
-    }
-   },
-   "outputs": [],
-   "source": []
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.11.5"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 5
-}
diff --git a/examples/question_the_docs.ipynb b/examples/question_the_docs.ipynb
deleted file mode 100644
index 1ad4c65b0..000000000
--- a/examples/question_the_docs.ipynb
+++ /dev/null
@@ -1,450 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "id": "c042ddbb-c2c9-46ed-b36c-c965c0d7ff5b",
-   "metadata": {},
-   "source": [
-    "# Building Q&A Assistant Using Mongo and OpenAI"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "7e6fbce6-fec9-47af-8701-99721eedec50",
-   "metadata": {},
-   "source": [
-    "## Introduction\n",
-    "\n",
-    "This notebook is designed to demonstrate how to implement a document Question-and-Answer (Q&A) task using SuperDuperDB in conjunction with OpenAI and MongoDB. It provides a step-by-step guide and explanation of each component involved in the process.\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "## Prerequisites\n",
-    "\n",
-    "Before diving into the implementation, ensure that you have the necessary libraries installed by running the following commands:"
-   ],
-   "metadata": {
-    "collapsed": false
-   },
-   "id": "f98f1c7ae8e02278"
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "6858da67-597d-4d98-ae4a-41003bb569f4",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "!pip install superduperdb\n",
-    "!pip install ipython openai==0.27.6"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "e3befb73",
-   "metadata": {},
-   "source": [
-    "Additionally, ensure that you have set your openai API key as an environment variable. You can uncomment the following code and add your API key:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "f5bcdade-f988-4464-bfcf-806245031bb3",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import os\n",
-    "\n",
-    "#os.environ['OPENAI_API_KEY'] = 'sk-...'\n",
-    "\n",
-    "if 'OPENAI_API_KEY' not in os.environ:\n",
-    "    raise Exception('Environment variable \"OPENAI_API_KEY\" not set')"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "## Connect to datastore \n",
-    "\n",
-    "First, we need to establish a connection to a MongoDB datastore via SuperDuperDB. You can configure the `MongoDB_URI` based on your specific setup. \n",
-    "Here are some examples of MongoDB URIs:\n",
-    "\n",
-    "* For testing (default connection): `mongomock://test`\n",
-    "* Local MongoDB instance: `mongodb://localhost:27017`\n",
-    "* MongoDB with authentication: `mongodb://superduper:superduper@mongodb:27017/documents`\n",
-    "* MongoDB Atlas: `mongodb+srv://<username>:<password>@<atlas_cluster>/<database>`"
-   ],
-   "metadata": {
-    "collapsed": false
-   },
-   "id": "85c1a0f7572c43ba"
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "f42c42cc-af6a-4712-a993-d9c921693819",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from superduperdb import superduper\n",
-    "from superduperdb.backends.mongodb import Collection\n",
-    "import os\n",
-    "\n",
-    "mongodb_uri = os.getenv(\"MONGODB_URI\",\"mongomock://test\")\n",
-    "db = superduper(mongodb_uri)\n",
-    "\n",
-    "collection = Collection('questiondocs')"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "3ce857b0-738c-4d7f-bee0-f709c6fc5ddf",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "db.metadata"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "## Load Dataset \n",
-    "\n",
-    "In this example we use the internal textual data from the `superduperdb` project's API documentation. The goal is to create a chatbot that can provide information about the project. You can either load the data from your local project or use the provided data. \n",
-    "\n",
-    "If you have the SuperDuperDB project locally and want to load the latest version of the API, uncomment the following cell:"
-   ],
-   "metadata": {
-    "collapsed": false
-   },
-   "id": "737497f7d5032bf"
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "d72a2a52-964f-456e-88b6-040965f5ed1e",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# import glob\n",
-    "\n",
-    "# ROOT = '../docs/hr/content/docs/'\n",
-    "\n",
-    "# STRIDE = 3       # stride in numbers of lines\n",
-    "# WINDOW = 25       # length of window in numbers of lines\n",
-    "\n",
-    "# files = sorted(glob.glob(f'{ROOT}/*.md') + glob.glob(f'{ROOT}/*.mdx'))\n",
-    "\n",
-    "# content = sum([open(file).read().split('\\n') for file in files], [])\n",
-    "# chunks = ['\\n'.join(content[i: i + WINDOW]) for i in range(0, len(content), STRIDE)]"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "Otherwise, you can load the data from an external source. The chunks of text contain code snippets and explanations, which will be used to build the document Q&A chatbot. "
-   ],
-   "metadata": {
-    "collapsed": false
-   },
-   "id": "c9803aef243ad58c"
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "a20bb184-d45b-4647-b3c3-7043db9a3239",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from IPython.display import *\n",
-    "\n",
-    "Markdown(chunks[20])"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "e587e284-0876-4464-a977-ac97a9070787",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "!curl -O https://superduperdb-public.s3.eu-west-1.amazonaws.com/superduperdb_docs.json\n",
-    "\n",
-    "import json\n",
-    "from IPython.display import Markdown\n",
-    "\n",
-    "with open('superduperdb_docs.json') as f:\n",
-    "    chunks = json.load(f)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "4f8c4636-88c6-42a4-b471-41be7c20680f",
-   "metadata": {},
-   "source": [
-    "You can see that the chunks of text contain bits of code, and explanations, \n",
-    "which can become useful in building a document Q&A chatbot."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "0370732b-0c55-4672-b6be-0830f9a3a755",
-   "metadata": {},
-   "source": [
-    "As usual we insert the data. The `Document` wrapper allows `superduperdb` to handle records with special data types such as images,\n",
-    "video, and custom data-types."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "a7208ef2-c035-43b9-a624-ade42a06ed09",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from superduperdb import Document\n",
-    "\n",
-    "db.execute(collection.insert_many([Document({'txt': chunk}) for chunk in chunks]))"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "4b299b6f-37ae-46d7-b064-7d368d98d68a",
-   "metadata": {},
-   "source": [
-    "## Create a Vector-Search Index\n",
-    "\n",
-    "To enable question-answering over your documents, we need to setup a standard `superduperdb` vector-search index using `openai` (although there are many options\n",
-    "here: `torch`, `sentence_transformers`, `transformers`, ...)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "7930b9a1-1483-4106-873c-d85a3920c64e",
-   "metadata": {},
-   "source": [
-    "A `Model` is a wrapper around a self-built or ecosystem model, such as `torch`, `transformers`, `openai`."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "56905f2e-485e-4179-8585-34eac26c0751",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from superduperdb.ext.openai import OpenAIEmbedding\n",
-    "\n",
-    "model = OpenAIEmbedding(model='text-embedding-ada-002')"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "6bb05a78-263e-4e6f-b429-8e51dbb932b8",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "model.predict('This is a test', one=True)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "4331b81b-c257-4353-aab4-8f601bef78de",
-   "metadata": {},
-   "source": [
-    "A `Listener` \"deploys\" a `Model` to \"listen\" to incoming data, and compute outputs, which are saved in the database, via `db`."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "c1625dab-6438-494b-b74d-efb58bfc8610",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from superduperdb import Listener\n",
-    "\n",
-    "listener = Listener(model=model, key='txt', select=collection.find())"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "591dad80-3788-441b-96db-a5bf23a16979",
-   "metadata": {},
-   "source": [
-    "A `VectorIndex` wraps a `Listener`, making its outputs searchable."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "1aa132d0-e6a2-46f6-9eb8-13fbce90ff11",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from superduperdb import VectorIndex\n",
-    "\n",
-    "db.add(\n",
-    "    VectorIndex(identifier='my-index', indexing_listener=listener)\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "7fde5b17-9d71-4535-aaf6-85f4fa9910e4",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "db.execute(collection.find_one())"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "92948823-0d18-4e1b-b103-f226d6b09e52",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from superduperdb.backends.mongodb import Collection\n",
-    "from superduperdb import Document as D\n",
-    "from IPython.display import *\n",
-    "\n",
-    "query = 'Code snippet how to create a `VectorIndex` with a torchvision model'\n",
-    "\n",
-    "result = db.execute(\n",
-    "    collection\n",
-    "        .like(D({'txt': query}), vector_index='my-index', n=5)\n",
-    "        .find()\n",
-    ")\n",
-    "\n",
-    "display(Markdown('---'))\n",
-    "\n",
-    "for r in result:\n",
-    "    display(Markdown(r['txt']))\n",
-    "    display(Markdown('---'))"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "## Create a Chat-Completion Component\n",
-    "\n",
-    "In this step, a chat-completion component is created and added to the system. This component is essential for the Q&A functionality:"
-   ],
-   "metadata": {
-    "collapsed": false
-   },
-   "id": "e0922a0dc623d7bf"
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "abfa4df6-73ac-4d46-8047-011648e24958",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from superduperdb.ext.openai import OpenAIChatCompletion\n",
-    "\n",
-    "chat = OpenAIChatCompletion(\n",
-    "    model='gpt-3.5-turbo',\n",
-    "    prompt=(\n",
-    "        'Use the following description and code-snippets aboout SuperDuperDB to answer this question about SuperDuperDB\\n'\n",
-    "        'Do not use any other information you might have learned about other python packages\\n'\n",
-    "        'Only base your answer on the code-snippets retrieved\\n'\n",
-    "        '{context}\\n\\n'\n",
-    "        'Here\\'s the question:\\n'\n",
-    "    ),\n",
-    ")\n",
-    "\n",
-    "db.add(chat)\n",
-    "\n",
-    "print(db.show('model'))"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "696ac7bb-eaaf-4bec-9561-603b3c98a736",
-   "metadata": {},
-   "source": [
-    "## Ask Questions to Your Docs\n",
-    "\n",
-    "Finally, you can ask questions about the documents. You can target specific queries and use the power of MongoDB for vector-search and filtering rules. Here's an example of asking a question:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "fc4a0f6c-9e24-47aa-bc73-7cc4507e94ff",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from superduperdb import Document\n",
-    "from IPython.display import Markdown\n",
-    "\n",
-    "# Define the search parameters\n",
-    "search_term = 'Can you give me a code-snippet to set up a `VectorIndex`?'\n",
-    "num_results = 5\n",
-    "\n",
-    "output, context = db.predict(\n",
-    "    model_name='gpt-3.5-turbo',\n",
-    "    input=search_term,\n",
-    "    context_select=(\n",
-    "        collection\n",
-    "            .like(Document({'txt': search_term}), vector_index='my-index', n=num_results)\n",
-    "            .find()\n",
-    "    ),\n",
-    "    context_key='txt',\n",
-    ")\n",
-    "\n",
-    "Markdown(output.content)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "b3d1fe16-78d7-4c8d-9991-1086cc9e51bb",
-   "metadata": {},
-   "source": [
-    "Reset the demo"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "ab688589-5180-4a78-8fc3-8d3ddaf11e37",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "db.remove('vector_index', 'my-index', force=True)\n",
-    "db.remove('listener', 'text-embedding-ada-002/txt', force=True)\n",
-    "db.remove('model', 'text-embedding-ada-002', force=True)"
-   ]
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.11.6"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 5
-}
diff --git a/examples/sql-example.ipynb b/examples/sql-example.ipynb
deleted file mode 100644
index c1dad4d70..000000000
--- a/examples/sql-example.ipynb
+++ /dev/null
@@ -1,339 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "id": "f0e29fef",
-   "metadata": {},
-   "source": [
-    "# End-2-end example using SQL databases\n",
-    "\n",
-    "SuperDuperDB allows users to connect to a MongoDB database, or any one of a range of SQL databases, i.e. from this selection:\n",
-    "\n",
-    "- MongoDB\n",
-    "- PostgreSQL\n",
-    "- SQLite\n",
-    "- DuckDB\n",
-    "- BigQuery\n",
-    "- ClickHouse\n",
-    "- DataFusion\n",
-    "- Druid\n",
-    "- Impala\n",
-    "- MSSQL\n",
-    "- MySQL\n",
-    "- Oracle\n",
-    "- pandas\n",
-    "- Polars\n",
-    "- PySpark\n",
-    "- Snowflake\n",
-    "- Trino\n",
-    "\n",
-    "In this example we show case how to implement multimodal vector-search with DuckDB.\n",
-    "This is a simple extension of multimodal vector-search with MongoDB, which is \n",
-    "just slightly easier to set-up (see [here](https://docs.superduperdb.com/docs/use_cases/items/multimodal_image_search_clip)).\n",
-    "Everything we do here applies equally to any of the above supported SQL databases, as well as to tabular data formats on disk, such as `pandas`."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "ff1db9c6",
-   "metadata": {},
-   "source": [
-    "## Prerequisites\n",
-    "\n",
-    "Before working on this use-case, make sure that you've installed the software requirements:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "38d752ab",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "!pip install superduperdb[demo]"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "dfde8264",
-   "metadata": {},
-   "source": [
-    "## Connect to datastore\n",
-    "\n",
-    "The first step in any `superduperdb` workflow is to connect to your datastore.\n",
-    "In order to connect to a different datastore, add a different `URI`, e.g. `postgres://...`."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "b8e7ef91-9eda-4fbd-b34f-b49b5411fc47",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import os\n",
-    "from superduperdb import superduper\n",
-    "\n",
-    "os.makedirs('.superduperdb', exist_ok=True)\n",
-    "db = superduper('duckdb://.superduperdb/test.ddb')"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "b8794451",
-   "metadata": {},
-   "source": [
-    "## Load dataset\n",
-    "\n",
-    "Now, Once connected, add some data to the datastore:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "b9d2b073-38b0-4d29-aa65-e568f19e7852",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "!curl -O https://superduperdb-public.s3.eu-west-1.amazonaws.com/coco_sample.zip\n",
-    "!curl -O https://superduperdb-public.s3.eu-west-1.amazonaws.com/captions_tiny.json\n",
-    "!unzip coco_sample.zip\n",
-    "!mkdir -p data/coco\n",
-    "!mv images_small data/coco/images"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "ec5d36d3-7e74-4c87-92c2-ed1586330858",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import json\n",
-    "import pandas\n",
-    "import PIL.Image\n",
-    "\n",
-    "with open(captions_tiny.json') as f:\n",
-    "    data = json.load(f)[:500]\n",
-    "    \n",
-    "data = pandas.DataFrame([\n",
-    "    {\n",
-    "        'image': r['image']['_content']['path'], \n",
-    "         'captions': r['captions']\n",
-    "    } for r in data   \n",
-    "])\n",
-    "data['id'] = pandas.Series(data.index).apply(str)\n",
-    "images_df = data[['id', 'image']]\n",
-    "\n",
-    "images_df['image'] = images_df['image'].apply(PIL.Image.open)\n",
-    "captions_df = data[['id', 'captions']].explode('captions')"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "b43cd7d2",
-   "metadata": {},
-   "source": [
-    "## Define schema\n",
-    "\n",
-    "This use-case requires a table with images, and a table with text. \n",
-    "SuperDuperDB extends standard SQL functionality, by allowing developers to define\n",
-    "their own data-types via the `Encoder` abstraction."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "2d9483f3-78c5-47df-9fa2-4cd070282791",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from superduperdb.backends.ibis.query import Table\n",
-    "from superduperdb.backends.ibis.field_types import dtype\n",
-    "from superduperdb.ext.pillow import pil_image\n",
-    "from superduperdb import Schema\n",
-    "\n",
-    "captions = Table(\n",
-    "    'captions', \n",
-    "    primary_id='id',\n",
-    "    schema=Schema(\n",
-    "        'captions-schema',\n",
-    "        fields={'id': dtype(str), 'captions': dtype(str)},\n",
-    "    )\n",
-    ")\n",
-    "\n",
-    "images = Table(\n",
-    "    'images', \n",
-    "    primary_id='id',\n",
-    "    schema=Schema(\n",
-    "        'images-schema',\n",
-    "        fields={'id': dtype(str), 'image': pil_image},\n",
-    "    )\n",
-    ")\n",
-    "\n",
-    "db.add(captions)\n",
-    "db.add(images)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "115b2c14",
-   "metadata": {},
-   "source": [
-    "## Add data to the datastore"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "93cce29b-dd04-47d2-bdfc-fe3780e06ddb",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "_ = db.execute(images.insert(images_df))\n",
-    "_ = db.execute(captions.insert(captions_df))"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "def10282",
-   "metadata": {},
-   "source": [
-    "## Build SuperDuperDB `Model` instances\n",
-    "\n",
-    "This use-case uses the `superduperdb.ext.torch` extension. \n",
-    "Both models used, output `torch` tensors, which are encoded with `tensor`:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "d87ed43d-6f90-46c1-8851-6050ae21a051",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import clip\n",
-    "import torch\n",
-    "from superduperdb.ext.torch import TorchModel, tensor\n",
-    "\n",
-    "# Load the CLIP model\n",
-    "model, preprocess = clip.load(\"RN50\", device='cpu')\n",
-    "\n",
-    "# Define a tensor type\n",
-    "t = tensor(torch.float, shape=(1024,))\n",
-    "\n",
-    "# Create a TorchModel for text encoding\n",
-    "text_model = TorchModel(\n",
-    "    identifier='clip_text',\n",
-    "    object=model,\n",
-    "    preprocess=lambda x: clip.tokenize(x)[0],\n",
-    "    encoder=t,\n",
-    "    forward_method='encode_text',    \n",
-    ")\n",
-    "\n",
-    "# Create a TorchModel for visual encoding\n",
-    "visual_model = TorchModel(\n",
-    "    identifier='clip_image',\n",
-    "    object=model.visual,    \n",
-    "    preprocess=preprocess,\n",
-    "    encoder=t,\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "89c5c236",
-   "metadata": {},
-   "source": [
-    "## Create a Vector-Search Index\n",
-    "\n",
-    "Let's define a mult-modal search index on the basis of the models imported above.\n",
-    "The `visual_model` is applied to the images, to make the `images` table searchable."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "fb8aef2c-484f-41a9-9956-d80b9c58eaa8",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from superduperdb import VectorIndex, Listener\n",
-    "\n",
-    "db.add(\n",
-    "    VectorIndex(\n",
-    "        'my-index',\n",
-    "        indexing_listener=Listener(\n",
-    "            model=visual_model,\n",
-    "            key='image',\n",
-    "            select=images,\n",
-    "        ),\n",
-    "        compatible_listener=Listener(\n",
-    "            model=text_model,\n",
-    "            key='captions',\n",
-    "            active=False,\n",
-    "            select=None,\n",
-    "        )\n",
-    "    )\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "4d8b9e84",
-   "metadata": {},
-   "source": [
-    "## Search Images Using Text\n",
-    "\n",
-    "Now we can demonstrate searching for images using text queries:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "1731a574-921a-4cba-a65c-26bff9fb9c8c",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from superduperdb import Document\n",
-    "\n",
-    "res = db.execute(\n",
-    "    images\n",
-    "        .like(Document({'captions': 'dog catches frisbee'}), vector_index='my-index', n=10)\n",
-    "        .limit(10)\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "72522031-0af8-452a-bbd1-b27dede55154",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "res[3]['image'].x"
-   ]
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.11.6"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 5
-}
diff --git a/examples/transfer_learning.ipynb b/examples/transfer_learning.ipynb
deleted file mode 100644
index 1aeea0dcb..000000000
--- a/examples/transfer_learning.ipynb
+++ /dev/null
@@ -1,267 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "source": [
-    "# Transfer Learning with Sentence Transformers and Scikit-Learn"
-   ],
-   "metadata": {
-    "collapsed": false
-   },
-   "id": "fe6fd0ab0e1ad844"
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "## Introduction\n",
-    "\n",
-    "In this notebook, we will explore the process of transfer learning using SuperDuperDB. We will demonstrate how to connect to a MongoDB datastore, load a dataset, create a SuperDuperDB model based on Sentence Transformers, train a downstream model using Scikit-Learn, and apply the trained model to the database. Transfer learning is a powerful technique that can be used in various applications, such as vector search and downstream learning tasks."
-   ],
-   "metadata": {
-    "collapsed": false
-   },
-   "id": "8dcde44d942793ff"
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "## Prerequisites\n",
-    "\n",
-    "Before diving into the implementation, ensure that you have the necessary libraries installed by running the following commands:"
-   ],
-   "metadata": {
-    "collapsed": false
-   },
-   "id": "1809feca8a8dca5a"
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "outputs": [],
-   "source": [
-    "!pip install superduperdb\n",
-    "!pip install ipython numpy datasets sentence-transformers"
-   ],
-   "metadata": {
-    "collapsed": false
-   },
-   "id": "94f3219ad932a327"
-  },
-  {
-   "cell_type": "markdown",
-   "id": "6bc151f6",
-   "metadata": {},
-   "source": [
-    "## Connect to datastore "
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "First, we need to establish a connection to a MongoDB datastore via SuperDuperDB. You can configure the `MongoDB_URI` based on your specific setup. \n",
-    "Here are some examples of MongoDB URIs:\n",
-    "\n",
-    "* For testing (default connection): `mongomock://test`\n",
-    "* Local MongoDB instance: `mongodb://localhost:27017`\n",
-    "* MongoDB with authentication: `mongodb://superduper:superduper@mongodb:27017/documents`\n",
-    "* MongoDB Atlas: `mongodb+srv://<username>:<password>@<atlas_cluster>/<database>`"
-   ],
-   "metadata": {
-    "collapsed": false
-   },
-   "id": "5379007991707d17"
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "44f8ef76",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from superduperdb import superduper\n",
-    "from superduperdb.backends.mongodb import Collection\n",
-    "import os\n",
-    "\n",
-    "mongodb_uri = os.getenv(\"MONGODB_URI\",\"mongomock://test\")\n",
-    "db = superduper(mongodb_uri)\n",
-    "\n",
-    "collection = Collection('transfer')"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "97fede97",
-   "metadata": {},
-   "source": [
-    "## Load Dataset\n",
-    "\n",
-    "Transfer learning can be applied to any data that can be processed with SuperDuperDB models.\n",
-    "For our example, we will use a labeled textual dataset with sentiment analysis.  We'll load a subset of the IMDb dataset."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "8bb65106",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import numpy\n",
-    "from datasets import load_dataset\n",
-    "from superduperdb import Document as D\n",
-    "\n",
-    "data = load_dataset(\"imdb\")\n",
-    "\n",
-    "N_DATAPOINTS = 500    # Increase for higher quality\n",
-    "\n",
-    "train_data = [\n",
-    "    D({'_fold': 'train', **data['train'][int(i)]}) \n",
-    "    for i in numpy.random.permutation(len(data['train']))\n",
-    "][:N_DATAPOINTS]\n",
-    "\n",
-    "valid_data = [\n",
-    "    D({'_fold': 'valid', **data['test'][int(i)]}) \n",
-    "    for i in numpy.random.permutation(len(data['test']))\n",
-    "][:N_DATAPOINTS // 10]\n",
-    "\n",
-    "db.execute(collection.insert_many(train_data))"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "00a92214",
-   "metadata": {},
-   "source": [
-    "## Run Model\n",
-    "\n",
-    "We'll create a SuperDuperDB model based on the `sentence_transformers` library. This demonstrates that you don't necessarily need a native SuperDuperDB integration with a model library to leverage its power. We configure the `Model wrapper` to work with the `SentenceTransformer class`. After configuration, we can link the model to a collection and daemonize the model with the `listen=True` keyword."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "fef91c74",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from superduperdb import Model\n",
-    "import sentence_transformers\n",
-    "from superduperdb.ext.numpy import array\n",
-    "\n",
-    "m = Model(\n",
-    "    identifier='all-MiniLM-L6-v2',\n",
-    "    object=sentence_transformers.SentenceTransformer('all-MiniLM-L6-v2'),\n",
-    "    encoder=array('float32', shape=(384,)),\n",
-    "    predict_method='encode',\n",
-    "    batch_predict=True,\n",
-    ")\n",
-    "\n",
-    "m.predict(\n",
-    "    X='text',\n",
-    "    db=db,\n",
-    "    select=collection.find(),\n",
-    "    listen=True\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "68fefc17",
-   "metadata": {},
-   "source": [
-    "## Train Downstream Model\n",
-    "Now that we've created and added the model that computes features for the `\"text\"`, we can train a downstream model using Scikit-Learn."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "8c2faeeb",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from sklearn.svm import SVC\n",
-    "\n",
-    "model = superduper(\n",
-    "    SVC(gamma='scale', class_weight='balanced', C=100, verbose=True),\n",
-    "    postprocess=lambda x: int(x)\n",
-    ")\n",
-    "\n",
-    "model.fit(\n",
-    "    X='text',\n",
-    "    y='label',\n",
-    "    db=db,\n",
-    "    select=collection.find().featurize({'text': 'all-MiniLM-L6-v2'}),\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "d1e1f164",
-   "metadata": {},
-   "source": [
-    "## Run Downstream Model\n",
-    "\n",
-    "With the model trained, we can now apply it to the database. "
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "eee16436",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "model.predict(\n",
-    "    X='text',\n",
-    "    db=db,\n",
-    "    select=collection.find().featurize({'text': 'all-MiniLM-L6-v2'}),\n",
-    "    listen=True,\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "67b156c1",
-   "metadata": {},
-   "source": [
-    "## Verification\n",
-    "\n",
-    "To verify that the process has worked, we can sample a few records to inspect the sanity of the predictions."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "76958a1e",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "r = next(db.execute(collection.aggregate([{'$sample': {'size': 1}}])))\n",
-    "print(r['text'][:100])\n",
-    "print(r['_outputs']['text']['svc'])"
-   ]
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.11.6"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 5
-}
diff --git a/examples/vector_search.ipynb b/examples/vector_search.ipynb
deleted file mode 100644
index cf4172a6e..000000000
--- a/examples/vector_search.ipynb
+++ /dev/null
@@ -1,324 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "id": "0d352545-a8c6-45ad-8359-c9b6edd2b7d2",
-   "metadata": {},
-   "source": [
-    "# Vector-search with SuperDuperDB\n",
-    "\n",
-    "## Introduction\n",
-    "This notebook provides a detailed guide on performing vector search using SuperDuperDB. Vector search is a powerful technique for searching and retrieving documents based on their similarity to a query vector. In this guide, we will demonstrate how to set up SuperDuperDB for vector search and use it to search a dataset of documents."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "f283b5675bea4619",
-   "metadata": {
-    "collapsed": false,
-    "jupyter": {
-     "outputs_hidden": false
-    }
-   },
-   "source": [
-    "## Prerequisites\n",
-    "\n",
-    "Before diving into the implementation, ensure that you have the necessary libraries installed by running the following commands:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "c1f9e69e-75f4-42f9-a48d-b1f68f02646d",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "!pip install superduperdb\n",
-    "!pip install ipython"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "f79d7ef8-46eb-4210-8d96-a09648314e37",
-   "metadata": {},
-   "source": [
-    "Additionally, ensure that you have set your openai API key as an environment variable. You can uncomment the following code and add your API key:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "a1c8e68c-045f-44b8-bfbf-4c9dff5cf30c",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import os\n",
-    "\n",
-    "#os.environ['OPENAI_API_KEY'] = 'sk-...'\n",
-    "\n",
-    "if 'OPENAI_API_KEY' not in os.environ:\n",
-    "    raise Exception('You need to set an OpenAI key as environment variable: \"export OPEN_API_KEY=sk-...\"')"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "4db1c2b4-e0b3-420f-ba1c-bd49655bff2b",
-   "metadata": {},
-   "source": [
-    "## Connect to datastore \n",
-    "\n",
-    "First, we need to establish a connection to a MongoDB datastore via SuperDuperDB. You can configure the `MongoDB_URI` based on your specific setup. \n",
-    "Here are some examples of MongoDB URIs:\n",
-    "\n",
-    "* For testing (default connection): `mongomock://test`\n",
-    "* Local MongoDB instance: `mongodb://localhost:27017`\n",
-    "* MongoDB with authentication: `mongodb://superduper:superduper@mongodb:27017/documents`\n",
-    "* MongoDB Atlas: `mongodb+srv://<username>:<password>@<atlas_cluster>/<database>`"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "8e097557-7c50-4442-9e38-1df8a9d8f211",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from superduperdb import superduper\n",
-    "from superduperdb.backends.mongodb import Collection\n",
-    "import os\n",
-    "\n",
-    "mongodb_uri = os.getenv(\"MONGODB_URI\",\"mongomock://test\")\n",
-    "db = superduper(mongodb_uri, artifact_store='filesystem://./data/')\n",
-    "\n",
-    "doc_collection = Collection('documents')"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "f41b3a35-760e-49aa-8387-6a5efb990ea5",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "db.metadata"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "6bb5ee2b-f0bb-4660-961d-fdf98833f33d",
-   "metadata": {},
-   "source": [
-    "## Load Dataset \n",
-    "\n",
-    "We have prepared a dataset, which is the inline documentation of the pymongo API. Let's load this dataset:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "049b4122-b2c9-4ca5-be3c-df788912ce34",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "!curl -O https://superduperdb-public.s3.eu-west-1.amazonaws.com/pymongo.json\n",
-    "\n",
-    "import json\n",
-    "\n",
-    "with open('pymongo.json') as f:\n",
-    "    data = json.load(f)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "420ef3662c07d91e",
-   "metadata": {
-    "collapsed": false,
-    "jupyter": {
-     "outputs_hidden": false
-    }
-   },
-   "source": [
-    "As usual, we insert the data:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "468ec3dc-fa1f-4c23-b569-456b8900b72c",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from superduperdb import Document\n",
-    "\n",
-    "db.execute(doc_collection.insert_many([Document(r) for r in data]))"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "aba4aa66-2aec-4986-b263-c510788bf478",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "db.execute(Collection('documents').find_one())"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "e7f78f2d-86c0-463f-8eb1-630cd65d48ef",
-   "metadata": {},
-   "source": [
-    "## Create Vectors\n",
-    "\n",
-    "In the remainder of the notebook, you can choose between using the `openai` or `sentence_transformers` libraries to perform vector search. After instantiating the model wrappers, the rest of the notebook remains identical.\n",
-    "\n",
-    "For OpenAI vectors:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "a0a30873-b7fc-4ec5-ace9-f3d4ca01bab2",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from superduperdb.ext.openai.model import OpenAIEmbedding\n",
-    "\n",
-    "model = OpenAIEmbedding(model='text-embedding-ada-002')"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "7e8d1d264dd7ba1b",
-   "metadata": {
-    "collapsed": false,
-    "jupyter": {
-     "outputs_hidden": false
-    }
-   },
-   "source": [
-    "For Sentence-Transformers vectors, uncomment the following section:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "a14c5c5f-c770-4a94-884c-3705f1d0a627",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "#import sentence_transformers\n",
-    "#from superduperdb import Model, vector\n",
-    "\n",
-    "#model = Model(\n",
-    "#    identifier='all-MiniLM-L6-v2', \n",
-    "#    object=sentence_transformers.SentenceTransformer('all-MiniLM-L6-v2'),\n",
-    "#    encoder=vector(shape=(384,)),\n",
-    "#    predict_method='encode', # Specify the prediction method\n",
-    "#    postprocess=lambda x: x.tolist(),  # Define postprocessing function\n",
-    "#    batch_predict=True, # Generate predictions for a set of observations all at once \n",
-    "#)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "b546308d-45f2-4605-8778-7aca46fe3c7c",
-   "metadata": {},
-   "source": [
-    "## Index Vectors\n",
-    "\n",
-    "Now we can configure the Atlas vector-search index. This command saves and sets up a model to `listen` to a particular subfield (or the whole document) for new text, converts it on the fly to vectors, and then indexes these vectors using Atlas vector-search."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "46ce7a59-cdd2-46e5-a218-77ef73df7a95",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from superduperdb import Listener, VectorIndex\n",
-    "\n",
-    "db.add(\n",
-    "    VectorIndex(\n",
-    "        identifier=f'pymongo-docs-{model.identifier}',\n",
-    "        indexing_listener=Listener(\n",
-    "            select=doc_collection.find(),\n",
-    "            key='value',\n",
-    "            model=model,\n",
-    "            predict_kwargs={'max_chunk_size': 1000},\n",
-    "        ),\n",
-    "    )\n",
-    ")\n",
-    "\n",
-    "db.show('vector_index')"
-   ]
-  },
-  {
-   "cell_type": "raw",
-   "id": "8aea59ff7e8ef67c",
-   "metadata": {
-    "collapsed": false,
-    "jupyter": {
-     "outputs_hidden": false
-    }
-   },
-   "source": [
-    "## Perform Vector Search\n",
-    "\n",
-    "Now that the index is set up, we can use it in a query. SuperDuperDB provides some syntactic sugar for the `aggregate` search pipelines, which can be helpful. It also handles all the conversion of inputs to vectors under the hood."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "f0184fb1-10ae-4488-9e93-c56b5fcd9ac2",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from superduperdb import Document\n",
-    "from IPython.display import *\n",
-    "\n",
-    "# Define the search parameters\n",
-    "search_term = 'Query the database'\n",
-    "num_results = 5\n",
-    "\n",
-    "# Execute the query\n",
-    "result = db.execute(doc_collection\n",
-    "        .like(Document({'value': search_term}), vector_index=f'pymongo-docs-{model.identifier}', n=num_results)\n",
-    "        .find()\n",
-    ")\n",
-    "\n",
-    "# Display a horizontal line\n",
-    "display(Markdown('---'))\n",
-    "\n",
-    "# Iterate through the query results and display them\n",
-    "for r in result:\n",
-    "    display(Markdown(f'### `{r[\"parent\"] + \".\" if r[\"parent\"] else \"\"}{r[\"res\"]}`'))\n",
-    "    display(Markdown(r['value']))\n",
-    "    display(Markdown('---'))"
-   ]
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.11.5"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 5
-}
diff --git a/examples/video_search.ipynb b/examples/video_search.ipynb
deleted file mode 100644
index f405d5288..000000000
--- a/examples/video_search.ipynb
+++ /dev/null
@@ -1,549 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "id": "a58575c6-59c4-4289-869f-f5a1ac7e021c",
-   "metadata": {},
-   "source": [
-    "# Search within videos with text\n",
-    "\n",
-    "## Introduction\n",
-    "This notebook outlines the process of searching for specific textual information within videos and retrieving relevant video segments. To accomplish this, we utilize various libraries and techniques, such as:\n",
-    "* clip: A library for vision and language understanding.\n",
-    "* PIL: Python Imaging Library for image processing.\n",
-    "* torch: The PyTorch library for deep learning."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "6eec562900dd0cff",
-   "metadata": {
-    "collapsed": false,
-    "jupyter": {
-     "outputs_hidden": false
-    }
-   },
-   "source": [
-    "## Prerequisites\n",
-    "\n",
-    "Before diving into the implementation, ensure that you have the necessary libraries installed by running the following commands:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 5,
-   "id": "ab56c57e-fa04-43dd-9670-ade9b5c6d4ac",
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Requirement already satisfied: ipython in /Users/dodo/SuperDuperDB/superduperdb/.venv/lib/python3.11/site-packages (8.17.2)\n",
-      "Collecting opencv-python\n",
-      "  Obtaining dependency information for opencv-python from https://files.pythonhosted.org/packages/05/58/7ee92b21cb98689cbe28c69e3cf8ee51f261bfb6bc904ae578736d22d2e7/opencv_python-4.8.1.78-cp37-abi3-macosx_10_16_x86_64.whl.metadata\n",
-      "  Using cached opencv_python-4.8.1.78-cp37-abi3-macosx_10_16_x86_64.whl.metadata (19 kB)\n",
-      "Requirement already satisfied: pillow in /Users/dodo/SuperDuperDB/superduperdb/.venv/lib/python3.11/site-packages (10.1.0)\n",
-      "Collecting openai-clip\n",
-      "  Using cached openai_clip-1.0.1-py3-none-any.whl\n",
-      "Requirement already satisfied: decorator in /Users/dodo/SuperDuperDB/superduperdb/.venv/lib/python3.11/site-packages (from ipython) (5.1.1)\n",
-      "Requirement already satisfied: jedi>=0.16 in /Users/dodo/SuperDuperDB/superduperdb/.venv/lib/python3.11/site-packages (from ipython) (0.19.1)\n",
-      "Requirement already satisfied: matplotlib-inline in /Users/dodo/SuperDuperDB/superduperdb/.venv/lib/python3.11/site-packages (from ipython) (0.1.6)\n",
-      "Requirement already satisfied: prompt-toolkit!=3.0.37,<3.1.0,>=3.0.30 in /Users/dodo/SuperDuperDB/superduperdb/.venv/lib/python3.11/site-packages (from ipython) (3.0.39)\n",
-      "Requirement already satisfied: pygments>=2.4.0 in /Users/dodo/SuperDuperDB/superduperdb/.venv/lib/python3.11/site-packages (from ipython) (2.16.1)\n",
-      "Requirement already satisfied: stack-data in /Users/dodo/SuperDuperDB/superduperdb/.venv/lib/python3.11/site-packages (from ipython) (0.6.3)\n",
-      "Requirement already satisfied: traitlets>=5 in /Users/dodo/SuperDuperDB/superduperdb/.venv/lib/python3.11/site-packages (from ipython) (5.13.0)\n",
-      "Requirement already satisfied: pexpect>4.3 in /Users/dodo/SuperDuperDB/superduperdb/.venv/lib/python3.11/site-packages (from ipython) (4.8.0)\n",
-      "Requirement already satisfied: appnope in /Users/dodo/SuperDuperDB/superduperdb/.venv/lib/python3.11/site-packages (from ipython) (0.1.3)\n",
-      "Requirement already satisfied: numpy>=1.21.2 in /Users/dodo/SuperDuperDB/superduperdb/.venv/lib/python3.11/site-packages (from opencv-python) (1.26.1)\n",
-      "Requirement already satisfied: ftfy in /Users/dodo/SuperDuperDB/superduperdb/.venv/lib/python3.11/site-packages (from openai-clip) (6.1.1)\n",
-      "Requirement already satisfied: regex in /Users/dodo/SuperDuperDB/superduperdb/.venv/lib/python3.11/site-packages (from openai-clip) (2023.10.3)\n",
-      "Requirement already satisfied: tqdm in /Users/dodo/SuperDuperDB/superduperdb/.venv/lib/python3.11/site-packages (from openai-clip) (4.66.1)\n",
-      "Requirement already satisfied: parso<0.9.0,>=0.8.3 in /Users/dodo/SuperDuperDB/superduperdb/.venv/lib/python3.11/site-packages (from jedi>=0.16->ipython) (0.8.3)\n",
-      "Requirement already satisfied: ptyprocess>=0.5 in /Users/dodo/SuperDuperDB/superduperdb/.venv/lib/python3.11/site-packages (from pexpect>4.3->ipython) (0.7.0)\n",
-      "Requirement already satisfied: wcwidth in /Users/dodo/SuperDuperDB/superduperdb/.venv/lib/python3.11/site-packages (from prompt-toolkit!=3.0.37,<3.1.0,>=3.0.30->ipython) (0.2.9)\n",
-      "Requirement already satisfied: executing>=1.2.0 in /Users/dodo/SuperDuperDB/superduperdb/.venv/lib/python3.11/site-packages (from stack-data->ipython) (2.0.1)\n",
-      "Requirement already satisfied: asttokens>=2.1.0 in /Users/dodo/SuperDuperDB/superduperdb/.venv/lib/python3.11/site-packages (from stack-data->ipython) (2.4.1)\n",
-      "Requirement already satisfied: pure-eval in /Users/dodo/SuperDuperDB/superduperdb/.venv/lib/python3.11/site-packages (from stack-data->ipython) (0.2.2)\n",
-      "Requirement already satisfied: six>=1.12.0 in /Users/dodo/SuperDuperDB/superduperdb/.venv/lib/python3.11/site-packages (from asttokens>=2.1.0->stack-data->ipython) (1.16.0)\n",
-      "Using cached opencv_python-4.8.1.78-cp37-abi3-macosx_10_16_x86_64.whl (54.7 MB)\n",
-      "Installing collected packages: opencv-python, openai-clip\n",
-      "Successfully installed openai-clip-1.0.1 opencv-python-4.8.1.78\n",
-      "\n",
-      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m23.2.1\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m23.3.1\u001b[0m\n",
-      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n"
-     ]
-    }
-   ],
-   "source": [
-    "# !pip install superduperdb\n",
-    "!pip install ipython opencv-python pillow openai-clip"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "f559fff0-df68-473a-94a2-afe39e4d5577",
-   "metadata": {},
-   "source": [
-    "## Connect to datastore \n",
-    "\n",
-    "First, we need to establish a connection to a MongoDB datastore via SuperDuperDB. You can configure the `MongoDB_URI` based on your specific setup. \n",
-    "Here are some examples of MongoDB URIs:\n",
-    "\n",
-    "* For testing (default connection): `mongomock://test`\n",
-    "* Local MongoDB instance: `mongodb://localhost:27017`\n",
-    "* MongoDB with authentication: `mongodb://superduper:superduper@mongodb:27017/documents`\n",
-    "* MongoDB Atlas: `mongodb+srv://<username>:<password>@<atlas_cluster>/<database>`"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 1,
-   "id": "99de0e3d-8918-4fc4-a45b-0a58b70793c6",
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "\u001b[32m 2023-Nov-14 13:53:39.83\u001b[0m| \u001b[32m\u001b[1mSUCCESS \u001b[0m | \u001b[36mDuncans-MacBook-Pro.local\u001b[0m| \u001b[36msuperduperdb.base.build\u001b[0m:\u001b[36m69  \u001b[0m | \u001b[32m\u001b[1mInitializing DataBackend Client:  mongomock.MongoClient('localhost', 27017)\u001b[0m\n"
-     ]
-    }
-   ],
-   "source": [
-    "from superduperdb import superduper\n",
-    "from superduperdb.backends.mongodb import Collection\n",
-    "from superduperdb import CFG\n",
-    "import os\n",
-    "\n",
-    "CFG.downloads.hybrid = True\n",
-    "CFG.downloads.root = './'\n",
-    "\n",
-    "mongodb_uri = os.getenv(\"MONGODB_URI\",\"mongomock://test\")\n",
-    "db = superduper(mongodb_uri, artifact_store='filesystem://./data/')\n",
-    "\n",
-    "video_collection = Collection('videos')"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "1e53ce4113115246",
-   "metadata": {
-    "collapsed": false,
-    "jupyter": {
-     "outputs_hidden": false
-    }
-   },
-   "source": [
-    "## Load Dataset\n",
-    "\n",
-    "We'll begin by configuring a video encoder."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 2,
-   "id": "ebac4921-5c83-4ba7-b793-67f5f90d42ec",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "[]"
-      ]
-     },
-     "execution_count": 2,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "from superduperdb import Encoder\n",
-    "\n",
-    "vid_enc = Encoder(\n",
-    "    identifier='video_on_file',\n",
-    "    load_hybrid=False,\n",
-    ")\n",
-    "\n",
-    "db.add(vid_enc)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "bf1cef0e-21ac-4291-b2c8-41065717ee67",
-   "metadata": {},
-   "source": [
-    "Now, let's retrieve a sample video from the internet and insert it into our collection."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 3,
-   "id": "ee6335cb-960d-4239-be6e-501d52b88026",
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "\u001b[32m 2023-Nov-14 13:53:42.29\u001b[0m| \u001b[1mINFO    \u001b[0m | \u001b[36mDuncans-MacBook-Pro.local\u001b[0m| \u001b[36msuperduperdb.misc.download\u001b[0m:\u001b[36m358 \u001b[0m | \u001b[1mfound 1 uris\u001b[0m\n",
-      "\u001b[32m 2023-Nov-14 13:53:42.54\u001b[0m| \u001b[1mINFO    \u001b[0m | \u001b[36mDuncans-MacBook-Pro.local\u001b[0m| \u001b[36msuperduperdb.misc.download\u001b[0m:\u001b[36m125 \u001b[0m | \u001b[1mnumber of workers 0\u001b[0m\n"
-     ]
-    },
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  1.10it/s]\n"
-     ]
-    },
-    {
-     "data": {
-      "text/plain": [
-       "[Document({'video': Encodable(encoder=Encoder(identifier='video_on_file', decoder=<Artifact artifact=5383d16e618f4b51a6396fa628aaf710 serializer=dill>, encoder=<Artifact artifact=c635ff89bf0c422286e0f1fd0212e25d serializer=dill>, shape=None, version=0, load_hybrid=False), x=None, uri='https://superduperdb-public.s3.eu-west-1.amazonaws.com/animals_excerpt.mp4'), '_fold': 'train', '_id': ObjectId('65536dd6b0e451df3a649bc4')})]"
-      ]
-     },
-     "execution_count": 3,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "from superduperdb.base.document import Document\n",
-    "\n",
-    "db.execute(video_collection.insert_one(\n",
-    "        Document({'video': vid_enc(uri='https://superduperdb-public.s3.eu-west-1.amazonaws.com/animals_excerpt.mp4')})\n",
-    "    )\n",
-    ")\n",
-    "\n",
-    "# Display the list of videos in the collection\n",
-    "list(db.execute(Collection('videos').find()))"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "441fe6d6a9dee06b",
-   "metadata": {
-    "collapsed": false,
-    "jupyter": {
-     "outputs_hidden": false
-    }
-   },
-   "source": [
-    "## Register Encoders\n",
-    "\n",
-    "Next, we'll create encoders for processing videos and extracting frames. This encoder will help us convert videos into individual frames."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 6,
-   "id": "2af2d178-9ff2-496d-8293-e5aee3f12a19",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import cv2\n",
-    "import tqdm\n",
-    "from PIL import Image\n",
-    "from superduperdb.ext.pillow import pil_image\n",
-    "from superduperdb import Model, Schema\n",
-    "\n",
-    "\n",
-    "def video2images(video_file):\n",
-    "    sample_freq = 10\n",
-    "    cap = cv2.VideoCapture(video_file)\n",
-    "\n",
-    "    frame_count = 0\n",
-    "\n",
-    "    fps = cap.get(cv2.CAP_PROP_FPS)\n",
-    "    print(fps)\n",
-    "    extracted_frames = []\n",
-    "    progress = tqdm.tqdm()\n",
-    "\n",
-    "    while True:\n",
-    "        ret, frame = cap.read()\n",
-    "        if not ret:\n",
-    "            break\n",
-    "        current_timestamp = frame_count // fps\n",
-    "        \n",
-    "        if frame_count % sample_freq == 0:\n",
-    "            extracted_frames.append({\n",
-    "                'image': Image.fromarray(frame[:,:,::-1]),\n",
-    "                'current_timestamp': current_timestamp,\n",
-    "            })\n",
-    "        frame_count += 1        \n",
-    "        progress.update(1)\n",
-    "    \n",
-    "    cap.release()\n",
-    "    cv2.destroyAllWindows()\n",
-    "    return extracted_frames\n",
-    "\n",
-    "\n",
-    "video2images = Model(\n",
-    "    identifier='video2images',\n",
-    "    object=video2images,\n",
-    "    flatten=True,\n",
-    "    model_update_kwargs={'document_embedded': False},\n",
-    "    output_schema=Schema(identifier='myschema', fields={'image': pil_image})\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "19a28dbe-dcec-4c6b-bd2d-72dbd48daf39",
-   "metadata": {},
-   "source": [
-    "We'll also set up a listener to continuously download video URLs and save the best frames into another collection."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 7,
-   "id": "a30d093b-03d3-4bdb-aa8b-46ff974d1995",
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "\u001b[32m 2023-Nov-14 13:54:27.17\u001b[0m| \u001b[1mINFO    \u001b[0m | \u001b[36mDuncans-MacBook-Pro.local\u001b[0m| \u001b[36msuperduperdb.components.model\u001b[0m:\u001b[36m207 \u001b[0m | \u001b[1mAdding model video2images to db\u001b[0m\n",
-      "\u001b[32m 2023-Nov-14 13:54:27.17\u001b[0m| \u001b[1mINFO    \u001b[0m | \u001b[36mDuncans-MacBook-Pro.local\u001b[0m| \u001b[36msuperduperdb.components.model\u001b[0m:\u001b[36m210 \u001b[0m | \u001b[1mDone.\u001b[0m\n"
-     ]
-    },
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "1it [00:00, 1916.08it/s]\n"
-     ]
-    },
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "30.0\n"
-     ]
-    },
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "900it [00:00, 1844.03it/s]\n"
-     ]
-    },
-    {
-     "ename": "InvalidDocument",
-     "evalue": "documents must have only string keys, key was 0",
-     "output_type": "error",
-     "traceback": [
-      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
-      "\u001b[0;31mInvalidDocument\u001b[0m                           Traceback (most recent call last)",
-      "Cell \u001b[0;32mIn[7], line 3\u001b[0m\n\u001b[1;32m      1\u001b[0m \u001b[38;5;28;01mfrom\u001b[39;00m \u001b[38;5;21;01msuperduperdb\u001b[39;00m \u001b[38;5;28;01mimport\u001b[39;00m Listener\n\u001b[0;32m----> 3\u001b[0m \u001b[43mdb\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43madd\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m      4\u001b[0m \u001b[43m   \u001b[49m\u001b[43mListener\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m      5\u001b[0m \u001b[43m       \u001b[49m\u001b[43mmodel\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mvideo2images\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m      6\u001b[0m \u001b[43m       \u001b[49m\u001b[43mselect\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mvideo_collection\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mfind\u001b[49m\u001b[43m(\u001b[49m\u001b[43m)\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m      7\u001b[0m \u001b[43m       \u001b[49m\u001b[43mkey\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;124;43m'\u001b[39;49m\u001b[38;5;124;43mvideo\u001b[39;49m\u001b[38;5;124;43m'\u001b[39;49m\u001b[43m,\u001b[49m\n\u001b[1;32m      8\u001b[0m \u001b[43m   \u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m      9\u001b[0m \u001b[43m)\u001b[49m\n\u001b[1;32m     11\u001b[0m db\u001b[38;5;241m.\u001b[39mexecute(Collection(\u001b[38;5;124m'\u001b[39m\u001b[38;5;124m_outputs.video.video2images\u001b[39m\u001b[38;5;124m'\u001b[39m)\u001b[38;5;241m.\u001b[39mfind_one())\u001b[38;5;241m.\u001b[39munpack()[\u001b[38;5;124m'\u001b[39m\u001b[38;5;124m_outputs\u001b[39m\u001b[38;5;124m'\u001b[39m][\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mvideo\u001b[39m\u001b[38;5;124m'\u001b[39m][\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mvideo2images\u001b[39m\u001b[38;5;124m'\u001b[39m][\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mimage\u001b[39m\u001b[38;5;124m'\u001b[39m]\n",
-      "File \u001b[0;32m~/SuperDuperDB/superduperdb/superduperdb/base/datalayer.py:491\u001b[0m, in \u001b[0;36mDatalayer.add\u001b[0;34m(self, object, dependencies)\u001b[0m\n\u001b[1;32m    483\u001b[0m     \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28mtype\u001b[39m(\u001b[38;5;28mobject\u001b[39m)(\n\u001b[1;32m    484\u001b[0m         \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_add(\n\u001b[1;32m    485\u001b[0m             \u001b[38;5;28mobject\u001b[39m\u001b[38;5;241m=\u001b[39mcomponent,\n\u001b[0;32m   (...)\u001b[0m\n\u001b[1;32m    488\u001b[0m         \u001b[38;5;28;01mfor\u001b[39;00m component \u001b[38;5;129;01min\u001b[39;00m \u001b[38;5;28mobject\u001b[39m\n\u001b[1;32m    489\u001b[0m     )\n\u001b[1;32m    490\u001b[0m \u001b[38;5;28;01melif\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(\u001b[38;5;28mobject\u001b[39m, Component):\n\u001b[0;32m--> 491\u001b[0m     \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_add\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;28;43mobject\u001b[39;49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mobject\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mdependencies\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mdependencies\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m    492\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[1;32m    493\u001b[0m     \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mValueError\u001b[39;00m(\n\u001b[1;32m    494\u001b[0m         \u001b[38;5;124m'\u001b[39m\u001b[38;5;124mobject should be a sequence of `Component` or `Component`\u001b[39m\u001b[38;5;124m'\u001b[39m\n\u001b[1;32m    495\u001b[0m     )\n",
-      "File \u001b[0;32m~/SuperDuperDB/superduperdb/superduperdb/base/datalayer.py:855\u001b[0m, in \u001b[0;36mDatalayer._add\u001b[0;34m(self, object, dependencies, serialized, parent)\u001b[0m\n\u001b[1;32m    851\u001b[0m     \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mmetadata\u001b[38;5;241m.\u001b[39mcreate_parent_child(parent, \u001b[38;5;28mobject\u001b[39m\u001b[38;5;241m.\u001b[39munique_id)\n\u001b[1;32m    852\u001b[0m \u001b[38;5;28mobject\u001b[39m\u001b[38;5;241m.\u001b[39mon_load(\n\u001b[1;32m    853\u001b[0m     \u001b[38;5;28mself\u001b[39m\n\u001b[1;32m    854\u001b[0m )  \u001b[38;5;66;03m# TODO do I really need to call this here? Could be handled by `.on_create`?\u001b[39;00m\n\u001b[0;32m--> 855\u001b[0m jobs \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mobject\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mschedule_jobs\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mdependencies\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mdependencies\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m    856\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m jobs\n",
-      "File \u001b[0;32m~/SuperDuperDB/superduperdb/superduperdb/components/listener.py:113\u001b[0m, in \u001b[0;36mListener.schedule_jobs\u001b[0;34m(self, database, dependencies, distributed, verbose)\u001b[0m\n\u001b[1;32m    110\u001b[0m     \u001b[38;5;28;01mreturn\u001b[39;00m ()\n\u001b[1;32m    112\u001b[0m \u001b[38;5;28;01massert\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(\u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mmodel, \u001b[38;5;28mstr\u001b[39m)\n\u001b[0;32m--> 113\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mmodel\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mpredict\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m    114\u001b[0m \u001b[43m    \u001b[49m\u001b[43mX\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mkey\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    115\u001b[0m \u001b[43m    \u001b[49m\u001b[43mdb\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mdatabase\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    116\u001b[0m \u001b[43m    \u001b[49m\u001b[43mselect\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mselect\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    117\u001b[0m \u001b[43m    \u001b[49m\u001b[43mdistributed\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mdistributed\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    118\u001b[0m \u001b[43m    \u001b[49m\u001b[43mdependencies\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mdependencies\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    119\u001b[0m \u001b[43m    \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43m(\u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mpredict_kwargs\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;129;43;01mor\u001b[39;49;00m\u001b[43m \u001b[49m\u001b[43m{\u001b[49m\u001b[43m}\u001b[49m\u001b[43m)\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    120\u001b[0m \u001b[43m\u001b[49m\u001b[43m)\u001b[49m\n",
-      "File \u001b[0;32m~/SuperDuperDB/superduperdb/superduperdb/components/model.py:238\u001b[0m, in \u001b[0;36mPredictMixin.predict\u001b[0;34m(self, X, db, select, distributed, ids, max_chunk_size, dependencies, listen, one, context, in_memory, overwrite, **kwargs)\u001b[0m\n\u001b[1;32m    236\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m select \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m \u001b[38;5;129;01mand\u001b[39;00m ids \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m:\n\u001b[1;32m    237\u001b[0m     \u001b[38;5;28;01massert\u001b[39;00m db \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m\n\u001b[0;32m--> 238\u001b[0m     \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_predict_with_select\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m    239\u001b[0m \u001b[43m        \u001b[49m\u001b[43mX\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mX\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    240\u001b[0m \u001b[43m        \u001b[49m\u001b[43mselect\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mselect\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    241\u001b[0m \u001b[43m        \u001b[49m\u001b[43mdb\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mdb\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    242\u001b[0m \u001b[43m        \u001b[49m\u001b[43min_memory\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43min_memory\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    243\u001b[0m \u001b[43m        \u001b[49m\u001b[43mmax_chunk_size\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mmax_chunk_size\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    244\u001b[0m \u001b[43m        \u001b[49m\u001b[43moverwrite\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43moverwrite\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    245\u001b[0m \u001b[43m        \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    246\u001b[0m \u001b[43m    \u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m    247\u001b[0m \u001b[38;5;28;01melif\u001b[39;00m select \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m \u001b[38;5;129;01mand\u001b[39;00m ids \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m:\n\u001b[1;32m    248\u001b[0m     \u001b[38;5;28;01massert\u001b[39;00m db \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m\n",
-      "File \u001b[0;32m~/SuperDuperDB/superduperdb/superduperdb/components/model.py:319\u001b[0m, in \u001b[0;36mPredictMixin._predict_with_select\u001b[0;34m(self, X, select, db, max_chunk_size, in_memory, overwrite, **kwargs)\u001b[0m\n\u001b[1;32m    316\u001b[0m \u001b[38;5;28;01mfor\u001b[39;00m r \u001b[38;5;129;01min\u001b[39;00m tqdm\u001b[38;5;241m.\u001b[39mtqdm(db\u001b[38;5;241m.\u001b[39mexecute(query)):\n\u001b[1;32m    317\u001b[0m     ids\u001b[38;5;241m.\u001b[39mappend(\u001b[38;5;28mstr\u001b[39m(r[db\u001b[38;5;241m.\u001b[39mdatabackend\u001b[38;5;241m.\u001b[39mid_field]))\n\u001b[0;32m--> 319\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_predict_with_select_and_ids\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m    320\u001b[0m \u001b[43m    \u001b[49m\u001b[43mX\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mX\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    321\u001b[0m \u001b[43m    \u001b[49m\u001b[43mdb\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mdb\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    322\u001b[0m \u001b[43m    \u001b[49m\u001b[43mids\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mids\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    323\u001b[0m \u001b[43m    \u001b[49m\u001b[43mselect\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mselect\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    324\u001b[0m \u001b[43m    \u001b[49m\u001b[43mmax_chunk_size\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mmax_chunk_size\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    325\u001b[0m \u001b[43m    \u001b[49m\u001b[43min_memory\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43min_memory\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    326\u001b[0m \u001b[43m    \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    327\u001b[0m \u001b[43m\u001b[49m\u001b[43m)\u001b[49m\n",
-      "File \u001b[0;32m~/SuperDuperDB/superduperdb/superduperdb/components/model.py:395\u001b[0m, in \u001b[0;36mPredictMixin._predict_with_select_and_ids\u001b[0;34m(self, X, db, select, ids, in_memory, max_chunk_size, **kwargs)\u001b[0m\n\u001b[1;32m    391\u001b[0m     outputs \u001b[38;5;241m=\u001b[39m encoded_ouputs \u001b[38;5;28;01mif\u001b[39;00m encoded_ouputs \u001b[38;5;28;01melse\u001b[39;00m outputs\n\u001b[1;32m    393\u001b[0m \u001b[38;5;28;01massert\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(\u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mversion, \u001b[38;5;28mint\u001b[39m)\n\u001b[0;32m--> 395\u001b[0m \u001b[43mselect\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mmodel_update\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m    396\u001b[0m \u001b[43m    \u001b[49m\u001b[43mdb\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mdb\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    397\u001b[0m \u001b[43m    \u001b[49m\u001b[43mmodel\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43midentifier\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    398\u001b[0m \u001b[43m    \u001b[49m\u001b[43moutputs\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43moutputs\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    399\u001b[0m \u001b[43m    \u001b[49m\u001b[43mkey\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mX\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    400\u001b[0m \u001b[43m    \u001b[49m\u001b[43mversion\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mversion\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    401\u001b[0m \u001b[43m    \u001b[49m\u001b[43mids\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mids\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    402\u001b[0m \u001b[43m    \u001b[49m\u001b[43mflatten\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mflatten\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    403\u001b[0m \u001b[43m    \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mmodel_update_kwargs\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    404\u001b[0m \u001b[43m\u001b[49m\u001b[43m)\u001b[49m\n",
-      "File \u001b[0;32m~/SuperDuperDB/superduperdb/superduperdb/backends/base/query.py:54\u001b[0m, in \u001b[0;36mSelect.model_update\u001b[0;34m(self, db, ids, key, model, version, outputs, **kwargs)\u001b[0m\n\u001b[1;32m     44\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21mmodel_update\u001b[39m(\u001b[38;5;28mself\u001b[39m, db, ids: t\u001b[38;5;241m.\u001b[39mSequence[t\u001b[38;5;241m.\u001b[39mAnnotated], key: \u001b[38;5;28mstr\u001b[39m, model: \u001b[38;5;28mstr\u001b[39m, version: \u001b[38;5;28mint\u001b[39m, outputs: t\u001b[38;5;241m.\u001b[39mSequence[t\u001b[38;5;241m.\u001b[39mAny], \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mkwargs):\n\u001b[1;32m     45\u001b[0m \u001b[38;5;250m    \u001b[39m\u001b[38;5;124;03m\"\"\"\u001b[39;00m\n\u001b[1;32m     46\u001b[0m \u001b[38;5;124;03m    Update model outputs for a set of ids.\u001b[39;00m\n\u001b[1;32m     47\u001b[0m \n\u001b[0;32m   (...)\u001b[0m\n\u001b[1;32m     52\u001b[0m \u001b[38;5;124;03m    :param outputs: The outputs to update\u001b[39;00m\n\u001b[1;32m     53\u001b[0m \u001b[38;5;124;03m    \"\"\"\u001b[39;00m\n\u001b[0;32m---> 54\u001b[0m     \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mtable_or_collection\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mmodel_update\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m     55\u001b[0m \u001b[43m        \u001b[49m\u001b[43mdb\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mdb\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m     56\u001b[0m \u001b[43m        \u001b[49m\u001b[43mids\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mids\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m     57\u001b[0m \u001b[43m        \u001b[49m\u001b[43mkey\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mkey\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m     58\u001b[0m \u001b[43m        \u001b[49m\u001b[43mmodel\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mmodel\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m     59\u001b[0m \u001b[43m        \u001b[49m\u001b[43moutputs\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43moutputs\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m     60\u001b[0m \u001b[43m        \u001b[49m\u001b[43mversion\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mversion\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m     61\u001b[0m \u001b[43m        \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m     62\u001b[0m \u001b[43m    \u001b[49m\u001b[43m)\u001b[49m\n",
-      "File \u001b[0;32m~/SuperDuperDB/superduperdb/superduperdb/backends/mongodb/query.py:732\u001b[0m, in \u001b[0;36mCollection.model_update\u001b[0;34m(self, db, ids, key, model, version, outputs, document_embedded, flatten, **kwargs)\u001b[0m\n\u001b[1;32m    730\u001b[0m collection_name \u001b[38;5;241m=\u001b[39m \u001b[38;5;124mf\u001b[39m\u001b[38;5;124m'\u001b[39m\u001b[38;5;124m_outputs.\u001b[39m\u001b[38;5;132;01m{\u001b[39;00mkey\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m.\u001b[39m\u001b[38;5;132;01m{\u001b[39;00mmodel\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m'\u001b[39m\n\u001b[1;32m    731\u001b[0m collection \u001b[38;5;241m=\u001b[39m db\u001b[38;5;241m.\u001b[39mdatabackend\u001b[38;5;241m.\u001b[39mget_table_or_collection(collection_name)\n\u001b[0;32m--> 732\u001b[0m \u001b[43mcollection\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mbulk_write\u001b[49m\u001b[43m(\u001b[49m\u001b[43mbulk_docs\u001b[49m\u001b[43m)\u001b[49m\n",
-      "File \u001b[0;32m~/SuperDuperDB/superduperdb/.venv/lib/python3.11/site-packages/mongomock/collection.py:1823\u001b[0m, in \u001b[0;36mCollection.bulk_write\u001b[0;34m(self, requests, ordered, bypass_document_validation, session)\u001b[0m\n\u001b[1;32m   1821\u001b[0m \u001b[38;5;28;01mfor\u001b[39;00m operation \u001b[38;5;129;01min\u001b[39;00m requests:\n\u001b[1;32m   1822\u001b[0m     operation\u001b[38;5;241m.\u001b[39m_add_to_bulk(bulk)\n\u001b[0;32m-> 1823\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m BulkWriteResult(\u001b[43mbulk\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mexecute\u001b[49m\u001b[43m(\u001b[49m\u001b[43m)\u001b[49m, \u001b[38;5;28;01mTrue\u001b[39;00m)\n",
-      "File \u001b[0;32m~/SuperDuperDB/superduperdb/.venv/lib/python3.11/site-packages/mongomock/collection.py:314\u001b[0m, in \u001b[0;36mBulkOperationBuilder.execute\u001b[0;34m(self, write_concern)\u001b[0m\n\u001b[1;32m    312\u001b[0m exec_name \u001b[38;5;241m=\u001b[39m execute_func\u001b[38;5;241m.\u001b[39m\u001b[38;5;18m__name__\u001b[39m\n\u001b[1;32m    313\u001b[0m \u001b[38;5;28;01mtry\u001b[39;00m:\n\u001b[0;32m--> 314\u001b[0m     op_result \u001b[38;5;241m=\u001b[39m \u001b[43mexecute_func\u001b[49m\u001b[43m(\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m    315\u001b[0m \u001b[38;5;28;01mexcept\u001b[39;00m WriteError \u001b[38;5;28;01mas\u001b[39;00m error:\n\u001b[1;32m    316\u001b[0m     result[\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mwriteErrors\u001b[39m\u001b[38;5;124m'\u001b[39m]\u001b[38;5;241m.\u001b[39mappend({\n\u001b[1;32m    317\u001b[0m         \u001b[38;5;124m'\u001b[39m\u001b[38;5;124mindex\u001b[39m\u001b[38;5;124m'\u001b[39m: index,\n\u001b[1;32m    318\u001b[0m         \u001b[38;5;124m'\u001b[39m\u001b[38;5;124mcode\u001b[39m\u001b[38;5;124m'\u001b[39m: error\u001b[38;5;241m.\u001b[39mcode,\n\u001b[1;32m    319\u001b[0m         \u001b[38;5;124m'\u001b[39m\u001b[38;5;124merrmsg\u001b[39m\u001b[38;5;124m'\u001b[39m: \u001b[38;5;28mstr\u001b[39m(error),\n\u001b[1;32m    320\u001b[0m     })\n",
-      "File \u001b[0;32m~/SuperDuperDB/superduperdb/.venv/lib/python3.11/site-packages/mongomock/collection.py:273\u001b[0m, in \u001b[0;36mBulkOperationBuilder.insert.<locals>.exec_insert\u001b[0;34m()\u001b[0m\n\u001b[1;32m    272\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21mexec_insert\u001b[39m():\n\u001b[0;32m--> 273\u001b[0m     \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mcollection\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43minsert_one\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m    274\u001b[0m \u001b[43m        \u001b[49m\u001b[43mdoc\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mbypass_document_validation\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_bypass_document_validation\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m    275\u001b[0m     \u001b[38;5;28;01mreturn\u001b[39;00m {\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mnInserted\u001b[39m\u001b[38;5;124m'\u001b[39m: \u001b[38;5;241m1\u001b[39m}\n",
-      "File \u001b[0;32m~/SuperDuperDB/superduperdb/.venv/lib/python3.11/site-packages/mongomock/collection.py:454\u001b[0m, in \u001b[0;36mCollection.insert_one\u001b[0;34m(self, document, bypass_document_validation, session)\u001b[0m\n\u001b[1;32m    452\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m bypass_document_validation:\n\u001b[1;32m    453\u001b[0m     validate_is_mutable_mapping(\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mdocument\u001b[39m\u001b[38;5;124m'\u001b[39m, document)\n\u001b[0;32m--> 454\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m InsertOneResult(\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_insert\u001b[49m\u001b[43m(\u001b[49m\u001b[43mdocument\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43msession\u001b[49m\u001b[43m)\u001b[49m, acknowledged\u001b[38;5;241m=\u001b[39m\u001b[38;5;28;01mTrue\u001b[39;00m)\n",
-      "File \u001b[0;32m~/SuperDuperDB/superduperdb/.venv/lib/python3.11/site-packages/mongomock/collection.py:505\u001b[0m, in \u001b[0;36mCollection._insert\u001b[0;34m(self, data, session, ordered)\u001b[0m\n\u001b[1;32m    501\u001b[0m     \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mValueError\u001b[39;00m(\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mDocument keys must be strings\u001b[39m\u001b[38;5;124m'\u001b[39m)\n\u001b[1;32m    503\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m BSON:\n\u001b[1;32m    504\u001b[0m     \u001b[38;5;66;03m# bson validation\u001b[39;00m\n\u001b[0;32m--> 505\u001b[0m     \u001b[43mBSON\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mencode\u001b[49m\u001b[43m(\u001b[49m\u001b[43mdata\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mcheck_keys\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43;01mTrue\u001b[39;49;00m\u001b[43m)\u001b[49m\n\u001b[1;32m    507\u001b[0m \u001b[38;5;66;03m# Like pymongo, we should fill the _id in the inserted dict (odd behavior,\u001b[39;00m\n\u001b[1;32m    508\u001b[0m \u001b[38;5;66;03m# but we need to stick to it), so we must patch in-place the data dict\u001b[39;00m\n\u001b[1;32m    509\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;124m'\u001b[39m\u001b[38;5;124m_id\u001b[39m\u001b[38;5;124m'\u001b[39m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;129;01min\u001b[39;00m data:\n",
-      "File \u001b[0;32m~/SuperDuperDB/superduperdb/.venv/lib/python3.11/site-packages/bson/__init__.py:1428\u001b[0m, in \u001b[0;36mBSON.encode\u001b[0;34m(cls, document, check_keys, codec_options)\u001b[0m\n\u001b[1;32m   1401\u001b[0m \u001b[38;5;129m@classmethod\u001b[39m\n\u001b[1;32m   1402\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21mencode\u001b[39m(\n\u001b[1;32m   1403\u001b[0m     \u001b[38;5;28mcls\u001b[39m: Type[BSON],\n\u001b[0;32m   (...)\u001b[0m\n\u001b[1;32m   1406\u001b[0m     codec_options: CodecOptions[Any] \u001b[38;5;241m=\u001b[39m DEFAULT_CODEC_OPTIONS,\n\u001b[1;32m   1407\u001b[0m ) \u001b[38;5;241m-\u001b[39m\u001b[38;5;241m>\u001b[39m BSON:\n\u001b[1;32m   1408\u001b[0m \u001b[38;5;250m    \u001b[39m\u001b[38;5;124;03m\"\"\"Encode a document to a new :class:`BSON` instance.\u001b[39;00m\n\u001b[1;32m   1409\u001b[0m \n\u001b[1;32m   1410\u001b[0m \u001b[38;5;124;03m    A document can be any mapping type (like :class:`dict`).\u001b[39;00m\n\u001b[0;32m   (...)\u001b[0m\n\u001b[1;32m   1426\u001b[0m \u001b[38;5;124;03m       Replaced `uuid_subtype` option with `codec_options`.\u001b[39;00m\n\u001b[1;32m   1427\u001b[0m \u001b[38;5;124;03m    \"\"\"\u001b[39;00m\n\u001b[0;32m-> 1428\u001b[0m     \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28mcls\u001b[39m(\u001b[43mencode\u001b[49m\u001b[43m(\u001b[49m\u001b[43mdocument\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mcheck_keys\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mcodec_options\u001b[49m\u001b[43m)\u001b[49m)\n",
-      "File \u001b[0;32m~/SuperDuperDB/superduperdb/.venv/lib/python3.11/site-packages/bson/__init__.py:1042\u001b[0m, in \u001b[0;36mencode\u001b[0;34m(document, check_keys, codec_options)\u001b[0m\n\u001b[1;32m   1039\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(codec_options, CodecOptions):\n\u001b[1;32m   1040\u001b[0m     \u001b[38;5;28;01mraise\u001b[39;00m _CODEC_OPTIONS_TYPE_ERROR\n\u001b[0;32m-> 1042\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43m_dict_to_bson\u001b[49m\u001b[43m(\u001b[49m\u001b[43mdocument\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mcheck_keys\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mcodec_options\u001b[49m\u001b[43m)\u001b[49m\n",
-      "\u001b[0;31mInvalidDocument\u001b[0m: documents must have only string keys, key was 0"
-     ]
-    }
-   ],
-   "source": [
-    "from superduperdb import Listener\n",
-    "\n",
-    "db.add(\n",
-    "   Listener(\n",
-    "       model=video2images,\n",
-    "       select=video_collection.find(),\n",
-    "       key='video',\n",
-    "   )\n",
-    ")\n",
-    "\n",
-    "db.execute(Collection('_outputs.video.video2images').find_one()).unpack()['_outputs']['video']['video2images']['image']"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "8ef3c353-fcc4-4f23-892b-c8a3796f952c",
-   "metadata": {},
-   "source": [
-    "## Create CLIP model\n",
-    "Now, we'll create a model for the CLIP (Contrastive Language-Image Pre-training) model, which will be used for visual and textual analysis."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "bd7329ea-75d1-4275-b754-1a977e76161a",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import clip\n",
-    "from superduperdb import vector\n",
-    "from superduperdb.ext.torch import TorchModel\n",
-    "\n",
-    "model, preprocess = clip.load(\"RN50\", device='cpu')\n",
-    "t = vector(shape=(1024,))\n",
-    "\n",
-    "visual_model = TorchModel(\n",
-    "    identifier='clip_image',\n",
-    "    preprocess=preprocess,\n",
-    "    object=model.visual,\n",
-    "    encoder=t,\n",
-    "    postprocess=lambda x: x.tolist(),\n",
-    ")\n",
-    "\n",
-    "text_model = TorchModel(\n",
-    "    identifier='clip_text',\n",
-    "    object=model,\n",
-    "    preprocess=lambda x: clip.tokenize(x)[0],\n",
-    "    forward_method='encode_text',\n",
-    "    encoder=t,\n",
-    "    device='cpu',\n",
-    "    preferred_devices=None,\n",
-    "    postprocess=lambda x: x.tolist(),\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "dfa470d9-35d0-4c53-a5d0-afba5456320a",
-   "metadata": {},
-   "source": [
-    "## Create VectorIndex\n",
-    "\n",
-    "We will set up a VectorIndex to index and search the video frames based on both visual and textual content. This involves creating an indexing listener for visual data and a compatible listener for textual data."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "475d27c7-81e0-47ae-a02b-b1df4332002c",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from superduperdb import Listener, VectorIndex\n",
-    "from superduperdb.backends.mongodb import Collection\n",
-    "\n",
-    "db.add(\n",
-    "    VectorIndex(\n",
-    "        identifier='video_search_index',\n",
-    "        indexing_listener=Listener(\n",
-    "            model=visual_model,\n",
-    "            key='_outputs.video.video2images.image',\n",
-    "            select=Collection('_outputs.video.video2images').find(),\n",
-    "        ),\n",
-    "        compatible_listener=Listener(\n",
-    "            model=text_model,\n",
-    "            key='text',\n",
-    "            select=None,\n",
-    "            active=False\n",
-    "        )\n",
-    "    )\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "aeb7df08-c1ec-45db-8716-79db58ad6502",
-   "metadata": {},
-   "source": [
-    "## Query a text against saved frames."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "95c48d0c-4f7a-4c32-a2e3-3f8d8985733a",
-   "metadata": {},
-   "source": [
-    "Now, let's search for something that happened during the video:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "31ba463f-97ae-4f83-890e-c852f9818e63",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Define the search parameters\n",
-    "search_term = 'Some ducks'\n",
-    "num_results = 1\n",
-    "\n",
-    "\n",
-    "r = next(db.execute(\n",
-    "    Collection('_outputs.video.video2images').like(Document({'text': search_term}), vector_index='video_search_index', n=num_results).find()\n",
-    "))\n",
-    "\n",
-    "search_timestamp = r['_outputs']['video']['video2images']['current_timestamp']\n",
-    "\n",
-    "# Get the back reference to the original video\n",
-    "video = db.execute(Collection('videos').find_one({'_id': r['_source']}))"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "78fc11ff-dafc-4525-88a5-327ed547b89e",
-   "metadata": {},
-   "source": [
-    "## Start the video from the resultant timestamp:\n",
-    "\n",
-    "Finally, we can display and play the video starting from the timestamp where the searched text is found."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "eeda6711-15a4-465e-903d-ed0a1d0db672",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from IPython.display import display, HTML\n",
-    "\n",
-    "video_html = f\"\"\"\n",
-    "<video width=\"640\" height=\"480\" controls>\n",
-    "    <source src=\"{video['video'].uri}\" type=\"video/mp4\">\n",
-    "</video>\n",
-    "<script>\n",
-    "    var video = document.querySelector('video');\n",
-    "    video.currentTime = {search_timestamp};\n",
-    "    video.play();\n",
-    "</script>\n",
-    "\"\"\"\n",
-    "\n",
-    "display(HTML(video_html))"
-   ]
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.11.6"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 5
-}
diff --git a/examples/voice_memos.ipynb b/examples/voice_memos.ipynb
deleted file mode 100644
index 2b5079174..000000000
--- a/examples/voice_memos.ipynb
+++ /dev/null
@@ -1,371 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "id": "ae374483",
-   "metadata": {},
-   "source": [
-    "# Cataloguing voice-memos for a self managed personal assistant"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "cc4fa500665eccb9",
-   "metadata": {},
-   "source": [
-    "## Introduction\n",
-    "\n",
-    "Discover the magic of SuperDuperDB as we seamlessly integrate models across different data modalities, such as audio and text. Experience the creation of highly sophisticated data-based applications with minimal boilerplate code.\n",
-    "\n",
-    "### Objectives:\n",
-    "\n",
-    "1. Maintain a database of audio recordings\n",
-    "2. Index the content of these audio recordings\n",
-    "3. Search and interrogate the content of these audio recordings\n",
-    "\n",
-    "### Our approach involves:\n",
-    "\n",
-    "* Utilizing a transformers model by Facebook's AI team to transcribe audio to text.\n",
-    "* Employing an OpenAI vectorization model to index the transcribed text.\n",
-    "* Harnessing OpenAI ChatGPT model in conjunction with relevant recordings to query the audio database."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "ecf9f0ec45cb1f3",
-   "metadata": {},
-   "source": [
-    "## Prerequisites\n",
-    "\n",
-    "Before diving into the implementation, ensure that you have the necessary libraries installed by running the following commands:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "dce1a857",
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [],
-   "source": [
-    "!pip install superduperdb\n",
-    "!pip install transformers soundfile torchaudio librosa openai\n",
-    "!pip install -U datasets"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "0d02e472-8395-435c-b46d-6a5158ef67fb",
-   "metadata": {
-    "tags": []
-   },
-   "source": [
-    "Additionally, ensure that you have set your openai API key as an environment variable. You can uncomment the following code and add your API key:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "94262bf76c630b10",
-   "metadata": {
-    "collapsed": false,
-    "tags": []
-   },
-   "outputs": [],
-   "source": [
-    "import os\n",
-    "\n",
-    "#os.environ['OPENAI_API_KEY'] = 'sk-XXXX'\n",
-    "\n",
-    "if 'OPENAI_API_KEY' not in os.environ:\n",
-    "    raise Exception('Environment variable \"OPENAI_API_KEY\" not set')"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "32971b8afdf76fe5",
-   "metadata": {},
-   "source": [
-    "## Connect to datastore \n",
-    "\n",
-    "First, we need to establish a connection to a MongoDB datastore via SuperDuperDB. You can configure the `MongoDB_URI` based on your specific setup. \n",
-    "Here are some examples of MongoDB URIs:\n",
-    "\n",
-    "* For testing (default connection): `mongomock://test`\n",
-    "* Local MongoDB instance: `mongodb://localhost:27017`\n",
-    "* MongoDB with authentication: `mongodb://superduper:superduper@mongodb:27017/documents`\n",
-    "* MongoDB Atlas: `mongodb+srv://<username>:<password>@<atlas_cluster>/<database>`"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "b84da3f2ef58e401",
-   "metadata": {
-    "collapsed": false,
-    "tags": []
-   },
-   "outputs": [],
-   "source": [
-    "from superduperdb import superduper\n",
-    "from superduperdb.backends.mongodb import Collection\n",
-    "import os\n",
-    "\n",
-    "mongodb_uri = os.getenv(\"MONGODB_URI\",\"mongomock://test\")\n",
-    "db = superduper(mongodb_uri)\n",
-    "\n",
-    "# Create a collection for Voice memos\n",
-    "voice_collection = Collection('voice-memos')"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "d13d051e8f5f6f",
-   "metadata": {},
-   "source": [
-    "\n",
-    "## Load Dataset\n",
-    "\n",
-    "In this example se use `LibriSpeech` as our voice recording dataset. It is a corpus of approximately 1000 hours of read English speech. The same functionality could be accomplised using any audio, in particular audio hosted on the web, or in an `s3` bucket. For instance, if you have a repository of audio of conference calls, or memos, this may be indexed in the same way. "
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "10ab7114",
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [],
-   "source": [
-    "from datasets import load_dataset\n",
-    "from superduperdb.ext.numpy import array\n",
-    "from superduperdb import Document\n",
-    "\n",
-    "data = load_dataset(\"hf-internal-testing/librispeech_asr_demo\", \"clean\", split=\"validation\")\n",
-    "\n",
-    "# Using an `Encoder`, we may add the audio data directly to a MongoDB collection:\n",
-    "enc = array('float64', shape=(None,))\n",
-    "\n",
-    "db.add(enc)\n",
-    "\n",
-    "db.execute(voice_collection.insert_many([\n",
-    "    Document({'audio': enc(r['audio']['array'])}) for r in data\n",
-    "]))"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "721f31f4626881e0",
-   "metadata": {},
-   "source": [
-    "## Install Pre-Trained Model (LibreSpeech) into Database\n",
-    "\n",
-    "Apply a pretrained `transformers` model to the data: "
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "222284f7",
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [],
-   "source": [
-    "from transformers import Speech2TextProcessor, Speech2TextForConditionalGeneration\n",
-    "from superduperdb.ext.transformers import Pipeline\n",
-    "\n",
-    "model = Speech2TextForConditionalGeneration.from_pretrained(\"facebook/s2t-small-librispeech-asr\")\n",
-    "processor = Speech2TextProcessor.from_pretrained(\"facebook/s2t-small-librispeech-asr\")\n",
-    "\n",
-    "SAMPLING_RATE = 16000\n",
-    "\n",
-    "transcriber = Pipeline(\n",
-    "    identifier='transcription',\n",
-    "    object=model,\n",
-    "    preprocess=processor,\n",
-    "    preprocess_kwargs={'sampling_rate': SAMPLING_RATE, 'return_tensors': 'pt', 'padding': True},\n",
-    "    postprocess=lambda x: processor.batch_decode(x, skip_special_tokens=True),\n",
-    "    predict_method='generate',\n",
-    "    preprocess_type='other',\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "ed83a8b084844292",
-   "metadata": {},
-   "source": [
-    "# Run Predictions on All Recordings in the Collection\n",
-    "Apply the `Pipeline` to all audio recordings:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "573dccc4",
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [],
-   "source": [
-    "transcriber.predict(X='audio', db=db, select=voice_collection.find(), max_chunk_size=10)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "64a6cbd8e3d429d9",
-   "metadata": {},
-   "source": [
-    "## Ask Questions to Your Voice Assistant\n",
-    "\n",
-    "Ask questions to your voice assistant, targeting specific queries and utilizing the power of MongoDB for vector-search and filtering rules:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "3aedc03c",
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [],
-   "source": [
-    "from superduperdb import VectorIndex, Listener\n",
-    "from superduperdb.ext.openai import OpenAIEmbedding\n",
-    "\n",
-    "db.add(\n",
-    "    VectorIndex(\n",
-    "        identifier='my-index',\n",
-    "        indexing_listener=Listener(\n",
-    "            model=OpenAIEmbedding(model='text-embedding-ada-002'),\n",
-    "            key='_outputs.audio.transcription',\n",
-    "            select=voice_collection.find(),\n",
-    "        ),\n",
-    "    )\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "2f92b56f",
-   "metadata": {},
-   "source": [
-    "Let's confirm this has worked, by searching for the `royal cavern`."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "7d2e3e56",
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [],
-   "source": [
-    "# Define the search parameters\n",
-    "search_term = 'royal cavern'\n",
-    "num_results = 2\n",
-    "\n",
-    "list(db.execute(\n",
-    "    voice_collection.like(\n",
-    "        {'_outputs.audio.transcription': search_term},\n",
-    "        n=num_results,\n",
-    "        vector_index='my-index',\n",
-    "    ).find({}, {'_outputs.audio.transcription': 1})\n",
-    "))"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "6068514b31268846",
-   "metadata": {},
-   "source": [
-    "## Enrich it with Chat-Completion \n",
-    "\n",
-    "Connect the previous steps with the gpt-3.5.turbo, a chat-completion model on OpenAI. The plan is to seed the completions with the most relevant audio recordings, as judged by their textual transcriptions. These transcriptions are retrieved using the previously configured `VectorIndex`. "
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "99e206af",
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [],
-   "source": [
-    "from superduperdb.ext.openai import OpenAIChatCompletion\n",
-    "\n",
-    "chat = OpenAIChatCompletion(\n",
-    "    model='gpt-3.5-turbo',\n",
-    "    prompt=(\n",
-    "        'Use the following facts to answer this question\\n'\n",
-    "        '{context}\\n\\n'\n",
-    "        'Here\\'s the question:\\n'\n",
-    "    ),\n",
-    ")\n",
-    "\n",
-    "db.add(chat)\n",
-    "\n",
-    "print(db.show('model'))"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "9cb623c4",
-   "metadata": {},
-   "source": [
-    "## Full Voice-Assistant Experience\n",
-    "\n",
-    "Test the full model by asking a question about a specific fact mentioned in the audio recordings. The model will retrieve the most relevant recordings and use them to formulate its answer:\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "60d7f0af-6305-4c8c-be65-4b75ec7dbf50",
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [],
-   "source": [
-    "from superduperdb import Document\n",
-    "\n",
-    "q = 'Is anything really Greek?'\n",
-    "\n",
-    "print(db.predict(\n",
-    "    model_name='gpt-3.5-turbo',\n",
-    "    input=q,\n",
-    "    context_select=voice_collection.like(\n",
-    "        Document({'_outputs.audio.transcription': q}), vector_index='my-index'\n",
-    "    ).find(),\n",
-    "    context_key='_outputs.audio.transcription',\n",
-    ")[0].content)"
-   ]
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.11.5"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 5
-}

- Text-To-Image Search + Text-To-Image Search	- Text-To-Video Search + Text-To-Video Search	- Question the Docs + Question the Docs
- +	- +	- +
- +	- +	- +
- Semantic Search Engine + Semantic Search Engine	- Classical Machine Learning + Classical Machine Learning	- Cross-Framework Transfer Learning + Cross-Framework Transfer Learning