Merge pull request #86 from getdozer/mrunmay/docs-update

docs: update
getdozer · Jan 22, 2024 · e4f0bf2 · e4f0bf2
2 parents 733b472 + b44a90a
commit e4f0bf2
Show file tree

Hide file tree

Showing 10 changed files with 94 additions and 157 deletions.
diff --git a/docs/configuration/api-endpoints.md b/docs/configuration/api-endpoints.md
@@ -3,23 +3,18 @@ The endpoint configuration defines how Dozer should expose gRPC/REST endpoints.
 
 ```yaml
 endpoints:
-  - name: trips_cache  
-    path: /trips
-    table_name: trips_cache
-    index:
-      ...
-    conflict_resolution: 
-      ...
+  - table_name: trips_cache
+    kind: !Api
+      path: /trips
 ```
 
 ### Parameters
 | Name                  | Type         | Description                                                                                                                         |
 |-----------------------|--------------|-------------------------------------------------------------------------------------------------------------------------------------|
-| `name`                | String       | The designated name of the endpoint.                                                                                                |
-| `path`                | String       | Determines the route or path for the REST endpoint.                                                                                 |
-| `table_name`          | String       | Identifies the name of the table in the source or in the SQL that this endpoint is set to expose.                                   |
-| [`index`](#indexes)               | Object       | An optional section that describes the index configuration for this endpoint, specifying primary and secondary indexes and whether to skip default configurations.  |
-| [`conflict_resolution`](#conflicts-resolution) | Object       | An optional section that outlines the strategies to handle potential data conflicts for this endpoint.                              |
+| `table_name`          | String       | Identifies the name of the table in the source or in the SQL that this endpoint is set to expose.                                                                
+| `kind`                | String       | Determines the sink used for the endpoint. For example, `!Dummy`, `!Aerospike`, `!Snowflake`                                    |
+| `path`                | String       | Determines the route or path for the REST endpoint.                                                         
+
 
 ## Indexes
 The `index` section of the endpoint configuration in Dozer determines how indexing is managed for the exposed endpoint. Appropriate indexing ensures quick data retrieval and can greatly improve query performance.

diff --git a/docs/getting_started.md b/docs/getting_started.md
@@ -9,7 +9,6 @@ Dozer is available in two flavours: an Open Source Core version and a Cloud vers
 - [Connecting to data sources](getting_started/core/connecting-to-sources)
 - [Adding transformations](getting_started/core/adding-transformations)
 - [Querying data](getting_started/core/querying-data)
-- [Monitoring your application](getting_started/core/monitoring-your-application)
 
 ## Dozer Cloud
 

diff --git a/docs/getting_started/cloud/adding-transformations.mdx b/docs/getting_started/cloud/adding-transformations.mdx
@@ -41,16 +41,17 @@ To expose the result of this query as an API we will also need to add an additio
 
 ```yaml
 endpoints:
-  - name: ticker_analysis
-    path: /analysis/ticker
-    table_name: ticker_analysis
+  - table_name: ticker_analysis
+    kind: !Dummy
 
-  - name: daily_analysis
-    path: /analysis/daily
-    table_name: daily_analysis
+  - table_name: daily_analysis
+    kind: !Dummy
 
-  - name: highest_daily_close
-    path: /analysis/highest_daily_close
+  - table_name: highest_daily_close
+    kind: !Dummy
+
+  - table_name: lowest_daily_close
+    kind: !Dummy   
 ```
 
 

diff --git a/docs/getting_started/cloud/connecting-to-sources.mdx b/docs/getting_started/cloud/connecting-to-sources.mdx
@@ -40,9 +40,8 @@ connections:
             extension: .csv
     name: s3
 endpoints:
-- name: stocks
-  table_name: stocks
-  path: /stocks
+  - table_name: stocks
+    kind: !Dummy
 ```
 
 

diff --git a/docs/getting_started/core/adding-transformations.mdx b/docs/getting_started/core/adding-transformations.mdx
@@ -26,9 +26,8 @@ To expose the result of this query as an API we will also need to add an additio
 
 ```yaml
 endpoints:
-  - name: avg_fares
-    path: /avg_fares
-    table_name: avg_fares
+  - table_name: avg_fares
+    kind: !Dummy
 ```
 
 <Tabs groupId="tool">

diff --git a/docs/getting_started/core/connecting-to-sources.mdx b/docs/getting_started/core/connecting-to-sources.mdx
@@ -47,9 +47,8 @@ sources:
     connection: local_storage
 
 endpoints:
-  - name: trips
-    path: /trips
-    table_name: trips
+  - table_name: trips
+    kind: !Dummy
 ```
 
 Now download some sample trip data and copy it to the `data/trips` directory:
@@ -162,14 +161,11 @@ sources:
     connection: pg
 
 endpoints:
-  - name: trips
-    path: /trips
-    table_name: trips
-
-  - name: zones
-    path: /zoness
-    table_name: zones
+  - table_name: trips
+    kind: !Dummy
 
+  - table_name: zones
+    kind: !Dummy
 ```
 
 <Tabs groupId="tool">

diff --git a/docs/getting_started/core/monitoring-your-application.md b/docs/getting_started/core/monitoring-your-application.md
diff --git a/docs/lambda-functions.md → docs/udfs/lambda-functions.md b/docs/lambda-functions.md → docs/udfs/lambda-functions.md
diff --git a/docs/udfs/onnx.md b/docs/udfs/onnx.md
@@ -0,0 +1,56 @@
+# ONNX
+
+ONNX, or Open Neural Network Exchange, is an open-source format designed to represent machine learning models. It provides a standardized way to describe models so that they can be easily exchanged between different deep learning frameworks. ONNX is supported by various frameworks such as PyTorch, TensorFlow, Microsoft Cognitive Toolkit (CNTK), and others, allowing interoperability and flexibility in deploying models across different platforms.
+
+Dozer supports ONNX models and allows you to deploy them as APIs. This enables you to use your models in production without having to write any additional code. For instance, you can use a pre-trained model to predict probabilities of a particular event, such as a customer credit score, or other use cases.
+
+## Configuration
+
+Add the following block to your YAML file to register ONNX models.
+
+```yaml
+sql : |
+    SELECT torch_jit(col1, col2) INTO output FROM input;
+```
+
+```yaml
+udfs:
+    - name: torch_jit
+      config: !Onnx
+        path: ./model.onnx
+```
+
+`torch_jit` is the function which would run the ONNX model on `col1, col2` as input returning the output in output` column.
+
+### Parameters  
+
+| **Parameter Name** | **Type**             | **Description**                                                                                                                                                                                                                                                           |
+|--------------------|----------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| `path`             | String               | Path to the ONNX model.                                                                                                                                                                                    |
+## Running the ONNX Model
+
+### Pre-requisites
+
+- Enabling ONNX feature while building Dozer.
+  ```bash
+  cargo install --path dozer-cli --features onnx --locked
+  ```
+- Installing ONNX runtime.
+  ```bash
+    pip install onnxruntime
+    ```
+
+Run App to start ingesting data into Dozer.
+
+```bash
+dozer run app
+```
+### To use ONNX on Dozer Cloud
+
+```
+dozer cloud deploy -c dozer-config.yaml -c model.onnx
+```
+
+## Trying it out
+
+To test a ONNX sample, clone the `dozer-samples` GitHub repo and follow the steps described [here](https://github.com/getdozer/dozer-samples/tree/main/usecases/onnx)
diff --git a/sidebars.js b/sidebars.js
@@ -128,8 +128,18 @@ const sidebars = {
         'transforming-data/windowing'
       ]
     },
-
-
+    {
+      type: 'category',
+      label: 'User Defined Functions',
+      link: {
+        type: 'generated-index',
+        title: 'User Defined Functions',
+      },
+      items: [
+       'udfs/lambda-functions',
+        'udfs/onnx',
+      ]
+    },
     {
       type: 'category',
       label: 'Accessing Data',
@@ -145,18 +155,6 @@ const sidebars = {
         'accessing-data/authorization'
       ]
     },
-    //'lambda-functions',
-    // {
-    //   type: 'category',
-    //   label: 'Deployment',
-    //   link: {
-    //     type: 'doc',
-    //     id: 'deployment',
-    //   },
-    //   items: []
-    // }
-
-
   ],
 };