SimpleDataLabsInc · kathweinschenkprophecy · Nov 27, 2024 · Nov 27, 2024 · Nov 27, 2024 · Nov 27, 2024
diff --git a/docs/Spark/gems/custom/delta-table-operations.md b/docs/Spark/gems/custom/delta-table-operations.md
@@ -1,6 +1,6 @@
 ---
 sidebar_position: 4
-title: Delta Table Operations
+title: DeltaTableOperations
 id: delta-ops
 description: Gem that encompasses some of the import side operations of Delta
 tags:

diff --git a/docs/Spark/gems/custom/file-operation.md b/docs/Spark/gems/custom/file-operation.md
@@ -1,14 +1,14 @@
 ---
 sidebar_position: 3
-title: File Operation
+title: FileOperation
 id: file-operations
 description: Perform file operations on different file systems
 tags:
   - file
   - dbfs
 ---
 
-Helps perform file operations like `copy` and `move` on different file systems
+Helps perform file operations like `copy` and `move` on different file systems.
 
 ## Parameters
 

diff --git a/docs/Spark/gems/custom/rest-api-enrich.md b/docs/Spark/gems/custom/rest-api-enrich.md
@@ -1,6 +1,6 @@
 ---
 sidebar_position: 5
-title: Rest API Enrich
+title: RestAPIEnrich
 id: rest-api-enrich
 description: Enrich DataFrame with content from rest API response based on configuration
 tags:

diff --git a/docs/Spark/gems/custom/sql-statement.md b/docs/Spark/gems/custom/sql-statement.md
@@ -1,6 +1,6 @@
 ---
 sidebar_position: 1
-title: SQL Statement
+title: SQLStatement
 id: sql-statement
 description: Create DataFrames based on custom SQL queries
 tags:
@@ -9,7 +9,7 @@ tags:
   - custom
 ---
 
-Create one or more DataFrame(s) based on provided SQL queries to run against one or more input DataFrame(s).
+Create one or more DataFrame(s) based on provided SQL queries to run against one or more input DataFrames.
 
 ### Parameters
 

diff --git a/docs/Spark/gems/join-split/compare-columns.md b/docs/Spark/gems/join-split/compare-columns.md
@@ -1,6 +1,6 @@
 ---
 sidebar_position: 4
-title: Compare Columns
+title: CompareColumns
 id: compare-columns
 description: Compare columns between two dataframes
 tags:
@@ -10,7 +10,7 @@ tags:
   - compare-columns
 ---
 
-Compare columns between two DataFrame based on the key id columns defined
+The CompareColumns Gem lets you compare columns between two DataFrames based on the key id columns defined.
 
 ## Parameters
 

diff --git a/docs/Spark/gems/join-split/row-distributor.md b/docs/Spark/gems/join-split/row-distributor.md
@@ -1,6 +1,6 @@
 ---
 sidebar_position: 3
-title: Row Distributor
+title: RowDistributor
 id: row-distributor
 description: Create multiple DataFrames based on filter conditions
 tags:
@@ -10,7 +10,7 @@ tags:
   - row distributor
 ---
 
-Create multiple DataFrames based on provided filter conditions from an input DataFrame.
+Use the RowDistributor Gem to create multiple DataFrames based on provided filter conditions from an input DataFrame.
 
 This is useful for cases where rows from the input DataFrame needs to be distributed into multiple DataFrames in different ways for downstream Gems.
 

diff --git a/docs/Spark/gems/machine-learning/ml-pinecone-lookup.md b/docs/Spark/gems/machine-learning/ml-pinecone-lookup.md
@@ -1,6 +1,6 @@
 ---
 sidebar_position: 3
-title: Pinecone Lookup
+title: PineconeLookup
 id: ml-pinecone-lookup
 description: Lookup a vector embedding from a Pinecone Database
 tags: [generative-ai, machine-learning, llm, pinecone, openai]
@@ -14,7 +14,7 @@ tags: [generative-ai, machine-learning, llm, pinecone, openai]
 
 <br />
 
-The Pinecone Lookup Gem identifies content that is similar to a provided vector embedding. The Gem calls the Pinecone API and returns a set of IDs with highest similarity to the provided embedding.
+The PineconeLookup Gem identifies content that is similar to a provided vector embedding. The Gem calls the Pinecone API and returns a set of IDs with highest similarity to the provided embedding.
 
 - [**Parameters:**](https://docs.prophecy.io/Spark/gems/machine-learning/ml-pinecone-lookup#gem-parameters) Configure the parameters needed to call the Pinecone API.
 
@@ -40,15 +40,15 @@ Hardcoding the Pinecone credential is not recommended. Selecting this option cou
 
 #### Properties
 
-Pinecone DB uses indexing to map the vectors to a data structure that will enable faster searching. The Pinecone Lookup Gem searches through a Pinecone index to identify embeddings with similarity to the input embedding. Enter the Pinecone **[(4) Index name](https://docs.prophecy.io/Spark/gems/machine-learning/ml-pinecone-lookup#faq)** which you’d like to use for looking up embeddings.
+Pinecone DB uses indexing to map the vectors to a data structure that will enable faster searching. The PineconeLookup Gem searches through a Pinecone index to identify embeddings with similarity to the input embedding. Enter the Pinecone **[(4) Index name](https://docs.prophecy.io/Spark/gems/machine-learning/ml-pinecone-lookup#faq)** which you’d like to use for looking up embeddings.
 
-Select one of the Gem’s input columns with vector embeddings as the **(5) Vector column** to send to Pinecone’s API. The column [must](https://docs.prophecy.io/Spark/gems/machine-learning/ml-pinecone-lookup#input) be compatible with the Pinecone Index. To change the column’s datatype and properties, [configure](https://docs.prophecy.io/Spark/gems/machine-learning/ml-pinecone-lookup#faq) the Gem(s) preceding the Pinecone Lookup Gem.
+Select one of the Gem’s input columns with vector embeddings as the **(5) Vector column** to send to Pinecone’s API. The column [must](https://docs.prophecy.io/Spark/gems/machine-learning/ml-pinecone-lookup#input) be compatible with the Pinecone Index. To change the column’s datatype and properties, [configure](https://docs.prophecy.io/Spark/gems/machine-learning/ml-pinecone-lookup#faq) the Gem(s) preceding the PineconeLookup Gem.
 
 Pinecone’s API can return multiple results. Depending on the use case, select the desired **(6) Number of results** sorted by similarity score. The result with highest similarity to the user’s text question will be listed first.
 
 ### Input
 
-Pinecone Lookup requires a model_embedding column as input. Use one of Prophecy's Machine Learning Gems to provide the model_embedding. For example, the OpenAI Gem can precede the Pinecone Lookup Gem in the Pipeline. The OpenAI Gem, configured to `Compute a text embedding`, will output an openai_embedding column. This is a suitable input for the Pinecone Lookup Gem.
+PineconeLookup requires a model_embedding column as input. Use one of Prophecy's Machine Learning Gems to provide the model_embedding. For example, the OpenAI Gem can precede the PineconeLookup Gem in the Pipeline. The OpenAI Gem, configured to `Compute a text embedding`, will output an openai_embedding column. This is a suitable input for the PineconeLookup Gem.
 
 | Column          | Description                                                                                                                                                                                                                                                                                                                                                                                                                  | Required |
 | --------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------- |
@@ -61,9 +61,9 @@ The output Dataset contains the pinecone_matches and pinecone_error columns. For
 | Column           | Description                                                                                                                                                                          |
 | ---------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
 | pinecone_matches | array - an array of several content IDs and their scores. Example: `[{"id":"web-223","score":0.8437653},{"id":"web-224","score":0.8403446}, ...{"id":"web-237","score":0.82916564}]` |
-| pinecone_error   | string - this column is provided to show any error message returned from Pinecone’s API; helpful for troubleshooting errors related to the Pinecone Lookup Gem.                      |
+| pinecone_error   | string - this column is provided to show any error message returned from Pinecone’s API; helpful for troubleshooting errors related to the PineconeLookup Gem.                       |
 
-Prophecy converts the visual design into Spark code available on the Prophecy user's Git repository. Find the Spark code for the Pinecone Lookup Gem below.
+Prophecy converts the visual design into Spark code available on the Prophecy user's Git repository. Find the Spark code for the PineconeLookup Gem below.
 
 ````mdx-code-block
 import Tabs from '@theme/Tabs';
@@ -105,7 +105,7 @@ def vector_lookup(Spark: SparkSession, in0: DataFrame) -> DataFrame:
 
 #### Troubleshooting
 
-To troubleshoot the Gem preceding Pinecone Lookup, open the data preview output from the previous Gem. For example if the embedding structure is incorrect then try adjusting the previous Gem, run, and view that Gem’s output data preview.
+To troubleshoot the Gem preceding PineconeLookup, open the data preview output from the previous Gem. For example if the embedding structure is incorrect then try adjusting the previous Gem, run, and view that Gem’s output data preview.
 
 #### Creating a Pinecone Index
 

diff --git a/docs/Spark/gems/machine-learning/ml-text-processing.md b/docs/Spark/gems/machine-learning/ml-text-processing.md
@@ -1,6 +1,6 @@
 ---
 sidebar_position: 1
-title: Text Processing
+title: TextProcessing
 id: ml-text-processing
 description: Text processing to prepare data to submit to a foundational model API.
 tags:

diff --git a/docs/Spark/gems/subgraph/basicSubgraph.md b/docs/Spark/gems/subgraph/basicSubgraph.md
@@ -1,6 +1,6 @@
 ---
 sidebar_position: 1
-title: Basic Subgraph
+title: Basic subgraph
 id: basic-subgraph
 description: Basic Subgraph, Group your Gems in reusable Parent Gems.
 tags:

diff --git a/docs/Spark/gems/subgraph/tableIterator.md b/docs/Spark/gems/subgraph/tableIterator.md
@@ -1,6 +1,6 @@
 ---
 sidebar_position: 2
-title: Table Iterator
+title: TableIterator
 id: table-iterator
 description: Loop over each row of an input Dataframe
 tags:
@@ -9,21 +9,21 @@ tags:
   - iterator
 ---
 
-Table Iterator allows you to iterate over one or more Gems for each row of the first input DataFrame.
+TableIterator allows you to iterate over one or more Gems for each row of the first input DataFrame.
 Let's see how to create a Basic Loop which loops over a Metadata Table, and for each row of the table will run the Gems inside the Subgraph.
 
-## Creating a Table Iterator Gem
+## Creating a TableIterator Gem
 
 First add the Input Gem on which you want to Iterate over. For this, simply use an existing Dataset or create a new [Source Gem](/docs/Spark/gems/source-target/source-target.md) pointing to your Metadata table.
 You can run this Source Gem to see the data your loop would be running for.
 
-Now, Drag and Drop the **(1) Table Iterator** Gem from the Subgraph menu, and connect it to the above created Source Gem.
+Now, Drag and Drop the **(1) TableIterator** Gem from the Subgraph menu, and connect it to the above created Source Gem.
 
 ![Create_table_iterator](img/Create_table_iterator.png)
 
-## Configure the Table Iterator
+## Configure the TableIterator
 
-Open the Table Iterator Gem, and click on **(1) Configure** to open the Settings dialog.
+Open the TableIterator Gem, and click on **(1) Configure** to open the Settings dialog.
 Here, on the left side panel you can edit the **(2) Name ** of your Gem, check the **(3) Input Schema** for your DataFrame on which the loop will iterate.
 
 On the right side, you can define your Iterator Settings, and any other Subgraph Configs you want to use in the Subgraph.
@@ -70,7 +70,7 @@ Click on the **(2) Iteration** button, and it will open up the Iterations table
 
 ## Adding Inputs and Outputs to TableIterator
 
-For a Table Iterator Gem, the first input port is for your DataFrame on which you want to Iterate Over.
+For a TableIterator Gem, the first input port is for your DataFrame on which you want to Iterate Over.
 You can **(1)Add** more Inputs or Switch to **(2) Output** tab to add more Outputs as needed. These extra inputs would not change for every iteration.
 Also, the output will be a Union of output of all Iterations. You can **(3) Delete** any port by hovering over it.
 

diff --git a/docs/Spark/gems/transform/bulk-column-expressions.md b/docs/Spark/gems/transform/bulk-column-expressions.md
@@ -1,6 +1,6 @@
 ---
 sidebar_position: 11
-title: Bulk Column Expressions
+title: BulkColumnExpressions
 id: bulk-column-expressions
 description: Change the data type of multiple columns at once.
 tags:
@@ -9,7 +9,7 @@ tags:
   - columns
 ---
 
-The Bulk Column Expressions Gem primarily lets you cast or change the data type of multiple columns at once. It provides additional functionality, including:
+The BulkColumnExpressions Gem primarily lets you cast or change the data type of multiple columns at once. It provides additional functionality, including:
 
 - Adding a prefix or suffix to selected columns.
 - Applying a custom expression to selected columns.
@@ -28,7 +28,7 @@ The Bulk Column Expressions Gem primarily lets you cast or change the data type
 
 Assume you have some columns in a table that represent zero-based indices and are stored as long data types. You want them to represent one-based indices and be stored as integers to optimize memory use.
 
-Using the Bulk Column Expressions Gem, you can:
+Using the BulkColumnExpressions Gem, you can:
 
 - Filter your columns by long data types.
 - Select the columns you wish to transform.

diff --git a/docs/Spark/gems/transform/bulk-column-rename.md b/docs/Spark/gems/transform/bulk-column-rename.md
@@ -1,6 +1,6 @@
 ---
 sidebar_position: 10
-title: Bulk Column Rename
+title: BulkColumnRename
 id: bulk-column-rename
 description: Rename multiple columns in your Dataset in a systematic way.
 tags:
@@ -9,7 +9,7 @@ tags:
   - columns
 ---
 
-Use the Bulk Column Rename Gem to rename multiple columns in your Dataset in a systematic way.
+Use the BulkColumnRename Gem to rename multiple columns in your Dataset in a systematic way.
 
 ## Parameters
 

diff --git a/docs/Spark/gems/transform/data-cleansing.md b/docs/Spark/gems/transform/data-cleansing.md
@@ -1,6 +1,6 @@
 ---
 sidebar_position: 12
-title: Data Cleansing
+title: DataCleansing
 id: data-cleansing
 description: Standardize data formats and address missing or null values in the data.
 tags:
@@ -9,7 +9,7 @@ tags:
   - format
 ---
 
-Use the Data Cleansing Gem to standardize data formats and address missing or null values in the data.
+Use the DataCleansing Gem to standardize data formats and address missing or null values in the data.
 
 ## Parameters
 
@@ -22,6 +22,6 @@ Use the Data Cleansing Gem to standardize data formats and address missing or nu
 
 ## Example
 
-Assume you have a table that includes customer feedback on individual orders. In this scenario, some customers may not provide feedback, resulting in null values in the data. You can use the Data Cleansing Gem to replace null values with the string `NA`.
+Assume you have a table that includes customer feedback on individual orders. In this scenario, some customers may not provide feedback, resulting in null values in the data. You can use the DataCleansing Gem to replace null values with the string `NA`.
 
 ![Replace null with string](./img/replace-null-with-string.png)
diff --git a/docs/Spark/gems/transform/dynamic-select.md b/docs/Spark/gems/transform/dynamic-select.md
@@ -1,6 +1,6 @@
 ---
 sidebar_position: 13
-title: Dynamic Select
+title: DynamicSelect
 id: dynamic-select
 description: Dynamically filter columns of your dataset based on a set of conditions.
 tags:
@@ -9,11 +9,11 @@ tags:
   - dynamic
 ---
 
-Use the Dynamic Select Gem to dynamically filter columns of your Dataset based on a set of conditions.
+Use the DynamicSelect Gem to dynamically filter columns of your Dataset based on a set of conditions.
 
 ## Configuration
 
-There are two ways to configure the Dynamic Select.
+There are two ways to configure the DynamicSelect.
 
 | Configuration         | Description                                                                                   |
 | --------------------- | --------------------------------------------------------------------------------------------- |
@@ -22,7 +22,7 @@ There are two ways to configure the Dynamic Select.
 
 ## Examples
 
-You’ll use Dynamic Select when you want to avoid hard-coding your choice of columns. In other words, rather than define each column to keep in your Pipeline, you let the system automatically choose the columns based on certain conditions or rules.
+You’ll use DynamicSelect when you want to avoid hard-coding your choice of columns. In other words, rather than define each column to keep in your Pipeline, you let the system automatically choose the columns based on certain conditions or rules.
 
 ### Remove date columns using field type
 

diff --git a/docs/Spark/gems/transform/flattenschema.md b/docs/Spark/gems/transform/flattenschema.md
@@ -1,6 +1,6 @@
 ---
 sidebar_position: 5
-title: Flatten Schema
+title: FlattenSchema
 id: flatten-schema
 description: Flatten nested data
 tags:
@@ -10,7 +10,7 @@ tags:
   - flatten
 ---
 
-When processing raw data it can be useful to flatten complex data types like `Struct`s and `Array`s into simpler, flatter schemas.
+When processing raw data it can be useful to flatten complex data types like structures and arrays into simpler, flatter schemas.
 
 ![The FlattenSchema gem](./img/flatten_gem.png)
 
@@ -26,19 +26,19 @@ And the data looks like so:
 
 ![Input data](./img/flatten_input_interim.png)
 
-We want to extract `count`, and all of the columns from the `struct`s in `events` into a flattened schema.
+We want to extract `count` from _result_ and all of the columns from _events_ into a flattened schema.
 
 ## The Expressions
 
-Having added a `FlattenSchema` Gem to your Pipeline, all you need to do is click the column names you wish to extract and they'll be added to the `Expressions` section. Once added you can change the `Target Column` for a given row to change the name of the Column in the output.
+Having added a FlattenSchema Gem to your Pipeline, all you need to do is click the column names you wish to extract and they'll be added to the **Expressions** section. Then, you can change the values in the **Target Column** to change the name of output columns.
 
 ![Adding Expressions](./img/flatten_add_exp.gif)
 
-The `Columns Delimiter` dropdown allows you to control how the names of the new columns are derived. Currently dashes and underscores are supported.
+The **Columns Delimiter** dropdown allows you to control how the names of the new columns are derived. Currently dashes and underscores are supported.
 
 ## The Output
 
-If we check the `Output` tab in the Gem, you'll see the schema that we've created using the selected columns.
+If we check the **Output** tab in the Gem, you'll see the schema that we've created using the selected columns.
 
 ![Output schema](./img/flatten_output.png)
 

diff --git a/docs/Spark/gems/transform/order-by.md b/docs/Spark/gems/transform/order-by.md
@@ -1,6 +1,6 @@
 ---
 sidebar_position: 3
-title: Order By
+title: OrderBy
 id: order-by
 description: Sort your data based on one or more Columns
 tags:

diff --git a/docs/Spark/gems/transform/schema-transform.md b/docs/Spark/gems/transform/schema-transform.md
@@ -1,6 +1,6 @@
 ---
 sidebar_position: 5
-title: Schema Transform
+title: SchemaTransform
 id: schema-transform
 description: Add, Edit, Rename or Drop Columns
 tags:
@@ -80,19 +80,19 @@ object transform {
 
 ## Advanced Import
 
-The Advanced Import feature allows you to bulk import statements that are structured similarly to CSV/TSV files. This can be useful if you have your expressions/transformation logic in another format and just want to quickly configure a `Schema Transform` Gem based on existing logic.
+The Advanced Import feature allows you to bulk import statements that are structured similarly to CSV/TSV files. This can be useful if you have your expressions/transformation logic in another format and just want to quickly configure a SchemaTransform Gem based on existing logic.
 
 ### Using Advanced Import
 
-1. Click the `Advanced` button in the `Schema Transform` Gem UI
+1. Click the **Advanced** button in the SchemaTransform Gem UI
 
 ![Advanced import toggle](./img/schematransform_advanced_1.png)
 
 2. Enter the expressions into the text area using the format as described below:
 
 ![Advanced import mode](./img/schematransform_advanced_2.png)
 
-3. Use the button at the top (labeled `Expressions`) to switch back to the expressions view. This will translate the expressions from the CSV format to the table format and will show any errors detected.
+3. Use the button at the top (labeled **Expressions**) to switch back to the expressions view. This will translate the expressions from the CSV format to the table format and will show any errors detected.
 
 ### Format
 

diff --git a/docs/Spark/gems/transform/set-operation.md b/docs/Spark/gems/transform/set-operation.md
@@ -1,6 +1,6 @@
 ---
 sidebar_position: 8
-title: Set Operation
+title: SetOperation
 id: set-operation
 description: Union, Intersect and Difference
 tags:
@@ -11,7 +11,7 @@ tags:
   - difference
 ---
 
-Allows you to perform addition or subtraction of rows from DataFrames with identical schemas and different data.
+Use the SetOperation Gem to perform addition or subtraction of rows from DataFrames with identical schemas and different data.
 
 ### Parameters