SimpleDataLabsInc · alexanderahn · Nov 26, 2024 · Nov 12, 2024 · Nov 14, 2024 · Nov 26, 2024
diff --git a/.../SQL/gems/Transformations/_category_.json → docs/SQL/gems/transform/_category_.json b/.../SQL/gems/Transformations/_category_.json → docs/SQL/gems/transform/_category_.json
@@ -1,5 +1,5 @@
 {
-  "label": "Transformations",
+  "label": "Transform",
   "position": 1,
   "collapsible": true,
   "collapsed": true

diff --git a/docs/SQL/gems/Transformations/aggregate.md → docs/SQL/gems/transform/aggregate.md b/docs/SQL/gems/Transformations/aggregate.md → docs/SQL/gems/transform/aggregate.md
diff --git a/docs/SQL/gems/transform/deduplicate.md b/docs/SQL/gems/transform/deduplicate.md
@@ -0,0 +1,98 @@
+---
+title: Deduplicate
+id: deduplicate
+description: Remove rows with duplicate values of specified columns
+sidebar_position: 3
+tags:
+  - gems
+  - dedupe
+  - distinct
+  - unique
+---
+
+Removes rows with duplicate values of specified columns.
+
+## Parameters
+
+| Parameter              | Description                                                                                                                                                                                                                                                                                                                  | Required |
+| :--------------------- | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :------- |
+| Source                 | Input source                                                                                                                                                                                                                                                                                                                 | True     |
+| Row to keep            | - `Distinct Rows`: Keeps all distinct rows. This is equivalent to performing a `select distinct` operation <br/>- `Unique Only`: Keeps rows that don't have duplicates <br/>- `First`: Keeps first occurrence of the duplicate row <br/>- `Last`: Keeps last occurrence of the duplicate row <br/>Default is `Distinct Rows` | True     |
+| Deduplicate On Columns | Columns to consider while removing duplicate rows (not required for `Distinct Rows`)                                                                                                                                                                                                                                         | True     |
+
+## Row to keep options
+
+As mentioned in the previous parameters, there are four **Row to keep** options that you can use in your deduplicate Gem.
+
+![Deduplicate row to keep](./img/deduplicate_row_to_keep.png)
+
+In the Code view, you can see that the Deduplicate Gem contains `SELECT DISTINCT *` when using the `Distinct Rows` option.
+
+![Deduplicate code view](./img/deduplicate_code_view.png)
+
+## Example
+
+Suppose you're deduplicating the following table.
+
+| First_Name | Last_Name | Type  | Contact           |
+| :--------- | :-------- | :---- | :---------------- |
+| John       | Doe       | phone | 123-456-7890      |
+| John       | Doe       | phone | 123-456-7890      |
+| John       | Doe       | phone | 123-456-7890      |
+| Alice      | Johnson   | phone | 246-135-0987      |
+| Alice      | Johnson   | phone | 246-135-0987      |
+| Alice      | Johnson   | email | [email protected] |
+| Alice      | Johnson   | email | [email protected] |
+| Bob        | Smith     | email | [email protected]     |
+
+For `Distinct Rows`, the interim data will show the following:
+
+| First_Name | Last_Name | Type  | Contact           |
+| :--------- | :-------- | :---- | :---------------- |
+| John       | Doe       | phone | 123-456-7890      |
+| Alice      | Johnson   | phone | 246-135-0987      |
+| Alice      | Johnson   | email | [email protected] |
+| Bob        | Smith     | email | [email protected]     |
+
+The `First` and `Last` options work similarly to `Distinct Rows`, but they keep the first and last occurrence of the duplicate rows respectively.
+
+For `Unique Only`, the interim data will look like the following:
+
+| First_Name | Last_Name | Type  | Contact       |
+| :--------- | :-------- | :---- | :------------ |
+| Bob        | Smith     | email | [email protected] |
+
+You'll be left with only one unique row since the rest were all duplicates.
+
+---
+
+You can add `First_Name` and `Last_Name` to Deduplicate On Columns if you want to further deduplicate the table.
+
+For `Distinct Rows`, the interim data will show the following:
+
+| First_Name | Last_Name |
+| :--------- | :-------- |
+| John       | Doe       |
+| Alice      | Johnson   |
+| Bob        | Smith     |
+
+:::note
+
+For `First`, `Last`, and `Unique Only`, the interim data will contain all columns, irrespective of the columns that were added.
+
+For `First` and `Last`, the interim data will look like the following:
+
+| First_Name | Last_Name | Type  | Contact           |
+| :--------- | :-------- | :---- | :---------------- |
+| John       | Doe       | phone | 123-456-7890      |
+| Alice      | Johnson   | phone | 246-135-0987      |
+| Alice      | Johnson   | email | [email protected] |
+| Bob        | Smith     | email | [email protected]     |
+
+For `Unique Only`, the interim data will look like the following:
+
+| First_Name | Last_Name | Type  | Contact       |
+| :--------- | :-------- | :---- | :------------ |
+| Bob        | Smith     | email | [email protected] |
+
+:::
diff --git a/docs/SQL/gems/transform/flattenschema.md b/docs/SQL/gems/transform/flattenschema.md
@@ -0,0 +1,68 @@
+---
+title: Flatten Schema
+id: flattenschema
+description: Flatten nested data
+sidebar_position: 4
+tags:
+  - gems
+  - schema
+  - explode
+  - flatten
+---
+
+When processing raw data it can be useful to flatten complex data types like `Struct`s and `Array`s into simpler, flatter schemas. This allows you to preserve all schemas, and not just the first one. You can use FlattenSchema with Snowflake Models.
+
+![The FlattenSchema gem](./img/flatten_gem.png)
+
+## The Input
+
+FlattenSchema works on Snowflake sources that have nested columns that you'd like to extract into a flat schema.
+
+For example, with an input schema like so:
+
+![Input schema](./img/flatten_input.png)
+
+And the data looks like so:
+
+![Input data](./img/flatten_input_interim.png)
+
+We want to extract the `contact`, and all of the columns from the `struct`s in `content` into a flattened schema.
+
+## The Expressions
+
+Having added a `FlattenSchema` Gem to your Model, all you need to do is click the column names you wish to extract and they'll be added to the `Expressions` section.
+
+:::tip
+
+You can click to add all columns, which would make all nested leaf level values of an object visible as columns.
+
+:::
+
+Once added you can change the `Output Column` for a given row to change the name of the Column in the output.
+
+![Adding expressions](./img/flatten_add_exp.png)
+
+## The Output
+
+If we check the `Output` tab in the Gem, you'll see the schema that we've created using the selected columns.
+
+And here's what the output data looks like:
+
+![Output interim](./img/flatten_output_interim.png)
+
+The nested contact information has been flatten so that you have individual rows for each content type.
+
+## Advanced settings
+
+If you're familiar with Snowflake's `FLATTEN` table function, you can use the advanced settings to customize the optional column arguments.
+
+To use the advanced settings, hover over a column, and click the dropdown arrow.
+
+![Advanced settings](./img/flatten_advanced_settings.png)
+
+You can customize the following options:
+
+- Path to the element: The path to the element within the variant data structure that you want to flatten.
+- Flatten all elements recursively: If set to `false`, only the element mentioned in the path is expanded. If set to `true`, all sub-elements are expanded recursively. This is set to false by default.
+- Preserve rows with missing fields: If set to `false`, rows with missing fields are omitted from the output. If set to `true`, rows with missing fields are generated with `null` in the key, index, and value columns. This is set to false by default.
+- Datatype that needs to be flattened: The data type that you want to flatten. You can choose `Object`, `Array`, or `Both`. This is set to `Both` by default.
diff --git a/...sformations/img/deduplicate_code_view.png → ...s/transform/img/deduplicate_code_view.png b/...sformations/img/deduplicate_code_view.png → ...s/transform/img/deduplicate_code_view.png
diff --git a/...ormations/img/deduplicate_row_to_keep.png → ...transform/img/deduplicate_row_to_keep.png b/...ormations/img/deduplicate_row_to_keep.png → ...transform/img/deduplicate_row_to_keep.png
diff --git a/...s/transformations/img/flatten_add_exp.png → ...QL/gems/transform/img/flatten_add_exp.png b/...s/transformations/img/flatten_add_exp.png → ...QL/gems/transform/img/flatten_add_exp.png
diff --git a/...mations/img/flatten_advanced_settings.png → ...ansform/img/flatten_advanced_settings.png b/...mations/img/flatten_advanced_settings.png → ...ansform/img/flatten_advanced_settings.png
diff --git a/.../gems/transformations/img/flatten_gem.png → docs/SQL/gems/transform/img/flatten_gem.png b/.../gems/transformations/img/flatten_gem.png → docs/SQL/gems/transform/img/flatten_gem.png
diff --git a/...ems/transformations/img/flatten_input.png → .../SQL/gems/transform/img/flatten_input.png b/...ems/transformations/img/flatten_input.png → .../SQL/gems/transform/img/flatten_input.png
diff --git a/...sformations/img/flatten_input_interim.png → ...s/transform/img/flatten_input_interim.png b/...sformations/img/flatten_input_interim.png → ...s/transform/img/flatten_input_interim.png
diff --git a/...formations/img/flatten_output_interim.png → .../transform/img/flatten_output_interim.png b/...formations/img/flatten_output_interim.png → .../transform/img/flatten_output_interim.png
diff --git a/...ms/Transformations/sql-transformations.md → docs/SQL/gems/transform/transform.md b/...ms/Transformations/sql-transformations.md → docs/SQL/gems/transform/transform.md
@@ -1,6 +1,6 @@
 ---
-title: SQL Transformations
-id: sql-transformations
+title: Transform
+id: transform
 description: Data transformation steps in SQL
 sidebar_position: 1
 tags:

diff --git a/docs/getting-started/getting-started-sql-snowflake.md b/docs/getting-started/getting-started-sql-snowflake.md
@@ -251,7 +251,7 @@ Here we create a `customers_nations` model that’s going to enrich our customer
 
 The `customers_nations` model is stored as a `.sql` file on Git. The table or view defined by the model is stored on the SQL warehouse, database, and schema defined in the attached Fabric.
 
-Suggestions are provided each step of the way. If Copilot's suggestions aren't exactly what you need, just select and configure the Gems as desired. Click [here](../SQL/gems/joins.md) for details on configuring joins or [here](../SQL/gems/transformations/sql-aggregate) for aggregations.
+Suggestions are provided each step of the way. If Copilot's suggestions aren't exactly what you need, just select and configure the Gems as desired. Click [here](../SQL/gems/joins.md) for details on configuring joins or [here](../SQL/gems/transform/aggregate.md) for aggregations.
 
 ### 4.5 Interactively Test