A Data Product should already exist in order to attach the new components to it.
This section includes the basic information that any Component of Witboost must have:
- Name: Required name used for display purposes on your Data Product
- Fully Qualified Name: Workload fully qualified name, this is optional as will be generated by the system if not given by you
- Description: A short description to help others understand what this Workload is for.
- Domain: The Domain of the Data Product this Workload belongs to. Be sure to choose it correctly as is a fundamental part of the Workload and cannot be changed afterwards.
- Data Product: The Data Product this Workload belongs to, be sure to choose the right one.
- Identifier: Unique ID for this new entity inside the domain. Don't worry to fill this field, it will be automatically filled for you.
- Development Group: Development group of this Data Product. Don't worry to fill this field, it will be automatically filled for you.
- Depends On: If you want your workload to depend on other components from the Data Product, you can choose this option (Optional).
Example:
Field name | Example value |
---|---|
Name | Snowflake Vaccinations SQL Workload |
Description | Uploads the transformation SQL file to calculate vaccination values |
Domain | domain:healthcare |
Data Product | system:healthcare.vaccinationsdp.0 |
Identifier | Will look something like this: healthcare.vaccinationsdp.0.snowflake-vaccinations-sql-workload |
Development Group | Might look something like this: group:datameshplatform Depends on the Data Product development group |
Snowflake SQL file name: The name of the .sql file located in the artifacts/ folder of the component, you can choose the name of your .sql file, but it must be saved in the artifacts/ folder in order for the component to recognize it.
Be aware that we provide a default SQL file based on the default values provided to you on the components of this tutorial. If you, at any point on the tutorial, use your own custom values for either the Snowflake Storage Database and Schema, Airbyte Dataset name or Output Port Table Name, you should modify the SQL script in order to reflect these changes as soon as their repository is created at the end of this form (the CI will take care of the rest!).
You can always choose to upload your own SQL script to the repository with your own transformation of the data. If you want to use this component with MWAA to read from the Output Port of an external Data Product or from an already existing table, you can create a MWAA component and inside the related python DAG file you can write the Operator(s) required in order to trigger the SQL transformation specified inside the dedicated SQL file of this component.
Example:
Field name | Example value |
---|---|
Source name | snowflake.sql |
Be aware that we had followed certain rules for naming the specifics inside the catalog-info file (For example - Adding suffixes for schemaName and appending it with majorVersion) regardless of the value being a default one (or) a custom one. During any point of time, if you have any doubts regarding these names, please refer to the catalog-info file of Snowflake Storage as this would be the starting point of the tutorial.