Skip to content

Latest commit

 

History

History
99 lines (71 loc) · 6.86 KB

File metadata and controls

99 lines (71 loc) · 6.86 KB

Prerequisites

A Data Product should already exist in order to attach the new components to it.

Creation Wizard

The Creation Wizard allows you to create a new Snowflake Storage Area.

Component metadata

This section covers the basic information that any component must have.

  • Name: Required. The name of the component.
  • Fully Qualified Name: Fully qualified name of the component.
  • Description: Required. Help others understand what this component is for. What data will it store?
  • Domain: Required. Domain of the Data Product this component belongs to. Be sure to choose it correctly as otherwise you won't find your Data Product below.
  • Data Product: Required. Data Product this component belongs to. Be sure to choose the right one as it cannot be changed.
  • Identifier: Autogenerated from the information above. A unique identifier for the component. It will not be editable after creation and is a string composed of [a-zA-Z] separated by any of [-_].
  • Development Group: Automatically selected from the Data Product metadata. Data Product development group.
  • Depends On: A component could depend on other components in the same Data Product. This information will be used to deploy the components in such an order that their dependencies already exist.
  • Tags: Tags for the component.

Example:

Field name Example value
Name Snowflake Vaccinations Storage
Description Contains data on COVID-19 Vaccinations
Domain domain:healthcare
Data Product system:healthcare.vaccinationsdp.0
Identifier Will look something like this: healtchare.vaccinationsdp.0.snowflake-vaccinations-storage. Depends on the name you gave to the component and the data product it belongs to.
Development Group Will look something like this: group:datameshplatform. Depends on the Data Product development group.
Depends On

Snowflake deployment Information

This section covers specific information related to where the Storage Area is located on Snowflake.

  • Database: Name of the database in Snowflake. If not provided, the default value (in this case, domain name) will be assigned during the creation.
  • Schema: Name of the schema inside the Snowflake database specified above. If not provided, the default value (in this case, dpname_dpversion) will be assigned during the creation.
  • View name: Required. Name of the table that will be created inside the Snowflake schema specified above.

Example:

Field name Example value
Database HEALTHCARE
Schema vaccinationsdp
Table Name vaccinations_clean

Table schema

The schema for the Storage Area, ie the columns of the table.

  • Name: Required. Name of the column inside the table.
  • Description: Description for the column.
  • Data Type: Required. Data Type of the column.
  • Constraint: Type of constraint defined on the column.
  • Length: Length of a TEXT column, ie the maximum number of characters.
  • Precision: Total number of digits allowed (in case of NUMERIC Data Type). Minimum allowed precision is 1.
  • Scale: Number of digits allowed to the right of the decimal point (in case of NUMERIC Data Type). Minimum allowed scale is 0.
  • Business Terms: Pick any number of Business Terms to apply to the column as tags.
  • PII: Check the box to mark the column as PII-relevant with a tag.

Below are the list of data types that are currently supported and are widely used in our use-case:

  • TEXT
  • NUMBER
  • DATE
  • BOOLEAN

If any need arises in the future to expand the current list of data types, we can do so comfortably.

Example:

Name Data Type Constraint Length. Precision Scale
date DATE PRIMARY_KEY - - -
location_key TEXT PRIMARY_KEY 16777216 - -
new_persons_vaccinated NUMBER - - 38 0
new_persons_fully_vaccinated NUMBER - - 38 0
new_vaccine_doses_administered NUMBER - - 38 0
cumulative_persons_vaccinated NUMBER - - 38 0
cumulative_persons_fully_vaccinated NUMBER - - 38 0
cumulative_vaccine_doses_administered NUMBER - - 38 0

If multiple columns are marked as PRIMARY_KEY, a composite primary key will be created.

Creation

After this step, the system will show you the summary of the information provided. You can go back and edit them if you notice any mistake, otherwise you can go ahead and create the component.

After clicking on "Create", the component registration will start. If no errors occur, it will go through the 3 phases (Fetching, Publishing and Registering) and it will show you the links to the newly created repository inside GitLab and the new Data Product component in the Builder Catalog.

When deploying the Data Product, deployment of this component will create the Snowflake table inside the specified database and schema.

Edit Wizard

The Edit Wizard allows you to edit most information about the component after you have created it. The sections are the same as the Creation Wizard, so you can refer to the documentation above, but some fields will be locked as they cannot be changed after creation.