diff --git a/specification/appendix/appendix.mdpp b/specification/appendix/appendix.mdpp index 510b27ceb..3f9ea0cbd 100644 --- a/specification/appendix/appendix.mdpp +++ b/specification/appendix/appendix.mdpp @@ -4,3 +4,4 @@ !INCLUDE "grouping_constructs_for_resources_and_or_services.md",1 !INCLUDE "origination_of_cost_data.md",1 +!INCLUDE "examples/examples.mdpp",1 diff --git a/specification/appendix/examples/examples.mdpp b/specification/appendix/examples/examples.mdpp new file mode 100644 index 000000000..edf59c82a --- /dev/null +++ b/specification/appendix/examples/examples.mdpp @@ -0,0 +1,5 @@ +# Examples + +*This section is non-normative.* + +!INCLUDE "metadata/metadata_examples.mdpp",1 diff --git a/specification/appendix/examples/metadata/adding_new_columns_example.md b/specification/appendix/examples/metadata/adding_new_columns_example.md new file mode 100644 index 000000000..ede72f5c8 --- /dev/null +++ b/specification/appendix/examples/metadata/adding_new_columns_example.md @@ -0,0 +1,78 @@ +# Adding New Columns + +## Scenario + +ACME has decided add additional columns to their FOCUS data export. The new columns are x_awesome_column1, x_awesome_column2, and x_awesome_column3. The provider creates a new schema object to represent the new schema, this schema object has a unique SchemaId. The subsequent data exports that use the new schema include the new schema's id as a reference to their corresponding schema object. + +## Supplied Metadata + +## Location for the new schema object + +`/FOCUS/metadata/schemas/schema-23456-abcde-23456-abcde-23456.json` + +## Content for the new schema object + +```json + { + "SchemaId": "23456-abcde-23456-abcde-23456", + "FocusVersion": "1.0", + "CreationDate": "2024-02-02T12:01:03.083z", + "ColumnDefinition": [ + { + "ColumnName": "BillingAccountId", + "DataType": "STRING", + "StringMaxLength": 64, + "StringEncoding": "UTF-8" + }, + { + "ColumnName": "BillingAccountName", + "DataType": "STRING", + "StringMaxLength": 64, + "StringEncoding": "UTF-8" + }, + { + "ColumnName": "ChargePeriodStart", + "DataType": "DATETIME" + }, + { + "ColumnName": "ChargePeriodEnd", + "DataType": "DATETIME" + }, + { + "ColumnName": "BilledCost", + "DataType": "DECIMAL", + "NumericPrecision": 20, + "NumberScale": 10 + }, + { + "ColumnName": "EffectiveCost", + "DataType": "DECIMAL", + "NumericPrecision": 20, + "NumberScale": 10 + }, + { + "ColumnName": "Tags", + "DataType": "JSON", + "ProviderTagPrefixes": ["awecorp", "ac"] + }, + { + "ColumnName": "x_awesome_column1", + "DataType": "STRING", + "StringMaxLength": 64, + "StringEncoding": "UTF-8" + }, + { + "ColumnName": "x_awesome_column2", + "DataType": "DATETIME" + }, + { + "ColumnName": "x_awesome_column3", + "DataType": "STRING", + "StringMaxLength": 64, + "StringEncoding": "UTF-8" + } + ] +} +``` + +For an example of how ACME ensures the schema metadata reference requirement is met see: [Schema Metadata to FOCUS Data Reference](../schema_metadata_reference_example.md) diff --git a/specification/appendix/examples/metadata/changing_column_metadata_example.md b/specification/appendix/examples/metadata/changing_column_metadata_example.md new file mode 100644 index 000000000..85d65e9f9 --- /dev/null +++ b/specification/appendix/examples/metadata/changing_column_metadata_example.md @@ -0,0 +1,72 @@ +# Changing a Column's Metadata Example + +## Scenario + +ACME has decided to change the datatype of column x_awesome_column1 from a string to a number. ACME creates a new schema object with the modification to x_awesome_column2. + +## Supplied Metadata + +## Location for the new schema object + +`/FOCUS/metadata/schemas/schema-67891-abcde-67891-abcde-67891.json` + +## Content for the new schema object + +```json + { + "SchemaId": "67891-abcde-67891-abcde-67891", + "FocusVersion": "1.0", + "CreationDate": "2024-06-02T12:01:03.083z", + "ColumnDefinition": [ + { + "ColumnName": "BillingAccountId", + "DataType": "STRING", + "StringMaxLength": 64, + "StringEncoding": "UTF-8" + }, + { + "ColumnName": "BillingAccountName", + "DataType": "STRING", + "StringMaxLength": 64, + "StringEncoding": "UTF-8" + }, + { + "ColumnName": "ChargePeriodStart", + "DataType": "DATETIME" + }, + { + "ColumnName": "ChargePeriodEnd", + "DataType": "DATETIME" + }, + { + "ColumnName": "BilledCost", + "DataType": "DECIMAL", + "NumericPrecision": 20, + "NumberScale": 10 + }, + { + "ColumnName": "EffectiveCost", + "DataType": "DECIMAL", + "NumericPrecision": 20, + "NumberScale": 10 + }, + { + "ColumnName": "Tags", + "DataType": "JSON", + "ProviderTagPrefixes": ["acme", "ac"] + }, + { + "ColumnName": "x_awesome_column1", + "DataType": "DECIMAL", + "NumericPrecision": 20, + "NumberScale": 10 + }, + { + "ColumnName": "x_awesome_column2", + "DataType": "DATETIME" + } + ] +} +``` + +For an example of how ACME ensures the schema metadata reference requirement is met see: [Schema Metadata to FOCUS Data Reference](schema_metadata_reference_example.md) diff --git a/specification/appendix/examples/metadata/correcting_schema_errors_example.md b/specification/appendix/examples/metadata/correcting_schema_errors_example.md new file mode 100644 index 000000000..16d166c3b --- /dev/null +++ b/specification/appendix/examples/metadata/correcting_schema_errors_example.md @@ -0,0 +1,72 @@ +# Provider has an error in their schema metadata + +## Scenario + +ACME has discovered that while their export includes the column x_awesome_column3, the schema metadata does not include this column. In this case, the provider fixes the metadata in existing the schema object and does not need to create a new schema object. Reference metadata remains the same. + +## Supplied Metadata + +## Location of the schema object + +`/FOCUS/metadata/schemas/schema-34567-abcde-34567-abcde-34567.json` + +## Content of the schema object + +```json + { + "SchemaId": "34567-abcde-34567-abcde-34567", + "FocusVersion": "1.0", + "CreationDate": "2024-03-02T12:01:03.083z", + "ColumnDefinition": [ + { + "ColumnName": "BillingAccountId", + "DataType": "STRING", + "StringMaxLength": 64, + "StringEncoding": "UTF-8" + }, + { + "ColumnName": "BillingAccountName", + "DataType": "STRING", + "StringMaxLength": 64, + "StringEncoding": "UTF-8" + }, + { + "ColumnName": "ChargePeriodStart", + "DataType": "DATETIME" + }, + { + "ColumnName": "ChargePeriodEnd", + "DataType": "DATETIME" + }, + { + "ColumnName": "BilledCost", + "DataType": "DECIMAL", + "NumericPrecision": 20, + "NumberScale": 10 + }, + { + "ColumnName": "EffectiveCost", + "DataType": "DECIMAL", + "NumericPrecision": 20, + "NumberScale": 10 + }, + { + "ColumnName": "Tags", + "DataType": "JSON", + "ProviderTagPrefixes": ["acme", "ac"] + }, + { + "ColumnName": "x_awesome_column1", + "DataType": "STRING", + "StringMaxLength": 64, + "StringEncoding": "UTF-8" + }, + { + "ColumnName": "x_awesome_column2", + "DataType": "STRING", + "StringMaxLength": 64, + "StringEncoding": "UTF-8" + } + ] +} +``` diff --git a/specification/appendix/examples/metadata/data_generator_example.md b/specification/appendix/examples/metadata/data_generator_example.md new file mode 100644 index 000000000..932e52bb4 --- /dev/null +++ b/specification/appendix/examples/metadata/data_generator_example.md @@ -0,0 +1,20 @@ +# Data Generator Metadata + +## Scenario + +Acme provides metadata about the data generator. They provide this via the Data Generator schema object. + +## Supplied Metadata + +## Location of Data Generator Metadata File + +`/FOCUS/metadata/data_generator.json` + +## Content of Data Generator Metadata File + +```json +{ + "DataGenerator": "Acme" +} +``` + diff --git a/specification/appendix/examples/metadata/focus_version_changed_example.md b/specification/appendix/examples/metadata/focus_version_changed_example.md new file mode 100644 index 000000000..31a28b8ce --- /dev/null +++ b/specification/appendix/examples/metadata/focus_version_changed_example.md @@ -0,0 +1,72 @@ +# Provider has an error in their schema metadata + +## Scenario + +ACME's previous exports used Focus Version 1.0. They are now going to adopt Focus Version 1.1. It is required that they create a new schema metadata object when using a new FOCUS version regardless of schema changes. In this example, the FOCUS new version adoption doesn't include columns changes. This is to illustrate that FOCUS Version changes are independent of column changes, however, this scenario is unlikely. + +## Supplied Metadata + +## Location of the new schema object + +`/FOCUS/metadata/schemas/schema-45678-abcde-45678-abcde-45678.json` + +## Content of the schema object + +```json + { + "SchemaId": "45678-abcde-45678-abcde-45678", + "FocusVersion": "1.1", + "CreationDate": "2024-04-02T12:01:03.083z", + "ColumnDefinition": [ + { + "ColumnName": "BillingAccountId", + "DataType": "STRING", + "StringMaxLength": 64, + "StringEncoding": "UTF-8" + }, + { + "ColumnName": "BillingAccountName", + "DataType": "STRING", + "StringMaxLength": 64, + "StringEncoding": "UTF-8" + }, + { + "ColumnName": "ChargePeriodStart", + "DataType": "DATETIME" + }, + { + "ColumnName": "ChargePeriodEnd", + "DataType": "DATETIME" + }, + { + "ColumnName": "BilledCost", + "DataType": "DECIMAL", + "NumericPrecision": 20, + "NumberScale": 10 + }, + { + "ColumnName": "EffectiveCost", + "DataType": "DECIMAL", + "NumericPrecision": 20, + "NumberScale": 10 + }, + { + "ColumnName": "Tags", + "DataType": "JSON", + "ProviderTagPrefixes": ["acme", "ac"] + }, + { + "ColumnName": "x_awesome_column1", + "DataType": "STRING", + "StringMaxLength": 64, + "StringEncoding": "UTF-8" + }, + { + "ColumnName": "x_awesome_column2", + "DataType": "DATETIME" + } + ] +} +``` + +For an example of how ACME ensures the schema metadata reference requirement is met see: [Schema Metadata to FOCUS Data Reference](schema_metadata_reference_example.md) diff --git a/specification/appendix/examples/metadata/focus_version_changed_with_provider_version_used_example.md b/specification/appendix/examples/metadata/focus_version_changed_with_provider_version_used_example.md new file mode 100644 index 000000000..bf166cfa3 --- /dev/null +++ b/specification/appendix/examples/metadata/focus_version_changed_with_provider_version_used_example.md @@ -0,0 +1,140 @@ +# FOCUS Version Changed by Provider Using Provider Version + +## Scenario + +ACME uses Provider Version, and their previous exports used FOCUS Version 1.0. Their current Provider Version is 2.2. They are now going to adopt FOCUS Version 1.1. Because it is required that they update their Provider Version when using a new FOCUS Version, they create a new schema object designating that both have changed. In this example, the adoption of the new FOCUS Version doesn't include additional columns. This is to illustrate that Provider Version changes are independent of column changes; however, this scenario is unlikely. + +The provider creates a new schema object to represent the new schema. The provider includes both the new FOCUS Version and Provider Version in the schema object. + +## Supplied Metadata + +## Location of the previous schema object + +`/FOCUS/metadata/schemas/schema-34567-abcde-34567-abcde-34567.json` + +## Content of the previous schema object + +```json + { + "SchemaId": "34567-abcde-34567-abcde-34567", + "FocusVersion": "1.1", + "ProviderVersion": "2.2", + "CreationDate": "2024-04-02T12:01:03.083z", + "ColumnDefinition": [ + { + "ColumnName": "BillingAccountId", + "DataType": "STRING", + "StringMaxLength": 64, + "StringEncoding": "UTF-8" + }, + { + "ColumnName": "BillingAccountName", + "DataType": "STRING", + "StringMaxLength": 64, + "StringEncoding": "UTF-8" + }, + { + "ColumnName": "ChargePeriodStart", + "DataType": "DATETIME" + }, + { + "ColumnName": "ChargePeriodEnd", + "DataType": "DATETIME" + }, + { + "ColumnName": "BilledCost", + "DataType": "DECIMAL", + "NumericPrecision": 20, + "NumberScale": 10 + }, + { + "ColumnName": "EffectiveCost", + "DataType": "DECIMAL", + "NumericPrecision": 20, + "NumberScale": 10 + }, + { + "ColumnName": "Tags", + "DataType": "JSON", + "ProviderTagPrefixes": ["acme", "ac"] + }, + { + "ColumnName": "x_awesome_column1", + "DataType": "STRING", + "StringMaxLength": 64, + "StringEncoding": "UTF-8" + }, + { + "ColumnName": "x_awesome_column2", + "DataType": "DATETIME" + } + ] +} +``` + +## Location of the new schema object + +`/FOCUS/metadata/schemas/schema-45678-abcde-45678-abcde-45678.json` + +## Content of the new schema object + +```json + { + "SchemaId": "45678-abcde-45678-abcde-45678", + "FocusVersion": "1.1", + "ProviderVersion": "2.3", + "name": "New Columns", + "CreationDate": "2024-04-02T12:01:03.083z", + "ColumnDefinition": [ + { + "ColumnName": "BillingAccountId", + "DataType": "STRING", + "StringMaxLength": 64, + "StringEncoding": "UTF-8" + }, + { + "ColumnName": "BillingAccountName", + "DataType": "STRING", + "StringMaxLength": 64, + "StringEncoding": "UTF-8" + }, + { + "ColumnName": "ChargePeriodStart", + "DataType": "DATETIME" + }, + { + "ColumnName": "ChargePeriodEnd", + "DataType": "DATETIME" + }, + { + "ColumnName": "BilledCost", + "DataType": "DECIMAL", + "NumericPrecision": 20, + "NumberScale": 10 + }, + { + "ColumnName": "EffectiveCost", + "DataType": "DECIMAL", + "NumericPrecision": 20, + "NumberScale": 10 + }, + { + "ColumnName": "Tags", + "DataType": "JSON", + "ProviderTagPrefixes": ["acme", "ac"] + }, + { + "ColumnName": "x_awesome_column1", + "DataType": "STRING", + "StringMaxLength": 64, + "StringEncoding": "UTF-8" + }, + { + "ColumnName": "x_awesome_column2", + "DataType": "DATETIME" + } + ] +} +``` + +For an example of how ACME ensures the schema metadata reference requirement is met see: [Schema Metadata to FOCUS Data Reference](schema_metadata_reference_example.md) diff --git a/specification/appendix/examples/metadata/metadata_examples.mdpp b/specification/appendix/examples/metadata/metadata_examples.mdpp new file mode 100644 index 000000000..58d75735d --- /dev/null +++ b/specification/appendix/examples/metadata/metadata_examples.mdpp @@ -0,0 +1,24 @@ +# Metadata Examples + +The following is an example metadata JSON structure provided by a hypothetical FOCUS data provider called ACME. This example illustrates an example of how a provider can supply the required reference between the FOCUS data and the schema metadata. Provider implementations will vary on how the metadata is stored and retrieved; however, the provider's chosen metadata delivery approach should be able to support the structure represented in this example. + +## Scenario + +In this example, the provider supports delivery of FOCUS data via file export to a data storage system. The provider delivers data every 12 hours. + +## Example Data Structure + +* export root location: `/FOCUS` +* metadata location: `/FOCUS/metadata` +* focus data location: `/FOCUS/data` + +!INCLUDE "data_generator_example.md",1 +!INCLUDE "schema_metadata_example.md",1 +!INCLUDE "schema_metadata_reference_example.md",1 +!INCLUDE "adding_new_columns_example.md",1 +!INCLUDE "removing_columns_example.md",1 +!INCLUDE "changing_a_columns_metadata_example.md", +!INCLUDE "correcting_schema_error_example.md",1 +!INCLUDE "focus_version_changed_example.md",1 +!INCLUDE "focus_version_changed_with_provider_version_example.md",1 +!INCLUDE "provider_version_changed.md",1 diff --git a/specification/appendix/examples/metadata/provider_version_changed_example.md b/specification/appendix/examples/metadata/provider_version_changed_example.md new file mode 100644 index 000000000..530e56ca7 --- /dev/null +++ b/specification/appendix/examples/metadata/provider_version_changed_example.md @@ -0,0 +1,76 @@ +# FOCUS Version Changed by Provider Using Provider Version + +## Scenario + +ACME uses provider version, and they made a change to their approach to create FOCUS data that does not adopt a new FOCUS Version, nor make a change the included columns but does impact values in the data. This is to illustrate that Provider Version changes are independent of column changes, however provider version changes may include column changes. + +The provider creates a new schema object to represent the new schema. The provider includes both the new FOCUS Version and Provider Version in the schema object. + +## Supplied Metadata + +## Location of the new schema object + + +`/FOCUS/metadata/schemas/schema-56789-abcde-56789-abcde-56789.json` + +## Content of the new schema object + +```json + { + "SchemaId": "56789-abcde-56789-abcde-56789", + "FocusVersion": "1.1", + "ProviderVersion": "2.4", + "CreationDate": "2024-05-02T12:01:03.083z", + "ColumnDefinition": [ + { + "ColumnName": "BillingAccountId", + "DataType": "STRING", + "StringMaxLength": 64, + "StringEncoding": "UTF-8" + }, + { + "ColumnName": "BillingAccountName", + "DataType": "STRING", + "StringMaxLength": 64, + "StringEncoding": "UTF-8" + }, + { + "ColumnName": "ChargePeriodStart", + "DataType": "DATETIME" + }, + { + "ColumnName": "ChargePeriodEnd", + "DataType": "DATETIME" + }, + { + "ColumnName": "BilledCost", + "DataType": "DECIMAL", + "NumericPrecision": 20, + "NumberScale": 10 + }, + { + "ColumnName": "EffectiveCost", + "DataType": "DECIMAL", + "NumericPrecision": 20, + "NumberScale": 10 + }, + { + "ColumnName": "Tags", + "DataType": "JSON", + "ProviderTagPrefixes": ["acme", "ac"] + }, + { + "ColumnName": "x_awesome_column1", + "DataType": "STRING", + "StringMaxLength": 64, + "StringEncoding": "UTF-8" + }, + { + "ColumnName": "x_awesome_column2", + "DataType": "DATETIME" + } + ] +} +``` + +For an example of how ACME ensures the schema metadata reference requirement is met see: [Schema Metadata to FOCUS Data Reference](schema_metadata_reference_example.md) diff --git a/specification/appendix/examples/metadata/removing_columns_example.md b/specification/appendix/examples/metadata/removing_columns_example.md new file mode 100644 index 000000000..b9f47e31e --- /dev/null +++ b/specification/appendix/examples/metadata/removing_columns_example.md @@ -0,0 +1,72 @@ +# Removing Columns + +## Scenario + +ACME has decided to remove columns from their FOCUS data export. The column removed is x_awesome_column3. The provider creates a new schema object to represent the new schema, this schema object has a unique SchemaId. + +## Supplied Metadata + +## Location for the new schema object + +`/FOCUS/metadata/schemas/schema-34567-abcde-34567-abcde-34567.json` + +## Content for the new schema object + +```json + { + "SchemaId": "34567-abcde-34567-abcde-34567", + "FocusVersion": "1.0", + "CreationDate": "2024-03-02T12:01:03.083z", + "ColumnDefinition": [ + { + "ColumnName": "BillingAccountId", + "DataType": "STRING", + "StringMaxLength": 64, + "StringEncoding": "UTF-8" + }, + { + "ColumnName": "BillingAccountName", + "DataType": "STRING", + "StringMaxLength": 64, + "StringEncoding": "UTF-8" + }, + { + "ColumnName": "ChargePeriodStart", + "DataType": "DATETIME" + }, + { + "ColumnName": "ChargePeriodEnd", + "DataType": "DATETIME" + }, + { + "ColumnName": "BilledCost", + "DataType": "DECIMAL", + "NumericPrecision": 20, + "NumberScale": 10 + }, + { + "ColumnName": "EffectiveCost", + "DataType": "DECIMAL", + "NumericPrecision": 20, + "NumberScale": 10 + }, + { + "ColumnName": "Tags", + "DataType": "JSON", + "ProviderTagPrefixes": ["acme", "ac"] + }, + { + "ColumnName": "x_awesome_column1", + "DataType": "STRING", + "StringMaxLength": 64, + "StringEncoding": "UTF-8" + }, + { + "ColumnName": "x_awesome_column2", + "DataType": "DATETIME" + } + ] +} +``` + +For an example of how ACME ensures the schema metadata reference requirement is met see: [Schema Metadata to FOCUS Data Reference](schema_metadata_reference_example.md) diff --git a/specification/appendix/examples/metadata/schema_metadata_example.md b/specification/appendix/examples/metadata/schema_metadata_example.md new file mode 100644 index 000000000..6a52bda73 --- /dev/null +++ b/specification/appendix/examples/metadata/schema_metadata_example.md @@ -0,0 +1,61 @@ +# Schema Metadata Example + +## Scenario + +ACME has only provided one schema for their provided FOCUS data. ACME provides a directory of schemas and each schema is a single file. Acme's provides a file representing the schema for the data they provide. + +## Supplied Metadata + +## Location of the schema object + +`/FOCUS/metadata/schemas/schema-1234-abcde-12345-abcde-12345.json` + +## Content of the schema object + +```json +{ + "SchemaId": "1234-abcde-12345-abcde-12345", + "FocusVersion": "1.0", + "CreationDate": "2024-01-01T12:01:03.083z", + "ColumnDefinition": [ + { + "ColumnName": "BillingAccountId", + "DataType": "STRING", + "StringMaxLength": 64, + "StringEncoding": "UTF-8" + }, + { + "ColumnName": "BillingAccountName", + "DataType": "STRING", + "StringMaxLength": 64, + "StringEncoding": "UTF-8" + }, + { + "ColumnName": "ChargePeriodStart", + "DataType": "DATETIME" + }, + { + "ColumnName": "ChargePeriodEnd", + "DataType": "DATETIME" + }, + { + "ColumnName": "BilledCost", + "DataType": "DECIMAL", + "NumericPrecision": 20, + "NumberScale": 10 + }, + { + "ColumnName": "EffectiveCost", + "DataType": "DECIMAL", + "NumericPrecision": 20, + "NumberScale": 10 + }, + { + "ColumnName": "Tags", + "DataType": "JSON", + "ProviderTagPrefixes": ["acme", "ac"] + } + ] +} +``` + diff --git a/specification/appendix/examples/metadata/schema_metadata_reference_example.md b/specification/appendix/examples/metadata/schema_metadata_reference_example.md new file mode 100644 index 000000000..9edab82c1 --- /dev/null +++ b/specification/appendix/examples/metadata/schema_metadata_reference_example.md @@ -0,0 +1,78 @@ +# Schema Metadata to FOCUS Data Reference + +## Scenario + +ACME makes a change to the schema of their data exports. For each FOCUS data export, ACME includes a metadata reference to the schema object. Because multiple files are provided, Acme has elected to include a metadata file that includes the focus schema reference that applies to the data export files. They therefore include the new schema id in their export metadata file. + +## Supplied Metadata + +## Location of the existing schema metadata reference file + +`/FOCUS/data/export1-metadata.json` + +## Content for the existing export metadata object + +```json +{ + "SchemaId":"1234-abcde-12345-abcde-12345", + "data_location": + [ + { + "filepath": "/FOCUS/data/export1/export1-part1.csv", + "total_bytes": 9010387, + "total_rows": 4450 + }, + { + "filepath": "/FOCUS/data/export1/export1-part2.csv", + "total_bytes": 9010387, + "total_rows": 4450 + }, + { + "filepath": "/FOCUS/data/export1/export1-part3.csv", + "total_bytes": 9010387, + "total_rows": 4450 + }, + { + "filepath": "/FOCUS/data/export1/export1-part4.csv", + "total_bytes": 9010387, + "total_rows": 4450 + } + ] +} +``` + +## Location for the new export metadata object + +`/FOCUS/data/export2-metadata.json` + +## Content for the new export metadata object + +```json +{ + "SchemaId":"23456-abcde-23456-abcde-23456", + "data_location": + [ + { + "filepath": "/FOCUS/data/export2/export2-part1.csv", + "total_bytes": 9010387, + "total_rows": 4450 + }, + { + "filepath": "/FOCUS/data/export2/export2-part2.csv", + "total_bytes": 9010387, + "total_rows": 4450 + }, + { + "filepath": "/FOCUS/data/export2/export2-part3.csv", + "total_bytes": 9010387, + "total_rows": 4450 + }, + { + "filepath": "/FOCUS/data/export2/export2-part4.csv", + "total_bytes": 9010387, + "total_rows": 4450 + } + ] +} +``` + diff --git a/specification/metadata/data_generator/data_generator.mdpp b/specification/metadata/data_generator/data_generator.mdpp index 80b36aa5c..2526c9bc2 100644 --- a/specification/metadata/data_generator/data_generator.mdpp +++ b/specification/metadata/data_generator/data_generator.mdpp @@ -2,4 +2,12 @@ The FOCUS metadata about the generator of the FOCUS data. +## Requirements + +The FOCUS Data Generator metadata MUST be provided. This metadata MUST be of type Object and MUST NOT contain null values. + +## Schema Example + +For an example of the FOCUS Data Generator metadata please refer to: [Data Generator Example](#data_generator_example) + !INCLUDE "datagenerator.md",1 diff --git a/specification/metadata/data_generator/datagenerator.md b/specification/metadata/data_generator/datagenerator.md index a731c5999..abc636b46 100644 --- a/specification/metadata/data_generator/datagenerator.md +++ b/specification/metadata/data_generator/datagenerator.md @@ -1,8 +1,8 @@ # Data Generator -Human readable name of the entity that is generating the data. +Human-readable name of the entity that is generating the data. -The DataGenerator MUST be provided in the metadata. DataGenerator MUST be of type String and MUST NOT contain null values. The DataGenerator SHOULD be easily associated with the provider who generated the FOCUS dataset. +The DataGenerator MUST be provided in the metadata. DataGenerator MUST be of type String and MUST NOT be null. The DataGenerator SHOULD be easily associated with the provider who generated the FOCUS dataset. ## Metadata ID @@ -12,6 +12,15 @@ DataGenerator Data Generator +## Content constraints + +| Constraint | Value | +|:----------------|:-----------------| +| Feature level | Mandatory | +| Allows nulls | False | +| Data type | String | +| Value format | \ | + ## Introduced (version) 1.0 diff --git a/specification/metadata/metadata.mdpp b/specification/metadata/metadata.mdpp index 4b1b22fe0..46a776168 100644 --- a/specification/metadata/metadata.mdpp +++ b/specification/metadata/metadata.mdpp @@ -1,6 +1,8 @@ # Metadata -The FOCUS specification defines a metadata structure that is to be supplied by data providers to facilitate practitioners use of FOCUS data. This meta data includes general information about the data generator and the schema of the FOCUS dataset. FOCUS Metadata SHOULD be provided in a format that is accessible programmatically, such as: a file, website, api, table. +The FOCUS specification defines a metadata structure to be supplied by data providers to facilitate practitioners' use of FOCUS data. This metadata includes general information about the data generator and the schema of the FOCUS dataset. + +FOCUS Metadata SHOULD be provided in a format that is accessible programmatically, such as a file, website, API, or table. Providers SHOULD provide documentation on their implementation of the FOCUS metadata. !INCLUDE "data_generator/data_generator.mdpp",1 -!INCLUDE "schema/schema.mdpp",1 +!INCLUDE "schema/schema.mdpp",1 \ No newline at end of file diff --git a/specification/metadata/schema/column_definition/column_definition.mdpp b/specification/metadata/schema/column_definition/column_definition.mdpp index 75971504e..18a2f8490 100644 --- a/specification/metadata/schema/column_definition/column_definition.mdpp +++ b/specification/metadata/schema/column_definition/column_definition.mdpp @@ -2,6 +2,10 @@ The FOCUS metadata schema column definition provides a list of the columns present in the FOCUS dataset along with metadata about the columns. +## Requirements + +This metadata MUST be present in the FOCUS metadata schema. This metadata MUST be of type Object and MUST NOT contain null values. + !INCLUDE "columnname.md",1 !INCLUDE "datatype.md",1 !INCLUDE "numericprecision.md",1 diff --git a/specification/metadata/schema/column_definition/columnname.md b/specification/metadata/schema/column_definition/columnname.md index cc9ff728b..bd6b40f96 100644 --- a/specification/metadata/schema/column_definition/columnname.md +++ b/specification/metadata/schema/column_definition/columnname.md @@ -12,6 +12,15 @@ ColumnName Column Name +## Content constraints + +| Constraint | Value | +|:----------------|:-----------------| +| Feature level | Mandatory | +| Allows nulls | False | +| Data type | String | +| Value format | \ | + ## Introduced (version) 1.0 diff --git a/specification/metadata/schema/column_definition/datatype.md b/specification/metadata/schema/column_definition/datatype.md index bd099443d..8a33adbca 100644 --- a/specification/metadata/schema/column_definition/datatype.md +++ b/specification/metadata/schema/column_definition/datatype.md @@ -12,6 +12,15 @@ DataType Data Type +## Content constraints + +| Constraint | Value | +|:----------------|:-----------------| +| Feature level | Mandatory | +| Allows nulls | False | +| Data type | String | +| Value format | \ | + ## Introduced (version) 1.0 diff --git a/specification/metadata/schema/column_definition/numberscale.md b/specification/metadata/schema/column_definition/numberscale.md index 4602e7058..e45bdbd1b 100644 --- a/specification/metadata/schema/column_definition/numberscale.md +++ b/specification/metadata/schema/column_definition/numberscale.md @@ -12,6 +12,15 @@ NumberScale Number Scale +## Content constraints + +| Constraint | Value | +|:--------------|:---------------------------------| +| Feature level | Conditional | +| Allows nulls | False | +| Data type | Integer | +| Value format | [Numeric Format](#numericformat) | + ## Introduced (version) 1.0 \ No newline at end of file diff --git a/specification/metadata/schema/column_definition/numericprecision.md b/specification/metadata/schema/column_definition/numericprecision.md index 1ac35d95c..999c73c46 100644 --- a/specification/metadata/schema/column_definition/numericprecision.md +++ b/specification/metadata/schema/column_definition/numericprecision.md @@ -12,6 +12,15 @@ NumericPrecision Numeric Precision +## Content constraints + +| Constraint | Value | +|:--------------|:---------------------------------| +| Feature level | Conditional | +| Allows nulls | False | +| Data type | Integer | +| Value format | [Numeric Format](#numericformat) | + ## Introduced (version) 1.0 diff --git a/specification/metadata/schema/column_definition/providertagprefixes.md b/specification/metadata/schema/column_definition/providertagprefixes.md index d6267ffae..ca8ddb4b6 100644 --- a/specification/metadata/schema/column_definition/providertagprefixes.md +++ b/specification/metadata/schema/column_definition/providertagprefixes.md @@ -12,6 +12,15 @@ ProviderTagPrefixes Provider Tag Prefixes +## Content constraints + +| Constraint | Value | +|:--------------|:------------------------------------| +| Feature level | Conditional | +| Allows nulls | False | +| Data type | Array | +| Value format | STRING datatype values in the array | + ## Introduced (version) 1.0 diff --git a/specification/metadata/schema/column_definition/stringencoding.md b/specification/metadata/schema/column_definition/stringencoding.md index cb9f870a1..49f7fba22 100644 --- a/specification/metadata/schema/column_definition/stringencoding.md +++ b/specification/metadata/schema/column_definition/stringencoding.md @@ -12,6 +12,15 @@ StringEncoding StringEncoding +## Content constraints + +| Constraint | Value | +|:----------------|:-----------------| +| Feature level | Conditional | +| Allows nulls | False | +| Data type | String | +| Value format | \ | + ## Introduced (version) 1.0 diff --git a/specification/metadata/schema/column_definition/stringmaxlength.md b/specification/metadata/schema/column_definition/stringmaxlength.md index 69a74e3e0..4a0299995 100644 --- a/specification/metadata/schema/column_definition/stringmaxlength.md +++ b/specification/metadata/schema/column_definition/stringmaxlength.md @@ -12,6 +12,15 @@ StringMaxLength String Max Length +## Content constraints + +| Constraint | Value | +|:--------------|:---------------------------------| +| Feature level | Conditional | +| Allows nulls | False | +| Data type | Integer | +| Value format | [Numeric Format](#numericformat) | + ## Introduced (version) 1.0 diff --git a/specification/metadata/schema/creationdate.md b/specification/metadata/schema/creationdate.md index 6c5ed6bc1..77cbc92fd 100644 --- a/specification/metadata/schema/creationdate.md +++ b/specification/metadata/schema/creationdate.md @@ -12,6 +12,15 @@ CreationDate Creation Date +## Content constraints + +| Constraint | Value | +|:--------------|:------------------------------------------| +| Feature level | Mandatory | +| Allows nulls | False | +| Data type | Date/Time | +| Value format | [Date/Time Format](#date/timeformat) | + ## Introduced (version) 1.0 diff --git a/specification/metadata/schema/focusversion.md b/specification/metadata/schema/focusversion.md index c650665bd..323b54428 100644 --- a/specification/metadata/schema/focusversion.md +++ b/specification/metadata/schema/focusversion.md @@ -2,7 +2,7 @@ The version of FOCUS utilized for building the dataset. -The FocusVersion MUST be provided in the metadata. FocusVersion MUST be of type String and MUST NOT contain null values. FOCUSVersion MUST match one of the published versions of the FOCUS specification. FocusVersion MUST match the version of the FOCUS specification that the FOCUS dataset conforms to. +The FocusVersion MUST be provided in the metadata. FocusVersion MUST be of type String and MUST NOT contain null values. FocusVersion MUST match one of the published versions of the FOCUS specification. FocusVersion MUST match the version of the FOCUS specification that the FOCUS dataset conforms to. ## Metadata ID @@ -12,6 +12,15 @@ FocusVersion FOCUS Version +## Content constraints + +| Constraint | Value | +|:--------------|:-----------------------------------------| +| Feature level | Mandatory | +| Allows nulls | False | +| Data type | STRING | +| Value format | Must align with a published FocusVersion | + ## Introduced (version) 1.0 diff --git a/specification/metadata/schema/providerversion.md b/specification/metadata/schema/providerversion.md new file mode 100644 index 000000000..bcf3ebe6d --- /dev/null +++ b/specification/metadata/schema/providerversion.md @@ -0,0 +1,26 @@ +# Provider Version + +The ProviderVersion MAY be supplied to declare the version of logic by which the FOCUS dataset was generated and is separate from FOCUS Version. ProviderVersion allows for the provider to specify changes that may not result in a structural change in the data. It is suggested that the provider version use a versioning approach such as [SemVer](https://semver.org) version. + +ProviderVersion MUST be of type String and MUST NOT contain null values. If FocusVersion is changed a new ProviderVersion MUST be also changed. The provider MUST document what changes are present in the ProviderVersion. + +## Metadata ID + +ProviderVersion + +## Metadata Name + +Provider Version + +## Content constraints + +| Constraint | Value | +|:--------------|:-----------------| +| Feature level | Optional | +| Allows nulls | False | +| Data type | STRING | +| Value format | \ | + +## Introduced (version) + +1.1 diff --git a/specification/metadata/schema/schema.mdpp b/specification/metadata/schema/schema.mdpp index b27448e13..7d211016c 100644 --- a/specification/metadata/schema/schema.mdpp +++ b/specification/metadata/schema/schema.mdpp @@ -1,8 +1,38 @@ # Schema -Each FOCUS dataset must have a metadata about the schema associated with it. The schema metadata provides information about the structure of the data provided. +The schema metadata object and its contents provides information about the structure of the data provided. + +## Requirements + +### Reference to FOCUS Data + +FOCUS data artifacts, whether they are data files, data streams, or data tables, MUST provide a clear reference to the schema of the data. This reference MUST be retrievable without inspection of the contents of the FOCUS data within the data artifact. For some delivery mechanisms such as database tables, the provider may rely on the schema functionality of the providing system. + +It is recommended that the schema reference be provided as an external reference rather than included in full as metadata accompanying the data artifact. This allows for easier understanding of when changes to the schema of the FOCUS datasets occurs. + +### Schema Metadata Creation + +Should the provider change the structure of the supplied FOCUS data artifact, a new schema metadata object MUST be supplied. +These scenarios include, but are not limited to: + +* [Adding a new column](#adding_new_columns_example) +* [Removing a column](#removing_columns_example) +* [Renaming a column](#renaming_columns_example) +* [Changing column metadata](#changing_column_metadata_example) +* [FOCUS Version is changed](#focus_version_change_example) +* [Provider Version is changed](#provider_version_change_example) +* [Correcting schema metadata errors](#correcting_schema_metadata_errors) + +### Schema Metadata Updates + +Should there be an error where the schema metadata object does not match the schema of the FOCUS data artifact, the provider MUST update the schema metadata object to match the schema of the FOCUS data artifact. This is to ensure that the schema metadata object is always accurate. + +## Schema Example + +For an example of the FOCUS schema metadata please refer to: [Schema Metadata Example](#schema-metadata-examples) !INCLUDE "schemaid.md",1 !INCLUDE "creationdate.md",1 !INCLUDE "focusversion.md",1 +!INCLUDE "providerversion.md", !INCLUDE "column_definition/column_definition.mdpp",1 diff --git a/specification/metadata/schema/schemaid.md b/specification/metadata/schema/schemaid.md index d55872250..c1cf64388 100644 --- a/specification/metadata/schema/schemaid.md +++ b/specification/metadata/schema/schemaid.md @@ -2,7 +2,7 @@ The Schema ID provides the reference item to associate which Schema was used for the generation of a FOCUS Dataset. -The SchemaId MUST be present in the metadata. The SchemaId MUST be of String. It is RECOMMENDED for SchemaId to be a Universally Unique Identifier (UUID) or [SemVer](https://semver.org) version. +The SchemaId MUST be present in the metadata. The SchemaId MUST be of String. It is RECOMMENDED for SchemaId to be a Globally Unique Identifier (GUID). ## Metadata ID @@ -12,6 +12,15 @@ SchemaId Schema ID +## Content constraints + +| Constraint | Value | +|:--------------|:----------------------| +| Feature level | Mandatory | +| Allows nulls | False | +| Data type | STRING | +| Value format | Recommend GUID String | + ## Introduced (version) 1.0 diff --git a/supporting_content/metadata/schema_creation_scenarios.md b/supporting_content/metadata/schema_creation_scenarios.md new file mode 100644 index 000000000..bbe6a39de --- /dev/null +++ b/supporting_content/metadata/schema_creation_scenarios.md @@ -0,0 +1,13 @@ +# Scheme Creation Scenarios + +## The following is a list of scenarios and their schema update requirements + + +| Scenario | Requires New Schema object | Requires Change in Provider Version | Requires Change FOCUS Version | +|:---------------------------------------------------------------------------------------------------------|:----------------------------|:------------------------------------|:-------------------------------| +| Provider uses a new focus version when they supply a provider version | Y | | Y |N | +| Provider is changing the way they generate the data that doesn't affect the focus version or the columns | Y | Y | N | +| Addition of Column | Y | N | N | +| Removal of Columns | Y | N | N | +| Change of Focus Version | Y | Y| Y| +| Correction of schema metadata that is not correct | N | N | N | \ No newline at end of file