Official SDF Utils Library

This SDF library contains the standard utilities available in the default namespace in any SDF workspace. As such, macros defined in this workspace do not need to be prefixed with the workspace name, they can be referenced directly.

Here's an example:

WITH date_spine AS (
  {{ sdf_utils.date_spine(
          datepart = 'day',
          start_date_raw = "2019-01-01",    
          end_date_raw = "2019-12-31"
    )
  }}
)

select * from date_spine;

You'll find documentation for each specific macro below. If no Dialect section is specified, this means the implementation is dialect-agnostic and should work across all supported dialects.

Note: SDF is still < v1, as such certain scenarios may result in unexpected behavior. Please follow the contributing guidelines and create an issue in this repo if you find any bugs or issues.

For an in-depth guide on how to use Jinja macros in SDF, please see the Jinja section of our official docs

SDF Utils

Test Name
`date_spine()`
`generate_date_values()`
`generate_integer_values(condition)`
`group_by(condition)`
`generate_surrogate_key()`

Date Spine

Source Code

This macro generates a SQL SELECT statement that produces a single column of dates. The dates are generated for each day in the range from the 'from' date to the 'to' date, where from is inclusive and to is not inclusive.

Parameters: - datepart: The date part to increment. This can be one of the following: 'day', 'week', 'month', 'quarter', 'year'. Others like hour, minute, second are dialect-specific and supported in Snowflake for example. - start_date: The start date of the range, as a SQL Date object. The best dialect-agnostic way is to pass "CAST('2019-01-01' AS DATE)" for example. - end_date: The noninclusive end date of the range, as a SQL Date object. The best dialect-agnostic way is to pass "CAST('2019-01-01' AS DATE)" for example. - start_date_raw: The start date of the range, as a string in 'YYYY-MM-DD' format. - end_date_raw: The end date of the range, as a string in 'YYYY-MM-DD' format.

Returns: A SQL SELECT statement that produces a table with a single column of dates.

Usage:

WITH date_spine AS (
    {{ 
        sdf_utils.date_spine(
            datepart = 'day',
            start_date = "CAST('2019-01-01' AS DATE)",    
            end_date = "CAST('2019-12-31' AS DATE)"
        )
    }}
)

Alternatively, using the raw parameters:

WITH date_spine AS (
    {{ 
        sdf_utils.date_spine(
            datepart = 'day',
            start_date_raw = "2019-01-01",    
            end_date_raw = "2019-12-31"
        )
    }}
)

Output:

WITH date_spine AS (
    WITH
    intervals_list AS (
        SELECT 
            ARRAY_GENERATE_RANGE(
                0, 
                DATEDIFF(day, CAST('2019-01-01' AS DATE), CAST('2019-12-31' AS DATE)) + 1
            ) AS i
    )

    SELECT 
        i.VALUE AS i,
        DATEADD(day, i.VALUE, CAST('2019-01-01' AS DATE)) AS date_spine
    FROM intervals_list l, LATERAL FLATTEN(input => l.i) i
)

Dialects:

Dialect	Supported
Snowflake	✅
BigQuery	❌
Redshift	❌
Trino	❌
Postgres	❌

Generate Date Values

Source Code

Generates a SQL VALUES clause with a list of date strings. The date strings are generated for each day in the range from the 'from' date to the 'to' date, inclusive.

Parameters: - from: The start date of the range, as a string in 'YYYY-MM-DD' format. - to: The end date of the range, as a string in 'YYYY-MM-DD' format.

Returns: A string representing a SQL VALUES clause with a list of date strings.

Usage:

{{ sdf_utils.generate_date_values('2022-01-01', '2022-01-03') }}

Output:

(VALUES ('2022-01-01'), ('2022-01-02'), ('2022-01-03'))

Generate Integer Values

Source Code

Generates a SQL VALUES clause with a list of incrementing integers. The 'from' int to the 'to' int are inclusive.

Parameters: - from: The starting integer of the range - to: The ending integer of the range

Returns: A string representing a SQL VALUES clause with a list of incrementing integers.

Usage:

{{ sdf_utils.generate_integer_values(1,5) }}

Output:

(VALUES 1,2,3,4,5)

Group By

Source Code

Builds a group by statement for fields 1...N

Parameters: - n: The number of fields to group by

Returns: A string representing a SQL GROUP BY clause with a list of incrementing integers.

Usage:

{{ sdf_utils.group_by(5) }}

Output:

group by 1,2,3,4,5

Generate Surrogate Key

Source Code

Generates a surrogate key for a given list of fields. The surrogate key is a unique identifier for each record in a table. It is generated by concatenating the values of the given fields and applying the MD5 hash function to the result. The fields are cast to varchar type before concatenation. If a field value is null, it is replaced with the string 'jinja_surrogate_key_null'.

Parameters: - fields: A list of field names for which the surrogate key should be generated.

Returns: A string representing a SQL GROUP BY clause with a list of incrementing integers.

Usage:

SELECT
    {{ sdf_utils.generate_surrogate_key(['column1', 'column2', 'column3']) }} AS surrogate_key,

Output:

SELECT
    md5(
        coalesce(cast(column1 as varchar),'_jinja_surrogate_key_null_')
         || '-' || coalesce(cast(column2 as varchar),'_jinja_surrogate_key_null_')
         || '-' || coalesce(cast(column3 as varchar),'_jinja_surrogate_key_null_')) AS surrogate_key,

Why the concatenated dash (|| '-' ||) in the output?

Without the delimiter, the inputs firstName = 'John Smith' and lastName = '', and firstName = 'John ' and lastName = 'Smith' would generate the same hash with generate_surrogate_key(["firstName", "lastName"]). The delimiter prevents this collision

Thanks @sophieml from Obie for the delimiter improvement ^ <3

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
.github/workflows		.github/workflows
macros		macros
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
workspace.sdf.yml		workspace.sdf.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Official SDF Utils Library

SDF Utils

Date Spine

Generate Date Values

Generate Integer Values

Group By

Generate Surrogate Key

About

Releases

Packages

Contributors 2

Languages

License

sdf-labs/utils

Folders and files

Latest commit

History

Repository files navigation

Official SDF Utils Library

SDF Utils

Date Spine

Generate Date Values

Generate Integer Values

Group By

Generate Surrogate Key

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages