Merge pull request #63 from fivetran/release/v0.8.0-official

v0.8.0 Official Release
fivetran · Feb 16, 2022 · e5c3d3a · e5c3d3a
2 parents c3a5f05 + ac4ffed
commit e5c3d3a
Show file tree

Hide file tree

Showing 25 changed files with 376 additions and 69 deletions.
diff --git a/.circleci/config.yml b/.circleci/config.yml
@@ -30,8 +30,9 @@ jobs:
             cd integration_tests
             dbt deps
             dbt seed --target redshift --full-refresh
+            dbt run --vars '{using_schedules: false, using_domain_names: false, using_user_tags: false, using_ticket_form_history: false, using_organization_tags: false}' --target redshift --full-refresh
             dbt run --target redshift --full-refresh
-            dbt run --vars '{using_schedules: false, using_domain_names: false, using_user_tags: false, using_ticket_form_history: false, using_organization_tags: false}' --target redshift
+            dbt run --target redshift
             dbt test --target redshift
       - run:
           name: "Run Tests - Postgres"
@@ -41,8 +42,9 @@ jobs:
             cd integration_tests
             dbt deps
             dbt seed --target postgres --full-refresh
+            dbt run --vars '{using_schedules: false, using_domain_names: false, using_user_tags: false, using_ticket_form_history: false, using_organization_tags: false}' --target postgres --full-refresh
             dbt run --target postgres --full-refresh
-            dbt run --vars '{using_schedules: false, using_domain_names: false, using_user_tags: false, using_ticket_form_history: false, using_organization_tags: false}' --target postgres
+            dbt run --target postgres
             dbt test --target postgres
       - run:
           name: "Run Tests - Snowflake"
@@ -52,8 +54,9 @@ jobs:
             cd integration_tests
             dbt deps
             dbt seed --target snowflake --full-refresh
+            dbt run --vars '{using_schedules: false, using_domain_names: false, using_user_tags: false, using_ticket_form_history: false, using_organization_tags: false}' --target snowflake --full-refresh
             dbt run --target snowflake --full-refresh
-            dbt run --vars '{using_schedules: false, using_domain_names: false, using_user_tags: false, using_ticket_form_history: false, using_organization_tags: false}' --target snowflake
+            dbt run --target snowflake
             dbt test --target snowflake
       - run:
           name: "Run Tests - BigQuery"
@@ -66,8 +69,9 @@ jobs:
             cd integration_tests
             dbt deps
             dbt seed --target bigquery --full-refresh
+            dbt run --vars '{using_schedules: false, using_domain_names: false, using_user_tags: false, using_ticket_form_history: false, using_organization_tags: false}' --target bigquery --full-refresh
             dbt run --target bigquery --full-refresh
-            dbt run --vars '{using_schedules: false, using_domain_names: false, using_user_tags: false, using_ticket_form_history: false, using_organization_tags: false}' --target bigquery
+            dbt run --target bigquery
             dbt test --target bigquery
       - save_cache:
           key: deps2-{{ .Branch }}

diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,3 +1,15 @@
+# dbt_zendesk v0.8.0
+# 🚨 Breaking Changes 🚨
+- The logic used to generate the `zendesk__ticket_backlog` model was updated to more accurately map backlog changes to tickets. As the underlying `zendesk__ticket_field_history` model is incremental, we recommend a `--full-refresh` after installing this latest version of the package. ([#61](https://github.com/fivetran/dbt_zendesk/pull/61))
+# Features
+- Addition of the [DECISIONLOG.md](https://github.com/fivetran/dbt_zendesk/blob/main/DECISIONLOG.md). This file contains detailed explanations for the opinionated transformation logic found within this dbt package. ([#59](https://github.com/fivetran/dbt_zendesk/pull/59))
+# Bug Fixes
+- Added logic required to account for the `first_reply_time` when the first commenter is an internal comment and there are no previous external comments applied to the ticket. ([#59](https://github.com/fivetran/dbt_zendesk/pull/59))
+- For those using schedules, incorporates Daylight Savings Time to use the proper timezone offsets for calculating UTC timestamps. Business minute metrics are more accurately calculated, as previously the package did not acknowledge daylight time and only used the standard time offsets ([#62](https://github.com/fivetran/dbt_zendesk/issues/62)).
+
+## Under the Hood
+- Updated the incremental logic within `int_zendesk__field_history_scd` to include an additional partition for `ticket_id`. This allows for a more accurate generation of ticket backlog records. ([#61](https://github.com/fivetran/dbt_zendesk/pull/61))
+- Corrected the spelling of the partition field within the cte in `int_zendesk__field_history_scd` to be `partition` opposed to `patition`. ([#61](https://github.com/fivetran/dbt_zendesk/pull/61))
 # dbt_zendesk v0.8.0-b1
 🎉 dbt v1.0.0 Compatibility Pre Release 🎉 An official dbt v1.0.0 compatible version of the package will be released once existing feature/bug PRs are merged.
 ## 🚨 Breaking Changes 🚨

diff --git a/DECISIONLOG.md b/DECISIONLOG.md
@@ -1,10 +1,114 @@
-### Zendesk Backlog Tickets
+# Decision Log
 
+
+## Zendesk Backlog Tickets
 - You may find some discrepancies between what Zendesk reports and our model the total number of backlog tickets on a given day. After investigating this we have realized this is due to Zendesk taking a snapshot of each day sometime in the 23rd hour as stated in their [article.](https://support.zendesk.com/hc/en-us/articles/4408819342490-Why-does-the-Backlog-dataset-only-show-the-Backlog-recorded-Hour-as-23-).
- 
+
 ```
 Because backlog data is captured on a per-day basis, it cannot be segmented hourly. The Backlog recorded - Hour is listed as 23 because data is captured daily between 11 pm, 12 am, or 1 am depending on factors like Daylight Saving Time (DST). 
 For more information, see the article: Analyzing your ticket backlog history with Explore.
 ```
 
 - While Zendesk doesn't segment their backlog data per hour, on the other hand we always try to model our data starting at a greater granularity. This means we start by taking the _hour_ from the timestamp field from the Zendesk source tables then bringing it to _day_. Therefore there will be edge cases where tickets updated near the end of day may fall into different statuses, depending on whether you're looking at the Zendesk Backlog dashboard or our model outputs.
+
+## Business Time Metrics
+When developing this package we noticed Zendesk reported ticket response times in business minutes based on the last schedule which is applied to the ticket. However, we felt this is not an accurate representation of the true ticket elapsed time in business minutes. Therefore, we took the opinionated decision to apply logic within our transformations to calculate the cumulative elapsed time in business minutes of a ticket across **all** schedules which the ticket was assigned during it's lifetime.
+
+Below is a quick explanation of how this is calculated within the dbt package for **first_reply_time_business_minutes** as well as how this differs from Zendesk's logic:
+> Note: While this is an example of `first_reply_time_business_minutes`, the logic is the same for other business minute metrics.
+
+- A ticket (`941606`) is created on `2020-09-29 17:01:38 UTC` and first solved at `2020-10-01 15:03:44 UTC`.
+- When the ticket was created it was assigned the schedule `Level 1 Chicago`
+  - The schedule intervals are expressed as the number of minutes since the start of the week.
+  - Sunday is considered the start of the week.
+- The `Level 1 Chicago` schedule can be interpreted as the following:
+
+| **start_time_utc** | **end_time_utc**  | 
+| ------------------ | ----------------- |
+| 720  | 1560  |
+| 2160 | 3000  |
+| 3600 | 4440  |
+| 5040 | 5880  |
+| 6480 | 7320  |
+| 7920 | 8760  |
+| 9360 | 10200 |
+
+- Looking closer into the ticket, we also see another schedule `Level 2 San Francisco` was assigned to the ticket on `2020-09-30 19:01:25 UTC`
+- The `Level 2 San Francisco` schedule can be interpreted as the following:
+
+| **start_time_utc** | **end_time_utc**  | 
+| ------------------ | ----------------- |
+| 2340 | 2910 |
+| 3780 | 4350 |
+| 5220 | 5790 |
+| 6660 | 7230 |
+| 8100 | 8670 |
+
+- Now that we know the ticket had two schedules, let's see the comments exchanged within this ticket to capture when the `first_reply_time` was recorded.
+
+| **ticket_id** | **field_name** | **is_public** | **commenter_role** | **valid_starting_at** |
+| ------------- | -------------- | ------------- | ------------------ | --------------------- |
+| 941606 | comment | TRUE | external_comment | 2020-09-29 17:01:38 UTC |
+| 941606 | comment | FALSE | internal_comment | 2020-09-30 19:01:25 UTC |
+| 941606 | comment | TRUE | internal_comment | 2020-09-30 19:01:46 UTC |
+| 941606 | comment | TRUE | internal_comment | 2020-10-01 15:03:44 UTC |
+
+- Seeing the comments made to the ticket, we understand that the customer commented on the ticket at `2020-09-29 17:01:38 UTC` and the first **public** internal comment was made at `2020-09-30 19:01:46 UTC`.
+- In comparison of the two schedules associated with this ticket, we can see that the `Level 1 Chicago` schedule was set for almost the entire duration of the ticket before the first reply. Whereas, the `Level 2 San Francisco` schedule was only set for 21 seconds.
+  - Regardless, we will be using both schedules in the calculation of the `first_reply_time_business_minutes`.
+- Now that we have the schedules, the schedule intervals, and the first_reply_time we can calculate the total elapsed `first_reply_time_business_minutes`. But, let's first convert the UTC timestamps to the Zendesk-esque intervals expressed within the schedules:
+> The `Interval Results` are calculate via: `(Full Days From Sunday * 24 * 60) + (Hours * 60) + Minutes`
+
+| **Action** | **Timestamp** | **Full Days from Sunday** | **Hours** | **Minutes** | **Interval Result** |
+| ---------- | ------------- | ------------------------- | --------- | ----------- | ------------------- |
+| Ticket Created and Schedule set to Level 1 Chicago | `Tuesday, September 29, 2020 at 5:01:38 PM` | 2 | 17 | 2 | 3902 |
+| Schedule changed to Level 2 San Francisco | `Wednesday, September 30, 2020 at 7:01:25 PM` | 3 | 19 | 1.25 | 5461.25 |
+| First Public Internal Comment | `Wednesday, September 30, 2020 at 7:01:46 PM` | 3 | 19 | 1.46 | 5461.46 |
+
+- With the Interval Results obtained above, we can see where these overlap within the schedules.
+
+**Level 1 Chicago**
+> Overlap was from 3902 to 5461.25 and falls within two intervals
+
+| **start_time_utc** | **end_time_utc**  | 
+| ------------------ | ----------------- |
+| 720  | 1560  |
+| 2160 | 3000  |
+| >**3600**<  | >**4440**<  |
+| >**5040**< | >**5880**<  |
+| 6480 | 7320  |
+| 7920 | 8760  |
+| 9360 | 10200 |
+
+**Level 2 San Francisco**
+> Only overlap was from 5461.25 to 5461.46 and falls within one interval
+
+| **start_time_utc** | **end_time_utc**  | 
+| ------------------ | ----------------- |
+| 2340 | 2910 |
+| 3780 | 4350 |
+| >**5220**< | >**5790**< |
+| 6660 | 7230 |
+| 8100 | 8670 |
+
+- Now let's figure out the overlapping duration
+
+| **Schedule** | **Schedule start_time_utc** | **Schedule end_time_utc**  | **Ticket Start** | **Ticket End** | **Difference** | 
+|----| ------------------ | -----------------| ------------------ | -----------------| ------------------ |
+| `Level 1 Chicago` | 3600 | >**4440**< | >**3902**< | 5461.25 | 538 |
+| `Level 1 Chicago` | >**5040**< | 5880 | 3902 | >**5461.25**< | 421.25 |
+| `Level 2 San Francisco` | >**5220**< (We use **5461.25** to account for overlap) | 5790 | 5462 | >**5461.46**< | .21 |
+
+- Adding the differences above we arrive at a total `first_reply_time_business_minutes` of 959.46 minutes.
+
+- So how does Zendesk calculate this?
+  - Instead of taking into account the various schedules used by the ticket, Zendesk will instead use the **last** schedule applied to the ticket to record the duration in business minutes.
+- Therefore, in the example above Zendesk will **only** use the `Level 2 San Francisco` schedule when calculating the `first_reply_time_business_minutes` for ticket `941606`.
+  - Below is an example of how Zendesk calculates this:
+
+| **Schedule** | **Schedule start_time_utc** | **Schedule end_time_utc**  | **Ticket Start** | **Ticket End** | **Difference** | 
+|----| ------------------ | -----------------| ------------------ | -----------------| ------------------ |
+| `Level 2 San Francisco` | 3780 | >**4350**<  | 3902 | 5461.46 | 448 |
+| `Level 2 San Francisco` | >**5220**<  | 5790 | 3902 | 5461.46 | 241.46 |
+
+- Adding the differences above we arrive at a total `first_reply_time_business_minutes` of 689.46 minutes.
diff --git a/README.md b/README.md
@@ -30,7 +30,7 @@ Include in your `packages.yml` to stay up to date with the latest release!
 ```yaml
 packages:
   - package: fivetran/zendesk
-    version: 0.8.0-b1
+    version: [">=0.8.0", "<0.9.0"]
 ```
 ## Package Maintenance
 The Fivetran team maintaining this package **only** maintains the latest version. We highly recommend you keep your `packages.yml` updated with the [dbt hub latest version](https://hub.getdbt.com/fivetran/zendesk/latest/). You may refer to the [CHANGELOG](/CHANGELOG.md) and release notes for more information on changes across versions.
@@ -137,7 +137,9 @@ vars:
   zendesk:
     ticket_field_history_timeframe_years: integer_number_of_years # default = 50 (everything)
 ```
-
+## Opinionated Modelling Decisions
+### Business Time Metrics Logic
+This dbt package takes an opinionated stance on how business time metrics are calculated. The dbt package takes **all** schedules into account when calculating the business time duration. Whereas, the Zendesk UI logic takes into account **only** the latest schedule assigned to the ticket. If you would like a deeper explanation of the logic used by default in the dbt package you may reference the [DECISIONLOG](/DECISIONLOG.md).
 ## Database support
 This package is compatible with BigQuery, Snowflake, Redshift and Postgres.
 

diff --git a/dbt_project.yml b/dbt_project.yml
@@ -43,6 +43,8 @@ vars:
     ticket_tag: "{{ ref('stg_zendesk__ticket_tag') }}"
     user_tag: "{{ ref('stg_zendesk__user_tag') }}"
     user: "{{ ref('stg_zendesk__user') }}"
+    daylight_time: "{{ ref('stg_zendesk__daylight_time') }}"
+    time_zone: "{{ ref('stg_zendesk__time_zone') }}"
     using_schedules: true
     using_domain_names: true
     using_user_tags: true

diff --git a/docs/catalog.json b/docs/catalog.json