Properly handle recurring downtimes definitions #1092

armcburney · 2021-06-01T15:04:02Z

Overview

This PR implements a workaround for handling recurring downtimes with the Datadog Terraform provider. Recurring downtimes differ from regular 'one-off' downtimes in that subsequent recurrences are scheduled as new downtimes from the previous parent definition. Conceptually, we can think of this as a "linked list" of downtimes where each subsequent downtime is scheduled from the previous scheduled recurrence downtime definition. All fields are copied over from the previous parent with the exception of certain fields like start and end, which are calculated off of the recurrence attribute.

The Datadog Terraform provider keeps reference to this original parent downtime ID. Since subsequent recurrence downtimes are scheduled as new downtimes (with a new ID), updates in the UI/API to the existing 'child' downtime corresponding to the original recurrence would not previously be recognized when comparing the downtime's state with what we store in Terraform. Moreover, after downtimes expire, we delete them from our database after a certain period of time. This behavior was recently changed so that we don't delete downtimes after they expire if they are the first downtime in the recurrence chain (i.e., the original parent downtime).

Instead, we now return that downtime with a new active_child field in our GET /api/v1/downtime API - which we use to compare state with Terraform. This way updates from the UI/API will be propagated back to Terraform. Additionally, when making updates through Terraform, we call the PUT /api/v1/downtime/{downtime_id} endpoint with the active_child definition on the active_child's ID, so that changes from Terraform will be made to the current active recurrence downtime.

Caveats

Caveat One

UPDATE: We don't check the start/end boundaries for changes if the recurring downtime is a child to prevent superfluous diffs every time the downtime is rescheduled. The con to this approach is that if the start/end values are changed in the UI on a child recurring downtime, the diff will not be picked up by Terraform. We plan to iterate on this solution to address this shortcoming, but feel the benefits of the child/parent references being handled by Terraform are worth merging in.

~~Since new downtimes are scheduled each time a recurrence is rescheduled, fields like start and end will perpetually differ after the first schedule when running terraform plan/terraform apply.~~

$ terraform plan
datadog_downtime.recurring_downtime: Refreshing state... [id=1337]

An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
  ~ update in-place

Terraform will perform the following actions:

  # datadog_downtime.recurring_downtime will be updated in-place
  ~ resource "datadog_downtime" "recurring_downtime" {
      ~ end             = 1621016100 -> 1620929700
        id              = "1337"
      ~ message         = "test this out edit ui" -> "test this out"
      ~ start           = 1621015800 -> 1620929400
        # (6 unchanged attributes hidden)

        # (1 unchanged block hidden)
    }

Plan: 0 to add, 1 to change, 0 to destroy.

I recommend we update the documentation around recurring downtimes to document this problem and recommended an approach to mitigating this (i.e., using ignore_changes to ignore the start and end updates).

resource "datadog_downtime" "recurring_downtime" {
  scope        = ["*"]
  start        = 1620929400
  end          = 1620929700
  timezone     = "EST"
  monitor_tags = ["github:armcburney"]
  message      = "test this out"

  recurrence {
    type   = "days"
    period = 1
  }

  lifecycle {
    ignore_changes = [
      start,
      end
    ]
  }
}

Caveat Two

Recurring downtimes created before 2021-05-13 using Terraform will need to be deleted and recreated for the references to work with the newest version of the Terraform provider. All recurring downtimes created after 2021-05-13 will have the updated parent/child references, ensuring they’ll work as expected with the latest version of the provider. We apologize for this inconvenience.

Fixes

…e child recurrence downtime.

…g_downtimes

datadog/resource_datadog_downtime.go

phillip-dd · 2021-06-07T13:41:30Z

thanks @armcburney! This makes sense to me, except for this part:

fields like start and end will perpetually differ after the first schedule

From a customer perspective I don't think this is what we want - the provider should not be showing a diff when this is working as expected. As well, if customers use ignore_changes, then they won't be able to see any true diffs if the recurrence is updated in the UI.

Some other options:

ignore start/end explicitly in the provider
always compare start/end on the parent downtime (this would only catch changes if the original parent was changed)
compare duration: e.g. end-start
something else?

…ctive_child.

…g_downtimes

phillip-dd

👍 couple of questions, but this looks good to me. Definitely an improvement!

phillip-dd · 2021-06-07T12:35:07Z

GNUmakefile

 	gotestsum --hide-summary skipped --format testname --debug --packages $(TEST) -- $(TESTARGS) -timeout=30s

 # Run acceptance tests (this runs integration CRUD tests through the terraform test framework)
-testacc: get-test-deps fmtcheck lint
+testacc: get-test-deps fmtcheck


I saw there was a slack conversation about this, @therve can you just confirm this is what the recommendation was?

Yes we'll come back to fix it later.

phillip-dd · 2021-06-11T01:16:34Z

docs/resources/downtime.md

@@ -50,6 +50,7 @@ resource "datadog_downtime" "foo" {

 ### Optional

+- **active_child_id** (Number) The id corresponding to the downtime object definition of the active child for the original parent recurring downtime. This field will only exist on recurring downtimes.


if we include this, I think it should be actually read only.

Bumping on this, is this part auto generated or a copy/paste?

The doc is auto generated. But to avoid this, we should remove the line below since the attribute is read only:

terraform-provider-datadog/datadog/resource_datadog_downtime.go

Line 253 in f06954f

Optional: true,

datadog/resource_datadog_downtime.go

…curring downtimes.

phillip-dd

lgtm - we may need to update the PR title for the change log

therve · 2021-06-17T13:45:22Z

/azp run

azure-pipelines · 2021-06-17T13:45:35Z

Azure Pipelines successfully started running 1 pipeline(s).

armcburney added 6 commits May 13, 2021 12:44

[MA-2231] Properly handle recurring downtimes definitions in terraform.

d5dd8a8

[MA-2231] Fix getID types for active_child_id case.

1c5cbc1

[MA-2231] Fix updates for active child downtimes.

40435c8

[MA-2231] Don't set the ID we store in state when we update the activ…

0f9c443

…e child recurrence downtime.

Merge remote-tracking branch 'origin/master' into armcburney/recurrin…

264c792

…g_downtimes

[MA-2231] Fix build after merging origin/master.

ea0614d

armcburney commented Jun 1, 2021

View reviewed changes

datadog/resource_datadog_downtime.go Show resolved Hide resolved

armcburney added 2 commits June 1, 2021 12:14

[MA-2231] Remove deprecated golint check.

e1a6de8

[MA-2231] Commit autogenerated documentation.

a544f70

armcburney marked this pull request as ready for review June 2, 2021 11:59

armcburney requested review from a team as code owners June 2, 2021 11:59

armcburney added 3 commits June 9, 2021 16:35

[MA-1784] Ignore start/end changes if the recurring downtime is the a…

4c7a5b8

…ctive_child.

Merge remote-tracking branch 'origin/master' into armcburney/recurrin…

73aceb4

…g_downtimes

[MA-1784] Use d.GetOk() interface rather than checking nil.

1bee4d9

phillip-dd reviewed Jun 11, 2021

View reviewed changes

[MA-1784] Bugfixes for not being able to change start/end on child re…

f06954f

…curring downtimes.

phillip-dd previously approved these changes Jun 15, 2021

View reviewed changes

therve changed the title ~~[MA-2231] Properly handle recurring downtimes definitions in terraform.~~ Properly handle recurring downtimes definitions Jun 17, 2021

Remove optional

77e0093

therve dismissed phillip-dd’s stale review via 77e0093 June 17, 2021 13:44

therve approved these changes Jun 17, 2021

View reviewed changes

therve enabled auto-merge (squash) June 17, 2021 13:45

therve merged commit 8a594bb into master Jun 17, 2021

therve deleted the armcburney/recurring_downtimes branch June 17, 2021 14:02

NBParis mentioned this pull request Jun 18, 2021

Downtime recreated after recurrence #391

Closed

NBParis linked an issue Jun 18, 2021 that may be closed by this pull request

Downtime recreated after recurrence #391

Closed

NBParis mentioned this pull request Jun 18, 2021

Feature request: Please support parent_id on recurring Downtimes #109

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Properly handle recurring downtimes definitions #1092

Properly handle recurring downtimes definitions #1092

armcburney commented Jun 1, 2021 •

edited

Loading

phillip-dd commented Jun 7, 2021

phillip-dd left a comment

phillip-dd Jun 7, 2021

therve Jun 11, 2021

phillip-dd Jun 11, 2021

phillip-dd Jun 15, 2021

skarimo Jun 15, 2021

phillip-dd left a comment

therve commented Jun 17, 2021

azure-pipelines bot commented Jun 17, 2021

		@@ -50,6 +50,7 @@ resource "datadog_downtime" "foo" {

		### Optional

		- active_child_id (Number) The id corresponding to the downtime object definition of the active child for the original parent recurring downtime. This field will only exist on recurring downtimes.

Properly handle recurring downtimes definitions #1092

Properly handle recurring downtimes definitions #1092

Conversation

armcburney commented Jun 1, 2021 • edited Loading

Overview

Caveats

Caveat One

Caveat Two

Fixes

phillip-dd commented Jun 7, 2021

phillip-dd left a comment

Choose a reason for hiding this comment

phillip-dd Jun 7, 2021

Choose a reason for hiding this comment

therve Jun 11, 2021

Choose a reason for hiding this comment

phillip-dd Jun 11, 2021

Choose a reason for hiding this comment

phillip-dd Jun 15, 2021

Choose a reason for hiding this comment

skarimo Jun 15, 2021

Choose a reason for hiding this comment

phillip-dd left a comment

Choose a reason for hiding this comment

therve commented Jun 17, 2021

azure-pipelines bot commented Jun 17, 2021

armcburney commented Jun 1, 2021 •

edited

Loading