Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

False drift with nomad variables containing newlines #476

Closed
optiz0r opened this issue Aug 16, 2024 · 9 comments · Fixed by hashicorp/nomad#24423
Closed

False drift with nomad variables containing newlines #476

optiz0r opened this issue Aug 16, 2024 · 9 comments · Fixed by hashicorp/nomad#24423
Assignees
Labels
hcc/jira Admin - internal

Comments

@optiz0r
Copy link

optiz0r commented Aug 16, 2024

Hi there,

Thank you for opening an issue. Please note that we try to keep the Terraform issue tracker reserved for bug reports and feature requests. For general usage questions, please see: https://www.terraform.io/community.html.

Terraform Version

Terraform v1.5.1
on linux_amd64
+ provider registry.terraform.io/hashicorp/nomad v2.3.0

Only tested on 1.5.1 but very likely to be the same on newer versions, due to relevant logic being in nomad core and this provider only.

Nomad Version

Nomad v1.8.3+ent
BuildDate 2024-08-13T07:52:39Z
Revision 82fa712be0e7c1e07d6d630e0583c188347411ee

Provider Configuration

data "vault_nomad_access_token" "nomad_token" {
  backend = "nomad"
  role    = "admin"
}

provider "nomad" {
  secret_id = local.is_remote_run ? data.vault_nomad_access_token.nomad_token.secret_id : null
  region    = "london"
}

Environment Variables

NOMAD_ADDR=https://...:4646

Affected Resource(s)

Please list the resources as a list, for example:

  • nomad_job

Terraform Configuration Files

resource "nomad_job" "foo" {
  jobspec = file("${path.module}/foo.nomad.hcl")
  hcl2 {
    vars    = {
      example = <<-EOT
        This is an example multi
        line variable
        EOT
    }
  }
}

Debug Output

Omitted, includes sensitive data

Expected Behavior

If the example variable has not changed, no drift should be detected and no change to be made.

Actual Behavior

Since Nomad 1.8.2, Nomad re-encodes the newlines in variables received in JobSubmission so that the internal variables file is well formed, and the job once stopped through the UI can be started again.

This provider retrieves the modified variable definitions from the Nomad API, and then compares it to the unmodified content of the example variable in the terraform code. Since they no longer match, this is reported as a drift, and the nomad_job resource is refreshed.

This doesn't appear actually cause nomad to interrupt the running job, but does cause false reporting of changes.

Steps to Reproduce

  1. terraform apply
  2. terraform apply

Important Factoids

It's unclear if this is a fault in nomad core itself (i.e. the values returned by the read API should be unmodified to match what terraform would have already submitted), or if a fault in the provider (i.e. it should be pre-encoding the newlines before submission, or handling the diff ignoring changes in newline encoding).

References

@optiz0r
Copy link
Author

optiz0r commented Oct 18, 2024

Actually it's worse than just newlines. The HCL file is also not handling quotes properly.

Input file:

blah blah
some_string="foobar"
some_other_string="foo&bar"
settings_file="blah blah\nsome_string="foobar"\nsome_other_string="foo&bar""

Attempting to restart a job such as the above fails with an error that bitwise AND is not supported. It's not possible to restart a job which has been stopped if one of the input variables contained quotes and other symbols, which is a repeat of #23560.

@mmcquillan mmcquillan added the hcc/jira Admin - internal label Oct 18, 2024
@tgross
Copy link
Member

tgross commented Oct 18, 2024

@optiz0r the bug fix you linked to hashicorp/nomad#23560 landed in Nomad 1.8.2 (ref changelog) but looking at the releases for the provider that API didn't land until v2.3.1. Provider v2.3.0 has the 1.8.0 API.

Can you verify you're seeing this with a more recent version of the provider?

@optiz0r
Copy link
Author

optiz0r commented Oct 18, 2024

I was running on 2.3.0 of the provider, but having just tested, I am seeing the same behaviour on 2.3.1 and 2.4.0.

On reflection I think the quoting escape is actually a problem in the hashicorp/nomad#23560 implementation (improper encoding of user input), rather than a problem with the provider. The drift relating to encoded newlines might fall either in nomad core or this provider.

@tgross
Copy link
Member

tgross commented Oct 18, 2024

Ok, thanks @optiz0r, we'll take a look. (Leaving a note that this is being tracked internally as https://hashicorp.atlassian.net/browse/NET-11421)

@Juanadelacuesta Juanadelacuesta self-assigned this Nov 4, 2024
@Juanadelacuesta
Copy link
Member

Juanadelacuesta commented Nov 4, 2024

Hi @optiz0r I have been unsuccessfully trying to reproduce this bug with the following configuration:
Nomad provider: 2.4.0
Nomad: v1.9.1

Terraform file:

provider "nomad" {
  secret_id = local.is_remote_run ? data.vault_nomad_access_token.nomad_token.secret_id : null
}

data "vault_nomad_access_token" "nomad_token" {
  backend = "nomad"
  role    = "role-name"
}

locals {
  is_remote_run = true
}

resource "nomad_job" "foo" {
  jobspec = file("${path.module}/foo.nomad.hcl")
  hcl2 {
    vars    = {
      example = <<-EOT
        
        This is an example multi

        line changed
        EOT
    }
  }
}

Nomad job:

variable "example" {
  type = string
}

job "example" {
  group "group" {
    network {
      port "www" {
        to = 8007
      }
    }

    task "task" {

      identity {
        env = true
        file = true
      } 

      driver = "docker"
      config {
        image   = "busybox:1"
        command = "httpd"
        args    = ["-v", "-f", "-p", "${NOMAD_PORT_www}", "-h", "/${NOMAD_ALLOC_DIR}/"]
        ports   = ["www"]
      }

      template {
        data        = "<html>hello, world</html>"
        destination = "${NOMAD_ALLOC_DIR}/index.html"
      }
      
      template {
        data = var.example
        destination = "local/count-api.txt"
      }
      

      resources {
        cpu    = 100
        memory = 1000
      }
    }
  }
}

I have run terraform apply multiple times, interacted with the job from the ui, change the variable value, and Im still unable to trigger a false change. Is there anything else in your configuration that is not been taken into account here?

@optiz0r
Copy link
Author

optiz0r commented Nov 6, 2024

Hi @Juanadelacuesta ,

Principle difference is we're seeding the hcl2 variables via the file() function also.

resource "nomad_job" "foo" {
  jobspec = file("${path.module}/foo.nomad.hcl")
  hcl2 {
    vars    = {
      settings_file = file("${path.module}/settings_file.conf")
    }
  }
}

Under Nomad 1.8.3, looking at the Definition tab in the UI, using my example inputs from previous comment, this shows within HCL Variable Values textarea:

settings_file="blah blah\nsome_string="foobar"\nsome_other_string="foo&bar""

The quotes in the output of the file() call are not escaped at any point, and embedded literally into the hcl variable values file. This then prevents a stopped job being started again due to parse failure of the hcl variable values file.

I see the same with both the previous and current version of terraform-provider-nomad.

@Juanadelacuesta
Copy link
Member

Juanadelacuesta commented Nov 6, 2024

Hello @optiz0r thank you for the pointer, I will revisit it and hopefully we will solve it this time around.

@Juanadelacuesta
Copy link
Member

@optiz0r I see the error now, lets start working on fixing it!

@optiz0r
Copy link
Author

optiz0r commented Dec 19, 2024

Thanks for the fix. Is it possible this could be backported to 1.8+ent LTS?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hcc/jira Admin - internal
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants