Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grafana Provider Error - "Set the auth and url provider attributes" #960

Open
Will282 opened this issue Jun 29, 2023 · 24 comments
Open

Grafana Provider Error - "Set the auth and url provider attributes" #960

Will282 opened this issue Jun 29, 2023 · 24 comments
Labels

Comments

@Will282
Copy link

Will282 commented Jun 29, 2023

Terraform Version

  • Terraform: 1.5.2
  • Terraform Grafana Provider: 1.42.0
  • Grafana: 9.4

Affected Resource(s)

Error raised by Grafana Provider directly

provider "grafana" {
  url  = "https://${module.core_infra.grafana_workspace_endpoint}"
  auth = module.core_infra.grafana_api_key
}

Where core_infra is a module which instantiates an Amazon Managed Grafana instance using terraform-aws-modules/managed-service-grafana

Resources being deployed:

  • grafana_data_source
  • grafana_folder
  • grafana_dashboard

Terraform Configuration Files

Working on an example I can share.

Debug Output

Working on an example I can share the full output from.
An example error is as below:

{
  "@level": "error",
  "@message": "Error: the Grafana client is required for `grafana_folder`. Set the auth and url provider attributes",
  "@module": "terraform.ui",
  "@timestamp": "2023-06-29T13:23:12.126952Z",
  "diagnostic": {
    "severity": "error",
    "summary": "the Grafana client is required for `grafana_folder`. Set the auth and url provider attributes",
    "detail": "",
    "address": "module.usecase_module_one.module.generic_grafana_module.grafana_folder.usecase_folder",
    "range": {
      "filename": ".terraform/modules/usecase_module_one.generic_grafana_module/grafana.tf",
      "start": { "line": 25, "column": 44, "byte": 917 },
      "end": { "line": 25, "column": 45, "byte": 918 }
    },
    "snippet": {
      "context": "resource \"grafana_folder\" \"usecase_folder\"",
      "code": "resource \"grafana_folder\" \"usecase_folder\" {",
      "start_line": 25,
      "highlight_start_offset": 43,
      "highlight_end_offset": 44,
      "values": []
    }
  },
  "type": "diagnostic"
}

Panic Output

N/A

Expected Behavior

We use "usecase" modules and have the following structure in our Terraform workspace:
Workspace

  • core_infra module (contains Amazon Managed Grafana)
  • Grafana Provider (using output from core_infra)
  • usecase_module_one
    • generic_grafana_module (contains Grafana resources grafana_data_source & grafana_folder, grafana_dashboard)
  • usecase_module_two
    • generic_grafana_module (contains Grafana resources grafana_data_source & grafana_folder, grafana_dashboard)

The Grafana Provider is created using the output of the core_infra module, specifically module.core_infra.grafana_workspace_endpoint and module.core_infra.grafana_api_key to configure the Provider with the "url" and "auth" parameters.

We then add a new "usecase_module" using the same underlying "generic_grafana_module" as follows:

Current Workspace

  • core_infra module
  • grafana provider
  • usecase_module_one
    • generic_grafana_module
  • usecase_module_two
    • generic_grafana_module
  • usecase_module_three
    • generic_grafana_module

This should update existing infra as required and add the grafana resources for "usecase_module_three"

Actual Behavior

When running the plan and apply to add "usecase_module_three" to our environment we get a Grafana Provider error only on resources related to "usecase_module_one" and "usecase_module_two". It successfully plans for the "usecase_module_three" deployment.

An example error is as below:

{
  "@level": "error",
  "@message": "Error: the Grafana client is required for `grafana_folder`. Set the auth and url provider attributes",
  "@module": "terraform.ui",
  "@timestamp": "2023-06-29T13:23:12.126952Z",
  "diagnostic": {
    "severity": "error",
    "summary": "the Grafana client is required for `grafana_folder`. Set the auth and url provider attributes",
    "detail": "",
    "address": "module.usecase_module_one.module.generic_grafana_module.grafana_folder.usecase_folder",
    "range": {
      "filename": ".terraform/modules/usecase_module_one.generic_grafana_module/grafana.tf",
      "start": { "line": 25, "column": 44, "byte": 917 },
      "end": { "line": 25, "column": 45, "byte": 918 }
    },
    "snippet": {
      "context": "resource \"grafana_folder\" \"usecase_folder\"",
      "code": "resource \"grafana_folder\" \"usecase_folder\" {",
      "start_line": 25,
      "highlight_start_offset": 43,
      "highlight_end_offset": 44,
      "values": []
    }
  },
  "type": "diagnostic"
}

We get an error like this for each Grafana resource in "usecase_module_one" and "usecase_module_two"

Steps to Reproduce

  1. Define modules as per the structure in "Expected Behaviour" with only "core_infra", "usecase_module_one", "usecase_module_two"
  2. Deploy this with terraform plan & terraform apply
  3. Add "usecase_module_three" to the terraform code
  4. terraform plan - Will fail as per behaviour in "Actual Behaviour"

Important Factoids

N/A

References

None

@Will282 Will282 added the bug label Jun 29, 2023
@julienduchesne
Copy link
Member

If you inspect your state, are both the module.core_infra attributes you're using set?

@Will282
Copy link
Author

Will282 commented Jun 30, 2023

Hi @julienduchesne,

I checked the state file and could find the values referenced by those core_infra outputs in the infrastructure. I.e. I could find the module.core_infra.module.managed_grafana.endpoint which is mapped in the core_infra/outputs.tf as module.core_infra.grafana_workspace_endpoint.

Just to be 100% sure, I stored module.core_infra.grafana_workspace_endpoint and module.core_infra. grafana_api_key outputs as terraform_data resources which I then referenced in the Grafana Provider as below:

provider "grafana" {
  url  = terraform_data.workspace_endpoint_url.output
  auth = terraform_data.workspace_key.output
}

I could clearly see these terraform_data values in the state file, see an extract below:

    {
      "mode": "managed",
      "type": "terraform_data",
      "name": "workspace_endpoint_url",
      "provider": "provider[\"terraform.io/builtin/terraform\"]",
      "instances": [
        {
          "schema_version": 0,
          "attributes": {
            "id": "62eb70b9-a57a-af7e-67c4-048d26451738",
            "input": {
              "value": "https://g-28jd182fk9.grafana-workspace.us-east-1.amazonaws.com",
              "type": "string"
            },
            "output": {
              "value": "https://g-28jd182fk9.grafana-workspace.us-east-1.amazonaws.com",
              "type": "string"
            },
            "triggers_replace": null
          },
          "sensitive_attributes": [],
          "dependencies": [
            "module.core_infra.aws_iam_role.grafana_service_role",
            "module.core_infra.module.managed_grafana.aws_grafana_workspace.this",
            "module.core_infra.module.managed_grafana.aws_iam_role.this",
            "module.core_infra.module.managed_grafana.aws_security_group.this",
            "module.core_infra.module.managed_grafana.data.aws_iam_policy_document.assume",
            "module.core_infra.module.managed_grafana.data.aws_partition.current",
            "module.core_infra.module.managed_grafana.data.aws_subnet.this"
          ]
        }
      ]
    },

Still had the same Provider error.

@Will282
Copy link
Author

Will282 commented Jun 30, 2023

Another thing to note is that I also tried an plan & apply using the existing TF (without"usecase_module_three" being added) and the plan fails in the same manner as the original issue comment.

If I hardcode a dummy Grafana workspace URL and API Key,

provider "grafana" {
  url  = "https://grafana.example.com"
  auth = "somekey"
}

Then the plan fails with an error

│ Error: Get "https://grafana.example.com/api/folders?limit=1000&page=1": dial tcp: lookup grafana.example.com on 10.184.0.2:53: no such host
│ 
│   with module.usecase_module_one.module.inference_infra.grafana_folder.usecase_folder,
│   on .terraform/modules/usecase_module_one.inference_infra/grafana.tf line 25, in resource "grafana_folder" "usecase_folder":
│   25: resource "grafana_folder" "usecase_folder" { 

@Will282
Copy link
Author

Will282 commented Jun 30, 2023

A further note, I retried the plan & apply with existing TF (no changes) and hardcoded different dummy Grafana workspace URL and API key and then the plan passed?!

Specifically I used:

provider "grafana" {
  url  = "https://example.com"
  auth = "somekey"
}

@julienduchesne
Copy link
Member

A plan of an unapplied resource will not do any remote calls

@Will282
Copy link
Author

Will282 commented Jul 7, 2023

Thanks @julienduchesne. If you read the message before that one, then it is making calls during the plan phase as the provider is erroring saying it's unable to reach the Grafana API endpoint.

@julienduchesne
Copy link
Member

Thanks @julienduchesne. If you read the message before that one, then it is making calls during the plan phase as the provider is erroring saying it's unable to reach the Grafana API endpoint.

Yes. If it's doing a remote call during a plan, it means it's doing a refresh of a resource that was previously applied

@beeradb
Copy link

beeradb commented Sep 20, 2023

I'm also having this issue. I have three environments that are managed via the same code using terraform workspaces. 2 of them fail with this same error, and the other one successfully. All three were created at similar times and should have valid states, as they are managed by TF cloud and all had passing runs on their last apply before this issue popped up.

The code for our deployments is heavily influenced by the example docs.

Here are the relevant resources:


# Declaring the first provider to be only used for creating the cloud-stack
provider "grafana" {
  alias = "first"
}

# Declaring the second provider to be used for creating resources in Grafana
provider "grafana" {
  alias = "second"
  url   = grafana_cloud_stack.target_env.url
  auth  = grafana_api_key.importer.key
}


resource "grafana_cloud_stack" "target_env" {
  provider    = grafana.first
  name        = var.environment_name
  slug        = "<redacted>${var.environment_name}"
  region_slug = "us" # Example “us”,”eu” etc
  url         = "https://${var.environment_name}.<redacted>"
}


# Creating an API key in Grafana instance to be used for creating resources in Grafana instance
resource "grafana_api_key" "importer" {
  cloud_stack_slug = grafana_cloud_stack.target_env.slug
  name             = "importer"
  role             = "Admin"
}

resource "grafana_folder" "target_folder" {
  provider = grafana.second
  title    = "target folder"
}

I was originally using version 1.28.0, but after encountering this, I tried upgrading to 1.43.0 and still have the same issue.

If I put API keys directly into the "auth" fields of my providers, a plan at least works.

@beeradb
Copy link

beeradb commented Sep 21, 2023

Update: if I invoke terraform with the -refresh=false flag my apply works.

So to recap:

  • This is an existing deployment which has worked without issue previously.
  • When trying to generate a plan I receive a Error: the Grafana client is required for 'grafana_folder'. Set the auth and url provider attributes error.
  • I can run terraform apply -refresh=false successfully.
  • I can also run terraform plan -refresh=false successfully.
  • Even after a successful apply, I continue to get auth errors any time I do not use the -refresh=false flag.

@fentonfentonfenton
Copy link

fentonfentonfenton commented Oct 21, 2023

Getting this too. From grafana_folder

Need to test whether you can use the provider fine as long as you don't create any folders.

the refresh=false workaround won't work long term for us as we can't use that in CI

@fentonfentonfenton
Copy link

Also seems to be present when using the grafana_folder data source

@fentonfentonfenton
Copy link

fentonfentonfenton commented Nov 13, 2023

@julienduchesne hey! let me kn ow if i can help you out with this - causing us a fair amount of hell in CI/CD

@julienduchesne
Copy link
Member

julienduchesne commented Nov 16, 2023

This issue is hard to remediate because in all the cases I've managed to reproduce, it's always that either the auth or URL are missing (as the message says). If I removed the error, you'd instead get a 401 error.

Here's an example: Folders and dashboards are managed by a service account token. That token is removed Grafana side. Terraform, on read, removes the token from state and so there's no auth anymore. The error triggers.

An ugly fix could be setting a depends_on condition for all resources that depend on previous resources for auth. For example, folders and dashboards would have a depends_on condition on the service account token resource that creates the auth used in their provider

@fentonfentonfenton
Copy link

fentonfentonfenton commented Nov 28, 2023

Hi @julienduchesne - gave depends_on a go but i still get this error

Why would the service account token be removed from inside grafana? (we don't touch them...) i do see that my token is expired one - so maybe that could be the root cause?

@beeradb
Copy link

beeradb commented Nov 28, 2023

Could this be an issue with API keys migrating to service accounts?

I'm using Grafana cloud and can confirm that in some environments the "importer" key we create still shows up under API keys, but in other deployments the API key tab is gone, and there is only a service accounts tab. In environments that only have the "service accounts" tab, it looks like the previous "importer" API key was upgraded.

Perhaps the root cause of this issue is provisioning an API key and then later doing the in-browser upgrade to service accounts? Once that latter step has happened, the deployments break since the API keys no longer exist?

@fentonfentonfenton
Copy link

We've only ever used service accounts. (only deployed this infra recently)

@julienduchesne
Copy link
Member

julienduchesne commented Nov 30, 2023

This issue is essentially https://discuss.hashicorp.com/t/depends-on-in-providers/42632.

Github issues:

Not sure there's anything we can do here. If a resource is being planned by a provider instance for which the auth is not in the state anymore (for any of many Terraform reasons), it is going to fail because Terraform provides an empty string for the auth

Users can get around that in a few ways:

  • Using multiple projects for their definitions. One project/dir has the service account and the other project/dir has the resources being applied with that SA. The token can be read across projects through outputs or with an orchestration system like terragrunt
  • Doing targeted (-target) plans and applies of the SA (and tokens) before doing a full terraform plan
  • Using terraform plan -refresh=false and terraform apply -refresh=false but that gets rid of one of the main features of TF which is drift reconciliation

@fentonfentonfenton
Copy link

OK. Is it possible that the token expiring is what takes it out of the state? I can't think of a reason that it'd leave our state otherwise.

If it is that, then I would imagine there is a case that in the grafana provider it handles the has_expired boolean and forces recreation, rather than removing it from the state.

@julienduchesne julienduchesne removed their assignment Feb 6, 2024
@NickAdolf
Copy link

I'm getting this on a brand new, un-applied workspace with no data lookups. I'm not making sense of this.

I create a grafana_team & grafana_folder and then get this error when trying grafana_folder_permissions_item

@julienduchesne
Copy link
Member

I'm getting this on a brand new, un-applied workspace with no data lookups. I'm not making sense of this.

I create a grafana_team & grafana_folder and then get this error when trying grafana_folder_permissions_item

This is a different one @NickAdolf. Here it is: #1485. It will be fixed in next release. Sorry about that!

@AlexandreCassagne
Copy link

We are also facing this issue. It seems to be blocked by hashicorp/terraform#2430, but are there other workarounds?

@AzySir
Copy link

AzySir commented Jun 21, 2024

This issue is persisting for me on

data "grafana_dashboard" "this" {
  uid = var.uid
}

@Duologic Duologic removed the bug label Oct 16, 2024
@thiduzz
Copy link

thiduzz commented Dec 6, 2024

Was having the same issue until I solved it by adding an alias to the provider and referencing it in the resource:

provider "grafana" {
  alias = "bare-metal" <--- ADD THIS
  url   = "https://YOUR_ENDPOINT"
  auth  = var.grafana_service_account_token //generated manually via Grafana UI
}

resource "grafana_data_source" "loki" {
  is_default = true
  provider = grafana.bare-metal <--- ADD THIS
  type                = "loki"
  name                = "Loki"
  url                 = "http://loki-gateway.monitoring.svc.cluster.local"
  lifecycle {
    ignore_changes = [json_data_encoded, http_headers]
  }
}

@DominicBortmes
Copy link

resolution/workaround:
one comment (from one of the maintainers) clarifies that the root cause is indeed terraform's lack of supporting dependencies on providers attributes.

before the issue arose, the team had created a service account, token and a grafana dashboard (Amazon Managed Grafana) without problems. (i guess on the first plan and apply, the grafana provider may behave different/more robust against missing provider dependency since no resource exists?). The issue arose once the token expired after 30 days (as per configuration).

Resolution steps:

  1. upgrade the grafana provider (to v3.15.3) and used FOSS module (terraform-aws-modules/managed-service-grafana/aws to 2.2.0) to lates
  2. remove the relevant stale resources (expired service account/token and downstream dependent resources) (both infrastructure and tf state (direct removal on state)
  3. make two consecutive plan and apply cycles (first to get a new service account and token), second for the provider to use the new credential for creating any arbitrary grafana provider resource.

Likely the issue arises after the configured expiry time of the token associated to the service account. Repeating steps 2 and 3 should help you recover again. Unf. you may have to deal with data loss of your grafana resources or try to integrate your backup/restore procedures into the above steps.

(it's important to note that in this scenario we had a quite large monolithic root stack that had eks, helm, aws, grafana providers active. Carving out just the grafana resources and modules into an own root stack would have been to invasive to the current design and cicd implantation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

10 participants