Skip to content

Commit

Permalink
Move all dashboards to GitOps (#175)
Browse files Browse the repository at this point in the history
* Typo

* Remove Grafana provider

* Temp: move dashbaords to gitOps

* Move external labels to resource attributes

* Avoid DDoS with using 0.0.0.0

* Pre-commit

* Transition in two steps

Will need to remove provider in a separate version to provide a transition path as removing this will break terraform and leave orphans in the state

* Move patterns' dashboards creation to gitOps

Standardize config objects for patterns as well

* Pre-commit

* Create AMP dashboard from external source with Grafana provider

* Fix deprecated option

* Fix Flux requirements

* Run pre-commit

* Update example with operator

* Cleanup examples

* Update multicluster example

* Update multicluster example

* Drop dead variable

* Update docs

* Change GitOps branch name

* Update docs

* Replacing Secrets Manager to SSM to store Grafana API Key (#178)

* Fixing SSM

* Fixing SSM

* Replacing Secrets Manager with SSM

* Replacing Secrets Manager with SSM

* Update architecture diagram

* Update architecture diagram

* Update README.md

* Update index.md

* Fixing Grafana Operator Version

* Fix multicluster example

* Update docs

---------

Co-authored-by: Ela AWS <[email protected]>
Co-authored-by: Elamaran Shanmugam <[email protected]>
  • Loading branch information
3 people authored Jun 12, 2023
1 parent c5e4c0c commit fa38a90
Show file tree
Hide file tree
Showing 46 changed files with 361 additions and 4,460 deletions.
26 changes: 4 additions & 22 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,8 @@ your custom applications.
You also can monitor your Amazon Managed Service for Prometheus workspaces ingestion,
costs, active series with [this module](./modules/managed-prometheus-monitoring).

<img width="1501" alt="image" src="docs/images/dark-o11y-accelerator-amp-xray.png">
![image](https://github.com/aws-observability/terraform-aws-observability-accelerator/assets/10175027/e83f8709-f754-4192-90f2-e3de96d2e26c)


## Documentation

Expand All @@ -33,15 +34,6 @@ visit the [Amazon EKS cluster monitoring documentation](https://aws-observabilit
The sections below demonstrate how you can leverage AWS Observability Accelerator
to enable monitoring to an existing EKS cluster.

### v2.x changes

v2+ releases introduces couple of breaking changes compared to previous versions:

- `modules/workloads/infra` module moves to `modules/eks-monitoring`
- All EKS configuration options moves from the base module to the `eks-monitoring` module
- All EKS workload modules `modules/workloads/{java,nginx}` merge into `eks-monitoring` as configuration options (patterns), see [examples](./examples) to provide a more complete visibility
- All examples have been updated to reflect these changes
- Introducing GitOps for Grafana contents (Dashboards, Folders and Data sources) with [Grafana Operator](https://github.com/grafana-operator/grafana-operator) and [Flux CD](https://fluxcd.io/)

### Base Module

Expand Down Expand Up @@ -161,14 +153,13 @@ If you are interested in contributing, see the
| <a name="requirement_terraform"></a> [terraform](#requirement\_terraform) | >= 1.1.0 |
| <a name="requirement_aws"></a> [aws](#requirement\_aws) | >= 4.0.0 |
| <a name="requirement_awscc"></a> [awscc](#requirement\_awscc) | >= 0.24.0 |
| <a name="requirement_grafana"></a> [grafana](#requirement\_grafana) | 1.25.0 |
| <a name="requirement_grafana"></a> [grafana](#requirement\_grafana) | >= 1.25.0 |

## Providers

| Name | Version |
|------|---------|
| <a name="provider_aws"></a> [aws](#provider\_aws) | >= 4.0.0 |
| <a name="provider_grafana"></a> [grafana](#provider\_grafana) | 1.25.0 |

## Modules

Expand All @@ -180,8 +171,6 @@ No modules.
|------|------|
| [aws_prometheus_alert_manager_definition.this](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/prometheus_alert_manager_definition) | resource |
| [aws_prometheus_workspace.this](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/prometheus_workspace) | resource |
| [grafana_data_source.amp](https://registry.terraform.io/providers/grafana/grafana/1.25.0/docs/resources/data_source) | resource |
| [grafana_folder.this](https://registry.terraform.io/providers/grafana/grafana/1.25.0/docs/resources/folder) | resource |
| [aws_grafana_workspace.this](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/grafana_workspace) | data source |
| [aws_region.current](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/region) | data source |

Expand All @@ -190,12 +179,10 @@ No modules.
| Name | Description | Type | Default | Required |
|------|-------------|------|---------|:--------:|
| <a name="input_aws_region"></a> [aws\_region](#input\_aws\_region) | AWS Region | `string` | n/a | yes |
| <a name="input_create_dashboard_folder"></a> [create\_dashboard\_folder](#input\_create\_dashboard\_folder) | Boolean flag to enable Amazon Managed Grafana folder and dashboards | `bool` | `true` | no |
| <a name="input_create_prometheus_data_source"></a> [create\_prometheus\_data\_source](#input\_create\_prometheus\_data\_source) | Boolean flag to enable Amazon Managed Grafana datasource | `bool` | `true` | no |
| <a name="input_enable_alertmanager"></a> [enable\_alertmanager](#input\_enable\_alertmanager) | Creates Amazon Managed Service for Prometheus AlertManager for all workloads | `bool` | `false` | no |
| <a name="input_enable_managed_prometheus"></a> [enable\_managed\_prometheus](#input\_enable\_managed\_prometheus) | Creates a new Amazon Managed Service for Prometheus Workspace | `bool` | `true` | no |
| <a name="input_grafana_api_key"></a> [grafana\_api\_key](#input\_grafana\_api\_key) | Grafana API key for the Amazon Managed Grafana workspace | `string` | n/a | yes |
| <a name="input_managed_grafana_workspace_id"></a> [managed\_grafana\_workspace\_id](#input\_managed\_grafana\_workspace\_id) | Amazon Managed Grafana Workspace ID | `string` | `""` | no |
| <a name="input_managed_grafana_workspace_id"></a> [managed\_grafana\_workspace\_id](#input\_managed\_grafana\_workspace\_id) | Amazon Managed Grafana Workspace ID | `string` | n/a | yes |
| <a name="input_managed_prometheus_workspace_id"></a> [managed\_prometheus\_workspace\_id](#input\_managed\_prometheus\_workspace\_id) | Amazon Managed Service for Prometheus Workspace ID | `string` | `""` | no |
| <a name="input_managed_prometheus_workspace_region"></a> [managed\_prometheus\_workspace\_region](#input\_managed\_prometheus\_workspace\_region) | Region where Amazon Managed Service for Prometheus is deployed | `string` | `null` | no |
| <a name="input_tags"></a> [tags](#input\_tags) | Additional tags (e.g. `map('BusinessUnit`,`XYZ`) | `map(string)` | `{}` | no |
Expand All @@ -205,15 +192,10 @@ No modules.
| Name | Description |
|------|-------------|
| <a name="output_aws_region"></a> [aws\_region](#output\_aws\_region) | AWS Region |
| <a name="output_grafana_dashboard_folder_created"></a> [grafana\_dashboard\_folder\_created](#output\_grafana\_dashboard\_folder\_created) | Boolean value indicating if the module created a dashboard folder in Amazon Managed Grafana |
| <a name="output_grafana_dashboards_folder_id"></a> [grafana\_dashboards\_folder\_id](#output\_grafana\_dashboards\_folder\_id) | Grafana folder ID for automatic dashboards. Required by workload modules |
| <a name="output_grafana_prometheus_datasource_test"></a> [grafana\_prometheus\_datasource\_test](#output\_grafana\_prometheus\_datasource\_test) | Grafana save & test URL for Amazon Managed Prometheus workspace |
| <a name="output_managed_grafana_workspace_endpoint"></a> [managed\_grafana\_workspace\_endpoint](#output\_managed\_grafana\_workspace\_endpoint) | Amazon Managed Grafana workspace endpoint |
| <a name="output_managed_grafana_workspace_id"></a> [managed\_grafana\_workspace\_id](#output\_managed\_grafana\_workspace\_id) | Amazon Managed Grafana workspace ID |
| <a name="output_managed_prometheus_workspace_endpoint"></a> [managed\_prometheus\_workspace\_endpoint](#output\_managed\_prometheus\_workspace\_endpoint) | Amazon Managed Prometheus workspace endpoint |
| <a name="output_managed_prometheus_workspace_id"></a> [managed\_prometheus\_workspace\_id](#output\_managed\_prometheus\_workspace\_id) | Amazon Managed Prometheus workspace ID |
| <a name="output_managed_prometheus_workspace_region"></a> [managed\_prometheus\_workspace\_region](#output\_managed\_prometheus\_workspace\_region) | Amazon Managed Prometheus workspace region |
| <a name="output_prometheus_data_source_created"></a> [prometheus\_data\_source\_created](#output\_prometheus\_data\_source\_created) | Boolean value indicating if the module created a prometheus data source in Amazon Managed Grafana |
<!-- END OF PRE-COMMIT-TERRAFORM DOCS HOOK -->

## Contributing
Expand Down
24 changes: 13 additions & 11 deletions docs/concepts.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,22 +39,24 @@ The grafana-operator is a Kubernetes operator built to help you manage your Graf

GitOps is a way of managing application and infrastructure deployment so that the whole system is described declaratively in a Git repository. It is an operational model that offers you the ability to manage the state of multiple Kubernetes clusters leveraging the best practices of version control, immutable artifacts, and automation. Flux is a declarative, GitOps-based continuous delivery tool that can be integrated into any CI/CD pipeline. It gives users the flexibility of choosing their Git provider (GitHub, GitLab, BitBucket). Now, with grafana-operator supporting the management of external Grafana instances such as Amazon Managed Grafana, operations personas can use GitOps mechanisms using CNCF projects such as Flux to create and manage the lifecycle of resources in Amazon Managed Grafana.

We have setup a [GitRepository](https://fluxcd.io/flux/components/source/gitrepositories/) and [Kustomization](https://fluxcd.io/flux/components/kustomize/kustomization/) using flux to sync our GitHub Repository to add Grafana Datasources, folder and Dashboards to Amazon Managed Grafana using Grafana Operator. GitRepository defines a Source to produce an Artifact for a Git repository revision. Kustomization defines a pipeline for fetching, decrypting, building, validating and applying Kustomize overlays or plain Kubernetes manifests. we are also using [Flux Post build variable substitution](https://fluxcd.io/flux/components/kustomize/kustomization/#post-build-variable-substitution) to dynamically render variables such as AMG_AWS_REGION, AMP_ENDPOINT_URL, AMG_ENDPOINT_URL,GRAFANA_NODEEXP_DASH_URL on the YAML manifests during deployment time to avoid hardcoding on the YAML manifests stored in Git repo.
We have setup a [GitRepository](https://fluxcd.io/flux/components/source/gitrepositories/) and [Kustomization](https://fluxcd.io/flux/components/kustomize/kustomization/) using Flux to sync our GitHub Repository to add Grafana Datasources, folder and Dashboards to Amazon Managed Grafana using Grafana Operator. GitRepository defines a Source to produce an Artifact for a Git repository revision. Kustomization defines a pipeline for fetching, decrypting, building, validating and applying Kustomize overlays or plain Kubernetes manifests. we are also using [Flux Post build variable substitution](https://fluxcd.io/flux/components/kustomize/kustomization/#post-build-variable-substitution) to dynamically render variables such as AMG_AWS_REGION, AMP_ENDPOINT_URL, AMG_ENDPOINT_URL,GRAFANA_NODEEXP_DASH_URL on the YAML manifests during deployment time to avoid hardcoding on the YAML manifests stored in Git repo.

We have placed our declarative code snippet to create an Amazon Managed Service For Promethes datasource and Grafana Dashboard in Amazon Managed Grafana in our [AWS Observabiity Accelerator GitHub Repository](https://github.com/aws-observability/aws-observability-accelerator/tree/main/artifacts/grafana-operator-manifests). We have setup a GitRepository to point to the AWS Observabiity Accelerator GitHub Repository and `Kustomization` for flux to sync Git Repository with artifacts in `./artifacts/grafana-operator-manifests` path in the AWS Observabiity Accelerator GitHub Repository. You can use this extension of our solution to point your own Kubernetes manifests to create Grafana Datasources and personified Grafana Dashboards of your choice using GitOps with Grafana Operator and Flux in Kubernetes native way with altering and redeploying this solution for changes to Grafana resources.
We have placed our declarative code snippet to create an Amazon Managed Service For Promethes datasource and Grafana Dashboard in Amazon Managed Grafana in our [AWS Observabiity Accelerator GitHub Repository](https://github.com/aws-observability/aws-observability-accelerator). We have setup a GitRepository to point to the AWS Observabiity Accelerator GitHub Repository and `Kustomization` for flux to sync Git Repository with artifacts in `./artifacts/grafana-operator-manifests/*` path in the AWS Observabiity Accelerator GitHub Repository. You can use this extension of our solution to point your own Kubernetes manifests to create Grafana Datasources and personified Grafana Dashboards of your choice using GitOps with Grafana Operator and Flux in Kubernetes native way with altering and redeploying this solution for changes to Grafana resources.



## v2.x changes
## Release notes

v2.x [releases](https://github.com/aws-observability/terraform-aws-observability-accelerator/releases) introduce
couple of breaking changes compared to previous versions:
We encourage you to use our [release versions](https://github.com/aws-observability/terraform-aws-observability-accelerator/releases)
as much as possible to avoid breaking changes when deploying Terraform modules. You can
read also our change log on the releases page. Here's an example of using a fixed version:

```hcl
module "eks_monitoring" {
source = "github.com/aws-observability/terraform-aws-observability-accelerator//modules/managed-prometheus-monitoring?ref=v2.5.0"
}
```

- `modules/workloads/infra` module moves to `modules/eks-monitoring`
- EKS configuration options moves from the base module to the `eks-monitoring` module
- EKS workload modules **java,nginx** merge into `eks-monitoring` as configuration options (patterns),
see [examples](https://github.com/aws-observability/terraform-aws-observability-accelerator/tree/main/examples)
- Examples have been updated to reflect these changes

## Base module

Expand Down Expand Up @@ -138,4 +140,4 @@ classDiagram

If you are new to AWS Observability services, or want to dive deeper into them,
check our [One Observability Workshop](https://catalog.workshops.aws/observability/)
for a hands-on experience in a self-paced environement or at an AWS venue.
for a hands-on experience in a self-paced environment or at an AWS venue.
54 changes: 35 additions & 19 deletions docs/eks/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -111,25 +111,40 @@ terraform apply

## Visualization

#### 1. Prometheus data source on Grafana

Make sure to open the link in the output. After a successful deployment, this will open
the Prometheus data source configuration on Grafana.
Click `Save & test` and you should see a notification confirming that the Amazon Managed Service for Prometheus workspace is ready to be used on Grafana.
#### 1. Grafana dashboards

```bash
terraform output grafana_prometheus_datasource_test
```
Login to your Grafana workspace and navigate to the Dashboards panel. You should see a list of dashboards under the `Observability Accelerator Dashboards`
<img width="1540" alt="image" src="https://user-images.githubusercontent.com/10175027/190000716-29e16698-7c90-49d6-8c37-79ca1790e2cc.png">

#### 2. Grafana dashboards
Open a specific dashboard and you should be able to view its visualization
<img width="2056" alt="cluster headlines" src="https://user-images.githubusercontent.com/10175027/199110753-9bc7a9b7-1b45-4598-89d3-32980154080e.png">

Go to the Dashboards panel of your Grafana workspace. You should see a list of dashboards under the `Observability Accelerator Dashboards`
With v2.5 and above, the dashboards are managed with a Grafana Operator running in your cluster.
From the cluster to view all dashboards as Kubernetes objects, run

<img width="1540" alt="image" src="https://user-images.githubusercontent.com/10175027/190000716-29e16698-7c90-49d6-8c37-79ca1790e2cc.png">
```console
kubectl get grafanadashboards -A
NAMESPACE NAME AGE
grafana-operator cluster-grafanadashboard 138m
grafana-operator java-grafanadashboard 143m
grafana-operator kubelet-grafanadashboard 13h
grafana-operator namespace-workloads-grafanadashboard 13h
grafana-operator nginx-grafanadashboard 134m
grafana-operator node-exporter-grafanadashboard 13h
grafana-operator nodes-grafanadashboard 13h
grafana-operator workloads-grafanadashboard 13h
```

Open a specific dashboard and you should be able to view its visualization
You can inspect more details per dashboard using this command

```console
kubectl describe grafanadashboards cluster-grafanadashboard -n grafana-operator
```

Grafana Operator and Flux always work together to synchronize your dashboards with Git.
If you delete your dashboards by accident, they will be re-provisioned automatically.

<img width="2056" alt="cluster headlines" src="https://user-images.githubusercontent.com/10175027/199110753-9bc7a9b7-1b45-4598-89d3-32980154080e.png">

#### 3. Amazon Managed Service for Prometheus rules and alerts

Expand Down Expand Up @@ -216,19 +231,20 @@ export GO_AMG_API_KEY=$(aws grafana create-workspace-api-key \
--output text)
```

- Next, lets grab the Grafana API key secret name from AWS Secrets Manager. The keyname should start with `terraform-..`
- Finally, update the Grafana API key secret in AWS Secrets Manager using the above new Grafana API key:

```bash
aws secretsmanager list-secrets
aws aws ssm put-parameter \
--name "/terraform-accelerator/grafana-api-key" \
--type "SecureString" \
--value "{\"GF_SECURITY_ADMIN_APIKEY\": \"${GO_AMG_API_KEY}\"}" \
--region <Your AWS Region>
```

- Finally, update the Grafana API key secret in AWS Secrets Manager using the above new Grafana API key:
- If the issue persists, you can force the synchronization by deleting the `externalsecret` Kubernetes object.

```bash
aws secretsmanager update-secret \
--secret-id <Your Secret Name> \
--secret-string "{\"GF_SECURITY_ADMIN_APIKEY\": \"${GO_AMG_API_KEY}\"}" \
--region <Your AWS Region>
kubectl delete externalsecret/external-secrets-sm -n grafana-operator
```

### 2. Upgrade from 2.1.0 or earlier
Expand Down
2 changes: 1 addition & 1 deletion docs/eks/java.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ Make sure to refresh your temporary Grafana API key

```bash
export TF_VAR_managed_grafana_workspace_id=g-xxx
export TF_VAR_grafana_api_key=`aws grafana create-workspace-api-key --key-name "observability-accelerator-$(date +%s)" --key-role ADMIN --seconds-to-live 1200 --workspace-id $TF_VAR_managed_grafana_workspace_id --query key --output text`
export TF_VAR_grafana_api_key=`aws grafana create-workspace-api-key --key-name "observability-accelerator-$(date +%s)" --key-role ADMIN --seconds-to-live 7200 --workspace-id $TF_VAR_managed_grafana_workspace_id --query key --output text`
```

## Deploy
Expand Down
8 changes: 4 additions & 4 deletions docs/eks/multicluster.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ Using the example [eks-cluster-with-vpc](https://aws-observability.github.io/ter
1. `eks-cluster-1`
2. `eks-cluster-2`

#### 2. Amazon Managed Serivce for Prometheus (AMP) workspace
#### 2. Amazon Managed Service for Prometheus (AMP) workspace

We recommend that you create a new AMP workspace. To do that you can run the following command.

Expand Down Expand Up @@ -48,7 +48,7 @@ Ensure you have the following necessary IAM permissions
* `grafana.DeleteWorkspaceApiKey`

```sh
export TF_VAR_grafana_api_key=`aws grafana create-workspace-api-key --key-name "observability-accelerator-$(date +%s)" --key-role ADMIN --seconds-to-live 1200 --workspace-id $TF_VAR_managed_grafana_workspace_id --query key --output text`
export TF_VAR_grafana_api_key=`aws grafana create-workspace-api-key --key-name "observability-accelerator-$(date +%s)" --key-role ADMIN --seconds-to-live 7200 --workspace-id $TF_VAR_managed_grafana_workspace_id --query key --output text`
```

## Setup
Expand All @@ -70,8 +70,8 @@ Verify by looking at the file `variables.tf` that there are two EKS clusters tar

The difference in deployment between these clusters is that Terraform, when setting up the EKS cluster behind variable `eks_cluster_1_id` for observability, also sets up:

* Dashboard folder and files in `AMG`
* Prometheus and Java, alerting and recording rules in `AMP`
* Dashboard folder and files in Amazon Managed Grafana
* Prometheus and Java, alerting and recording rules in Amazon Managed Service for Prometheus

!!! warning
To override the defaults, create a `terraform.tfvars` and change the default values of the variables.
Expand Down
Loading

0 comments on commit fa38a90

Please sign in to comment.