Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move all dashboards to GitOps #175

Merged
merged 33 commits into from
Jun 12, 2023
Merged
Show file tree
Hide file tree
Changes from 22 commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
bca7d48
Typo
bonclay7 Jun 5, 2023
396720c
Remove Grafana provider
bonclay7 Jun 5, 2023
f55cb27
Temp: move dashbaords to gitOps
bonclay7 Jun 6, 2023
a09ddbf
Move external labels to resource attributes
bonclay7 Jun 6, 2023
d1c1d08
Avoid DDoS with using 0.0.0.0
bonclay7 Jun 6, 2023
f7b8973
Pre-commit
bonclay7 Jun 6, 2023
3613389
Transition in two steps
bonclay7 Jun 6, 2023
183f556
Move patterns' dashboards creation to gitOps
bonclay7 Jun 6, 2023
7dde178
Pre-commit
bonclay7 Jun 6, 2023
56b2671
Merge branch 'main' into cleanup/grafana
bonclay7 Jun 7, 2023
783b929
Create AMP dashboard from external source with Grafana provider
bonclay7 Jun 7, 2023
bbd8b5f
Merge remote-tracking branch 'origin/fix/adot-restart' into cleanup/g…
bonclay7 Jun 7, 2023
7330816
Merge branch 'main' into cleanup/grafana
bonclay7 Jun 7, 2023
fb28387
Fix deprecated option
bonclay7 Jun 7, 2023
70d4b67
Fix Flux requirements
bonclay7 Jun 7, 2023
1832395
Run pre-commit
bonclay7 Jun 7, 2023
8d2bfaa
Update example with operator
bonclay7 Jun 7, 2023
0348f45
Cleanup examples
bonclay7 Jun 7, 2023
42e7897
Update multicluster example
bonclay7 Jun 7, 2023
a71a9ea
Update multicluster example
bonclay7 Jun 7, 2023
3d8002e
Drop dead variable
bonclay7 Jun 7, 2023
f7c385b
Update docs
bonclay7 Jun 8, 2023
9406b98
Change GitOps branch name
bonclay7 Jun 9, 2023
354df99
Update docs
bonclay7 Jun 9, 2023
e4f30fe
Replacing Secrets Manager to SSM to store Grafana API Key (#178)
elamaran11 Jun 9, 2023
d5130cb
Update architecture diagram
bonclay7 Jun 9, 2023
471db3e
Update architecture diagram
bonclay7 Jun 9, 2023
c320931
Update README.md
bonclay7 Jun 9, 2023
fd3dd0b
Update index.md
bonclay7 Jun 9, 2023
d21cf98
Fixing Grafana Operator Version
elamaran11 Jun 9, 2023
b0e7366
Fix multicluster example
bonclay7 Jun 9, 2023
276f484
Merge remote-tracking branch 'origin/feature/grafanaOperatorVersion' …
bonclay7 Jun 9, 2023
324e491
Update docs
bonclay7 Jun 9, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 2 additions & 21 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,15 +33,6 @@ visit the [Amazon EKS cluster monitoring documentation](https://aws-observabilit
The sections below demonstrate how you can leverage AWS Observability Accelerator
to enable monitoring to an existing EKS cluster.

### v2.x changes

v2+ releases introduces couple of breaking changes compared to previous versions:

- `modules/workloads/infra` module moves to `modules/eks-monitoring`
- All EKS configuration options moves from the base module to the `eks-monitoring` module
- All EKS workload modules `modules/workloads/{java,nginx}` merge into `eks-monitoring` as configuration options (patterns), see [examples](./examples) to provide a more complete visibility
- All examples have been updated to reflect these changes
- Introducing GitOps for Grafana contents (Dashboards, Folders and Data sources) with [Grafana Operator](https://github.com/grafana-operator/grafana-operator) and [Flux CD](https://fluxcd.io/)

### Base Module

Expand Down Expand Up @@ -161,14 +152,13 @@ If you are interested in contributing, see the
| <a name="requirement_terraform"></a> [terraform](#requirement\_terraform) | >= 1.1.0 |
| <a name="requirement_aws"></a> [aws](#requirement\_aws) | >= 4.0.0 |
| <a name="requirement_awscc"></a> [awscc](#requirement\_awscc) | >= 0.24.0 |
| <a name="requirement_grafana"></a> [grafana](#requirement\_grafana) | 1.25.0 |
| <a name="requirement_grafana"></a> [grafana](#requirement\_grafana) | >= 1.25.0 |

## Providers

| Name | Version |
|------|---------|
| <a name="provider_aws"></a> [aws](#provider\_aws) | >= 4.0.0 |
| <a name="provider_grafana"></a> [grafana](#provider\_grafana) | 1.25.0 |

## Modules

Expand All @@ -180,8 +170,6 @@ No modules.
|------|------|
| [aws_prometheus_alert_manager_definition.this](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/prometheus_alert_manager_definition) | resource |
| [aws_prometheus_workspace.this](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/prometheus_workspace) | resource |
| [grafana_data_source.amp](https://registry.terraform.io/providers/grafana/grafana/1.25.0/docs/resources/data_source) | resource |
| [grafana_folder.this](https://registry.terraform.io/providers/grafana/grafana/1.25.0/docs/resources/folder) | resource |
| [aws_grafana_workspace.this](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/grafana_workspace) | data source |
| [aws_region.current](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/region) | data source |

Expand All @@ -190,12 +178,10 @@ No modules.
| Name | Description | Type | Default | Required |
|------|-------------|------|---------|:--------:|
| <a name="input_aws_region"></a> [aws\_region](#input\_aws\_region) | AWS Region | `string` | n/a | yes |
| <a name="input_create_dashboard_folder"></a> [create\_dashboard\_folder](#input\_create\_dashboard\_folder) | Boolean flag to enable Amazon Managed Grafana folder and dashboards | `bool` | `true` | no |
| <a name="input_create_prometheus_data_source"></a> [create\_prometheus\_data\_source](#input\_create\_prometheus\_data\_source) | Boolean flag to enable Amazon Managed Grafana datasource | `bool` | `true` | no |
| <a name="input_enable_alertmanager"></a> [enable\_alertmanager](#input\_enable\_alertmanager) | Creates Amazon Managed Service for Prometheus AlertManager for all workloads | `bool` | `false` | no |
| <a name="input_enable_managed_prometheus"></a> [enable\_managed\_prometheus](#input\_enable\_managed\_prometheus) | Creates a new Amazon Managed Service for Prometheus Workspace | `bool` | `true` | no |
| <a name="input_grafana_api_key"></a> [grafana\_api\_key](#input\_grafana\_api\_key) | Grafana API key for the Amazon Managed Grafana workspace | `string` | n/a | yes |
| <a name="input_managed_grafana_workspace_id"></a> [managed\_grafana\_workspace\_id](#input\_managed\_grafana\_workspace\_id) | Amazon Managed Grafana Workspace ID | `string` | `""` | no |
| <a name="input_managed_grafana_workspace_id"></a> [managed\_grafana\_workspace\_id](#input\_managed\_grafana\_workspace\_id) | Amazon Managed Grafana Workspace ID | `string` | n/a | yes |
| <a name="input_managed_prometheus_workspace_id"></a> [managed\_prometheus\_workspace\_id](#input\_managed\_prometheus\_workspace\_id) | Amazon Managed Service for Prometheus Workspace ID | `string` | `""` | no |
| <a name="input_managed_prometheus_workspace_region"></a> [managed\_prometheus\_workspace\_region](#input\_managed\_prometheus\_workspace\_region) | Region where Amazon Managed Service for Prometheus is deployed | `string` | `null` | no |
| <a name="input_tags"></a> [tags](#input\_tags) | Additional tags (e.g. `map('BusinessUnit`,`XYZ`) | `map(string)` | `{}` | no |
Expand All @@ -205,15 +191,10 @@ No modules.
| Name | Description |
|------|-------------|
| <a name="output_aws_region"></a> [aws\_region](#output\_aws\_region) | AWS Region |
| <a name="output_grafana_dashboard_folder_created"></a> [grafana\_dashboard\_folder\_created](#output\_grafana\_dashboard\_folder\_created) | Boolean value indicating if the module created a dashboard folder in Amazon Managed Grafana |
| <a name="output_grafana_dashboards_folder_id"></a> [grafana\_dashboards\_folder\_id](#output\_grafana\_dashboards\_folder\_id) | Grafana folder ID for automatic dashboards. Required by workload modules |
| <a name="output_grafana_prometheus_datasource_test"></a> [grafana\_prometheus\_datasource\_test](#output\_grafana\_prometheus\_datasource\_test) | Grafana save & test URL for Amazon Managed Prometheus workspace |
| <a name="output_managed_grafana_workspace_endpoint"></a> [managed\_grafana\_workspace\_endpoint](#output\_managed\_grafana\_workspace\_endpoint) | Amazon Managed Grafana workspace endpoint |
| <a name="output_managed_grafana_workspace_id"></a> [managed\_grafana\_workspace\_id](#output\_managed\_grafana\_workspace\_id) | Amazon Managed Grafana workspace ID |
| <a name="output_managed_prometheus_workspace_endpoint"></a> [managed\_prometheus\_workspace\_endpoint](#output\_managed\_prometheus\_workspace\_endpoint) | Amazon Managed Prometheus workspace endpoint |
| <a name="output_managed_prometheus_workspace_id"></a> [managed\_prometheus\_workspace\_id](#output\_managed\_prometheus\_workspace\_id) | Amazon Managed Prometheus workspace ID |
| <a name="output_managed_prometheus_workspace_region"></a> [managed\_prometheus\_workspace\_region](#output\_managed\_prometheus\_workspace\_region) | Amazon Managed Prometheus workspace region |
| <a name="output_prometheus_data_source_created"></a> [prometheus\_data\_source\_created](#output\_prometheus\_data\_source\_created) | Boolean value indicating if the module created a prometheus data source in Amazon Managed Grafana |
<!-- END OF PRE-COMMIT-TERRAFORM DOCS HOOK -->

## Contributing
Expand Down
24 changes: 13 additions & 11 deletions docs/concepts.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,22 +39,24 @@ The grafana-operator is a Kubernetes operator built to help you manage your Graf

GitOps is a way of managing application and infrastructure deployment so that the whole system is described declaratively in a Git repository. It is an operational model that offers you the ability to manage the state of multiple Kubernetes clusters leveraging the best practices of version control, immutable artifacts, and automation. Flux is a declarative, GitOps-based continuous delivery tool that can be integrated into any CI/CD pipeline. It gives users the flexibility of choosing their Git provider (GitHub, GitLab, BitBucket). Now, with grafana-operator supporting the management of external Grafana instances such as Amazon Managed Grafana, operations personas can use GitOps mechanisms using CNCF projects such as Flux to create and manage the lifecycle of resources in Amazon Managed Grafana.

We have setup a [GitRepository](https://fluxcd.io/flux/components/source/gitrepositories/) and [Kustomization](https://fluxcd.io/flux/components/kustomize/kustomization/) using flux to sync our GitHub Repository to add Grafana Datasources, folder and Dashboards to Amazon Managed Grafana using Grafana Operator. GitRepository defines a Source to produce an Artifact for a Git repository revision. Kustomization defines a pipeline for fetching, decrypting, building, validating and applying Kustomize overlays or plain Kubernetes manifests. we are also using [Flux Post build variable substitution](https://fluxcd.io/flux/components/kustomize/kustomization/#post-build-variable-substitution) to dynamically render variables such as AMG_AWS_REGION, AMP_ENDPOINT_URL, AMG_ENDPOINT_URL,GRAFANA_NODEEXP_DASH_URL on the YAML manifests during deployment time to avoid hardcoding on the YAML manifests stored in Git repo.
We have setup a [GitRepository](https://fluxcd.io/flux/components/source/gitrepositories/) and [Kustomization](https://fluxcd.io/flux/components/kustomize/kustomization/) using Flux to sync our GitHub Repository to add Grafana Datasources, folder and Dashboards to Amazon Managed Grafana using Grafana Operator. GitRepository defines a Source to produce an Artifact for a Git repository revision. Kustomization defines a pipeline for fetching, decrypting, building, validating and applying Kustomize overlays or plain Kubernetes manifests. we are also using [Flux Post build variable substitution](https://fluxcd.io/flux/components/kustomize/kustomization/#post-build-variable-substitution) to dynamically render variables such as AMG_AWS_REGION, AMP_ENDPOINT_URL, AMG_ENDPOINT_URL,GRAFANA_NODEEXP_DASH_URL on the YAML manifests during deployment time to avoid hardcoding on the YAML manifests stored in Git repo.

We have placed our declarative code snippet to create an Amazon Managed Service For Promethes datasource and Grafana Dashboard in Amazon Managed Grafana in our [AWS Observabiity Accelerator GitHub Repository](https://github.com/aws-observability/aws-observability-accelerator/tree/main/artifacts/grafana-operator-manifests). We have setup a GitRepository to point to the AWS Observabiity Accelerator GitHub Repository and `Kustomization` for flux to sync Git Repository with artifacts in `./artifacts/grafana-operator-manifests` path in the AWS Observabiity Accelerator GitHub Repository. You can use this extension of our solution to point your own Kubernetes manifests to create Grafana Datasources and personified Grafana Dashboards of your choice using GitOps with Grafana Operator and Flux in Kubernetes native way with altering and redeploying this solution for changes to Grafana resources.
We have placed our declarative code snippet to create an Amazon Managed Service For Promethes datasource and Grafana Dashboard in Amazon Managed Grafana in our [AWS Observabiity Accelerator GitHub Repository](https://github.com/aws-observability/aws-observability-accelerator). We have setup a GitRepository to point to the AWS Observabiity Accelerator GitHub Repository and `Kustomization` for flux to sync Git Repository with artifacts in `./artifacts/grafana-operator-manifests/*` path in the AWS Observabiity Accelerator GitHub Repository. You can use this extension of our solution to point your own Kubernetes manifests to create Grafana Datasources and personified Grafana Dashboards of your choice using GitOps with Grafana Operator and Flux in Kubernetes native way with altering and redeploying this solution for changes to Grafana resources.



## v2.x changes
## Release notes

v2.x [releases](https://github.com/aws-observability/terraform-aws-observability-accelerator/releases) introduce
couple of breaking changes compared to previous versions:
We encourage you to use our [release versions](https://github.com/aws-observability/terraform-aws-observability-accelerator/releases)
as much as possible to avoid breaking changes when deploying Terraform modules. You can
read also our change log on the releases page. Here's an example of using a fixed version:

```hcl
module "eks_monitoring" {
source = "github.com/aws-observability/terraform-aws-observability-accelerator//modules/managed-prometheus-monitoring?ref=v2.5.0"
}
```

- `modules/workloads/infra` module moves to `modules/eks-monitoring`
- EKS configuration options moves from the base module to the `eks-monitoring` module
- EKS workload modules **java,nginx** merge into `eks-monitoring` as configuration options (patterns),
see [examples](https://github.com/aws-observability/terraform-aws-observability-accelerator/tree/main/examples)
- Examples have been updated to reflect these changes

## Base module

Expand Down Expand Up @@ -138,4 +140,4 @@ classDiagram

If you are new to AWS Observability services, or want to dive deeper into them,
check our [One Observability Workshop](https://catalog.workshops.aws/observability/)
for a hands-on experience in a self-paced environement or at an AWS venue.
for a hands-on experience in a self-paced environment or at an AWS venue.
45 changes: 33 additions & 12 deletions docs/eks/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -111,25 +111,40 @@ terraform apply

## Visualization

#### 1. Prometheus data source on Grafana

Make sure to open the link in the output. After a successful deployment, this will open
the Prometheus data source configuration on Grafana.
Click `Save & test` and you should see a notification confirming that the Amazon Managed Service for Prometheus workspace is ready to be used on Grafana.
#### 1. Grafana dashboards

```bash
terraform output grafana_prometheus_datasource_test
```
Login to your Grafana workspace and navigate to the Dashboards panel. You should see a list of dashboards under the `Observability Accelerator Dashboards`
<img width="1540" alt="image" src="https://user-images.githubusercontent.com/10175027/190000716-29e16698-7c90-49d6-8c37-79ca1790e2cc.png">

#### 2. Grafana dashboards
Open a specific dashboard and you should be able to view its visualization
<img width="2056" alt="cluster headlines" src="https://user-images.githubusercontent.com/10175027/199110753-9bc7a9b7-1b45-4598-89d3-32980154080e.png">

Go to the Dashboards panel of your Grafana workspace. You should see a list of dashboards under the `Observability Accelerator Dashboards`
With v2.5 and above, the dashboards are managed with a Grafana Operator running in your cluster.
From the cluster to view all dashboards as Kubernetes objects, run

<img width="1540" alt="image" src="https://user-images.githubusercontent.com/10175027/190000716-29e16698-7c90-49d6-8c37-79ca1790e2cc.png">
```console
kubectl get grafanadashboards -A
NAMESPACE NAME AGE
grafana-operator cluster-grafanadashboard 138m
grafana-operator java-grafanadashboard 143m
grafana-operator kubelet-grafanadashboard 13h
grafana-operator namespace-workloads-grafanadashboard 13h
grafana-operator nginx-grafanadashboard 134m
grafana-operator node-exporter-grafanadashboard 13h
grafana-operator nodes-grafanadashboard 13h
grafana-operator workloads-grafanadashboard 13h
```

Open a specific dashboard and you should be able to view its visualization
You can inspect more details per dashboard using this command

```console
kubectl describe grafanadashboards cluster-grafanadashboard -n grafana-operator
```

Grafana Operator and Flux always work together to synchronize your dashboards with Git.
If you delete your dashboards by accident, they will be re-provisioned automatically.

<img width="2056" alt="cluster headlines" src="https://user-images.githubusercontent.com/10175027/199110753-9bc7a9b7-1b45-4598-89d3-32980154080e.png">

#### 3. Amazon Managed Service for Prometheus rules and alerts

Expand Down Expand Up @@ -231,6 +246,12 @@ aws secretsmanager update-secret \
--region <Your AWS Region>
```

- If the issue persists, you can force the synchronization by deleting the `externalsecret` Kubernetes object.

```bash
kubectl delete externalsecret/external-secrets-sm -n grafana-operator
```

### 2. Upgrade from 2.1.0 or earlier

When you upgrade the eks-monitoring module from v2.1.0 or earlier, the following error may occur.
Expand Down
2 changes: 1 addition & 1 deletion docs/eks/java.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ Make sure to refresh your temporary Grafana API key

```bash
export TF_VAR_managed_grafana_workspace_id=g-xxx
export TF_VAR_grafana_api_key=`aws grafana create-workspace-api-key --key-name "observability-accelerator-$(date +%s)" --key-role ADMIN --seconds-to-live 1200 --workspace-id $TF_VAR_managed_grafana_workspace_id --query key --output text`
export TF_VAR_grafana_api_key=`aws grafana create-workspace-api-key --key-name "observability-accelerator-$(date +%s)" --key-role ADMIN --seconds-to-live 7200 --workspace-id $TF_VAR_managed_grafana_workspace_id --query key --output text`
```

## Deploy
Expand Down
8 changes: 4 additions & 4 deletions docs/eks/multicluster.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ Using the example [eks-cluster-with-vpc](https://aws-observability.github.io/ter
1. `eks-cluster-1`
2. `eks-cluster-2`

#### 2. Amazon Managed Serivce for Prometheus (AMP) workspace
#### 2. Amazon Managed Service for Prometheus (AMP) workspace

We recommend that you create a new AMP workspace. To do that you can run the following command.

Expand Down Expand Up @@ -48,7 +48,7 @@ Ensure you have the following necessary IAM permissions
* `grafana.DeleteWorkspaceApiKey`

```sh
export TF_VAR_grafana_api_key=`aws grafana create-workspace-api-key --key-name "observability-accelerator-$(date +%s)" --key-role ADMIN --seconds-to-live 1200 --workspace-id $TF_VAR_managed_grafana_workspace_id --query key --output text`
export TF_VAR_grafana_api_key=`aws grafana create-workspace-api-key --key-name "observability-accelerator-$(date +%s)" --key-role ADMIN --seconds-to-live 7200 --workspace-id $TF_VAR_managed_grafana_workspace_id --query key --output text`
```

## Setup
Expand All @@ -70,8 +70,8 @@ Verify by looking at the file `variables.tf` that there are two EKS clusters tar

The difference in deployment between these clusters is that Terraform, when setting up the EKS cluster behind variable `eks_cluster_1_id` for observability, also sets up:

* Dashboard folder and files in `AMG`
* Prometheus and Java, alerting and recording rules in `AMP`
* Dashboard folder and files in Amazon Managed Grafana
* Prometheus and Java, alerting and recording rules in Amazon Managed Service for Prometheus

!!! warning
To override the defaults, create a `terraform.tfvars` and change the default values of the variables.
Expand Down
Loading