Skip to content

Commit

Permalink
update docs
Browse files Browse the repository at this point in the history
  • Loading branch information
npalm committed Mar 18, 2024
1 parent 46964a9 commit 99795c9
Show file tree
Hide file tree
Showing 5 changed files with 39 additions and 15 deletions.
42 changes: 32 additions & 10 deletions docs/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -175,16 +175,6 @@ This tracing config generates timelines for following events:

This feature has been disabled by default.

## Termination watcher

This feature is in early stage and therefore disabled by default.

The termination watcher is currently watchting for spot termination notifications. The module is only taken events into account for instances tagged with `ghr:environment` by default when deployment the module as part of one of the main modules (root or multi-runner). The module can also be deployed stand-alone, in that case the tag filter needs to be tunned.

- Logs: The module will log all termination notificatons. For each warning it will look up instance details and log the environment, instance type and time the instance is running. As well some other details.
- Metrics: Metrics are disabled by default, this to avoid costs. Once enabled a metric will be created for each warning with at least dimensions for the environment and instance type.


### Multiple runner module in your AWS account

The watcher will act on all spot termination notificatins and log all onses relevant to the runner module. Therefor we suggest to only deploy the watcher once. You can either deploy the watcher by enabling in one of your deployments or deploy the watcher as a stand alone module.
Expand All @@ -202,6 +192,38 @@ In case the setup does not work as intended, trace the events through this seque

## Experimental features

### Termination watcher

This feature is in early stage and therefore disabled by default.

The termination watcher is currently watchting for spot termination notifications. The module is only taken events into account for instances tagged with `ghr:environment` by default when deployment the module as part of one of the main modules (root or multi-runner). The module can also be deployed stand-alone, in that case the tag filter needs to be tunned.

- Logs: The module will log all termination notificatons. For each warning it will look up instance details and log the environment, instance type and time the instance is running. As well some other details.
- Metrics: Metrics are disabled by default, this to avoid costs. Once enabled a metric will be created for each warning with at least dimensions for the environment and instance type. THe metric name space can be configured via the variables. The metric name used is `SpotInterruptionWarning`.

#### Log example

Below an example of the the log messages created.

```
{
"level": "INFO",
"message": "Received spot notification warning:",
"environment": "default",
"instanceId": "i-0039b8826b3dcea55",
"instanceType": "c5.large",
"instanceLaunchTime": "2024-03-15T08:10:34.000Z",
"instanceRunningTimeInSeconds": 68,
"tags": [
{
"Key": "ghr:environment",
"Value": "default"
}
... all tags ...
]
}
```

### Queue to publish workflow job events

This queue is an experimental feature to allow you to receive a copy of the wokflow_jobs events sent by the GitHub App. This can be used to calculate a matrix or monitor the system.
Expand Down
6 changes: 4 additions & 2 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,9 +64,11 @@ The control plane (scale up lambda) will store the runner registration configura

The AMI cleaner is a lambda that will clean up AMIs that are older than a configurable amount of days. This is useful when using the AMI builder to create AMIs. The cleaner will also check which AMIs are used the latest version of the launch template. And you can provide SSM config paths pointing to AMI IDs. The cleaner will not delete these AMIs. The AMI cleaner is opt in, it will not be created by default.

### Spot Instance Termination Watcher
### Instance Termination Watcher

The Spot Instance Termination Wachter is creating log and optional merrics for Spot Instance Termination warning sent by AWS two minutes before termination. The Lambda only will log instances details for instances tagged with `ghr:createdBy`. The module is by default not enabled.
> This feature is Beta, changes will not trigger a major release as long in beta.
The Instance Termination Wachter is creating log and optional merrics for termination of instances. Currently only spot termination warnings are watched. See [confgiuration](configuration/) for more details.

### Security

Expand Down
2 changes: 1 addition & 1 deletion examples/termination-watcher/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,4 +50,4 @@ No inputs.
## Outputs

No outputs.
<!-- END_TF_DOCS -->
<!-- END_TF_DOCS -->
2 changes: 1 addition & 1 deletion modules/multi-runner/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -619,7 +619,7 @@ variable "metrics_namespace" {

variable "instance_termination_watcher" {
description = <<-EOF
Configuration for the spot termination watcher lambda function.
Configuration for the spot termination watcher lambda function. This feature is Beta, changes will not trigger a major release as long in beta.
`enable`: Enable or disable the spot termination watcher.
'enable_metrics': Enable metric for the lambda. If `spot_warning` is set to true, the lambda will emit a metric when it detects a spot termination warning.
Expand Down
2 changes: 1 addition & 1 deletion variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -854,7 +854,7 @@ variable "metrics_namespace" {

variable "instance_termination_watcher" {
description = <<-EOF
Configuration for the instance termination watcher.
Configuration for the instance termination watcher. This feature is Beta, changes will not trigger a major release as long in beta.
`enable`: Enable or disable the spot termination watcher.
'enable_metrics': Enable or disable the metrics for the spot termination watcher.
Expand Down

0 comments on commit 99795c9

Please sign in to comment.