-
Notifications
You must be signed in to change notification settings - Fork 458
/
README.md
257 lines (199 loc) · 18.2 KB
/
README.md
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
# AWS Integration
The AWS integration is used to fetch logs and metrics from [Amazon Web Services](https://aws.amazon.com/).
Use the AWS integration to collect metrics and logs across many AWS services managed by your AWS account.
Visualize that data in Kibana, create alerts to notify you if something goes wrong,
and reference data when troubleshooting an issue.
**Extra AWS charges on CloudWatch API requests will be generated by this integration. Check [API Requests](#apirequests) for more details.**
## Data streams
The AWS integration collects two types of data, logs and metrics, across many AWS services.
**Logs** help you keep a record of events that happen in your AWS account.
This may include every user request that CloudFront receives, every action taken on your services
by an AWS user or role, and more.
**Metrics** give you insight into the state of your AWS services.
This may include understanding where you're spending the most and why, the volume of storage you're using,
CPU utilization of your instances, and more.
For a complete list of all AWS services and the data streams available for each, see [Reference](#reference).
## API requests
### Overview
The AWS integration uses different AWS API to bootstrap and collect metrics and logs. The following table illustrates which APIs are used by the AWS integration and how.
Each of these APIs may generate extra charges on your AWS Account. Refer to [AWS Pricing](https://aws.amazon.com/pricing) for more information.
| AWS API Name | AWS API Count | Frequency | Datastream |
|------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------|----------------------|
| IAM ListAccountAliases | 1 | Once on startup | all |
| STS GetCallerIdentity | 1 | Once on startup | all |
| EC2 DescribeRegions | 1 | Once on startup | all |
| CloudWatch ListMetrics | Total number of results / ListMetrics max page size (500, based on [AWS API ListMetrics](https://docs.aws.amazon.com/AmazonCloudWatch/latest/APIReference/API_ListMetrics.html) | Per region per collection period | metrics related only |
| CloudWatch GetMetricData | Total number of results / GetMetricData max page size (500, based on [AWS API GetMetricData](https://docs.aws.amazon.com/AmazonCloudWatch/latest/APIReference/API_GetMetricData.html) | Per region per namespace per collection period | metrics related only |
| CloudWatch DescribeLogGroups | Total number of results / DescribeLogGroups max page size (50, based on [AWS API DescribeLogGroups](https://docs.aws.amazon.com/AmazonCloudWatchLogs/latest/APIReference/API_DescribeLogGroups.html) | Per region per collection period | logs related only |
| CloudWatch FilterLogEvents | Total number of results / FilterLogEvents max page size (1MB or 10'0000 events, based on [AWS API FilterLogEvents](https://docs.aws.amazon.com/AmazonCloudWatchLogs/latest/APIReference/API_FilterLogEvents.html) | Per log group per region per collection period | logs related only |
| CostExplorer GetCostAndUsage | Total number of results / GetCostAndUsage max page size (8192, based on [AWS API GetCostAndUsage](https://docs.aws.amazon.com/aws-cost-management/latest/APIReference/API_GetCostAndUsage.html) | Per CostExplorer Group Definition per region per collection period | AWS Billing |
| S3 ListObjectsV2 | Total number of results / ListObjectsV2 max page size (up to 1,000, based on [AWS API FilterLogEvents](https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjectsV2.html) | Per bucket per region per collection period | logs related only |
| S3 GetObject | 1 | Per object per collection period | logs related only |
| SecurityHub GetFindings | Total number of results / GetFindings max page size ( 100, based on [AWS API GetFindings](https://docs.aws.amazon.com/securityhub/1.0/APIReference/API_GetFindings.html) | Per region per collection period | AWS Security Hub | |
| SecurityHub GetInsights | Total number of results / GetInsights max page size ( 100, based on [AWS API GetInsights](https://docs.aws.amazon.com/securityhub/1.0/APIReference/API_GetInsights.html) | Per region per collection period | AWS Security Hub | |
| SQS ReceiveMessage | 1 | Every 20s minimum (more frequent if messages are waiting) | logs related only (S3 notifications) |
| SQS DeleteMessage | 1 | Once per received message | logs related only (S3 notifications) |
| SQS ChangeMessageVisibility | 1 | When message processing exceeds 150s | logs related only (S3 notifications) |
| SQS GetQueueAttributes | 1 | Every minute to capture queue depth metric | logs related only (S3 notifications) |
### Metrics collection and cost considerations
For each AWS service you enable metrics data collection for, the AWS integration will collect metrics in all the AWS regions where there are available metrics for that service. The collection period is also set to sensible defaults
that should fit the majority of use cases.
The extra-charges generated by GetMetricData API calls are proportional to the frequency we collect data and the amount of metrics that are queried for. If you are concerned about the cost derived by enabling any metrics collection, we recommend reviewing the following parameters:
* `Regions`. By selecting only the AWS Regions you are interested in, you can make sure that no unnecessary Cloudwatch API call is performed against irrelevant AWS regions.
* `Collection Period` and `Data Granularity`. By setting `Collection Period` and `Data Granularity` together, you can control, respectively, how frequently you want your metrics to be collected and how granular they have to be. If you can tolerate an extra delay in retrieving metrics as trade off, you may consider setting `data_granularity` and increase the value for `Collection Period` to reduce extra charges. For example, setting `Data Granularity` to your current value for `Period`, and doubling the value of `Period`, may lead to a 50% savings.
* `Tags Filter`. By specifying a tag, you can ensure that no Cloudwatch API call is performed for AWS resources you are not interested in.
### Cross-account observability
The `include_linked_accounts` parameter is used to enable the inclusion of metrics from different accounts linked to a
main monitoring account. By setting this parameter to true, users can gather metrics from multiple AWS accounts that are
linked through the [CloudWatch cross-account observability](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch-Unified-Cross-Account.html).
Internally, the agent uses [ListMetrics](https://docs.aws.amazon.com/AmazonCloudWatch/latest/APIReference/API_ListMetrics.html) API to include metrics from the monitoring account and all linked source accounts in the returned data, providing a comprehensive cross-account view.
You can further utilize `owning_account` parameter to refine the cross account observability. This parameter accepts a valid AWS account ID which should be linked to the monitoring account.
If configured, metrics will be extracted from this specified linked/owning account.
This parameter [utilize OwningAccount](https://docs.aws.amazon.com/AmazonCloudWatch/latest/APIReference/API_ListMetrics.html#API_ListMetrics_RequestParameters) parameter of the ListMetrics API request.
For logs, integration supports monitoring log groups from linked accounts when log groups are extracted using `log_group_name_prefix` option.
You can enable `include_linked_accounts_for_prefix_mode` to include log groups from linked accounts. This is disabled by default.
*_Note_:* Users should ensure that the necessary IAM roles and policies are properly set up in order to link the monitoring
account and source accounts together.
Please see [Link monitoring accounts with source accounts](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch-Unified-Cross-Account-Setup.html#CloudWatch-Unified-Cross-Account-Setup-permissions) for more details.
## Requirements
Before using the AWS integration you will need:
* **AWS Credentials** to connect with your AWS account.
* **AWS Permissions** to make sure the user you're using to connect has permission to share the relevant data.
### AWS Credentials
AWS credentials are required for running AWS integrations.
There are a few ways to provide AWS credentials:
* Use access keys directly
* Use temporary security credentials
* Use a shared credentials file
* Use an IAM role Amazon Resource Name (ARN)
#### Use access keys directly
Access keys are long-term credentials for an IAM user or the AWS account root user.
To use access keys as credentials, you need to provide:
* `access_key_id`: The first part of the access key.
* `secret_access_key`: The second part of the access key.
For more details see [AWS Access Keys and Secret Access Keys](https://docs.aws.amazon.com/general/latest/gr/aws-sec-cred-types.html#access-keys-and-secret-access-keys).
#### Use temporary security credentials
Temporary security credentials can be configured in AWS to last for some period of time.
They consist of an access key ID, a secret access key, and a security token, which is
typically returned using `GetSessionToken`.
IAM users with multi-factor authentication (MFA) enabled need to submit an MFA code
while calling `GetSessionToken`.
For more details see [Temporary Security Credentials](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp.html).
You can use AWS CLI to generate temporary credentials.
For example, you would use `sts get-session-token` if you have MFA enabled:
```js
aws> sts get-session-token --serial-number arn:aws:iam::1234:mfa/[email protected] --duration-seconds 129600 --token-code 123456
```
Then, use the response to provide the following options to the AWS integration:
* `access_key_id`: The first part of the access key.
* `secret_access_key`: The second part of the access key.
* `session_token`: A token required when using temporary security credentials.
Because temporary security credentials are short term, after they expire, you will need
to generate new ones and manually update the package configuration to continue collecting AWS metrics.
This will cause data loss if the configuration is not updated with the new credentials before the old ones expire.
#### Use a shared credentials file
If you use different credentials for different tools or applications, you can use profiles to
configure multiple access keys in the same configuration file.
For more details see [Create Shared Credentials File](https://docs.aws.amazon.com/sdkref/latest/guide/file-format.html#file-format-creds)
Instead of providing the `access_key_id` and `secret_access_key` directly to the integration,
you will provide two advanced options to look up the access keys in the shared credentials file:
* `credential_profile_name`: The profile name in shared credentials file.
* `shared_credential_file`: The directory of the shared credentials file.
**Note**: If you don't provide values for all keys, the integration will use defaults:
- If `access_key_id`, `secret_access_key` and `role_arn` are all not provided, then the package will check for `credential_profile_name`.
- If there is no `credential_profile_name` given, the default profile will be used.
- If `shared_credential_file` is empty, the default directory will be used.
- In Windows, shared credentials file is located at `C:\Users\<yourUserName>\.aws\credentials`.
- For Linux, macOS, or Unix, the file is located at `~/.aws/credentials`.
#### Use an IAM role Amazon Resource Name (ARN)
An IAM role ARN is an IAM identity that you can create in your AWS account. You determine what the role has permission to do.
A role does not have standard long-term credentials such as a password or access keys associated with it.
Instead, when you assume a role it provides you with temporary security credentials for your role session.
IAM role ARN can be used to specify which AWS IAM role to assume to generate temporary credentials.
For more details see [AssumeRole API documentation](https://docs.aws.amazon.com/STS/latest/APIReference/API_AssumeRole.html).
To use an IAM role ARN, you need to provide either a [credential profile](#use-a-shared-credentials-file) or
[access keys](#use-access-keys-directly) along with the `role_arn` advanced option.
`role_arn` is used to specify which AWS IAM role to assume for generating temporary credentials.
Note: If `role_arn` is given, the package will check if access keys are given.
If they are not given, the package will check for a credential profile name.
If neither is given, the default credential profile will be used.
### AWS Permissions
Specific AWS permissions are required for the IAM user to make specific AWS API calls.
To enable the AWS integration to collect metrics and logs from all supported services,
make sure these permissions are given:
* `ce:GetCostAndUsage`
* `cloudwatch:GetMetricData`
* `cloudwatch:ListMetrics`
* `ec2:DescribeInstances`
* `ec2:DescribeRegions`
* `iam:ListAccountAliases`
* `inspector2:ListFindings`
* `logs:DescribeLogGroups`
* `logs:FilterLogEvents`
* `organizations:ListAccounts`
* `rds:DescribeDBInstances`
* `rds:ListTagsForResource`
* `s3:GetBucketLocation`
* `s3:GetObject`
* `s3:ListBucket`
* `sns:ListTopics`
* `sqs:ChangeMessageVisibility`
* `sqs:DeleteMessage`
* `sqs:GetQueueAttributes`
* `sqs:ListQueues`
* `sqs:ReceiveMessage`
* `sts:AssumeRole`
* `sts:GetCallerIdentity`
* `tag:GetResources`
## Setup
Use the AWS integration to connect to your AWS account and collect data from multiple AWS services.
When you configure the integration, you can collect data from as many AWS services as you'd like.
If you only need to collect data from one AWS service, consider using the individual integration
(for example, to only collect monitoring metrics for EC2, you can configure only the **AWS EC2** integration).
For step-by-step instructions on how to set up an integration, see the
{{ url "getting-started-observability" "Getting started" }} guide.
## Debug
### Latency causes missing metrics
Some AWS services send monitoring metrics to CloudWatch with a latency to process larger than the integration collection
period. This will cause data points missing or none get collected by the agent. In this case, please specify a
latency parameter so collection start time and end time will be shifted by the given latency amount.
In order to check how much the latency is, you can log into the AWS CloudWatch portal. Wait till a new point to
show up in AWS CloudWatch and record the current timestamp. Compare the timestamp of this latest data point with the
current timestamp to see what's the difference. This difference can be used as latency.
For example, the screenshot below is taken at `2023-05-09 22:30 UTC` and the timestamp for the last data point is
`2023-05-09 22:15 UTC`. This means there is a 15min delay between the current time and CloudWatch. With this information,
we should add a `latency` configuration for `15m` when adding the integration.
![CloudWatch Last Data Point Timestamp](../img/metricbeat-aws-cloudwatch-latency.png)
## Reference
Below is an overview of the type of data you can collect from each AWS service.
Visit the page for each individual AWS integration to see details about exported fields.
| Service | Metrics | Logs |
|------------------|:-------:|:-------:|
| API Gateway | x | |
| Billing | x | |
| CloudFront | | x |
| CloudTrail | | x |
| CloudWatch | x | x |
| DynamoDB | x | |
| EBS | x | |
| EC2 | x | x |
| ECS | x | |
| ELB | x | x |
| Fargate | x | |
| Kinesis | x | |
| Network Firewall | x | x |
| Lambda | x | |
| NAT Gateway | x | |
| Redshift | x | |
| RDS | x | |
| Route 53 | | x |
| S3 | x | x |
| S3 Storage Lens | x | |
| SNS | x | |
| SQS | x | |
| Transit Gateway | x | |
| Usage | x | |
| VPC Flow | | x |
| VPN | x | |
| WAF | | x |
| Custom | | x |