Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ALB target group tag selections failing due to resource id #268

Closed
muvster opened this issue Apr 24, 2020 · 9 comments · Fixed by #648
Closed

ALB target group tag selections failing due to resource id #268

muvster opened this issue Apr 24, 2020 · 9 comments · Fixed by #648

Comments

@muvster
Copy link

muvster commented Apr 24, 2020

Hi. I fail to get any matching metrics with this configuration (running 0.8.0):

    metrics:             
    - aws_namespace: AWS/ApplicationELB
      aws_metric_name: HealthyHostCount      
      aws_tag_select:                                
        tag_selections:             
          "ingress.k8s.aws/cluster": [myCluster]
          "kubernetes.io/namespace": [myNamespace]
        resource_type_selection: "elasticloadbalancing:targetgroup"
        resource_id_dimension: TargetGroup    
      aws_dimensions: [TargetGroup,LoadBalancer]       
      aws_statistics: [Minimum]

It works if I instead filter by LoadBalancer:

        resource_type_selection: "elasticloadbalancing:loadbalancer"
        resource_id_dimension: LoadBalancer

From reading the source, it looks to me like the reason for this is that the resource IDs that are extracted from the ARNs don't match the dimension values returned by listMetrics() in the target group case.

The ResourceARNs returned by aws resourcegroupstaggingapi get-resources ... look like this for load balancers and target groups, respectively (lightly obfuscated):

"ResourceARN": "arn:aws:elasticloadbalancing:eu-north-1:1234567890:loadbalancer/app/ce9e96ed-somelbname-6b08/3a16156ee3244yyy"
"ResourceARN": "arn:aws:elasticloadbalancing:eu-north-1:1234567890:targetgroup/ce9e96ed-cbe56901daf6f7a1xxx/37def13b297a7yyy"

As far as I can tell, this would result in the following extracted resource IDs:

app/ce9e96ed-somelbname-6b08/3a16156ee3244yyy
ce9e96ed-cbe56901daf6f7a1xxx/37def13b297a7yyy

However, aws list-metrics ... returns these dimensions:

        {                                                                                                                                                                                                   
            "Namespace": "AWS/ApplicationELB",                                                                                                                                                              
            "MetricName": "HealthyHostCount",
            "Dimensions": [
                {
                    "Name": "TargetGroup",
                    "Value": "targetgroup/ce9e96ed-cbe56901daf6f7a1xxx/37def13b297a7yyy"
                },
                {
                    "Name": "LoadBalancer",
                    "Value": "app/ce9e96ed-somelbname-6b08/3a16156ee3244yyy"
                }
            ]
        }

So the value for the LoadBalancer dimension matches the extracted ID, but the one for TargetGroup doesn't (because of the targetgroup/ prefix).

I can't say that I understand why the dimension values look like they do. Perhaps there's some CloudWatch bug there. But on the other hand it doesn't seem clear to me that the current resource ID extraction will work for all ARN flavours given the IMO less than super-clear documentation at https://docs.aws.amazon.com/general/latest/gr/aws-arns-and-namespaces.html.

Could it be an option to just pass around the full ARNs in CloudWatchCollector.java and consider a metric a match as long as there's an ARN that ends with the dimension value?

@brian-brazil
Copy link
Contributor

@louisfelix do you know why these are differing? Offhand I'd presume a bug on the AWS side.

@louisfelix
Copy link
Contributor

@brian-brazil no I don't know. I'd also presume an AWS bug here, this is unexpected to me.

@muvster
Copy link
Author

muvster commented Apr 25, 2020

Thanks for your attention. I didn't see this before posting, but looks like this is actually the expected naming. From https://docs.aws.amazon.com/elasticloadbalancing/latest/application/load-balancer-cloudwatch-metrics.html:

alb_monitoring

@brian-brazil
Copy link
Contributor

So an AWS design flaw then.

@muvster
Copy link
Author

muvster commented Apr 25, 2020

Does seem a bit arbitrary, yes.

In this particular case it would work to (1) assume that the dimension value will be a suffix of the ARN instead of extracting the resource ID, as mentioned above. But for full generality I suppose something like (2) a user-provided regex-based mapping from ARN to dimension value would be needed. Alternatively, (3) special cases in the code akin to the one for "AWS/DynamoDB".

@eranreshef
Copy link

So has anyone opened a ticket to AWS about it?
Is there a planned workaround in cloudwatch-exporter?

@tokheim
Copy link

tokheim commented Aug 10, 2020

MSK seems to have similar issue with mismatching arn id. As seen in https://docs.aws.amazon.com/msk/latest/developerguide/msk-create-cluster.html a cluster is given a arn id like "arn:aws:kafka:us-east-1:123456789012:cluster/CustomConfigExampleCluster/abcd1234-abcd-dcba-4321-a1b2abcd9f9f-2". The random characters postfixed in the arn makes it not match the "Cluster Name" resource dimension (which would be CustomConfigExampleCluster in this example).

It would be good with an extended syntax for tag selections, so people can work around such problem on their own. @muvster 's 2nd suggestion would seem to solve the original issue, my example and #273.

@brian-brazil
Copy link
Contributor

So sometimes the extraneous data is a suffix, sometimes it's a prefix?

@josephreynolds
Copy link

I've also encountered the MSK issue last summer and experienced a similar issue today with for billing causing the inconsistent label names message. In the later case the exporter was yace .

It would be nice to drop/rewrite before the metric is converted to a prometheus metric.

msvticket added a commit to msvticket/cloudwatch_exporter that referenced this issue Feb 8, 2024
msvticket added a commit to msvticket/cloudwatch_exporter that referenced this issue Feb 8, 2024
msvticket added a commit to msvticket/cloudwatch_exporter that referenced this issue Feb 8, 2024
msvticket added a commit to msvticket/cloudwatch_exporter that referenced this issue Mar 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants