Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CloudWatch Agent Fails on EKS with ContainerD Runtime #261

Closed
fitchtech opened this issue Aug 21, 2021 · 3 comments
Closed

CloudWatch Agent Fails on EKS with ContainerD Runtime #261

fitchtech opened this issue Aug 21, 2021 · 3 comments

Comments

@fitchtech
Copy link

fitchtech commented Aug 21, 2021

I have tried applying the fix listed in this issue #188 on EKS 1.21 with the containerd runtime enabled. However, I'm still getting the same error messages:

2021-08-21T00:09:00Z W! No pod metric collected, metrics count is still 5 is containerd socket mounted? #188

2021-08-21T00:09:05Z W! [outputs.cloudwatchlogs] Invalid SequenceToken used, will use new token and retry: The given sequenceToken is invalid. The next expected sequenceToken is: 49605661750447750614958043896578931231172344896032866930

2021-08-21T00:09:05Z W! [outputs.cloudwatchlogs] Retried 0 time, going to sleep 105.761168ms before retrying.

Support for containerd runtime on EKS was added in July when EKS 1.21 was released.
https://aws.amazon.com/blogs/containers/amazon-eks-1-21-released/

This is a description of the cloudwatch daemonset on EKS 1.21 with containerd:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  annotations:
    deprecated.daemonset.template.generation: "1"
    meta.helm.sh/release-name: cloudwatch-metrics
    meta.helm.sh/release-namespace: cloudwatch
  creationTimestamp: "2021-08-21T00:08:54Z"
  generation: 1
  labels:
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: aws-cloudwatch-metrics
    app.kubernetes.io/version: "1.247345"
    helm.sh/chart: aws-cloudwatch-metrics-0.0.5
  name: cloudwatch-metrics
  namespace: cloudwatch
  resourceVersion: "6523977"
  uid: c0655b85-6861-4a09-b9ae-5eaa93917520
spec:
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app.kubernetes.io/name: aws-cloudwatch-metrics
  template:
    metadata:
      creationTimestamp: null
      labels:
        app.kubernetes.io/name: aws-cloudwatch-metrics
    spec:
      containers:
      - env:
        - name: HOST_IP
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: status.hostIP
        - name: HOST_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: spec.nodeName
        - name: K8S_NAMESPACE
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
        - name: CI_VERSION
          value: k8s/1.2.2
        image: amazon/cloudwatch-agent:1.247349.0b251399
        imagePullPolicy: IfNotPresent
        name: aws-cloudwatch-metrics
        resources:
          limits:
            cpu: 200m
            memory: 200Mi
          requests:
            cpu: 200m
            memory: 200Mi
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /etc/cwagentconfig
          name: cwagentconfig
        - mountPath: /rootfs
          name: rootfs
          readOnly: true
        - mountPath: /var/run/docker.sock
          name: dockersock
          readOnly: true
        - mountPath: /var/lib/docker
          name: varlibdocker
          readOnly: true
        - mountPath: /run/containerd/containerd.sock
          name: containerdsock
          readOnly: true
        - mountPath: /sys
          name: sys
          readOnly: true
        - mountPath: /dev/disk
          name: devdisk
          readOnly: true
      dnsPolicy: ClusterFirst
      hostNetwork: true
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      serviceAccount: aws-cloudwatch-metrics
      serviceAccountName: aws-cloudwatch-metrics
      terminationGracePeriodSeconds: 60
      volumes:
      - configMap:
          defaultMode: 420
          name: cloudwatch-metrics
        name: cwagentconfig
      - hostPath:
          path: /
          type: ""
        name: rootfs
      - hostPath:
          path: /var/run/docker.sock
          type: ""
        name: dockersock
      - hostPath:
          path: /var/lib/docker
          type: ""
        name: varlibdocker
      - hostPath:
          path: /run/containerd/containerd.sock
          type: ""
        name: containerdsock
      - hostPath:
          path: /sys
          type: ""
        name: sys
      - hostPath:
          path: /dev/disk/
          type: ""
        name: devdisk
  updateStrategy:
    rollingUpdate:
      maxSurge: 0
      maxUnavailable: 1
    type: RollingUpdate
status:
  currentNumberScheduled: 5
  desiredNumberScheduled: 5
  numberAvailable: 5
  numberMisscheduled: 0
  numberReady: 5
  observedGeneration: 1
  updatedNumberScheduled: 5
@pingleig
Copy link
Member

pingleig commented Aug 21, 2021

Link reply in #188 (comment) the reason is the containerd sock on host is /run/dockershim.sock and cwagent is using /run/containerd/containerd.sock in default manifest. You need to change the one under volumes (and keep volumeMounts as it is)

@jhnlsn
Copy link
Contributor

jhnlsn commented Sep 23, 2021

Please feel free to reopen if you have any other questions

@jhnlsn jhnlsn closed this as completed Sep 23, 2021
@pingleig
Copy link
Member

I think the problem here is our default config is wrong for eks and we should

  • change the default yaml to follow eks's config for containerd sock, assuming most users are using eks instead of other distro, e.g. created using kops
  • update the helm chart, which is not maintained by agent team AFIK

Initially when I was fixing this issue, there is no eks containerd (except for bottlerocket) so I followed kops config in the yaml files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants