Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

recommender Error adding metric sample for container #2010

Closed
qist opened this issue May 11, 2019 · 9 comments
Closed

recommender Error adding metric sample for container #2010

qist opened this issue May 11, 2019 · 9 comments
Labels
area/vertical-pod-autoscaler lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@qist
Copy link

qist commented May 11, 2019

{"log":"W0511 06:07:10.273944 6 cluster_feeder.go:386] Error adding metric sample for container {{default my-rec-deployment-55c8bd8657-j5fmp} POD}: KeyError: {{default my-rec-deployment-55c8bd8657-j5fmp} POD}\n","stream":"stderr","time":"2019-05-11T06:07:10.276302417Z"}

@bskiba
Copy link
Member

bskiba commented May 13, 2019

Does it cause any visible issues? This is expected if we get a sample for a pod that stopped to exist in the meantime.

@qist
Copy link
Author

qist commented May 14, 2019

kube-apiserver --version
Kubernetes v1.14.0
docker images
k8s.gcr.io/vpa-recommender:0.5.1


apiVersion: v1
kind: ServiceAccount
metadata:
name: vpa-recommender
namespace: kube-system

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: vpa-recommender
namespace: kube-system
spec:
replicas: 1
template:
metadata:
labels:
app: vpa-recommender
spec:
serviceAccountName: vpa-recommender
containers:
- name: recommender
image: k8s.gcr.io/vpa-recommender:0.5.1
imagePullPolicy: Always
args:
- "--v=4"
- "--storage=prometheus"
- "--prometheus-address=http://prometheus-k8s.monitoring.svc:9090"
- "--prometheus-cadvisor-job-name=kubelet"
resources:
limits:
cpu: 200m
memory: 1000Mi
requests:
cpu: 50m
memory: 500Mi
ports:
- containerPort: 8080

@bskiba
Copy link
Member

bskiba commented May 14, 2019

@qist Sorry, but I do not understand what problem you are facing. Is the VPA misbehaving in any way or is it just the warning message worrying you?

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 12, 2019
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Sep 11, 2019
@qist qist closed this as completed Sep 18, 2019
@lli-hiya
Copy link

lli-hiya commented Jul 20, 2020

I am also seeing this issue.

To be more specific, i am using VPA release 0.8, and I have VPAs set up for only a tiny portion of Deployments in our cluster. When the Recommender first starts up and inits history from Prometheus, it logs is filled with error like

I0720 22:10:10.160316       1 cluster_feeder.go:212] Adding pod {syslog disk-metrics-1595223600-plpwj} with labels map[controller_uid:d5fa6460-cacd-4dc4-ac32-e35895069f0a job_name:disk-metrics-1595223600]
I0720 22:10:10.160335       1 cluster_feeder.go:218] Adding 11 samples for container {{syslog disk-metrics-1595223600-plpwj} disk-metrics}
W0720 22:10:10.160347       1 cluster_feeder.go:224] Error adding metric sample for container {{syslog disk-metrics-1595223600-plpwj} disk-metrics}: KeyError: {{syslog disk-metrics-1595223600-plpwj} disk-metrics}
W0720 22:10:10.160356       1 cluster_feeder.go:224] Error adding metric sample for container {{syslog disk-metrics-1595223600-plpwj} disk-metrics}: KeyError: {{syslog disk-metrics-1595223600-plpwj} disk-metrics}
...

The error logs seems to come from code: https://github.com/kubernetes/autoscaler/blob/vpa-release-0.8/vertical-pod-autoscaler/pkg/recommender/input/cluster_feeder.go#L219-L224, and the error message for each container repeats itself one or multiple times, which seems to suggest that Recommender was able to get Pod History samples from Prometheus, but Containers are not initialized correctly in the ClusterState:

if !containerExists {
return NewKeyError(sample.Container)
}
, which I don't understand why because most containers showed in the error logs did exist.

The errors log only exist when the Recommender first starts up, and won't repeat itself at the later cycles.

Have anyone else seen this problem? Functionally it is not a blocker, but it is very annoying because whenever I query Recommender logs, these errors will flush to the screen and then the useful info starts later.

@sjentzsch
Copy link

Yes, same here. Also just stumbled upon it, thinking I got my label parameters wrong.

@superset1
Copy link

I am still getting "Error adding metric sample for container" in our logs.

@theophileds
Copy link

I'm having the same error as well, it has appeared after a cluster upgrade to 1.24.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/vertical-pod-autoscaler lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests

8 participants