Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Contour dont start, started to fail after automatic restart #959

Closed
vongohren opened this issue Mar 24, 2019 · 6 comments
Closed

Contour dont start, started to fail after automatic restart #959

vongohren opened this issue Mar 24, 2019 · 6 comments
Labels
blocked/needs-info Categorizes the issue or PR as blocked because there is insufficient information to advance it.

Comments

@vongohren
Copy link

What steps did you take and what happened:
I did nothing, it just started to fail on prod. It happened after an automatic restart.
Dev was working fine, but then I restarted that as well, then it started to fail with the same problems. So I have not done anything for this to happen

What did you expect to happen:
I exepcted the clusters to run and the contour container to work smoothly.

Anything else you would like to add:
This is the logs i get

 time="2019-03-24T12:17:31Z" level=info msg="args: [serve --incluster]"
 E 
 time="2019-03-24T12:17:31Z" level=info msg=started context=grpc
 E 
 time="2019-03-24T12:17:31Z" level=info msg="waiting for cache sync" context=coreinformers
 E 
 time="2019-03-24T12:17:31Z" level=info msg=started context=coreinformers
 E 
 time="2019-03-24T12:17:31Z" level=info msg="waiting for cache sync" context=contourinformers
 E 
 time="2019-03-24T12:17:31Z" level=info msg=started context=contourinformers
 E 
 time="2019-03-24T12:17:31Z" level=info msg=started address="127.0.0.1:6060" context=debugsvc
 E 
 time="2019-03-24T12:17:31Z" level=info msg=started address="0.0.0.0:8000" context=metricsvc
 E 
 time="2019-03-24T12:17:31Z" level=info msg="forcing update" context=HoldoffNotifier last update=2562047h47m16.854775807s
 E 
 pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:95: Failed to list *v1beta1.TLSCertificateDelegation: tlscertificatedelegations.contour.heptio.com is forbidden: User "system:serviceaccount:heptio-contour:contour" cannot list tlscertificatedelegations.contour.heptio.com at the cluster scope E 
 pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:95: Failed to list *v1beta1.TLSCertificateDelegation: tlscertificatedelegations.contour.heptio.com is forbidden: User "system:serviceaccount:heptio-contour:contour" cannot list tlscertificatedelegations.contour.heptio.com at the cluster scope E 
 log: exiting because of error: log: cannot create log: open /tmp/contour.contour-74cc4d58c6-gt9f8.unknownuser.log.ERROR.20190324-121731.1: no such file or directory
 E 
 [2019-03-24 12:17:31.892][1][info][main] source/server/server.cc:183] initializing epoch 0 (hot restart version=10.200.16384.127.options=capacity=16384, num_slots=8209 hash=228984379728933363 size=2654312)
 E 
 [2019-03-24 12:17:31.892][1][info][main] source/server/server.cc:185] statically linked extensions:
 E 
 [2019-03-24 12:17:31.892][1][info][main] source/server/server.cc:187]   access_loggers: envoy.file_access_log,envoy.http_grpc_access_log
 E 
 [2019-03-24 12:17:31.892][1][info][main] source/server/server.cc:190]   filters.http: envoy.buffer,envoy.cors,envoy.ext_authz,envoy.fault,envoy.filters.http.header_to_metadata,envoy.filters.http.jwt_authn,envoy.filters.http.rbac,envoy.grpc_http1_bridge,envoy.grpc_json_transcoder,envoy.grpc_web,envoy.gzip,envoy.health_check,envoy.http_dynamo_filter,envoy.ip_tagging,envoy.lua,envoy.rate_limit,envoy.router,envoy.squash
 E 
 [2019-03-24 12:17:31.892][1][info][main] source/server/server.cc:193]   filters.listener: envoy.listener.original_dst,envoy.listener.proxy_protocol,envoy.listener.tls_inspector
 E 
 [2019-03-24 12:17:31.892][1][info][main] source/server/server.cc:196]   filters.network: envoy.client_ssl_auth,envoy.echo,envoy.ext_authz,envoy.filters.network.thrift_proxy,envoy.http_connection_manager,envoy.mongo_proxy,envoy.ratelimit,envoy.redis_proxy,envoy.tcp_proxy
 E 
 [2019-03-24 12:17:31.892][1][info][main] source/server/server.cc:198]   stat_sinks: envoy.dog_statsd,envoy.metrics_service,envoy.statsd
 E 
 [2019-03-24 12:17:31.892][1][info][main] source/server/server.cc:200]   tracers: envoy.dynamic.ot,envoy.lightstep,envoy.zipkin
 E 
 [2019-03-24 12:17:31.892][1][info][main] source/server/server.cc:203]   transport_sockets.downstream: envoy.transport_sockets.capture,raw_buffer,tls
 E 
 [2019-03-24 12:17:31.892][1][info][main] source/server/server.cc:206]   transport_sockets.upstream: envoy.transport_sockets.capture,raw_buffer,tls
 E 
 [2019-03-24 12:17:31.900][1][info][config] source/server/configuration_impl.cc:50] loading 0 static secret(s)
 E 
 [2019-03-24 12:17:31.904][1][critical][main] source/server/server.cc:78] error initializing configuration '/config/contour.yaml': logical_dns clusters must have a single host
 E 
 [2019-03-24 12:17:31.904][1][info][main] source/server/server.cc:437] exiting
 E 

Environment:

  • Contour version:
    contour: gcr.io/heptio-images/contour:master
  • Kubernetes version: (use kubectl version):
    not sure, its scripted
  • Kubernetes installer & version:
    not sure, its scripted
  • Cloud provider or hardware configuration:
    GCloud
    Machine type
    n1-standard-1 (1 vCPU, 3.75 GB memory)
    Total cores
    3 vCPUs
    Total memory
    11.25 GB
  • OS (e.g. from /etc/os-release):
    Docker, linux
@vongohren vongohren changed the title Contour have started to fail after a restart Contour dont start, started to fail after automatic restart Mar 24, 2019
@vongohren
Copy link
Author

This was solved by setting the container to a stable version, but this might be an important bug for the master branch, that should be fixed before next release

@davecheney
Copy link
Contributor

davecheney commented Mar 24, 2019 via email

@davecheney davecheney added the blocked/needs-info Categorizes the issue or PR as blocked because there is insufficient information to advance it. label Mar 29, 2019
@davecheney
Copy link
Contributor

@vongohren did my previous comment prove useful. Can you please update the issue if you were unable to resolve the problem.

@vongohren
Copy link
Author

@davecheney hi sorry for the lack of feedback!
But acctually just switching to a stable version fixed it for me, as i mentioned in my follow up comment #959 (comment)
No troubles after updating to 0.10

So I have not traversed down your suggestion quite yet.

unicell added a commit to unicell/contour that referenced this issue Apr 9, 2019
Related to projectcontour#959

Client-go uses glog even though Contour by itself doesn't. In the
failure scenario, such as CRD types not registered, glog in client-go
attempts to log to files under /tmp, which may not even exist (e.g in
`scratch` Docker image) or not accessible (container started with
non-root user etc.). And when that happens it'll crash the whole Contour
process which should not happen.

This change overrides the glog flag to let it always dumping to stderr,
avoid logging to files which is not desired in container environment

Signed-off-by: Qiu Yu <[email protected]>
@unicell
Copy link
Contributor

unicell commented Apr 9, 2019

Although the original issue reported was caused by CRD mismatch between what's been registered and what could be recognized by Contour, Contour, however, shouldn't crash when that happens. Filed a PR above to fix the crash issue.

@davecheney
Copy link
Contributor

I’m going to close this now that #1004 has landed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocked/needs-info Categorizes the issue or PR as blocked because there is insufficient information to advance it.
Projects
None yet
Development

No branches or pull requests

3 participants