Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

consul panics on read of multi-service definition #3874

Closed
davidkarlsen opened this issue Feb 8, 2018 · 8 comments
Closed

consul panics on read of multi-service definition #3874

davidkarlsen opened this issue Feb 8, 2018 · 8 comments
Labels
type/bug Feature does not function as expected type/crash The issue description contains a golang panic and stack trace
Milestone

Comments

@davidkarlsen
Copy link

davidkarlsen commented Feb 8, 2018

Description of the Issue (and unexpected/desired result)

consul panics on multi-service definiton as .json in services dir, loaded with consul reload - should at least not panic, but describe format error and ignore definition.

Reproduction steps

Create service def like this:

less /tmp/pmq-exporter.json 
{
 "services": [
    {
      "id": "myhostip1:mq-exporter:9101",
      "name": "pmq-exporter",
      "tags": ["prom_monitored", "metrics_path=/metrics", "ogteams=fsdevops"],
      "address": "myhostip1",
      "port": 9101
    },
    {
      "id": "myhostip2:mq-exporter:9101",
      "name": "pmq-exporter",
      "tags": ["prom_monitored", "metrics_path=/metrics", "ogteams=fsdevops"],
      "address": "myhostip2",
      "port": 9101
    }
  ]
}

which is according to https://www.consul.io/docs/agent/services.html "Multiple service definitions"

consul version for both Client and Server

server: 1.0.3
client: N/A

Operating system and Environment details

uname -a
Linux alp-aot-ccm02 3.10.0-693.1.1.el7.x86_64 #1 SMP Thu Aug 3 08:15:31 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux
[root@alp-aot-ccm02 services]# docker info
Containers: 16
Running: 16
Paused: 0
Stopped: 0
Images: 22
Server Version: 1.12.6
Storage Driver: btrfs
Build Version: Btrfs v4.9.1
Library Version: 102
Logging Driver: syslog
Cgroup Driver: systemd
Plugins:
Volume: local
Network: host overlay null bridge
Authorization: rhel-push-plugin
Swarm: inactive
Runtimes: docker-runc runc
Default Runtime: docker-runc
Security Options: seccomp
Kernel Version: 3.10.0-693.1.1.el7.x86_64
Operating System: Red Hat Enterprise Linux Server 7.4 (Maipo)
OSType: linux
Architecture: x86_64
Number of Docker Hooks: 3
CPUs: 6
Total Memory: 15.51 GiB
Name: alp-aot-ccm02
ID: WDB6:MHPJ:P72T:P4IT:2F7I:4XW5:XHYX:WHKE:EF67:YJXK:4UZH:O464
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://xxxx:8085/v1/
Insecure Registries:
127.0.0.0/8

Log Fragments or Link to gist

Feb  8 03:02:53 alp-aot-ccm02 docker/4e3120af2781[17967]: panic: runtime error: invalid memory address or nil pointer dereference
Feb  8 03:02:53 alp-aot-ccm02 docker/4e3120af2781[17967]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0xf2de5f]
Feb  8 03:02:53 alp-aot-ccm02 docker/4e3120af2781[17967]: 
Feb  8 03:02:53 alp-aot-ccm02 docker/4e3120af2781[17967]: goroutine 1 [running]:
Feb  8 03:02:53 alp-aot-ccm02 docker/4e3120af2781[17967]: github.com/hashicorp/consul/agent.(*Agent).loadServices(0xc420544000, 0xc420191800, 0xc4201a01e0, 0x7f178d4fc
d40)
Feb  8 03:02:53 alp-aot-ccm02 docker/4e3120af2781[17967]: #011/gopath/src/github.com/hashicorp/consul/agent/agent.go:2162 +0xa5f
Feb  8 03:02:53 alp-aot-ccm02 docker/4e3120af2781[17967]: github.com/hashicorp/consul/agent.(*Agent).Start(0xc420544000, 0xc420544000, 0x0)
Feb  8 03:02:53 alp-aot-ccm02 docker/4e3120af2781[17967]: #011/gopath/src/github.com/hashicorp/consul/agent/agent.go:312 +0x4f3
Feb  8 03:02:53 alp-aot-ccm02 docker/4e3120af2781[17967]: github.com/hashicorp/consul/command/agent.(*cmd).run(0xc420191000, 0xc42000e120, 0xe, 0xe, 0x0)
Feb  8 03:02:53 alp-aot-ccm02 docker/4e3120af2781[17967]: #011/gopath/src/github.com/hashicorp/consul/command/agent/agent.go:337 +0x414
Feb  8 03:02:53 alp-aot-ccm02 docker/4e3120af2781[17967]: github.com/hashicorp/consul/command/agent.(*cmd).Run(0xc420191000, 0xc42000e120, 0xe, 0xe, 0xc420270c20)
Feb  8 03:02:53 alp-aot-ccm02 docker/4e3120af2781[17967]: #011/gopath/src/github.com/hashicorp/consul/command/agent/agent.go:77 +0x50
Feb  8 03:02:53 alp-aot-ccm02 docker/4e3120af2781[17967]: github.com/hashicorp/consul/vendor/github.com/mitchellh/cli.(*CLI).Run(0xc420238900, 0xc420238900, 0x40, 0xc4
20270fc0)
Feb  8 03:02:53 alp-aot-ccm02 docker/4e3120af2781[17967]: #011/gopath/src/github.com/hashicorp/consul/vendor/github.com/mitchellh/cli/cli.go:242 +0x1eb
Feb  8 03:02:53 alp-aot-ccm02 docker/4e3120af2781[17967]: main.realMain(0xfb66c7)
Feb  8 03:02:53 alp-aot-ccm02 docker/4e3120af2781[17967]: #011/gopath/src/github.com/hashicorp/consul/main.go:52 +0x416
Feb  8 03:02:53 alp-aot-ccm02 docker/4e3120af2781[17967]: main.main()
Feb  8 03:02:53 alp-aot-ccm02 docker/4e3120af2781[17967]: #011/gopath/src/github.com/hashicorp/consul/main.go:19 +0x22
Feb  8 03:02:53 alp-aot-ccm02 systemd-machined: Machine 4e3120af27816b3a59b13d98c8b51823 terminated.
@slackpad
Copy link
Contributor

slackpad commented Feb 8, 2018

Hi @davidkarlsen I don't see the same crash on my Mac, but this code does look potentially dangerous - https://github.com/hashicorp/consul/blob/v1.0.3/agent/agent.go#L2153-L2162 if p ended up as nil.

@slackpad
Copy link
Contributor

slackpad commented Feb 8, 2018

Could you please post your file some place? I'm wondering if it contains an odd control character or something like that.

@slackpad slackpad added type/bug Feature does not function as expected type/crash The issue description contains a golang panic and stack trace labels Feb 8, 2018
@slackpad slackpad added this to the Next milestone Feb 8, 2018
@davidkarlsen
Copy link
Author

davidkarlsen commented Feb 8, 2018

@hanshasselberg
Copy link
Member

I started to look into this.

@hfarooqui
Copy link

hfarooqui commented Feb 25, 2018

I see the same issue with multiple service definitions

"service": [{
"name": "master",
"port": 8883,
"tags": ["master", "mynode_1"],
"check": {
"tcp": "192.168.122.120:8883",
"interval": "10s"
}
},
{
"name": "slave",
"port": 8884,
"tags": ["slave", "mynode_1"],
"check": {
"tcp": "192.168.122.120:8884",
"interval": "10s"
}
}
]

@hanshasselberg
Copy link
Member

hanshasselberg commented Feb 25, 2018

@hfarooqui @davidkarlsen I can reproduce your issue on v1.0.3 and master. I am a bit surprised that you put it into data/services and I am wondering if that is supposed to work. https://www.consul.io/docs/agent/services.html is saying:

To configure a service, either provide it as a -config-file option to the agent or place it inside the -config-dir of the agent.

@hanshasselberg
Copy link
Member

I thought about it more and I don't think you should be touching data-dir, because consul owns that. I think it is ok to panic if suddenly an internal file is no longer in the expected format. You can change stuff in config-dir and there the multiple service definition actually works.

@preetapan
Copy link
Contributor

Closing this based on @i0rek's comment above. The consul data directory is internal to the agent and you should not be editing files or adding content there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug Feature does not function as expected type/crash The issue description contains a golang panic and stack trace
Projects
None yet
Development

No branches or pull requests

5 participants