Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

change travis config script #54

Merged
merged 1 commit into from
Dec 21, 2016

Conversation

andyxning
Copy link
Member

@andyxning andyxning commented Dec 21, 2016

Currently, make node-problem-detector error with the following message:

CGO_ENABLED=0 GOOS=linux godep go build -a -installsuffix cgo -ldflags '-w' -o node-problem-detector
vendor/github.com/coreos/go-systemd/sdjournal/functions.go:19:2: no buildable Go source files in /home/xiening/go/src/k8s.io/node-problem-detector/vendor/github.com/coreos/pkg/dlopen
godep: go exit status 1
Makefile:15: recipe for target 'bin/node-problem-detector' failed
make: *** [bin/node-problem-detector] Error 1

dlopen uses cgo to compile. So, we should turn on CGO_ENABLED.

In order to make node-problem-detector statically linked. We can use -extldflags "-static"

  • remove ./bin/node-problem-detector target. IMO, it will not work as expected. It will not generate node-problem-detector binary under bin directory.
  • change go build -race script to make node-problem-detector
  • add dep target, it seems travis ci does not add godep binary automatically. So, we will check it and if it does not exist, we will manually download it.
  • Enable cgo, because we dlopen uses it to compile.
  • In order to compile a statically linked binary. we may should use -extldflags "-static".

This change is Reviewable

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Dec 21, 2016
@andyxning andyxning force-pushed the change_travis_config_script branch 2 times, most recently from b84e7ea to 0cf7e50 Compare December 21, 2016 06:52
@Random-Liu
Copy link
Member

Thanks. That's what I mean the repo is in a bad state before #39 is merged.

Some of the fixes are already covered in #39, but I'm totally fine to merge this first. Because it may take time for #39 to get merged. Thanks a lot! @andyxning

@andyxning
Copy link
Member Author

@Random-Liu Once this is merged, i will rebase #53 . IMO, they may conflict.

@@ -1 +1 @@
/bin/node-problem-detector
./node-problem-detector
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the dot here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. Will remove it.

@@ -9,10 +9,11 @@ PROJ = google_containers

PKG_SOURCES := $(shell find pkg -name '*.go')

node-problem-detector: ./bin/node-problem-detector
dep:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

./bin/node-problem-detector: $(PKG_SOURCES) node_problem_detector.go
CGO_ENABLED=0 GOOS=linux godep go build -a -installsuffix cgo -ldflags '-w' -o node-problem-detector
node-problem-detector: $(PKG_SOURCES) node_problem_detector.go dep
GOOS=linux godep go build -ldflags '-w -extldflags "-static"' -o node-problem-detector
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact currently, the journald integration won't work, because of the lack of systemd shared library in the container.

I'm fine with adding "-static" for now, but in #39 we don't need to do that because we'll be using fedora as the base image which should have the required libc libraries.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Random-Liu The static linked binary is mainly used for standalone deploy. We may have heterogeneous machines. So, in order to make things simple, we can just make npd static linked always wherever it will be deployed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, make sense to me.

@@ -14,4 +14,4 @@ install:
- cd $HOME/gopath/src/k8s.io/node-problem-detector
script:
- make test
- go build -race
- make node-problem-detector
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In #39, we changed this to also build container, but in this PR, it's fine to only binary for now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK to be changed in #39

@Random-Liu Random-Liu self-assigned this Dec 21, 2016
@andyxning andyxning force-pushed the change_travis_config_script branch from 0cf7e50 to acfbaa6 Compare December 21, 2016 08:52
@Random-Liu
Copy link
Member

LGTM. @andyxning Thanks!

@Random-Liu Random-Liu merged commit ec46587 into kubernetes:master Dec 21, 2016
@andyxning andyxning deleted the change_travis_config_script branch December 21, 2016 09:12
./bin/node-problem-detector: $(PKG_SOURCES) node_problem_detector.go
CGO_ENABLED=0 GOOS=linux godep go build -a -installsuffix cgo -ldflags '-w' -o node-problem-detector
node-problem-detector: $(PKG_SOURCES) node_problem_detector.go
GOOS=linux go build -ldflags '-w -extldflags "-static"' -o node-problem-detector
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andyxning There is still some problem here.
go-systemd is using dlopen to load libsystemd-dev. -static flag causes the following error:

fatal error: unexpected signal during runtime execution
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x0]

runtime stack:
runtime.throw(0x1a01fd9, 0x2a)
	/usr/local/go/src/runtime/panic.go:566 +0x95
runtime.sigpanic()
	/usr/local/go/src/runtime/sigpanic_unix.go:12 +0x2cc

goroutine 1 [syscall, locked to thread]:
runtime.cgocall(0x165b210, 0xc4204af8e0, 0xc400000000)
	/usr/local/go/src/runtime/cgocall.go:131 +0x110 fp=0xc4204af898 sp=0xc4204af858
k8s.io/node-problem-detector/vendor/github.com/coreos/pkg/dlopen._Cfunc_dlopen(0x7fdcfc000970, 0x1, 0x0)
	k8s.io/node-problem-detector/vendor/github.com/coreos/pkg/dlopen/_obj/_cgo_gotypes.go:95 +0x4e fp=0xc4204af8e0 sp=0xc4204af898
k8s.io/node-problem-detector/vendor/github.com/coreos/pkg/dlopen.GetHandle(0x2645e40, 0x4, 0x4, 0x0, 0x0, 0x0)
	/usr/local/google/home/lantaol/workspace/src/k8s.io/node-problem-detector/vendor/github.com/coreos/pkg/dlopen/dlopen.go:45 +0xd4 fp=0xc4204af948 sp=0xc4204af8e0
k8s.io/node-problem-detector/vendor/github.com/coreos/go-systemd/sdjournal.getFunction(0x19f3f88, 0x19, 0x0, 0x0, 0x0)
	/usr/local/google/home/lantaol/workspace/src/k8s.io/node-problem-detector/vendor/github.com/coreos/go-systemd/sdjournal/functions.go:46 +0x1d4 fp=0xc4204af9a8 sp=0xc4204af948
k8s.io/node-problem-detector/vendor/github.com/coreos/go-systemd/sdjournal.NewJournalFromDir(0xc420486600, 0xc, 0xc42040c890, 0x0, 0x0)
	/usr/local/google/home/lantaol/workspace/src/k8s.io/node-problem-detector/vendor/github.com/coreos/go-systemd/sdjournal/journal.go:377 +0xa4 fp=0xc4204afa58 sp=0xc4204af9a8
k8s.io/node-problem-detector/pkg/kernelmonitor/logwatchers/journald.getJournal(0xc4204865f0, 0x8, 0xc420486600, 0xc, 0xc4204865f8, 0x3, 0xc4204afc40, 0x465cf3, 0x268c1a0)
	/usr/local/google/home/lantaol/workspace/src/k8s.io/node-problem-detector/pkg/kernelmonitor/logwatchers/journald/log_watcher.go:133 +0x1d4 fp=0xc4204afbd8 sp=0xc4204afa58
k8s.io/node-problem-detector/pkg/kernelmonitor/logwatchers/journald.(*journaldWatcher).Watch(0xc420010460, 0x1, 0x1, 0x17b9f00)
	/usr/local/google/home/lantaol/workspace/src/k8s.io/node-problem-detector/pkg/kernelmonitor/logwatchers/journald/log_watcher.go:62 +0x47 fp=0xc4204afc50 sp=0xc4204afbd8
k8s.io/node-problem-detector/pkg/kernelmonitor.(*kernelMonitor).Start(0xc42000f260, 0x0, 0x1a86701, 0xb)
	/usr/local/google/home/lantaol/workspace/src/k8s.io/node-problem-detector/pkg/kernelmonitor/kernel_monitor.go:86 +0xb5 fp=0xc4204afcb8 sp=0xc4204afc50
k8s.io/node-problem-detector/pkg/problemdetector.(*problemDetector).Run(0xc4203ee1b0, 0xc42000f260, 0x26506a0)
	/usr/local/google/home/lantaol/workspace/src/k8s.io/node-problem-detector/pkg/problemdetector/problem_detector.go:56 +0x66 fp=0xc4204afec0 sp=0xc4204afcb8
main.main()
	/usr/local/google/home/lantaol/workspace/src/k8s.io/node-problem-detector/node_problem_detector.go:36 +0x6e fp=0xc4204aff08 sp=0xc4204afec0
runtime.main()
	/usr/local/go/src/runtime/proc.go:183 +0x1f4 fp=0xc4204aff60 sp=0xc4204aff08
runtime.goexit()
	/usr/local/go/src/runtime/asm_amd64.s:2086 +0x1 fp=0xc4204aff68 sp=0xc4204aff60

goroutine 17 [syscall, locked to thread]:
runtime.goexit()
	/usr/local/go/src/runtime/asm_amd64.s:2086 +0x1

goroutine 5 [chan receive]:
k8s.io/node-problem-detector/vendor/github.com/golang/glog.(*loggingT).flushDaemon(0x268c1a0)
	/usr/local/google/home/lantaol/workspace/src/k8s.io/node-problem-detector/vendor/github.com/golang/glog/glog.go:879 +0x7a
created by k8s.io/node-problem-detector/vendor/github.com/golang/glog.init.1
	/usr/local/google/home/lantaol/workspace/src/k8s.io/node-problem-detector/vendor/github.com/golang/glog/glog.go:410 +0x21d

goroutine 19 [chan receive]:
k8s.io/node-problem-detector/pkg/condition.(*conditionManager).syncLoop(0xc420011950)
	/usr/local/google/home/lantaol/workspace/src/k8s.io/node-problem-detector/pkg/condition/manager.go:94 +0x6e
created by k8s.io/node-problem-detector/pkg/condition.(*conditionManager).Start
	/usr/local/google/home/lantaol/workspace/src/k8s.io/node-problem-detector/pkg/condition/manager.go:79 +0x3f

There may be some way to fix this, I'll figure it out.
If it's hard to fix, we may have to:

  • Disable CGO when not building journald support, so the binary will be statically linked automatically.
  • Enable CGO when building journald support, so the binary will be dynamically linked.

For standalone NPD, either use non-journald version (statically linked) or use journald version (takes care of shared library).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will check this later. @Random-Liu

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that we can't use dlopen in statically linked application https://www.sourceware.org/ml/libc-alpha/2002-06/msg00079.html.

So for journald support, we have to use dynamically link then.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Random-Liu I have manually download #49. And compile it in a machine where libsystemd-dev has been downloaded. I run it well in a machine where libsystemd-dev has not been installed.

Seems i can not reproduce it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Random-Liu maybe you compile npd in a machine where libsystemd-dev is not available.

Copy link
Member Author

@andyxning andyxning Dec 27, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Random-Liu maybe we should take a look at etcd. It seems that the log package etcd uses import github.com/coreos/go-systemd/journal too.

FYI

Copy link
Member Author

@andyxning andyxning Dec 27, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Random-Liu According to code in https://github.com/kubernetes/node-problem-detector/blob/master/pkg/kernelmonitor/kernel_log_watcher.go#L83-L103. I do not think make logPath be empty in kernel-monitor.json will make node-problem-detector use systemd. It will uses default /var/log/kern.log.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind, etcd does not depends on dlopen.

Copy link
Member Author

@andyxning andyxning Dec 28, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Random-Liu
After some search, i think i can make some conclusions:

  • the way i add -extldflags "-static" to go build is not necessary and wrong. After adding this option, node-problem-detector can be compiled to be a static linked binary. However, this will make it panic just like what you have posted in #discussion_r93560095 when using journal.
    BTW, i can not tell why we got a SIGSEGV.
  • we should compile node-problem-detector in dynamic only.
  • Just as you said, we can not get a pure static linked binary with dlopen cause dlopen is targeted with loading shared library dynamically.

Copy link
Member

@Random-Liu Random-Liu Jan 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andyxning Good summary!

Sorry for the delay. I was on new year vocation in last 2 weeks and just come back today. Will review your PR (#49) today and hope we can get it in as soon as possible. :)

@Random-Liu Random-Liu added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 22, 2016
@andyxning andyxning mentioned this pull request Jan 19, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants