-
Notifications
You must be signed in to change notification settings - Fork 641
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Journald support #39
Journald support #39
Conversation
@adohe Could you help me review this? :) |
7471ee0
to
3feb8d1
Compare
3feb8d1
to
6e333a4
Compare
@adohe Wait, there is still something wrong. It seems that there is still something wrong with the statically linked binary. |
@Random-Liu this cadvisor#1531 has merged, please remove the third commit and update Godep. |
func getLogReader(cfg WatcherConfig) (io.ReadCloser, error) { | ||
var errs []error | ||
for name, getter := range logReaderGetters { | ||
r, err := getter(cfg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the meaning of name here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do!
LGTM. @Random-Liu 👍 |
@adohe In fact there is a problem with current solution. There are several ways to solve this:
@adohe Do you have any thoughts? :) |
@Random-Liu I have another option that solves that problem: Don't use journald! See #41 for that solution 😀 |
@euank as kernel monitor just cares about the kernel log message, access kmsg directly sounds like a good option. |
@Random-Liu What are the problems with Option 2? ABRT watches for kernel oopses in journald too. The ABRT container ships with own libsystemd and we haven't experienced any problem with backward compatibility. Option 3 works for me on docker run -it --privileged --rm -v /:/host fedora sh
journalctl -f -D /host/var/log/journal |
@jfilak Thanks for your information! I'm also using fedora now, because it has newer systemd (229) and supports My PR is almost ready. I'll send an update soon. |
6e333a4
to
e727ea1
Compare
e727ea1
to
c9b9651
Compare
@adohe @euank @jfilak Updated the PR.
I've manually tested journald support of this PR, and tested file based log support with the NPD k8s e2e test. @dchen1107 |
|
||
# NOTE that enable journald will increase the image size. | ||
ifeq ($(ENABLE_JOURNALD), 1) | ||
# Enable journald bulid tag. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpicking: a typo in the comment - s/bulid/build/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do. Thanks!
|
||
container: node-problem-detector | ||
cp Dockerfile.in Dockerfile | ||
sed -i 's|BASEIMAGE|$(BASEIMAGE)|g' Dockerfile | ||
sed -i 's|SETUP|$(SETUP)|g' Dockerfile |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpicking: I would rather use @baseimage@ and @setup@ (Autotools dialect) in the Dockerfile file, so it would be much clearer that these strings will be replaced. And I usually create a solo rule for the file transformations as follows:
Dockerfile: Dockerfile.in
sed -e 's|BASEIMAGE|$(BASEIMAGE)|g' \
-e 's|SETUP|$(SETUP)|g' \
$^ >$@
What is supposed to be the value of the SETUP variable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: the Dockerfile target must be .PHONY
target in your case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good suggestion. Will do!
In fact SETUP
can be removed. Initially I planed to use ubuntu, and needs a SETUP
to download the libsystemd-dev. However, later I decided to use fedora because systemd on ubuntu doesn't support +LZ
. Because systemd is already in fedora image, an extra SETUP
is not needed.
test: | ||
go test -v -race ./pkg/... $(BUILD_TAGS) | ||
|
||
.PHONY: all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a question: Why do you make only the all target PHONY target? I believe that the other targets that should run unconditionally should be made PHONY targets too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because pushing to gcr.io/google_containers
needs some special authentication. I don't think we want to do that every time we do make
. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PHONY
isn't about whether it happens every time, just about whether it treats the target name as a file to check the existence of (e.g. if you do touch clean; make clean
it won't run the commands because it treats "clean" as a filename and sees that the target already exists... unless you make clean PHONY).
node-problem-detector
isn't a PHONY necessarily, but everything else should be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@euank Thanks for the clarification. I'm really a Makefile fresher. :)
Will review your PR and rebase this one.
c9b9651
to
7ba9832
Compare
Addressed comments. :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if it's intended that this plugin interface can work for kmsg.. but it couldn't.
I think it would be a cleaner isolation for both reading and parsing of the messages to be abstracted behind one interface, not have reading straddle the interface boundary.
|
||
PKG_SOURCES := $(shell find pkg -name '*.go') | ||
|
||
node-problem-detector: ./bin/node-problem-detector | ||
|
||
./bin/node-problem-detector: $(PKG_SOURCES) node_problem_detector.go | ||
CGO_ENABLED=0 GOOS=linux godep go build -a -installsuffix cgo -ldflags '-w' -o node-problem-detector | ||
CGO_ENABLED=$(CGO_ENABLED) GOOS=linux go build -o ./bin/node-problem-detector -a -ldflags '-w' $(BUILD_TAGS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason we ever want CGO_ENABLED=0? Only distribution we support is the container, and it'll work fine in the container either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the binary is dynamically linked, and there is no corresponding dynamic library in the container, it won't work, right?
test: | ||
go test -timeout=1m -v -race ./pkg/... $(BUILD_TAGS) | ||
|
||
.PHONY: all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: shoulda been rebased out
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad. :(
|
||
test: | ||
go test -timeout=1m -v -race ./pkg/... | ||
Dockerfile: Dockerfile.in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there be a .gitignore
for Dockerfile?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you don't add Dockerfile to PHONY targets, users will be confused. Consider this scenario:
make
# The Dockerfile file didn't exist and make created it.
# @BASEIMAGE@ is Fedora and systemd-journal is enabled.
make ENABLE_JOURNALD=0
# The Dockerfile file already exists and make skipped the Dockerfile target.
# However the user believes that @BASEIMAGE@ is set to "alpine:3.4" and
# systemd-journal is disabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jfilak I see. You are right. Thanks for explaining. :)
Will do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to #39(discussion), we should also make ./bin/node-problem-detector
target phony or otherwise make will think that ./bin/node-problem-detector
is already exists and will not recompile it with journald disabled. WDYT.
continue | ||
} | ||
if line == "" { | ||
time.Sleep(100 * time.Millisecond) | ||
// Trim tailing `\n`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
redundant comment imo.
Also, strings.TrimSpace
would be fine I expect
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
) | ||
|
||
// Plugin is the interface of a log parsing plugin. | ||
type Plugin interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think an interface that is just: StreamLogMessages() (<-chan *types.KernelMessage, error)
would be better.
As is, this interface won't work for a kmsg plugin due to the reading abstraction. A read
of the kmsg device has special semantics which are lost by the attempts to buffer and so on above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The advantage of current interface is:
- Less code change, because current code is acting as.
- Plugin can share some code, such as the read loop.
However, delegating the whole logic to the plugin is actually more flexible:
- We can support kmsg, which has different read logic.
- We can use sdjournal api directly instead of the reader interface which has an extra buffer inside.
Comparing these 2 options, I prefer the later. :)
/cc @adohe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer the later too. kmsg has different read logical, which could be regarded as a new
kernel log watcher, but not suitable for current buffer approach plugin.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
cd06f6d
to
4956ccd
Compare
e6cc1cb
to
09d357b
Compare
Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please follow instructions at https://github.com/kubernetes/kubernetes/wiki/CLA-FAQ to sign the CLA. Once you've signed, please reply here (e.g. "I signed it!") and we'll verify. Thanks.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
09d357b
to
b39bff5
Compare
b39bff5
to
5917f80
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried a build of this, but could not get it to work.
One thing that didn't work at all was the magic with /etc/localtime, and I ended up using an init container to create a symlink instead of trying to mount the file directly.
# ENABLE_JOURNALD enables build journald support or not. Building journald support needs libsystemd-dev | ||
# or libsystemd-journal-dev. | ||
# TODO(random-liu): Build NPD inside container. | ||
ENABLE_JOURNALD?=1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: spaces around the assignment operators missing to keep this in-sync with the other variables here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK. Will fix. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -30,14 +79,17 @@ version: | |||
@echo $(VERSION) | |||
|
|||
./bin/node-problem-detector: $(PKG_SOURCES) | |||
GOOS=linux go build -o bin/node-problem-detector \ | |||
CGO_ENABLED=$(CGO_ENABLED) GOOS=linux go build -o bin/node-problem-detector \ | |||
-ldflags '-w -extldflags "-static" -X $(PKG)/pkg/version.version=$(VERSION)' \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per the comment above the "-static" here looks fishy, and probably should only be here when not building with journald support?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ankon
Yes, you're right. -static
should only be used when journald is disabled. For more info please consult the discussions about this review
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ankon Good catch. This is introduced during last rebase. As is mentioned in #54 (review), we should not try to static link when journald is enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
What problem did you hit? |
It (not sure who exactly - docker, kubelet, kubernetes?) refused to start the pod, because it couldn't mkdir the directory FWIW: $ kubectl version
Client Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.1", GitCommit:"82450d03cb057bab0950214ef122b67c83fb11df", GitTreeState:"clean", BuildDate:"2016-12-14T00:57:05Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.1", GitCommit:"82450d03cb057bab0950214ef122b67c83fb11df", GitTreeState:"clean", BuildDate:"1970-01-01T00:00:00Z", GoVersion:"go1.7.1", Compiler:"gc", Platform:"linux/amd64"}
$ minikube version
minikube version: v0.15.0
$ uname -r
4.9.3-200.fc25.x86_64 |
5917f80
to
1cda162
Compare
1cda162
to
457272c
Compare
Could you post the log here? In fact, I don't know whether npd works for minikube. :) |
LGTM |
@dchen1107 Thanks a lot for reviewing! :) |
Fixes #3.
Fixes #14.
This PR depends on google/cadvisor#1531. I'll remove the 3rd commit and update Godeps after the cadvisor patch is merged.
This PR finished the journald support.
Building
sdjournal
needslibsystemd-dev
orlibsystemd-journal-dev
which may not be easy to get on some os distros. So I added a knob to disable it:@adohe @dchen1107
/cc @kubernetes/sig-node
This change is