Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Contrib image: journaldreceiver receiver is missing journalctl #462

Open
pavolloffay opened this issue Jan 24, 2024 · 19 comments
Open

Contrib image: journaldreceiver receiver is missing journalctl #462

pavolloffay opened this issue Jan 24, 2024 · 19 comments

Comments

@pavolloffay
Copy link
Member

The contrib docker image does not have journalctl

kanse@ikanse-mac journaldreceiver % oc logs app-log-journal-rc-kj9r6 -c otc-container
2024-01-24T09:30:20.755Z	info	[email protected]/telemetry.go:86	Setting up own telemetry...
2024-01-24T09:30:20.755Z	info	[email protected]/telemetry.go:159	Serving metrics	{"address": ":8888", "level": "Basic"}
2024-01-24T09:30:20.767Z	info	[email protected]/service.go:151	Starting otelcol...	{"Version": "0.92.0", "NumCPU": 4}
2024-01-24T09:30:20.767Z	info	extensions/extensions.go:34	Starting extensions...
2024-01-24T09:30:20.768Z	info	adapter/receiver.go:45	Starting stanza receiver	{"kind": "receiver", "name": "journald", "data_type": "logs"}
2024-01-24T09:30:20.768Z	info	[email protected]/service.go:191	Starting shutdown...
2024-01-24T09:30:20.768Z	info	adapter/receiver.go:140	Stopping stanza receiver	{"kind": "receiver", "name": "journald", "data_type": "logs"}
2024-01-24T09:30:20.768Z	info	extensions/extensions.go:59	Stopping extensions...
2024-01-24T09:30:20.768Z	info	[email protected]/service.go:205	Shutdown complete.
Error: cannot start pipelines: start stanza: start journalctl: exec: "journalctl": executable file not found in $PATH
2024/01/24 09:30:20 collector server run finished with error: cannot start pipelines: start stanza: start journalctl: exec: "journalctl": executable file not found in $PATH
@swiatekm
Copy link

This may be a better default. I'm not sure what kinds of compatibility guarantees systemd gives for the journal format, but I've seen issues surface from shipping journalctl with the Sumo otel distro container image. In general, it's safer to mount the necessary files from the host, and this is something the operator should consider doing eventually.

@ba1ajinaidu
Copy link

ba1ajinaidu commented Mar 20, 2024

I'm running into the same issue as well, is there a different image which has journalctl? or any other workarounds to export journal logs to a otlp exporter?

@TylerHelmuth
Copy link
Member

There aren't plans right now to include journalctl as part of the contrib image we release. In the meantime, if you require journalctl, building you're own image is a good option.

@pecigonzalo
Copy link

@TylerHelmuth I think that is not a great solution, particularly while depending on a binary. The collector should be self-sufficient for the most part, for all receivers/exporters that it includes.

At least an image that includes it, or instructions on how to set it up would be appreciated. eg. Mounting the file from host, or just COPY --from=some-image journald.
Based on some of the discussions on this topic like open-telemetry/opentelemetry-collector-contrib#2332 (comment) it seems like each one creating an image that contains journald is not feasable.

@jpkrohling
Copy link
Member

The collector should be self-sufficient for the most part, for all receivers/exporters that it includes.

I agree. I'd vote to remove journaldreceiver from the image, as providing all its dependencies is not feasible, IMO.

@pecigonzalo
Copy link

I don't agree with that, since journald is a fairly common, if not hte most common way systems output system logs nowday. But I would rather have that, than shipping something that can't really run. Imagine if we shipped without some dynamically linked libraries, or other critical CLIs. Maybe relying on the CLI is not the best thing here.

@swiatekm
Copy link

The correct solution to this problem is a pure Go implementation of a journald file format reader. As we don't have that yet, journalctl is the next best thing, and here the correct thing to do is to mount the necessary files from the host. There is no reliable way to include the journalctl binary in the container image, as systemd doesn't guarantee file format compatibility between different versions.

If you want to do it on your own, here' is an example of how to do it in a relatively thorough fashion.

@pecigonzalo
Copy link

Mounting would but great, but since it’s dynamically linked it’s not just mounting a binary or path. You have to build a full image.

Out of curiosity I wonder how fluentd, fluentbit, datadog, etc do it. I’ll go check, but if they ship the binary, I would say that is a safe bet.

@swiatekm
Copy link

I'm pretty sure you can mount all the dependencies. The Dockerfile I linked copies them from a Debian container.

@jpkrohling
Copy link
Member

I believe the code owners for this component should make the decision (cc @sumo-drosiek, @djaglowski). That said, I would appreciate hearing about the concrete use-case of having journaldreceiver on a Collector on a container: are you trying to read journald files from the host? Or do you have a whole journald process on the container itself?

@sumo-drosiek
Copy link
Member

sumo-drosiek commented Apr 22, 2024

The aim is to read journald files from the host (eg. in Kubernetes environment), but for now there is no pure go implementation which is doing that.

The libraries I know require CGO, which AFAIR we are avoiding for Open Telemetry Collector, so we rely on the binary itself.

Like @swiatekm-sumo compatibility between version is not guaranteed and because of that I'm sceptical to put the binary into the container as it won't work everywhere. On the other hand it will solve most of the usages by default.

I will try to find some time this week to verify how complex is writing the reader in pure go, and then I will have stricter opinion

@pecigonzalo
Copy link

I'm pretty sure you can mount all the dependencies. The Dockerfile I linked copies them from a Debian container

While it is likely technically possible, it's not practical in my opinion to mount ~30 paths to get this working. If this was a self-contained binary, I would agree it would be a suitable solution.

systemd doesn't guarantee file format compatibility between different versions.

Is there confirmation of this? Because I believe it was an assumption. The spec seems to be fairly well defined: https://systemd.io/JOURNAL_FILE_FORMAT/

I checked both Datadog and Fluentbit

  • Datadog: Seems to use the systemd go bindings with CGO
  • Fluentbit/Fluentd: Are C so use the C API

@sumo-drosiek
Copy link
Member

Is there confirmation of this? Because I believe it was an assumption.

I am under impression we had one issue related to the incompatibility of systemd. @swiatekm-sumo do you remind such case?

@sumo-drosiek
Copy link
Member

After more research and making some PoC for golang based journald reader I have the following opinion. It doesn't seem to be hard to write journald reader, but it may need some effort in order to test it properly and solve edge cases. I have some initial implementation which I'm going to share after I add some features, todo comments and small refactor.

Is there confirmation of this? Because I believe it was an assumption. The spec seems to be fairly well defined: https://systemd.io/JOURNAL_FILE_FORMAT/

Yes, at the end seems that there is full backward compatibility for journal format

Due to above, I think we may add binary to the image and replace it as soon as it will be possible. Firstly using golang based reader as experimental optional reader and then after fixing remaining bugs and adding more feature use it as only way

cc: @djaglowski for another opinion 😄

@djaglowski
Copy link
Member

After more research and making some PoC for golang based journald reader I have the following opinion. It doesn't seem to be hard to write journald reader, but it may need some effort in order to test it properly and solve edge cases. I have some initial implementation which I'm going to share after I add some features, todo comments and small refactor.

That's great news. This would be a huge improvement.

Due to above, I think we may add binary to the image and replace it as soon as it will be possible.

My understanding is that this is a messy undertaking. I'd rather avoid it especially if it's only temporary. If we have a potential solution in the works then I'd just as soon wait for it at this point.

@sumo-drosiek
Copy link
Member

My understanding is that this is a messy undertaking. I'd rather avoid it especially if it's only temporary. If we have a potential solution in the works then I'd just as soon wait for it at this point.

I cannot give any ETA for now, other on that I agree 🙈
I should push some code next week so more people can be involved. Especially in discussion about design and how to make the code suitable for OpenTelemetry

@sumo-drosiek
Copy link
Member

Related issue in OpenTelemetry repository: open-telemetry/opentelemetry-collector-contrib#32711

I pushed my poc code to the following repo: https://github.com/sumo-drosiek/gournal

@gillg
Copy link

gillg commented Jun 8, 2024

Why not doing it with journald-remote to be agnostic instead of adding a complex binary dependency on the otel image ?

Design proposed here: open-telemetry/opentelemetry-collector-contrib#2332 (comment)

This solution is natively supported by systemd, on the host without tricks.

@alita1991
Copy link

+1 for having this feature, I'm currently trying to scrape the kubelet logs, which are only available in the journal files

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants