Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs: clarify node source(s) on kubernetes_logs #21474

Closed
Firehed opened this issue Oct 10, 2024 · 1 comment · Fixed by #21477
Closed

Docs: clarify node source(s) on kubernetes_logs #21474

Firehed opened this issue Oct 10, 2024 · 1 comment · Fixed by #21477
Labels
type: bug A code related bug.

Comments

@Firehed
Copy link

Firehed commented Oct 10, 2024

A note for the community

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Problem

On the docs for the kubernetes_logs source, it's unclear to me if any given instance of Vector produces the logs for:

a) the host node of the vector instance
b) all nodes in the cluster, regardless of which node Vector instance(s) run on
c) some configurable combination of the above
d) something else entirely

Basically, I'd like to know what the default configuration will provide depending on the deployment topology, so I can de-risk logs getting either omitted or duplicated (or both!), especially when running in an autoscaling cluster.

I'm assuming that running Vector as a DaemonSet with a minimal config for the kubernetes_logs source will generally do the expected thing (implying the source is "all logs from the host node, and no other nodes"), but I've run into easily-misconfigured tools for log aggregation before. Being very explicit in the documentation on this would help ensure everyone is able to use it in the way they intend.

This could literally be a change as small as

-Collects Pod logs from Kubernetes Nodes, automatically enriching data with metadata via the Kubernetes API.
+Collects Pod logs from Vector's host Kubernetes Node, automatically enriching data with metadata via the Kubernetes API.

(assuming this is actually the case, of course)

Configuration

No response

Version

0.41.0

Debug Output

No response

Example Data

No response

Additional Context

No response

References

I searched open issues and couldn't find anything!

@Firehed Firehed added the type: bug A code related bug. label Oct 10, 2024
@jszwedko
Copy link
Member

Thanks @Firehed ! I agree the docs are unclear here. Your assumption is correct, though, Vector will collect logs from pods on the same host as it. I'll submit your diff as a PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug A code related bug.
Projects
None yet
2 participants