You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
If you are interested in working on this issue or have submitted a pull request, please leave a comment
Problem
On the docs for the kubernetes_logs source, it's unclear to me if any given instance of Vector produces the logs for:
a) the host node of the vector instance
b) all nodes in the cluster, regardless of which node Vector instance(s) run on
c) some configurable combination of the above
d) something else entirely
Basically, I'd like to know what the default configuration will provide depending on the deployment topology, so I can de-risk logs getting either omitted or duplicated (or both!), especially when running in an autoscaling cluster.
I'm assuming that running Vector as a DaemonSet with a minimal config for the kubernetes_logs source will generally do the expected thing (implying the source is "all logs from the host node, and no other nodes"), but I've run into easily-misconfigured tools for log aggregation before. Being very explicit in the documentation on this would help ensure everyone is able to use it in the way they intend.
This could literally be a change as small as
-Collects Pod logs from Kubernetes Nodes, automatically enriching data with metadata via the Kubernetes API.+Collects Pod logs from Vector's host Kubernetes Node, automatically enriching data with metadata via the Kubernetes API.
(assuming this is actually the case, of course)
Configuration
No response
Version
0.41.0
Debug Output
No response
Example Data
No response
Additional Context
No response
References
I searched open issues and couldn't find anything!
The text was updated successfully, but these errors were encountered:
Thanks @Firehed ! I agree the docs are unclear here. Your assumption is correct, though, Vector will collect logs from pods on the same host as it. I'll submit your diff as a PR.
A note for the community
Problem
On the docs for the
kubernetes_logs
source, it's unclear to me if any given instance of Vector produces the logs for:a) the host node of the vector instance
b) all nodes in the cluster, regardless of which node Vector instance(s) run on
c) some configurable combination of the above
d) something else entirely
Basically, I'd like to know what the default configuration will provide depending on the deployment topology, so I can de-risk logs getting either omitted or duplicated (or both!), especially when running in an autoscaling cluster.
I'm assuming that running Vector as a DaemonSet with a minimal config for the kubernetes_logs source will generally do the expected thing (implying the source is "all logs from the host node, and no other nodes"), but I've run into easily-misconfigured tools for log aggregation before. Being very explicit in the documentation on this would help ensure everyone is able to use it in the way they intend.
This could literally be a change as small as
(assuming this is actually the case, of course)
Configuration
No response
Version
0.41.0
Debug Output
No response
Example Data
No response
Additional Context
No response
References
I searched open issues and couldn't find anything!
The text was updated successfully, but these errors were encountered: