-
Notifications
You must be signed in to change notification settings - Fork 155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Elastic Agent ] Support of processors functionality with hints autodiscovery #2959
Comments
Hey @gizas thanks for filing this one! Because of #735 we cannot support a generic approach like the following:
The reason for 1) failing is that sth like the following is not a valid yaml at all: processors:
- add_fields:
fields:
custom2: foo
target: baz
${local_dynamic.procs} Thus we don;t have a way to define a placeholder to be populated by the hints' emmited dictionaries. I see 3 options here: A) Find a way to support all the usecases by solving #735. processors:
${local_dynamic.procs} C) Based on the assumptions that only Filebeat supported the equivalent only support the processors hint for a inputs:
- name: filestream-generic
id: hints-container-logs-${kubernetes.hints.container_id}
type: filestream
use_output: default
streams:
- condition: ${kubernetes.hints.generic_logs.container_logs.enabled} == true
data_stream:
dataset: kubernetes.container_logs
type: logs
exclude_files: [ ]
exclude_lines: [ ]
parsers:
- container:
format: auto
stream: ${kubernetes.hints.generic_logs.container_logs.stream|'all'}
paths:
- /var/log/containers/*${kubernetes.hints.container_id}.log
prospector:
scanner:
symlinks: true
tags: [ ]
data_stream.namespace: default
processors:
${kubernetes.hints.generic_logs.container_logs.procs} Let me know what you think, and we can try to validate options B or C to verify that we don't miss anything. |
After our sync we have verified that processors map that is being used when we update hints mapping (see https://github.com/elastic/elastic-agent/blob/main/internal/pkg/composable/providers/kubernetes/pod.go#L218) can read from configuration file the additional processors and append additional processor blocks. I have created the branch to track the work @gsantoro if you want quickly to verify that processors defined in our config are also added to the final input after hint generation see below steps:
providers.kubernetes:
node: "pas-control-plane"
kube_config: /Users/andreasgkizas/.kube/config
#Uncomment to enable hints' support
hints.enabled: true
inputs:
# Collecting system metrics
- name: filestream-generic
id: hints-container-logs-${kubernetes.hints.container_id}
type: filestream
use_output: default
processors:
- add_fields:
fields:
name: myproject
id: '574734885120952459'
streams:
- condition: ${kubernetes.hints.generic_logs.container_logs.enabled} == true
data_stream:
dataset: kubernetes.container_logs
type: logs
exclude_files: [ ]
exclude_lines: [ ]
parsers:
- container:
format: auto
stream: ${kubernetes.hints.generic_logs.container_logs.stream|'all'}
paths:
- /var/log/containers/*${kubernetes.hints.container_id}.log
prospector:
scanner:
symlinks: true
data_stream.namespace: default Run inspect command to see rendered output: ./elastic-agent inspect -v --variables --variables-wait 2s Rendered output: ./elastic-agent inspect -v --variables --variables-wait 2s
agent:
logging:
to_stderr: true
inputs:
- data_stream.namespace: default
id: hints-container-logs-34bad0cd3deeffb8b2029b9615f6777567192619c4ad52a614cbd84c84cb895c-kubernetes-dc8a5011-15ac-4579-81a8-31a93fe9b721.nginx
name: filestream-generic
original_id: hints-container-logs-34bad0cd3deeffb8b2029b9615f6777567192619c4ad52a614cbd84c84cb895c
processors:
- add_fields:
fields:
id: 34bad0cd3deeffb8b2029b9615f6777567192619c4ad52a614cbd84c84cb895c
image:
name: nginx
runtime: containerd
target: container
- add_fields:
fields:
container:
name: nginx
deployment:
name: nginx
labels:
app: nginx
pod-template-hash: 654557c679
namespace: default
namespace_labels:
kubernetes_io/metadata_name: default
namespace_uid: 052c58b8-c5c0-4f4c-bcb2-4bf3a471583f
node:
hostname: pas-control-plane
labels:
beta_kubernetes_io/arch: arm64
beta_kubernetes_io/os: linux
kubernetes_io/arch: arm64
kubernetes_io/hostname: pas-control-plane
kubernetes_io/os: linux
node-role_kubernetes_io/control-plane: ""
node_kubernetes_io/exclude-from-external-load-balancers: ""
name: pas-control-plane
uid: 0511824c-88f3-4b38-a0a5-406ed2eda282
pod:
ip: 10.244.0.5
name: nginx-654557c679-5pg8g
uid: dc8a5011-15ac-4579-81a8-31a93fe9b721
replicaset:
name: nginx-654557c679
target: kubernetes
- add_fields:
fields:
cluster:
name: kind-pas
url: https://127.0.0.1:57202
target: orchestrator
- add_fields:
fields:
id: "574734885120952459"
name: myproject
streams:
- data_stream:
dataset: kubernetes.container_logs
[output trucnated ...]
So long strory short is that on branch above I just retrieve any processors that are part of config and I append them in the existing processor map. |
@gizas we just need to make sure we document that processors from hints will only be added on integration/input/package level and not on data_stream level. As a first iteration this is fine but let's make sure document it and even file an issue for Elastic Agent team to evaluate if we can tune the providers' interface so as to emit processors also on data_stream's level (this happens at
|
Sure sure! Lets first evaluate and test this and we will do!!! Thanks man once more @ChrsMark ! |
I have pushed latest changes in : I build the above elastic agent image and load it to elastic-standalone by following https://github.com/elastic/elastic-agent#testing-elastic-agent-on-kubernetes Please have in mind that this is a draft. So the goal is to read the processors annotations from pod and append those in the processor map that is going to be emitted. In more details: providers.kubernetes:
node: ${NODE_NAME}
scope: node
hints:
enabled: true
default_container_logs: false And I mount the /etc/elastic-agent/inputs.d/container_logs.yml template of https://github.com/elastic/elastic-agent/blob/main/deploy/kubernetes/elastic-agent-standalone/templates.d/container_logs.yml nginx.yml apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: nginx
name: nginx
namespace: default
spec:
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
annotations:
co.elastic.hints/package: "container_logs"
co.elastic.hints/processors.1.add_fields.target: "project"
co.elastic.hints/processors.1.add_fields.fields.name: "myproject"
spec:
containers:
- image: nginx
name: nginx Nginx is installed with id: k describe pod nginx-65c56f67cb-2svvr | grep -i id
Container ID: containerd://52efcae02df01b7cdd359914dc3d16eb6b419236c9a4fd0f9f6b88e84c67574c So I see the following flow:
This is what is in processorlist This is what is in final processor map https://github.com/elastic/elastic-agent/blob/processorhints_enhancement/internal/pkg/composable/providers/kubernetes/pod.go#L493 But when I resume the debugger and expect that the final configuration will be emitted in agent, it does not ! See output: elastic-agent inspect -v --variables --variables-wait 2s
I0719 12:52:22.681261 91 leaderelection.go:248] attempting to acquire leader lease kube-system/elastic-agent-cluster-leader...
I0719 12:52:22.688726 91 leaderelection.go:258] successfully acquired lease kube-system/elastic-agent-cluster-leader
agent:
logging:
to_stderr: true
inputs:
- data_stream.namespace: default
id: unique-system-metrics-input
streams:
- data_stream.dataset: system.cpu
metricsets:
- cpu
- data_stream.dataset: system.memory
metricsets:
- memory
- data_stream.dataset: system.network
metricsets:
- network
- data_stream.dataset: system.filesystem
metricsets:
- filesystem
type: system/metrics
use_output: default
outputs:
default:
hosts: http://elasticsearch:9200
password: changeme
type: elasticsearch
username: elastic
|
@gizas I'm not sure if the Did you try to put the template within the |
Hmm sth seems broken as still not works, and inspect does not show anything And yes the inspect honours the templates under inputs.d, but I also checked manually |
If you comment out the part that extends the processors list at e4d1c95#diff-ed933ad9cf09eb9ca797c5c89b5e8991624cbe5301d88f996d9bd947f4e3ea16R450-R455 for example, what do you see? Also I would try to add a dummy processor body there as an initial step and see it it works. Also maybe adding some unit tests will help here to debug it easier. If none of these help, I can find some time to checkout the branch and try it on my end. |
It worked !!!!! - data_stream.namespace: default
id: hints-filestream-container-logs-52efcae02df01b7cdd359914dc3d16eb6b419236c9a4fd0f9f6b88e84c67574c-kubernetes-5d63b8c2-bbff-4e04-9bd8-7337555aa861.nginx
name: hints-filestream-container-logs
original_id: hints-filestream-container-logs-52efcae02df01b7cdd359914dc3d16eb6b419236c9a4fd0f9f6b88e84c67574c
processors:
- add_fields:
fields:
id: 52efcae02df01b7cdd359914dc3d16eb6b419236c9a4fd0f9f6b88e84c67574c
image:
name: nginx
runtime: containerd
target: container
[output truncated .....]
- add_fields:
fields:
name: myproject
target: project
streams:
- data_stream:
dataset: kubernetes.container_logs
type: logs
parsers:
- container:
format: auto
stream: all
paths:
- /var/log/containers/*52efcae02df01b7cdd359914dc3d16eb6b419236c9a4fd0f9f6b88e84c67574c.log You can not believe what it was!!!! |
hey @gizas , ./elastic-agent inspect -c /etc/elastic-agent/agent.yml -v --variables --variables-wait 2s when running the command from the elastic-agent pod. Notice the param |
hey @gizas and @ChrsMark I am not able to replicate the outcome, I was able to run the command and generate a policy but that policy doesn't match the output that you created. I was able to build the code from your feature branch, run the debugger, but the output elastic-agent config doesn't look the same as yours. |
hello @gizas and @ChrsMark I finally managed to get the correct policy printed by elastic-agent and the data to be correctly printed into ES. I'm not sure if it was me not being aware of how hints work but it took me an entire day to replicate this issue. This is what I have done:
outputs:
default:
type: elasticsearch
hosts:
- >-
${ES_HOST}
protocol: https
ssl.verification_mode: "${ES_SSL_VERIFICATION_MODE}"
allow_older_versions: ${ES_ALLOW_OLDER_VERSIONS}
username: ${ES_USERNAME}
password: ${ES_PASSWORD}
agent:
monitoring:
enabled: true
use_output: default
logs: true
metrics: true
providers.kubernetes:
node: ${NODE_NAME}
scope: node
#Uncomment to enable hints' support
hints.enabled: true
inputs:
# Collecting system metrics
- name: filestream-generic
id: hints-container-logs-${kubernetes.hints.container_id}
type: filestream
use_output: default
processors:
- add_fields:
fields:
name: myproject
id: '574734885120952459'
streams:
- condition: ${kubernetes.hints.kubernetes.container_logs.enabled} == true
data_stream:
dataset: kubernetes.container_logs
type: logs
exclude_files: [ ]
exclude_lines: [ ]
parsers:
- container:
format: auto
stream: ${kubernetes.hints.kubernetes.container_logs.stream|'all'}
paths:
- /var/log/containers/*${kubernetes.hints.container_id}.log
prospector:
scanner:
symlinks: true
data_stream.namespace: default To notice from above that:
I used a pod with the following annotations annotations:
co.elastic.hints/package: "kubernetes"
co.elastic.hints/data_streams: "container_logs"
co.elastic.hints/processors.1.add_fields.target: "project"
co.elastic.hints/processors.1.add_fields.fields.name: "myproject" To notice that I "guessed" those hints from looking at the redis docs at #2386.
default_container_logs: false I don't know what it is but we don't need it.
kubectl exec -n kube-system -it elastic-agent-standalone-f422w -- ./elastic-agent inspect -c /etc/elastic-agent/agent.yml --variables --variables-wait 2s > ./policy.yaml Clearly you need to adapt the previous command to the elastic-agent pod that you are running.
inputs:
- data_stream.namespace: default
id: hints-container-logs-ddcae4cf518fc3645461f6ae903da145713376bc73c128d6a05ded1dbb0421f0-kubernetes-fdf67bc3-394c-495c-b87e-02b11b90898f.nginx
name: filestream-generic
original_id: hints-container-logs-ddcae4cf518fc3645461f6ae903da145713376bc73c128d6a05ded1dbb0421f0
processors:
- add_fields:
fields:
id: ddcae4cf518fc3645461f6ae903da145713376bc73c128d6a05ded1dbb0421f0
image:
name: nginx:1.23.1-alpine
runtime: containerd
target: container
- add_fields:
fields:
name: myproject
target: project
- add_fields:
fields:
id: "574734885120952459"
name: myproject
streams:
- data_stream:
dataset: kubernetes.container_logs
type: logs
exclude_files: []
exclude_lines: []
parsers:
- container:
format: auto
stream: all
paths:
- /var/log/containers/*ddcae4cf518fc3645461f6ae903da145713376bc73c128d6a05ded1dbb0421f0.log
prospector:
scanner:
symlinks: true
type: filestream
use_output: default
I hope that's all we wanted to test. |
I think that Also the annotations:
co.elastic.hints/package: "kubernetes"
co.elastic.hints/data_streams: "container_logs"
co.elastic.hints/processors.1.add_fields.target: "project"
co.elastic.hints/processors.1.add_fields.fields.name: "myproject" looks weird to me. There is no Not sure what else you might have missed here:). I could reproduce the feature locally even without running from inside the Pod. Note that the Here is what I use: providers:
kubernetes:
kube_config: /home/chrismark/.kube/config
node: "kind-worker"
hints.enabled: true
hints.default_container_logs: false
inputs:
- name: hints-filestream-container-logs
id: hints-filestream-container-logs-${kubernetes.hints.container_id}
type: filestream
use_output: default
streams:
- condition: ${kubernetes.hints.container_logs.enabled} == true
data_stream:
dataset: kubernetes.container_logs
type: logs
parsers:
- container:
format: auto
stream: ${kubernetes.hints.container_logs.stream|'all'}
paths:
- /var/log/containers/*${kubernetes.hints.container_id}.log
prospector:
scanner:
symlinks: true
data_stream.namespace: default I run to get: agent:
logging:
to_stderr: true
inputs:
- data_stream.namespace: default
id: hints-filestream-container-logs-d8af5ef2c60a2b5bdb7f2bd239d6994a5015f71bf70cb5584f6a3b86ed0f82ca-kubernetes-e46a6433-dd51-496c-a987-45d78a63b50f.nginx
name: hints-filestream-container-logs
original_id: hints-filestream-container-logs-d8af5ef2c60a2b5bdb7f2bd239d6994a5015f71bf70cb5584f6a3b86ed0f82ca
processors:
- add_fields:
fields:
id: d8af5ef2c60a2b5bdb7f2bd239d6994a5015f71bf70cb5584f6a3b86ed0f82ca
image:
name: nginx
runtime: containerd
target: container
- add_fields:
fields:
container:
name: nginx
deployment:
name: nginx
labels:
app: nginx
pod-template-hash: 678f56544b
namespace: default
namespace_labels:
kubernetes_io/metadata_name: default
namespace_uid: ba30ee51-a9ac-4044-96bc-c110183aeb2d
node:
hostname: kind-worker
labels:
beta_kubernetes_io/arch: amd64
beta_kubernetes_io/os: linux
kubernetes_io/arch: amd64
kubernetes_io/hostname: kind-worker
kubernetes_io/os: linux
name: kind-worker
uid: 2f643ef8-be15-45b1-9972-ceb7c91c81f6
pod:
ip: 10.244.1.124
name: nginx-678f56544b-wd892
uid: e46a6433-dd51-496c-a987-45d78a63b50f
replicaset:
name: nginx-678f56544b
target: kubernetes
- add_fields:
fields:
cluster:
name: kind-kind
url: https://127.0.0.1:37345
target: orchestrator
- add_fields:
fields:
name: myproject
target: project
- rename:
fail_on_error: "false"
fields:
"0":
from: a.g
"1":
to: e.d
streams:
- data_stream:
dataset: kubernetes.container_logs
type: logs
parsers:
- container:
format: auto
stream: all
paths:
- /var/log/containers/*d8af5ef2c60a2b5bdb7f2bd239d6994a5015f71bf70cb5584f6a3b86ed0f82ca.log
prospector:
scanner:
symlinks: true
type: filestream
use_output: default
outputs:
default:
api_key: example-key
hosts:
- 127.0.0.1:9200
type: elasticsearch
providers:
kubernetes:
hints:
default_container_logs: false
enabled: true
kube_config: /home/chrismark/.kube/config
node: kind-worker |
thanks for the feedback @ChrsMark, it works as expected now. A couple of points:
I'm just glad that is it all solved now and I appreciate the help, but given the complexity of this setup and the fact that I had no prior knowledge about hints and how docs are sometimes not enough or out of date, for me, it's very important having a self-contained PR with all the info to replicate the issue. I hope that you understand |
@gsantoro all the questions are sane and valuable feedback. However keep in mind that the feature is still in Beta, and there is active development happening here so lot's of things might be missing in terms of docs etc. The purpose of #3063 is to identify such issues and make the feature as stable as possible. So first we need to answer the question of what is the purpose of your testing as part of this issue: Is it to get familiar with the codebase/feature and actively contribute to it or just to perform a QA testing? A QA testing without codebase knowledge at this point might does not make sense since there is still active development and docs etc might be missing. If the goal is to get familiar with the codebase then the process of trial and error you described sounds normal and is the learning curve related to this advanced feature. Answering the questions specifically inline:
I think we indeed miss the respective docs part from https://www.elastic.co/guide/en/fleet/8.9/hints-annotations-autodiscovery.html. It should have been added after #2386 but priorities changed so we missed that (my bad mostly :) ). Feel free to file an issue for this or a PR directly. cc-ing @gizas to ensure this becomes part of the #3063.
I don't get it. The template that @gizas points out in testing notes of the respective PR is https://github.com/elastic/elastic-agent/blob/main/deploy/kubernetes/elastic-agent-standalone/templates.d/container_logs.yml. And this one contains the
I don't see how #2386 is related to this PR and what docs you are referring to. If you mean the "How to test this manually" section I think the examples are accurate with one called
In general if you think that the testing notes of #3107 should be more accurate/correct please suggest to @gizas directly what is missing and how it would be better. |
thanks for the clarification:
I have no idea. It was pointed out above at
I think they are accurate. It's a pity that I wasted 2 days looking at the wrong instructions. |
Describe the enhancement:
Currently Elastic Agent Autodiscovery with Hints supports specific list of annotations.
The full list of both required and optional annotations supported can be found in
https://www.elastic.co/guide/en/fleet/current/hints-annotations-autodiscovery.html#_required_hints
This feature will introduce an additional annotation:
co.elastic.hints/processor
and the goal is the users with this annotation to specify the addition of a new processorDescribe a specific use case for the enhancement or feature:
Users will define eg.
The following annotations
Elastic agent will emmit the following configuration:
What is the definition of done?
Related Issue
Tasks
The text was updated successfully, but these errors were encountered: