Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fleet managed: enable debug mode #143

Closed
mtojek opened this issue Oct 13, 2021 · 24 comments
Closed

Fleet managed: enable debug mode #143

mtojek opened this issue Oct 13, 2021 · 24 comments
Assignees
Labels
good first issue Good for newcomers Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team

Comments

@mtojek
Copy link
Contributor

mtojek commented Oct 13, 2021

Hi Team,

while investigating the root cause of elastic/integrations#1566 , we confirmed that's really useful to enable debug logs for filebeat and metricbeat running as Docker containers (under CI). The ideal option would involve an extra property in policy to enable debug logging or at least a special Docker image ENV which can be hardcoded in system tests.

@mtojek mtojek added the Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team label Oct 13, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)

@mtojek
Copy link
Contributor Author

mtojek commented Oct 27, 2021

Hey! Is there an option to give some priority? It might be a useful feature for supporting customers once we're GA. Any workaround would be also appreciated.

@jlind23
Copy link
Contributor

jlind23 commented Oct 28, 2021

Assigning @michel-laterman to have more insight as he worked on the diagnosis topic. As soon as we have a clear target on it we'll move forward.
(fyi @nimarezainia )

@michel-laterman
Copy link
Contributor

We can pass the logging.level setting to beats/processes

@jlind23
Copy link
Contributor

jlind23 commented Oct 29, 2021

Should we add it in the same PR as ECS logging? elastic/beats#28573
wdyt @michel-laterman

@mtojek
Copy link
Contributor Author

mtojek commented Oct 29, 2021

We can pass the logging.level setting to beats/processes

Sounds good, but to be complete we need to control the log level with ENVs injected to the Docker image. Is this option available?

@michel-laterman
Copy link
Contributor

There's some support for env vars now (https://www.elastic.co/guide/en/beats/metricbeat/7.15/using-environ-vars.html) however it needs changes made to the config file used in order to work, is that acceptable for the time being?

@mtojek
Copy link
Contributor Author

mtojek commented Nov 2, 2021

This is the way we start elastic-agent and fleet-server:
https://github.com/elastic/elastic-package/blob/9e5e39f0bfcdd0d92439384b4e86836da47b1391/internal/profile/_static/docker-compose-stack.yml#L94

We depend on environment variables only.

@jlind23 jlind23 added the good first issue Good for newcomers label Nov 2, 2021
@michel-laterman
Copy link
Contributor

After a chat with @ruflin, I think that we agreed the best approach to take was to add the ability to define logging levels to the policy instead of just passing the agent's level to the process. This way we can use environment variables with a default, for example (syntax may not be correct)

log.level: ${CUSTOM_LOG_LEVEL:'info'}

@ruflin
Copy link
Contributor

ruflin commented Nov 4, 2021

@joshdover I wonder if we should use an environment variable with the preset values as the default from Fleet. This means, the policy sent down by Fleet would always contain something like log.level: ${env.ELASTIC_AGENT_LOG_LEVEL|'error'} in the case of error (syntax may also not be correct).

Now what happens if a user was overwriting the log level for an elastic agent from Fleet? Which one wins?

@joshdover
Copy link
Contributor

I wonder if we should use an environment variable with the preset values as the default from Fleet. This means, the policy sent down by Fleet would always contain something like log.level: ${env.ELASTIC_AGENT_LOG_LEVEL|'error'} in the case of error (syntax may also not be correct).

Seems reasonable to me. Let us know if we should open an issue to start tracking this.

Now what happens if a user was overwriting the log level for an elastic agent from Fleet? Which one wins?

Do we have precedence with any other settings? I sorta expect the local agent config to override the managed config for simple debugging purposes. I don't think there would be any security risk here since the user needs root access on the machine to edit the local configuration, which means they likely have access to everything that Agent is collecting data from.

@jlind23
Copy link
Contributor

jlind23 commented Nov 15, 2021

Now what happens if a user was overwriting the log level for an elastic agent from Fleet? Which one wins?

Do we need to always have the same winner? Shouldn't we take the most verbose for diagnosis purpose?

@ruflin
Copy link
Contributor

ruflin commented Nov 16, 2021

@joshdover Sounds like this is the issue to track this? You mean we need an additional issue in Beats / Kibana?

@jlind23 I expect us to have many more configs where this logic applies. Instead of having a case by case logic I rather have a principle in place do always have the same behaviour.

I like the idea that the most local one wins. Basically the order would be:

local config > fleet config > template default

@jlind23
Copy link
Contributor

jlind23 commented Nov 16, 2021

Local first suits me well then! 👍🏼

@joshdover
Copy link
Contributor

Sounds like this is the issue to track this? You mean we need an additional issue in Beats / Kibana?

Yes, I was asking if we should open an Fleet UI issue for making this the default value: log.level: ${env.ELASTIC_AGENT_LOG_LEVEL|'error'}. But given the precedence discussion, should this just always be the behavior from the Agent side rather than requiring this in the configuration yaml directly?

@ruflin
Copy link
Contributor

ruflin commented Jan 10, 2022

I think we need both. Making it a default in the Elastic Agent but also when shipped down from the policy as the policy will overwrite it.

@axw
Copy link
Member

axw commented Feb 17, 2022

Being able to enable debug logging is important for APM Server too, as we are going all in on Fleet. If users can't enable debug logging we're going to have a harder time debugging issues.

Ideally we would also be able to set logging.selectors, as turning on debug level logging for everything tends to be overwhelming.

@joshdover
Copy link
Contributor

I've opened elastic/kibana#125956 to track this on the Fleet UI side.

Following from @axw's suggestion above, it seems we're likely to want to expose more than just logging.level to inputs. Instead of adding an env var for each one, should we instead extend the agent context provider to allow inputs to read the overall agent logging configuration? This would allow a policy like:

id: my-policy
agent:
  monitoring:
    # ...
outputs:
  # ...
fleet.hosts: ''
inputs:
  - id: <uuid>
    type: logfile
    logging:
      level: "${agent.logging.level | 'error'}"
      selectors: "${agent.logging.selectors | '[beat]'}"

TBH I'm still unclear on why Fleet needs to provide this in the policy and the default can't be part of the Elastic Agent logic. What wouldn't be possible if the default came from the policy? The only thing I can think of is whenever we get around to adding support for variables and conditions or global variables that this would be necessary.

@jlind23
Copy link
Contributor

jlind23 commented Feb 21, 2022

@ph would it be possible for you to give a first stab as designing it?

@ph
Copy link
Contributor

ph commented Feb 22, 2022

I need bit more details, looking at the original description and the above information we are looking for a way to define using the environment variable a new log level or a new selector or both. Looking at this comment we are looking at log level per input?

What is the actual need per input or that we can specify the log level and at the global of the agent policy? I am asking this for a few reasons:

  • I am not sure yet how logging will work with v2 input, so adding a new field that we would need to support make be a bit uneasy.
  • What happens when there are multiple inputs for logs that define different log level or even different selectors?

@jlind23
Copy link
Contributor

jlind23 commented Mar 7, 2022

If there are inputs with different log levels then I think we should take the most verbose one.

@jlind23 jlind23 transferred this issue from elastic/beats Mar 7, 2022
@oren-zohar
Copy link
Contributor

The debug logging option is also crucial for cloudbeat. We are less concerned about log level per input and even a log level that's inherited from the agent log level / env var would be helpful. Maybe we can start with that and reiterate once we have a complete definition?
I would happy to help with implementing some basic way of controlling the log level to speed things up.

@jlind23
Copy link
Contributor

jlind23 commented Apr 28, 2022

@oren-zohar starting with a global log level sounds good as a first step. If you give a first try at it let me know and i'll find someone to help you out if needed.

@jlind23
Copy link
Contributor

jlind23 commented May 14, 2024

Closing this as duplicate as it will be covered by #3090
cc @ycombinator

@jlind23 jlind23 closed this as not planned Won't fix, can't repro, duplicate, stale May 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants