Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enhancement: diagnostics subcommand #4660

Open
lukesteensen opened this issue Oct 20, 2020 · 2 comments
Open

enhancement: diagnostics subcommand #4660

lukesteensen opened this issue Oct 20, 2020 · 2 comments
Assignees
Labels
domain: observability Anything related to monitoring/observing Vector domain: setup Anything related to setting up or installing Vector type: enhancement A value-adding code change that enhances its existing functionality.

Comments

@lukesteensen
Copy link
Member

One of the biggest challenges when trying to help Vector users debug issues is collecting all of the relevant data from their environment. This can involve a lot of back and forth and it's not always clear which commands to run to get Vector-specific info like file checkpoints.

To address this, we should add a diagnostics subcommand (similar to homebrew's doctor command) that loads a user's config and gathers relevant information. That information can then be formatted nicely and output to the terminal or a file for the user to pass along to us.

Each component can independently implement its own set of checks via another build-style method on the config traits. For example, the file source checks could return information like the following:

  1. Count and sizes of files in the configured directories
  2. Checkpointed positions compared to the current file size (i.e. lag)
  3. Latency of operations like globbing, fingerprinting, etc
@lukesteensen lukesteensen added type: enhancement A value-adding code change that enhances its existing functionality. domain: observability Anything related to monitoring/observing Vector domain: setup Anything related to setting up or installing Vector labels Oct 20, 2020
@binarylogic
Copy link
Contributor

Please also see #4670 for more details.

@neuronull
Copy link
Contributor

Jotting down some thoughts that came to mind when this was brought up recently.
There was a lot of discussion around having a command similar to what the Agent "flare" command does, for Vector.

For some of the investigations I've looked into, here are some things that would be helpful to capture:

  • full vector configuration. Not just the parsed, but also what parts are in different files, if multi file
    • this is often the first thing we need from a customer and isn't always available so we have to ask for it, and there can be some back and forth, and even user error in providing the full / correct config
  • vector version (obvious)
  • some of the similar output provided to vector top , basically total counts for some of our more important internal telemetry such as bytes/events sent/received, component errors, events dropped. And per component.
  • I like the idea from the initial comment about each component being able to provide specific data. This could be modularized and default to none, so there isn't a requirement, but we could add it to critical components, or problem centric ones, and have a precedent to add it to new components.
    • For example the datadog metrics sink could provide data on what types of metrics it has processed, batch sizes
    • the sink infrastructure layer could report retry attempt data
  • we could even have a parameter to the diagnostic sub command to collect profiling data. The profiling would be all optional but if the code path of the running instance exercised it, and profiling was enabled, perhaps through RAII techniques we could have an understanding of how much time is spent at various levels of the code path from initialization, to data flow through the components.

@pront pront self-assigned this Oct 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain: observability Anything related to monitoring/observing Vector domain: setup Anything related to setting up or installing Vector type: enhancement A value-adding code change that enhances its existing functionality.
Projects
None yet
Development

No branches or pull requests

4 participants