Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better tracking of number of open file descriptors #7986

Merged

Conversation

kvch
Copy link
Contributor

@kvch kvch commented Aug 16, 2018

New metrics are introduced to better track the number of open file descriptors.

In the initial issue number of open file descriptors were requested by input. Reporting the number of open files by harvesters is already implemented. It's reported as filebeat.harvester.open_files.

I included process level file descriptor information reporting for each Beat which runs on Linux.

New metrics

  • beat.fd.open: Number of open files by a Beat process.
    It's the number of files under /proc/{{ filebeat-pid }}/fd. Only implemented on Linux.
  • beat.fd.limit.soft: Soft limit of the Beat process.
    Could be used to notify a user if the process is reaching the limit (in the Monitoring UI).
  • beat.fd.limit.hard: Hard limit of the Beat process.
    It is the max limit that can be set on a host without modifying kernel params.

@kvch kvch added review Filebeat Filebeat libbeat needs_backport PR is waiting to be backported to other branches. labels Aug 16, 2018
@kvch kvch requested review from ph and urso August 16, 2018 09:45
@ruflin
Copy link
Contributor

ruflin commented Aug 17, 2018

I assume we would want to show these potentially also in the stack monitoring UI? Should we open also an issue here to track it across the stack? https://github.com/elastic/stack-monitoring

@urso
Copy link

urso commented Aug 17, 2018

These seem very linux specific. Will we add these stats to other systems as well (do we even have similar ones on Windows)?

  • beat.fd.open: Number of open files by a Beat process.
    It's the number of files under /proc/{{ filebeat-pid }}/fd. Only implemented on Linux.
  • beat.fd.limit.soft: Soft limit of the Beat process.
    Could be used to notify a user if the process is reaching the limit (in the Monitoring UI).
  • beat.fd.limit.hard: Hard limit of the Beat process.
    It is the max limit that can be set on a host without modifying kernel params.

The TCP ones are definitely nice to have, but let's keep them out for now. These metrics are very related filebeat inputs only. Instead I'd like to have more 'generic' per filebeat input/module stats. E.g.

type: tcp
workers: 5
clients: 4
fds: 5
errors: 10

@kvch
Copy link
Contributor Author

kvch commented Aug 21, 2018

AFAIK file descriptors are part of Unix. On Windows file handles are used, but they are a bit different. (But those differences are not relevant in this case.) File handle usage is not part of gosigar. I looked for ways to add it, but all I have found was a GUI programs and documentation which report the number of open file handles and the limit of open file handles.
I gave up the search, because the current filebeat.harvester.open_files are sufficient for tracking open files both on Linux and Windows. As an addition I intended report the limits of open file handles, so users can track the ratio of open/limit #FDs.

@ph
Copy link
Contributor

ph commented Aug 21, 2018

The TCP ones are definitely nice to have, but let's keep them out for now. These metrics are very related filebeat inputs only. Instead I'd like to have more 'generic' per filebeat input/module stats. E.g.

I would be +1 to not use them for now.

@urso
Copy link

urso commented Aug 27, 2018

I would be +1 to not use them for now.

Let's remove them for now.

AFAIK file descriptors are part of Unix. On Windows file handles are used, but they are a bit different. (But those differences are not relevant in this case.) File handle usage is not part of gosigar. I looked for ways to add it, but all I have found was a GUI programs and documentation which report the number of open file handles and the limit of open file handles.
I gave up the search, because the current filebeat.harvester.open_files are sufficient for tracking open files both on Linux and Windows. As an addition I intended report the limits of open file handles, so users can track the ratio of open/limit #FDs.

We want to display the total resource usage, like number FD/H. This include not just open files by FD or Handles in general. E.g. used to sockets. Using third party libs, we can't rely on manually adding counters to some of our libs. I'd say filebeat.harvester.open_files is definitely helpful, but not really sufficient for all use-cases.

It's OK if we can't get windows working yet, but we should make sure that the metrics can be implemented for windows as well, so to support feature parity at some point in the future.

Do we have a gosigar issue to track progress for Windows support?

@kvch kvch force-pushed the feature/filebeat/report-open-fd-all-inputs branch from b1062f2 to 8f916bd Compare August 30, 2018 10:37
@kvch
Copy link
Contributor Author

kvch commented Aug 30, 2018

I have dropped TCP metrics.

I would like to merge this PR as is, because reporting open file handles is going to be a bigger development, as I need to add functions to go-windows and go-sysinfo and update them in this repo. I am keeping the original issue open, so we won't forget about Windows metrics.

@ph
Copy link
Contributor

ph commented Aug 30, 2018

@kvch Thanks for removing metrics, I am OK that we do a followup for adding windows supports, but before we merge this could we have some kind of test that make sure we don't have a regression later?

@kvch
Copy link
Contributor Author

kvch commented Aug 30, 2018

Metrics are tested in beats-tester. I opened an issue there to add tests: elastic/beats-tester#89

@urso
Copy link

urso commented Aug 30, 2018

jenkins test this please.

@kvch
Copy link
Contributor Author

kvch commented Aug 31, 2018

jenkins test this

@kvch kvch changed the title Better tracking of number of open file descriptors of Filebeat Better tracking of number of open file descriptors Oct 4, 2018
kvch added a commit to kvch/beats that referenced this pull request Oct 5, 2018
…ic#7986)

New metrics are introduced to better track the number of open file descriptors.

In the initial issue number of open file descriptors were requested by input. Reporting the number of open files by harvesters is already implemented. It's reported as filebeat.harvester.open_files.

I included process level file descriptor information reporting for each Beat which runs on Linux.

New metrics

    beat.fd.open: Number of open files by a Beat process.
    It's the number of files under /proc/{{ filebeat-pid }}/fd. Only implemented on Linux.
    beat.fd.limit.soft: Soft limit of the Beat process.
    Could be used to notify a user if the process is reaching the limit (in the Monitoring UI).
    beat.fd.limit.hard: Hard limit of the Beat process.
    It is the max limit that can be set on a host without modifying kernel params.
(cherry picked from commit f10096a)
kvch added a commit that referenced this pull request Oct 5, 2018
…riptors of Filebeat (#8514)

* Better tracking of number of open file descriptors of Filebeat (#7986)

New metrics are introduced to better track the number of open file descriptors.

In the initial issue number of open file descriptors were requested by input. Reporting the number of open files by harvesters is already implemented. It's reported as filebeat.harvester.open_files.

I included process level file descriptor information reporting for each Beat which runs on Linux.

New metrics

    beat.fd.open: Number of open files by a Beat process.
    It's the number of files under /proc/{{ filebeat-pid }}/fd. Only implemented on Linux.
    beat.fd.limit.soft: Soft limit of the Beat process.
    Could be used to notify a user if the process is reaching the limit (in the Monitoring UI).
    beat.fd.limit.hard: Hard limit of the Beat process.
    It is the max limit that can be set on a host without modifying kernel params.
(cherry picked from commit f10096a)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants