Skip to content

Commit

Permalink
Document havester_limit for Filestream input and fix typo (#39244) (#…
Browse files Browse the repository at this point in the history
…39274)

This commit documents `harvester_limit` for the filestream input and
replaces `close_*` by the correct key `close.on_state_change.*`.

(cherry picked from commit 59421bb)

Co-authored-by: Tiago Queiroz <[email protected]>
  • Loading branch information
mergify[bot] and belimawr authored Apr 30, 2024
1 parent 65fd14a commit 0acd9c1
Show file tree
Hide file tree
Showing 2 changed files with 27 additions and 2 deletions.
24 changes: 24 additions & 0 deletions filebeat/docs/inputs/input-filestream-file-options.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -498,6 +498,30 @@ less than or equal to `prospector.scanner.check_interval`
If `backoff.max` needs to be higher, it is recommended to close the file handler
instead and let {beatname_uc} pick up the file again.

[float]
[id="{beatname_lc}-input-{type}-harvester-limit"]
===== `harvester_limit`

The `harvester_limit` option limits the number of harvesters that are started in
parallel for one input. This directly relates to the maximum number of file
handlers that are opened. The default for `harvester_limit` is 0, which means
there is no limit. This configuration is useful if the number of files to be
harvested exceeds the open file handler limit of the operating system.

Setting a limit on the number of harvesters means that potentially not all files
are opened in parallel. Therefore we recommended that you use this option in
combination with the `close.on_state_change.*` options to make sure
harvesters are stopped more often so that new files can be picked up.

Currently if a new harvester can be started again, the harvester is picked
randomly. This means it's possible that the harvester for a file that was just
closed and then updated again might be started instead of the harvester for a
file that hasn't been harvested for a longer period of time.

This configuration option applies per input. You can use this option to
indirectly set higher priorities on certain inputs by assigning a higher
limit of harvesters.

[float]
===== `file_identity`

Expand Down
5 changes: 3 additions & 2 deletions filebeat/docs/inputs/input-filestream.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,9 @@ Use the `filestream` input to read lines from active log files. It is the
new, improved alternative to the `log` input. It comes with various improvements
to the existing input:

1. Checking of `close_*` options happens out of band. Thus, if an output is blocked,
{beatname_uc} can close the reader and avoid keeping too many files open.
1. Checking of `close.on_state_change.*` options happens out of
band. Thus, if an output is blocked, {beatname_uc} can close the
reader and avoid keeping too many files open.

2. Detailed metrics are available for all files that match the `paths` configuration
regardless of the `harvester_limit`. This way, you can keep track of all files,
Expand Down

0 comments on commit 0acd9c1

Please sign in to comment.