-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[receiver/filelog] Improve receiver logging and monitoring metrics #31256
Comments
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
We have some of this already and the rest should be reasonable to add. However, there are some caveats.
What's your definition of stale? Up-down counters for open & reading should be straightforward enough to maintain. |
|
This is pretty much the same as open less reading, but maybe there's some more nuance. We can consider when implementing. In any case, I'm marking this issue as required for ga because I think these are very basic observability signals which we should have in place. |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
…erface (#32662) **Description:** <Describe what has changed.> <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> This PR resumes the work from #31618. The goal is to pass in the `component.TelemetrySettings` so as to use them later in various ways: 1) report the filelog's state stats: #31544 2) switch from `SugaredLogger` to `Logger`: #32177 More about the breaking change decision at #31618 (comment). **Link to tracking Issue:** <Issue number if applicable> #31256 **Testing:** <Describe what testing was performed and which tests were added.> Testing suite got updated. #### Manual Testing 1. ```yaml receivers: filelog: start_at: end include: - /var/log/busybox/refactroring_test.log operators: - id: addon type: add field: attributes.extra value: my-val exporters: debug: verbosity: detailed service: pipelines: logs: receivers: [filelog] exporters: [debug] processors: [] ``` 2. `./bin/otelcontribcol_linux_amd64 --config ~/otelcol/config.yaml` 3. `echo 'some line' >> /var/log/busybox/refactroring_test.log` 4. ```console 2024-04-24T09:29:17.104+0300 info LogsExporter {"kind": "exporter", "data_type": "logs", "name": "debug", "resource logs": 1, "log records": 1} 2024-04-24T09:29:17.104+0300 info ResourceLog #0 Resource SchemaURL: ScopeLogs #0 ScopeLogs SchemaURL: InstrumentationScope LogRecord #0 ObservedTimestamp: 2024-04-24 06:29:17.005433317 +0000 UTC Timestamp: 1970-01-01 00:00:00 +0000 UTC SeverityText: SeverityNumber: Unspecified(0) Body: Str(some line) Attributes: -> extra: Str(my-val) -> log.file.name: Str(1.log) Trace ID: Span ID: Flags: 0 {"kind": "exporter", "data_type": "logs", "name": "debug"} ``` **Documentation:** <Describe the documentation added.> TBA. Signed-off-by: ChrsMark <[email protected]>
…ileconsumer (#31544) Blocked on #31618 **Description:** <Describe what has changed.> <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> This PR adds support for filelog receiver to emit observable metrics about its current state: how many files are opened, and harvested. **Link to tracking Issue:** #31256 **Testing:** <Describe what testing was performed and which tests were added.> #### How to test this manually 1. Use the following collector config: ```yaml receivers: filelog: start_at: end include: - /var/log/busybox/monitoring/*.log exporters: debug: verbosity: detailed service: telemetry: metrics: level: detailed address: ":8888" pipelines: logs: receivers: [filelog] exporters: [debug] processors: [] ``` 2. Build and run the collector: `make otelcontribcol && ./bin/otelcontribcol_linux_amd64 --config ~/otelcol/monitoring_telemetry/config.yaml` 3. Produce some logs: ```console echo 'some line' >> /var/log/busybox/monitoring/1.log while true; do echo -e "This is a log line" >> /var/log/busybox/monitoring/2.log; done ``` 4. Verify that metrics are produced: ```console curl 0.0.0.0:8888/metrics | grep _files % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 4002 0 4002 0 0 1954k 0 --:--:-- --:--:-- --:--:-- 1954k # HELP otelcol_fileconsumer_open_files Number of open files # TYPE otelcol_fileconsumer_open_files gauge otelcol_fileconsumer_open_files{service_instance_id="72b4899d-6ce3-41de-a25b-8f0370e22ec1",service_name="otelcontribcol",service_version="0.99.0-dev"} 2 # HELP otelcol_fileconsumer_reading_files Number of open files that are being read # TYPE otelcol_fileconsumer_reading_files gauge otelcol_fileconsumer_reading_files{service_instance_id="72b4899d-6ce3-41de-a25b-8f0370e22ec1",service_name="otelcontribcol",service_version="0.99.0-dev"} 1 ``` **Documentation:** <Describe the documentation added.> Added a respective section in Filelog receiver's docs. --------- Signed-off-by: ChrsMark <[email protected]>
Heads up on this. #31544 has been merged. That one provides the metrics about the What is remaining from this issue now is the logging enhancements. I will be working on this soon. |
Hey @djaglowski, I tried my luck on the Did you have something additional in mind when mentioning the following?
|
When we detect this condition and continue, we're not concerned with whether or not a file has been rotated. It may have, or may not have. The goal when originally written here was to find files which were rotated out of the matching pattern. Files which were rotated within the matching pattern would not be logged if we only log based on the |
Thank's for clarifying this. I see the distinction between the "out-of-pattern" rotated files and "within-pattern" ones. I did some extra digging into the implementation and I found that we can spot the "within-pattern" rotated files at this if-block if we compare the fileName of the However, at this point we only know that a file has been rotated from one filePath to a new one within the matching pattern. I guess we can use the same I'm testing these scenarios manually with the steps I mention at #33237. |
**Description:** <Describe what has changed.> <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> This PR adds the logging part from #31256. With this addition every time that is identified that file is rotated either by move/create or copy/truncate, proper logging takes place. **Link to tracking Issue:** <Issue number if applicable> #31256 **Testing:** <Describe what testing was performed and which tests were added.> Updated existing unit tests ### How to test this manually Using the following config: ```yaml receivers: filelog: start_at: beginning poll_interval: 5s include: - /var/log/busybox/monitoring/stable*.log exporters: debug: verbosity: detailed service: telemetry: logs: level: info pipelines: logs: receivers: [filelog] exporters: [debug] processors: [] ``` #### Testing truncate (out of pattern) ```console echo "$(date '+%FT%H:%M:%S.%NZ') some line1" >> /var/log/busybox/monitoring/stable_trunc.log && echo "$(date '+%FT%H:%M:%S.%NZ') some line2" >> /var/log/busybox/monitoring/stable_trunc.log && echo "$(date '+%FT%H:%M:%S.%NZ') some line3" >> /var/log/busybox/monitoring/stable_trunc.log && sleep 6 && cp /var/log/busybox/monitoring/stable_trunc.log /var/log/busybox/monitoring/stable_trunc.log.1 && : > /var/log/busybox/monitoring/stable_trunc.log && echo "$(date '+%FT%H:%M:%S.%NZ') some line new0" >> /var/log/busybox/monitoring/stable_trunc.log ``` #### Testing truncate (in pattern) ```console echo "$(date '+%FT%H:%M:%S.%NZ') some line1" >> /var/log/busybox/monitoring/stable_trunc.log && echo "$(date '+%FT%H:%M:%S.%NZ') some line2" >> /var/log/busybox/monitoring/stable_trunc.log && echo "$(date '+%FT%H:%M:%S.%NZ') some line3" >> /var/log/busybox/monitoring/stable_trunc.log && sleep 6 && cp /var/log/busybox/monitoring/stable_trunc.log /var/log/busybox/monitoring/stable_trunc_1.log && : > /var/log/busybox/monitoring/stable_trunc.log && echo "$(date '+%FT%H:%M:%S.%NZ') some line new1" >> /var/log/busybox/monitoring/stable_trunc.log ``` #### Testing move/create (out of pattern) ```console echo "$(date '+%FT%H:%M:%S.%NZ') some line1" >> /var/log/busybox/monitoring/stable_trunc.log && echo "$(date '+%FT%H:%M:%S.%NZ') some line2" >> /var/log/busybox/monitoring/stable_trunc.log && echo "$(date '+%FT%H:%M:%S.%NZ') some line3" >> /var/log/busybox/monitoring/stable_trunc.log && sleep 6 && mv /var/log/busybox/monitoring/stable_trunc.log /var/log/busybox/monitoring/stable_trunc.log.1 && echo "$(date '+%FT%H:%M:%S.%NZ') some line new0" >> /var/log/busybox/monitoring/stable_trunc.log ``` #### Testing move/create (in pattern) ```console echo "$(date '+%FT%H:%M:%S.%NZ') some line1" >> /var/log/busybox/monitoring/stable_trunc.log && echo "$(date '+%FT%H:%M:%S.%NZ') some line2" >> /var/log/busybox/monitoring/stable_trunc.log && echo "$(date '+%FT%H:%M:%S.%NZ') some line3" >> /var/log/busybox/monitoring/stable_trunc.log && sleep 6 && mv /var/log/busybox/monitoring/stable_trunc.log /var/log/busybox/monitoring/stable_trunc_1.log && echo "$(date '+%FT%H:%M:%S.%NZ') some line new0" >> /var/log/busybox/monitoring/stable_trunc.log ``` **Documentation:** <Describe the documentation added.> Add some extra notes in the `design.md` --------- Signed-off-by: ChrsMark <[email protected]>
Since both metrics and logging have been added we can close this. We can follow-up if we spot any additional need. |
Component(s)
receiver/filelog
Is your feature request related to a problem? Please describe.
At the moment OTel Collector can collect logs through the filelog receiver.
However the receiver does not log enough details about the file tracking actions nor extensive summary metrics.
Describe the solution you'd like
OTel Collector should be capable to offer the following:
Describe alternatives you've considered
No response
Additional context
Sample logs:
Sample metrics:
The text was updated successfully, but these errors were encountered: