-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
💡 [Feature] Log download stats from THREDDS server #444
Comments
Consider downloads from WPS outputs and STAC data proxy endpoints as well for the same reasons. |
ESGF uses Beats and Logstash to collect logs and compute their stats. See https://drive.google.com/drive/folders/1LbvoYeQ_6L_bzTsO-EEhwqjIx1jZ-G1k |
If the "node collector" can be located on the same instance, logstash seems like an interesting candidate. If there is no distinction between beats or logstash as "log producers", I would favor the 2nd architecture to limit the number of configurations/technologies involved. |
## Overview This version of canarie-api permits running the proxy (nginx) container independently of the canarie-api application. This makes it easier to monitor the logs of canarie-api and proxy containers simultaneously and allows for the configuration files for canarie-api to be mapped to the canarie-api containers where appropriate. ## Changes **Non-breaking changes** - New component version canarie-api:1.0.0 **Breaking changes** ## Related Issue / Discussion - Resolves [issue id](url) ## Additional Information Links to other issues or sources. - This might make parsing the nginx logs slightly easier as well which could help with #12 and #444 ## CI Operations <!-- The test suite can be run using a different DACCS config with ``birdhouse_daccs_configs_branch: branch_name`` in the PR description. To globally skip the test suite regardless of the commit message use ``birdhouse_skip_ci`` set to ``true`` in the PR description. Note that using ``[skip ci]``, ``[ci skip]`` or ``[no ci]`` in the commit message will override ``birdhouse_skip_ci`` from the PR description. --> birdhouse_daccs_configs_branch: master birdhouse_skip_ci: false
Parser for nginx logs and prometheus counter |
I've created two PRs to implement log parsing in different ways. I'd like to summarize each below and briefly discuss the pros and cons of each. We should decide here which one we are interested in. Prometheus log parser: #473Summary: reads log files with a lightweight python library and converts log lines to metrics using python functions that create metrics using the prometheus python client Pros:
Cons:
Promtail and Loki #474Summary: reads log files with the promtail component and converts log lines to metrics using the metrics pipeline stage. Optionally supports shipping the parsed logs themselves to grafana (through loki) for custom log inspection. Pros:
Cons:
Why not something else... logstash, beats, fluentbit ...These could totally work as well... probably. I didn't have time to investigate them all. The main reason why I didn't choose to investigate these options is because for most of them, exporting log data to metrics required additional plugins and the complexity to set them up seemed much higher. For our goals, I think we can achieve what we want with promtail or the prometheus log exporter. Unless there's a use-case that we can't achieve with either of those two I'm happy to look into other technologies but I'd rather stick with these two options for now. |
Thanks for the overview. I think one challenge we're having by plugging together different servers is the expertise required to configure each one. I don't think we have within our group someone fluent in Grafana for example. I'm concerned that as we add component, ẁe're going to make the problem worse. In that sense, I'm leaning toward your first approach, which is simple and can be easily extended without delving into yet another configuration format. |
I agree with @huard for the same reasons. |
Same here. |
## Overview This component parses log files from other components and converts their logs to prometheus metrics that are then ingested by the monitoring Prometheus instance (the one created by the`components/monitoring` component). For more information on how this component reads log files and converts them to prometheus components see the [log-parser](https://github.com/DACCS-Climate/log-parser/) documentation. To configure this component: * set the `PROMETHEUS_LOG_PARSER_POLL_DELAY` variable to a number of seconds to set how often the log parser checks if new lines have been added to log files (default: 1) * set the `PROMETHEUS_LOG_PARSER_TAIL` variable to `"true"` to only parse new lines in log files. If unset, this will parse all existing lines in the log file as well (default: `"true"`) To view all metrics exported by the log parser: * Navigate to the `https://<BIRDHOUSE_FQDN>/prometheus/graph` search page * Put `{job="log_parser"}` in the search bar and click the "Execute" button Update the prometheus version to the current latest `v2.53.3`. This is required to support loading multiple prometheus scrape configuration files with the `scrape_config_files` configuration option. ## Changes **Non-breaking changes** - New component version prometheus:v2.53.3 **Breaking changes** - None ## Related Issue / Discussion - #444 ## Additional Information - implements parser given as an example here: #444 (comment) - this is an alternative to #474. See discussion in #444 to help decide which we should pick. ## CI Operations <!-- The test suite can be run using a different DACCS config with ``birdhouse_daccs_configs_branch: branch_name`` in the PR description. To globally skip the test suite regardless of the commit message use ``birdhouse_skip_ci`` set to ``true`` in the PR description. Using ``[<cmd>]`` (with the brackets) where ``<cmd> = skip ci`` in the commit message will override ``birdhouse_skip_ci`` from the PR description. Such commit command can be used to override the PR description behavior for a specific commit update. However, a commit message cannot 'force run' a PR which the description turns off the CI. To run the CI, the PR should instead be updated with a ``true`` value, and a running message can be posted in following PR comments to trigger tests once again. --> birdhouse_daccs_configs_branch: master birdhouse_skip_ci: false
Description
It would be useful for reporting purposes to monitor data downloads from THREDDS:
References
This information can be parsed from NGINX logs, but those logs need to be exposed to Prometheus to be aggregated and archived within the current architecture.
Possible solutions:
Additional info
See also:
Concerned Organizations
The text was updated successfully, but these errors were encountered: