Prometheus Downsampler

This program use for collect Prometheus data for last n minutes (default is 5 minutes). Then take average on each metrics. Output to a text file.

Solution

~~Use with another Prometheus for store the downsampled data. For our case, we set the long term Prometheus retention to 2 years (Still testing. Hope it will work as expected).~~

Tested for a while. Memory usage on prometheus keep growing. It may cause by metrics didn't update recently but not reach retention will keep that index in memory. Now will try thanos.

This program only output a text file on K8S empty dir. Then use a nginx in same pod to expose the output to long-term Prometheus. And need to set honor_labels: true inside long term Prometheus scrape job. Otherwise some conflicted labels will be renamed.

Config

There 4 parameters can config. You can either use args or environment variable.

Source Prometheus endpoint
- Default: http://127.0.0.1:9090
- Args: -s
- Environame variable: PDS_SOURCE
Output file path
- Default: /tmp/prometheus_downsample_output.txt
- Args: -o
- Environame variable: PDS_OUTPUT
Interval in minute for collect data from source Prometheus
- Default: 5m
- Args: -i
- Environame variable: PDS_INTERVAL
Max concurrent connection to source Prometheus
- Default: 50
- Args: -c
- Environame variable: PDS_CONCURRENT

Example: Your prometheus endpoint is http://192.168.1.20:9090 and want to downsample data for every 10 minutes:

go run prometheus-downsampler.go -s http://192.168.1.20:9090 -i 10m

or

./prometheus-downsampler -s http://192.168.1.20:9090 -i 10m

How it work

Call Querying label values API to get all metric names
Call Range Queries API to get every metrics with 1 minute step
Take average on each metrics
Write all metrics with exposition format to a temp file
Rename the temp file to output file name

Issue

This program can handle collect a longer time range data. Then group them to every n minute and take average. But due to below reasons. Now only process for single time group.

exposition format mention Each line must have a unique combination of a metric name and labels. Otherwise, the ingestion behavior is undefined.. But didn't mention is it safe if have different timestamp
Also tested for a while with export 1 hour data with 12 data points. The long term Prometheus lost some of data point.

Why not use remote_write with InfluxDB

Because we scrape over 650K metrics every 10 second (With 2 Prometheus servers for HA). We tried use remote_write to InfluxDB (Single server, not enterprise edition). But it cause InfluxDB very high CPU usage, no response and OOM dead very soon. Also make the operation Prometheus dead. So we try on different way (This project). Also just need a little bit modify on Grafana dashboard. No need to re-build all dashboard (Don't know why use Prometheus remote_read from InfluxDB for Grafana always got proxy timeout from Grafana).

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
other_resource		other_resource
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
prometheus-downsampler.go		prometheus-downsampler.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Prometheus Downsampler

Solution

Config

How it work

Issue

Why not use remote_write with InfluxDB

About

Releases

Packages

Languages

License

alantang888/prometheus-downsampler

Folders and files

Latest commit

History

Repository files navigation

Prometheus Downsampler

Solution

Config

How it work

Issue

Why not use remote_write with InfluxDB

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages