-
Notifications
You must be signed in to change notification settings - Fork 463
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix missing points in data source. #1109
Comments
Quoting Zabbix documentation:
Or, as they say, "no martini no party". Why do these "missing" values need to be plotted? And how will plugin know that they've ever existed? Compared to MySQL/PostgreSQL plugins there aren't even For the record, there is a Zabbix issue on the topic. P.S. I think throttling feature is flawed from inception and should not be used. Zabbix should grow itself support of time series databases. Most of them provide loseless data compression out of the box. No extra configuration required on user side and no magic tricks are required on the side of visualization tool. |
FYI https://www.zabbix.com/documentation/current/manual/appendix/install/timescaledb No promises on the 'No extra configuration required on the user side' though. : ) |
It is still better to configure TimescaleDB compression once than to configure throttling for every item... :) |
@i-ky throttling is a feature to deal with high frequency monitoring and solves (tries to solve?) a different problem than compression. Without discard, if you are collecting data every 1ms you will still need to write 1000 values per second to database before you compress the data. In addition, this data needs to travel to your database. For example if we are monitoring thousands of) device(s) in a remote network, we can setup a proxy in that network which discards unnecessary value updates before sending to the zabbix server. Compression will not solve these problems, isn't it so? About the zabbix issue ticket on the topic of graphs. I am aware of it and it is a tricky problem to resolve yes.... I am already using Zabbix with TimescaleDB on PostgreSQl. Wouldn't it be possible to fetch data directly from PostgreSQL leveraging the feature for filling gaps in Grafana PostgreSQL data source? |
@yurtesen, in my understanding the only thing that was (still is?) holding back "high frequency monitoring" was database size. Compression solves this problem. Compression can also solve high network traffic issue. I think that Zabbix throttling is completely unfit for purpose. It is like a video codec that only compresses stream if it is a still image. If your values have a slightest variation, Zabbix won't be able to discard anything. However, compression algorithms will be able to do pretty good job even in this case. After all, we are streaming 4K video on demand these days. Yes, it is a tricky issue to resolve. And I believe it should be resolved on Zabbix side. Throttling makes life of every third party visualization and data analysis tool a bit trickier. |
@i-ky It is interesting example as video compression often employs some sort of discard mechanism to not send unchanged parts of an image. So you are saying that what Zabbix does is a form of compression, I understand your point and the issue might need to be fixed at Zabbix side. But the fix is probably not undoing the discard feature. Zabbix already support timescaledb and compression. Discarding values have many ohter benefits (like saving processing cycles on triggers and expression evaluations etc.). So I doubt they will revert that feature. But lets assume the Zabbix API should deal with figuring out missing values related to the discard feature. Independent of that, shouldn't users be allowed to decide how the graph should work in case there are missing values in the timeseries? For example, what about the direct db connection of this plugin? Can it be adapted to use https://docs.timescale.com/latest/api#locf by a user setting? Independent of what the Zabbix API does? |
Well, throttling has its strengths. It works more or less adequately for textual data. And it absolutely shines with low level discovery. LLD is a database-intense process and discarding unchanged information lets you increase the frequency of LLD rule checks without increasing burden on the database. Thus you achieve low latency LLD, which is very good for dynamic environments. (I prefer "low latency" term to "high frequency", it better describes the goal.) But throttling is not suited well for numeric data, which needs to be aggregated in trends, interpolated and plotted on graphs. I hope Zabbix will eventually acknowledge that and will restrict throttling to text data. Or users will learn that numeric data should not be discarded. I don't think you can save a lot of cycles on trigger evaluation. I'm not even mentioning that trigger functions like To be honest, I don't understand why Grafana's Staircase display option doesn't solve the problem. It also seems to be available in Zabbix. Doesn't Staircase apply the same logic as |
@i-ky no doubt discard helps LLD to save some cpu cycles but actually LLD require no database access. Zabbix has an in memory configuration cache which it can check against. Discard feature has nothing to do with LLD. whatsoever. Because a value is numeric does not automatically mean that it will keep changing. I have hosts which has 10k+ values monitored and less than half of those does not change ever. It could be an application counter where the code is never executed. The compression won't help because when I collect 10k items from 1k hosts, I would need to enter 10 million values per interval to database. Then it can be compressed in database and save space. But who cares about that space really, disk space is cheap and plenty. There is real gains in saved processing power by discarding same values. Staircase,... because staircase is not the correct representation, eg. when I have a value like temperature. Staircase and
locf
staircase (I think?)
Sure for the missing data part it works rather well and same, but it breaks the representation of existing data by adding extra values. |
Just as a note. The gap filling feature of Grafana PostgreSQL/MySQL data sources does NOT work by replacing NULL values either. |
Duplicated #388. |
@borjasueiro is it? It is a very old issue ticket and the description is in Russian? if it is, the last message of @alexanderzobnin he says TSDB should solve it. Zabbix now uses TimescaleDB so, the plugin may possibly use postgresql data source to directly query the data using the timescaledb functionality. |
In #388 (comment) Alexander is saying that it is not easy to know where are the gaps/missing values, because Zabbix API does not provide such information. In #388 (comment) he says that to solve the problem Zabbix should store null values. My speculation is that it would provide information when checks were performed, but returned no data. |
Sorry, I do not understand. Staircase is how Grafana connects data points. In your example it would draw a horizontal line at 1 from t1 to t4, then a vertical line from 1 to 3 at t4, then a horizontal line at 3 from t4 to t5, then a vertical line from 3 to 4 at t5 and so on. |
@i-ky First of all, that other ticket is more like a distant cousin from the past dealing with actually really missing data. The discard unchanged feature was added to Zabbix 4.2 which was released in 2019. Which is 2 years after that ticket was created. The original poster in #388 is mentioning the conversion of NULL values option of Grafana (from screenshot in ticket), so I understand why @alexanderzobnin thinks Zabbix should output NULL values where data is missing "for that specific case". However things have changed quite a bit since 2017. Now also Grafana has additional option for data sources to fill missing data. That feature works independent of NULL values existing or not. Actually, other time series databases can also return empty values when there is no data. This is why they have fill functions. But anyway... About the solution of the problem. There are few ways to go:
About the staircase. If you draw staircase, you would have 2 data points on t4 and t5 Here is an example. You can count the points for yourself. Staircase 7 points |
@yurtesen, sorry, I wasn't aware that Staircase draws points where are no real data points. Your first suggestion is reasonable, but you also need to find a way to distinguish cases when Zabbix is discarding values and when no values are being collected due to maintenance, network issues, etc. I honestly don't understand why such complexity needs to be implemented (and maintained) in a third-party visualization tool if Zabbix itself can't implement this solution for 6 months now. Your second solution will not work for everyone. There are users of MySQL, Oracle and even PostgreSQL without TimescaleDB. I also have a feeling that
|
@i-ky When data is compressed, sometimes it may be acceptable that some data is lost.. Losing why there was a gap is one of those cases. Because in most cases data will be missing due to discard also. For items where this is not an acceptable compromise the user should not enable discard functionality. Grafana PostgreSQL/MySQL gap fill algorithm also does not detect or care why there is a gap in data. So, the solution does not have to be complicated and there is nothing to fix at Zabbix side. It would be an additional feature if Zabbix did it in their API. https://blog.timescale.com/blog/sql-functions-for-time-series-analysis/
The article is about aggregating values using time buckets and handling missing data. It says nothing like |
Yes, the PostgreSQL solution is not universal, but there is no rule as: if the solution is not universal, do not implement it. There is a switch for TimescaleDB support for PostgreSQL data source in Grafana. So people who does not use it, can't use the features of TimescaleDB. (the feature is implemented even though not everybody uses TimescaleDB) Sure enougn, this feature does NOT exist for MySQL data source, yet implemented for PostgreSQL. So features get implemented even though not every database supports it. No problem there... |
No, lossless compression is a thing, otherwise file archivers wouldn't make much sense.
In my view "user should" and "user should not" is a design flaw. Good design must not allow user to select an option that does not work.
Yes, I did.
I did read it too. It says "The
Indeed, there is no such rule. But there are always cost vs. benefit considerations, especially in projects maintained by small teams and individuals. |
@i-ky I do not know if you are just joking with me now or not. Your response is totally unrelated. |
Actually, #1082 fixes this. Now it's required in Grafana to make points aligned (for properly working stacking graphs). So that issue contains changes which allows to align points according to item update interval and fill missing points with |
@alexanderzobnin thanks for the response. I had a quick look at the PR you mentioned. |
Describe the bug
When Zabbix is using discard/throttle preprocessing, not all data points are available. Grafana naively connecting the dots and this is causing a badly drawn graph. Not that Zabbix itself could do it. Zabbix does it wrong too. But that is not the point 😄
Expected behavior
Grafana data sources for MySQL/PostgreSQL etc. have a solution in place already. They do fill a user defined value for missing points. In Zabbix discard case, it should simply fill previous value
grafana/grafana#12756
Screenshots
Above is Zabbix, and below is Grafana.
Software versions
Can this be fixed in this plugin?
The text was updated successfully, but these errors were encountered: