-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ZFS: io stats removed #2068
Comments
Ok, looks like someone needs to implement the "more efficient ways .. to obtain detailed statistics". |
The "more efficient ways" are probably as done by
(And more stats are available, e.g. The question would be how to pull this stuff in:
The other fields in the now-defunct "io" file are from https://github.com/openzfs/zfs/blob/master/lib/libspl/include/sys/kstat.h#L582 I don't know which of these are more or less useful to try to get. Sadly, my knowledge of Go is practically nonexistent, as is the time I have to put towards this. |
We don't allow running external commands. Technically using libzfs2 might be an option, but we'd try to avoid linking to c libraries. So ideally we can use the ioctl directly but dunno how involved that would be. |
Looks like it's not straight forward (you've gotta pull apart "nvlists" and ...stuff), but someone's started on it: https://github.com/lorenz/go-zfs |
There's another "go-zfs" project at https://github.com/mistifyio/go-zfs but that one appears to be a wrapper around the command line utilities so not suitable given "We don't allow running external commands". |
Huh. And another at https://github.com/bicomsystems/go-libzfs which links to the libzfs c library so less suitable. |
@discordianfish @chrisrd anything new on this? Encountered the same bug after upgrading openzfs to the latest release ... |
@discostur Nothing new from me - I'm not in a position to fix this. In the meantime I'm just monitoring one of the disks in my pool with:
This obviously doesn't provide actual pool-wide read/write stats, but it's enough to track how busy the pool is over time. |
@discostur No unfortunately not. Someone should look into using to get these stats from https://github.com/lorenz/go-zfs. I might get to that eventually but can't commit to when. |
FYI I (the go-zfs author) am working on getting a ZFS exporter up and running. Once I have something we can work on integrating it into node_exporter. |
I invite everyone here to give feedback on https://github.com/lorenz/zfs_exporter. As we get consensus on which metrics to expose, how and under what names we can start merging this back into node_exporter, but I think we can iterate this faster outside. |
Awesome! For the node-exporter we're trying to not break existing metrics before the next major release, so ideally in the version for the node-exporter the change shouldn't change the metrics generated for now. Or it could be a new collector that is disabled by default. Alternatively, I was also thinking about adding versioned metrics endpoint. But I guess that requires some further discussion which I should finally kick off. |
Hello was this resolved somehow, are there new metrics/collector available that exposes Thanks |
Just a note, since I'm not sure it was clear from the conversation so far, this is breaking the collection of all zpool metrics, since if the |
As a quick fix, would it be possible to make the exporter skip over the IO metrics if their paths are missing, to make the other metrics work again? 'state' especially would be useful, since zed's state alerts don't fire for some states if you're on zfs < 2.1.3. |
@calliecameron That sounds reasonable to me |
As a note, Ubuntu 22.04 has a version of ZFS that lacks the old 'io' kstat file for per-pool IO statistics. Anyone upgrading Ubuntu ZFS machines is no longer getting their ZFS stats through the node_exporter, perhaps unlike what they expect (the stats are certainly important to our operations, and we were looking forward to per-filesystem stats). |
This fixes an issue where the `node_zfs_zpool_*` metrics disappeared due to an upstream ZFS change[1]. With these metrics back, I can drop the monitoring scripts and switch to Alert Manager. [1]: prometheus/node_exporter#2068
I don't agree that #2451 has resolved this issue: it fixes the case that the missing zpool IO metrics would cause the rest of the (otherwise available) zpool metrics to be unavailable, but the original issue was specifically that the zpool IO metrics are no longer available (due to the 'io' kstat file being defunct). These metrics ARE available via a system call. The solution should be to use the system call to make the zpool IO metrics available once again. |
Unfortunately, the ZFS IO stats that are now available have changed significantly such that some of the metrics simply can't be generated any more; there is no drop-in replacement system call that could be used to get the traditional metrics (and no way to synthetically generate all of them). An illustration of what's available in the 'new' ZFS IO stats (and implicitly what's missing) can be found in eg siebenmann/zfs_exporter (which we currently use in production). There are enough metrics and enough choices involved that I'm far from confident that they should all be shoved into the node_exporter. (Some of the 'new' ZFS IO stats are actually present fairly far back, because they're the information 'zpool iostat' will present. The old pool-level IO stats were a separate system that OpenZFS eliminated for reasons; see openzfs/zfs#12212 ) |
Thanks @siebenmann, I'll certainly take a look at siebenmann/zfs_exporter. For my use-case, I don't specifically need "the same as before" IO stats, just some stats that allow monitoring the pool level IO, so your zfs_exporter should certainly do that. I still think this ticket should remain open until the 'new' IO pool level stats (such as available via |
@SuperQ / @discordianfish please consider reopening, hiding the errors does not solve the issue described here |
Host operating system:
OpenZFS as of commit openzfs/zfs@371f88d
node_exporter version
...through to master, current HEAD (3762191)
node_exporter command line flags
--collector.zfs
What did you do that produced an error?
Upgraded to OpenZFS HEAD openzfs/zfs@HEAD
What did you expect to see?
What did you see instead?
Node_exporter uses
/proc/spl/kstat/zfs/*/io
to collect these stats. This proc file has been removed as of OpenZFS openzfs/zfs@371f88d.See Also: openzfs/zfs#12212
The text was updated successfully, but these errors were encountered: