This is a very simple, file-counter and mtime
-stats aggregator,
Linux program designed to just tell me how many pictures I still have
to process. It's an (expanded) port of a previous Python version,
mostly as a learning exercise, and is designed to work together with
the file structure as used by github.com/iustin/corydalis.
The program is intended to export metrics for Prometheus based on the count, age and distribution of files in a directory.
The directory structure is based just on the first-level directories; anything
deeper will be aggregated to the first-level directory. So for example
the output of this tree
command:
$ tree
.
├── 2023-11-01 Long trip
│ ├── Day 1
│ │ └── b.jpg
│ └── Day 2
│ └── c.jpg
└── 2023-11-19 Some pictures
├── a.jpg
├── b.nef
└── b.xmp
5 directories, 5 files
Will be exported as:
- 4 total files pending; the
xmp
file is ignored by default, see in the usage section; - 2 directories (
2023-11-01 Long trip
and2023-11-19 Some pictures
); - for each directory, an aggregated "age" will be computed (sum of ages, relative to the current time);
- and an overall histogram with pending file ages will be exported;
The program currently exports counters for four different error classes:
- scan errors: some directories cannot be scanned due to permissions;
- ownership errors: known file type user or group doesn't match the passed owner/group;
- permission errors: known file type or directory permissions doesn't match the configured ones;
- unknown errors: file extension is of unknown type; this is in order to make sure that all files are either categorized or ignored;
Suggestions for more (or less) checks are welcome.
I always lag behind photo processing. When I do process, I like to see the counters going down, kind of like a gamification of the processing. And with Prometheus and Grafana, the graphs are more or less real-time indeed; the cost for scanning a ~5K directory tree are around 15ms, which allows me to poll this on a normal Prometheus poll.
See the examples/grafana-dashboard.json
file as an example
dashboard.
What is missing from the dashboard is the alert definition (Grafana
doesn't export both). I have a simple alert on the
photo_backlog_errors
metrics, condition '> 0 for 4 hours'. In the
future I plan to add permission/ownership checks as well.
Run the usual cargo build -r
. Copy the resulting binary (from
target/release
) somewhere, then run it. You have two options:
- run the
photo-backlog-exporter
binary as a daemon - run the
oneshot
binary as a text exporter, and save its output somewhere so that the commonnode-exporter
can pick it up.
Since this doesn't need any special rights, just to be able to look at
directories and files, I run the daemon as a dynamic systemd user, just
with supplemental groups my photos group. You can find an example
systemd unit in examples/systemd.server
. This is accompanied by the
"defaults" file examples/prometheus-photo-backlog-exporter.defaults
(see the service file, move the defaults where it is appropriate).
Note that the binary expects at least the path to the root of the "incoming" photos directory, i.e. a minimal invocation is:
photo-backlog-exporter -P /my/incoming/photos/directory
The full list of arguments is:
$ photo-backlog-exporter --help
Optional arguments:
-h, --help print help message
-p, --port PORT port to listen on (default: 8813)
-l, --listen LISTEN address to listen on (default: ::)
-P, --path PATH path to root of incoming photo directory
-i, --ignored-exts IGNORED-EXTS
ignored file extension (default: xmp,lua,DS_Store)
-r, --raw-exts RAW-EXTS raw or other files that should not be editable (default: nef,cr2,arw,orf,raf)
-e, --editable-exts EDITABLE-EXTS
editable files, e.g. jpg, png, tif (default: jpg,jpeg,heic,heif,mov,mp4,avi,gpr,dng,png,tif,tiff,3gp,pano)
-a, --age-buckets AGE-BUCKETS
Photos age histogram buckets, in weeks (default: 1,2,3,4,5,7,10,13,17,20,26,30,35,52,104)
-o, --owner OWNER Optional owner expected for all files
-g, --group GROUP Optional group expected for all files
-d, --dir-mode DIR-MODE Optional numeric mode (permissions) expected for directories, e.g 750
-R, --raw-file-mode RAW-FILE-MODE
Optional numeric mode (permissions) expected for non-editable files, e.g. 640
-E, --editable-file-mode EDITABLE-FILE-MODE
Optional numeric mode (permissions) expected for editable files, e.g. 660
I hope they are self-explanatory. Well, maybe the --ignored-exts
:
these are extensions for which files should be completely
ignored. For example, the xmp
extension should normally be ignored
because Lightroom can store metadata (if so configured) outside of the
catalog and in xmp
files for each proprietary RAW format. (Yes, this
also means that the mtime-counting doesn't work well for jpeg
files,
for example. Sorry - if you have ideas, file a bug!)
The file permissions are split in two categories:
- raw files, which in general should not be edited, at least not for proprietary RAW files, like Nikon's NEF and Canon's CRW, as opposed to DNG which is editable;
- and editable files, which is everything else - DNG, TIFF, JPEG, etc.
The difference arises that ideally, the processing tool shouldn't be given write
permissions to raw files. If you do use Lightroom to store edits into RAW files
(in my opinion, not a good idea), then don't pass -R
and override the -r
options.
Note that the binary uses the env_logger
rust package, and thus
logging can be configured via the usual RUST_LOG=info
and similar
environment variables. This is why the example systemd service file
mentioned uses an env file, to allow easy passing of both arguments
but also (in this case) RUST_LOG
.
I'm not a Rust programmer, just having fun learning a new language, so please do point out mistakes, obvious or not. Thanks!