This is FABS, the "Flexible AFS Backup System", a set of tools and daemons to help backup AFS cells. This file serves as a kind of "quick start" for setting up fabs, as well as a general overview of what it is. It does not document all of the functionality in fabs; see our manpages for more detailed info.
A few prebuilt packages are available for each release on GitHub: https://github.com/openafs-contrib/fabs/releases/
For RHEL/CentOS and Fedora, you can also use the openafs-contrib yum repo provided by Sine Nomine Associates:
$ yum install https://download.sinenomine.net/openafs/contrib/sna-openafs-contrib-release-latest.noarch.rpm
See https://download.sinenomine.net/openafs/contrib/ for info about the repo.
Note that the RPM packaging does not declare dependencies for the following:
- OpenAFS itself
- dumpscan
This just makes it easier to install fabs when those are not installed via RPM, which is pretty common. You must ensure yourself that OpenAFS and dumpscan are built and installed somewhere. You can get dumpscan from the OpenAFS source tree, or you can build a standalone dumpscan from here: https://github.com/openafs-contrib/cmu-dumpscan.
$ pip3 install --upgrade fabs
To manually install from a git checkout, run something like the following:
$ python3 setup.py build
$ python3 setup.py install --skip-build
However, by default, that will use paths in /opt/fabs
. For more traditional
paths, you can specify a few path variables like so:
$ export PREFIX=/usr
$ export LOCALSTATEDIR=/var
$ export SYSCONFDIR=/etc
$ python3 setup.py build && python3 setup.py install --skip-build
This will only install the fabs libraries and commands, though. There are some
additional man pages and other documentation in the doc
dir. Ideally, just
use the RPM or other packaging to install fabs.
Example Debian packaging exists in the debian
dir. Run the normal debuild
command or similar to build a Debian package.
RPM packaging is in the rpm
dir. Run:
$ ./rpm/rules help
To see a list of targets to build.
After installing fabs, you must initialize a couple of things (the SQL db, and our dump blob storage). But before you can initialize those, you should check the configuration to see if the db and dump blob storage are configured for where you want them to go.
To change the configuration, edit or add new files in /etc/fabs/fabs.yaml.d/. To see what the built-in defaults are, run:
$ fabsys config --dump-all
Which can give you an idea for the configuration file format, and what options
are available. For actual documentation on the configuration directives, see
the manpage for fabsys_config(1)
.
To check the db connection url, run this:
$ fabsys config db/url
That will give you the currently-configured url fabs will use to connect to
the database. The built-in default is a sqlite database, which is fine unless
you want to use some other external database. (See the section about the
db/url
directive in fabsys_config(1)
for more information about other
database types.)
To get the relevant SQL commands to initialize the db (create tables and such), run this:
$ fabsys db-init --sql
Of course, if you want to save that SQL to a file, just redirect the output:
$ fabsys db-init --sql > /tmp/fabs-db.sql
Or, if you want fabs to run that SQL itself, and cause the tables to be created, you can do this:
$ fabsys db-init --exec
Of course, in order to run that, you must have rights (via the db/url
connection url) to access the database and create tables, etc.
See the manpage for fabsys_config(1)
for more information (look for the
section on db/url
).
To check what directories fabs will use for storing volume blobs, run this:
$ fabsys config dump/storage_dirs
That will tell you what fabs will use as a directory to store volume blobs. Before fabs will be willing to use that directory, though, you must initialize it, by running this command:
$ fabsys storage-init --all
That will create a few things in that directory to indicate that it is valid for use as a fabs blob storage directory.
In order for fabs to be able to dump volumes and do other privileged operations with AFS, it needs some credentials to do. This can be provided by a krb5 keytab.
You can also use -localauth for authenticated AFS commands, but this is not
recommended for production use, since not all operations support -localauth. To
use this (perhaps for initial testing or debugging), enable the configuration
option afs/localauth
.
For non-localauth mode, you need a krb5 keytab to authenticate to AFS. By
default, fabs looks for this keytab in /etc/fabs/afsadmin.keytab, but you can
specify a different path with the config option afs/keytab
.
You must be able to authenticate to AFS using this keytab. Here is an example of how you can do so manually:
$ k5start -t -f /etc/fabs/afsadmin.keytab -U -- vos examine root.cell
That command will examine the root.cell volume in your cell after authenticating to AFS using the afsadmin.keytab file via k5start. If you see any errors or warnings, something is probably wrong.
FABS also makes use of some AFS fs
commands, so make sure you have an AFS
client running and operational.
fabs has a single server daemon that is used for orchestrating backup runs, as well as checking for errors, sending reports, etc. You can run it manually like so:
$ fabsys server
And it will run in the foreground until it receives a SIGINT or SIGTERM,
logging to the local syslog daemon
facility.
To run it with debugging turned on, you can run it like so:
$ fabsys server -x log/level=debug
But be warned that that generates a LOT of data when backup runs are actually running.
We do not provide an init script for fabsys server
. Instead, it is intended
that you just run it under bosserver. See OpenAFS documentation for how to run
a command under bosserver.
Note that the server process currently has no way to re-read configuration while running. You must restart the server process for it to notice any configuration changes.
To run a backup, run this:
$ fabsys backup-start --all
However, by default, fabs will not backup any volumes. You have to give it a
pattern of volumes to match in the dump/include/volumes
configuration option
(to backup by volume name), or via dump/include/fileservers
(to backup by
fileserver). For example, specify a pattern of app.*
to backup all volumes
that begin with app.
.
To backup a specific volume, run this:
$ fabsys backup-start --volume app.foo
Which will only backup the volume app.foo
.
It is also useful to be able to give a "note" to a backup run, like so:
$ fabsys backup-start --all --note "Daily backup for Fri Jun 19"
Or something like that. That note will be associated with the backup run in the database, and will be printed by status reporting tools and the like.
Also note that fabsys backup-start
just schedules/starts the backup. The process
that actually runs the backup is the fabsys server
process. The fabsys server
process also only scans for scheduled backups ever minute or so, so you will
need to wait up to a minute for the dumps to actually start running.
There are also more options for including/excluding volumes and fileservers
from backups. See the documentation for the dump/include/*
, dump/exclude/*
,
and dump/filter_cmd/*
options in fabsys_config(1)
.
Note that, at least currently, there is no mechanism for scheduling backups at
certain times. To run a backup every day or every week, etc, just run fabs backup-start
via cron.
To view the status of a backup run while it is running, you can run this:
$ fabsys backup-status
Which shows some information about a backup run that is currently running. It
also shows information about the individual dumps that are spawned by the
backup run, but depending on the state, each dump is either just included in a
count (e.g. 56 job(s) in state CLONE_PENDING
), or details about the job are
actually shown (such as, when the vos dump
command is actually running).
By default this just shows some human-readable plain text. This information is also available as machine-readable JSON by running this:
$ fabsys backup-status --format json
Which can be used to format the status output differently, or can be used for more automated monitoring/alerts, or something similar.
fabs can generate reports when a backup run finishes, providing some
information about failures, which volumes were unchanged, and other
information. By default no report is generated, but if you specify a command in
the config options report/txt/command
or report/json/command
, that command
will be run with the report data on stdin when a backup run finishes.
The report/txt/command
command is given a plain text human-readable report.
The report/json/command
command is given JSON-formatted data instead, so you
can format your own reports. See the fabsys_config(1)
manpage on those
directives for more information.
The format of the volume dump blob storage directory is fairly simple. There are a few levels of directories just to reduce the number of directory entries per directory, but at the lowest level, there is just one dump blob per volume. Right now, the filenames are all formatted as such:
<volid>.<fabs_internal_id>.dump
So, for example:
536870915.324.dump
So, whenever a volume has changed and is dumped again, it gets a different (but very similar) filename.
Currently, the general scheme in mind for limiting space in the blob storage
directory is to keep only the most recent copy of a volume around, and delete
all other copies from disk (but keeping them around on tape). The way that you
can achieve this is to periodically run fabsys storage-trim
, which will give
you a list of path names that are redundant (that is, we have another dump file
that is the most recent for that volume).
The idea is that you run fabsys storage-trim
, and examine each printed
filename to see if it has been backed up to tape or other long-term storage. If
it has, then you can just delete the file from disk. If not, just skip the
file, or you can back it up to tape immediately.
In the event of losing everything on the fabs backup server, it's still
possible to restore volume data relatively intuitively. We store volume blobs
as described in the previous section, so manually looking for dump blobs for a
specific volume id is straightforward, and those blobs can be passed to vos restore
like any volume dump blob.
You may want to backup all fabs-related data, though, so in the event of data loss on the fabs backup server, you can easily restore all of the fabs-related data so all path data and other information is just as it was before. You can of course ensure that you've backed up everything needed for running fabs by backing up everything on the machine, but you may want to know what specific files are important to fabs.
All of the fabs-related data that is currently worth backing up is as follows:
- The fabs configuration, typically in /etc/fabs.
- The fabs database. Where this is located can be found by running
fabs config db/url
. By default, our database is in a sqlite database in /var/lib/fabs/fabs.sqlite. - The volume dump blob storage. The locations for this can be found by running
fabsys config dump/storage_dirs
. By default, there is only one dir, which is /var/lib/fabs/fabs-dumps.
So, by default, you can backup the whole fabs system by backing up all files inside /etc/fabs and /var/lib/fabs. Of course, this may change if you change the paths in the fabs configuration, and future versions of fabs may add data in other locations. In general, though, you should be able to always backup everything needed for fabs by backing up two things:
- The /etc/fabs directory.
- All locations referenced by the configuration. You can see the entire
configuration by running
fabsys config --dump-all
.
How to actually backup this data is up to you; fabs does not currently have a mechanism for backing up itself.
To restore a volume, first you must find which backup of it that you want to
restore. You can search for backups of a volume using the fabsys dump-find
command, which can find backup dumps by volume name or by path. For examine:
$ fabsys dump-find --volume user.adeason --near 1438491600 --admin
That will search for backups of the volume home.adeason
around the timestamp
1438491600 (unix time). The --admin flag indicates this is being performed by
an administrator, and so no access checks will be performed.
Searching by path looks like this:
$ fabsys dump-find --path /afs/cell/user/adeason --near 1438491600 --admin
To search on behalf of a user (typically done via some scripting frontend or web interface), you can run it like this:
$ fabsys dump-find --path /afs/cell/user/adeason --near 1438491600 --user adeason
Which checks that adeason
has permission to restore that volume. That
authorization check is done by checking if adeason
can write to the root
directory of the volume (specifically, if they have rlidw rights).
In the output of those commands, they will show some info for each dump found,
including the volume dump id. That volume dump id is what you use to actually
restore the volume, with the fabsys restore-start
command:
$ fabsys restore-start --dump-id 6 --admin
Again, --admin bypasses authorization checks (and --user <user>
says what
user to check authorization for).
To view status on a running restore, you can run:
$ fabsys restore-status
Or to see restore requests from a specific user:
$ fabsys restore-status --user adeason
When the data has been restored to a staging location, that location will be
shown in the fabsys restore-status
output, and the
/etc/fabs/hooks/stage-notify script will be run. That script can, for example,
send an email to the user to let them know a restore has finished.
After a certain amount of time (configurable via stage/lifetime
, defaults to
1 week), the staging data will be removed, and the restore request will be
marked as done.
Backup runs and restore requests can be forcibly killed with the two commands:
$ fabsys backup-kill <brun_id>
$ fabsys restore-kill <req_id>
This will cause the relevant job to be marked as "failed", just as if it encountered an error multiple times and gave up. Pass the --note option to provide a reason as to why the job was killed (the note is free-form text, stored in the database with the relevant job).
Later on, you can cause a failed job to be retried from the last known good point by running:
$ fabsys backup-retry <brun_id>
$ fabsys restore-retry <req_id>
That can be done for jobs that have failed either because they were killed, or because they failed too many times. Of course, it is recommended that you fix whatever problem was causing the job to fail, before issuing that command.
If you are upgrading from a previous version of fabs where the database schema
was different, you will need to upgrade the database with the
fabsys db-upgrade
command. This command can tell you if the db needs to be
upgraded:
$ fabsys db-upgrade
Database MUST be upgraded (version 3 -> 4)
See the documentation for the --exec and --sql options to perform the
upgrade. Please make a backup copy of the database before upgrading!
Or if the db is already at the current version:
$ fabsys db-upgrade
Database already at latest version 4
To perform the upgrade, you should make a backup of the database, and then run
fabsys db-upgrade --exec
or fabsys db-upgrade --sql
to perform the upgrade.
See fabsys_db-upgrade(1)
for details.