The archive differ is an application that takes an archived bag in the Bagit format, and checks the METS file for difference in comparison to another bag. If there are differences between the preservation steps (outlined in the PREMIS) it will display the difference that bags have to the user in a useful and readable format.
Before installing, please ensure that you have Go installed on your machine. You can download it from the official Go Download Page.
You can install the binary quickly with:
go install github.com/Diogenesoftoronto/ardi@latest
Clone the repository to your local machine, or download the source code as a zip:
git clone https://github.com/Diogenesoftoronto/ardi
Navigate into the project directory:
cd repository
Install the necessary dependencies:
go mod tidy
Build the application:
go build
The first focus is to display the difference and counts of the preservation events. It will also give users the output of the diff tag if asked for.
The tool may also do other things given time.
The archive differ is primarly a command line tool although in the future if it proves useful, it's feature set may include a text user interface.
To use the application, run the following command with your METS file paths as arguments:
- This assumes you have added ardi to your PATH variable.
ardi path1/to/metsfile.xml path2/to/metsfile.xml
Ardi works with tars, zips, and sevenzips. But not directories. You can give Ardi compressed files and it will find the Mets and compare on its own. Ardi can do multiple diffs at the same time but just make sure that you are giving Ardi a multiple of two. Otherwise it wont be able to compare the odd one.
To add ardi to your path please do this for the bash shell:
export PATH=$PATH:/path/to/ardi/bin
Adding variables to the fish shell is even easier:
fish_add_path ardi /path/to/ardi/bin
The reason this exists is to test different digital archiving tools to see if they produce the same types of preservation metadata in a quick way without resorting to human error and intense labour.
The primary applications it is meant to test is the a3m and archivematica Archival Information Packages.