Skip to content
Jakob Voß edited this page Jun 4, 2024 · 14 revisions

Install the application

Prerequisites

You need Docker Engine installed. If you don't have it, check its installation guide at https://docs.docker.com/engine/install/.

You should then have the docker-compose.yml and qa-catalogue files. You can get those:

Either: via cloning the repository:

git clone https://github.com/pkiraly/qa-catalogue.git
cd qa-catalogue

or

wget https://github.com/pkiraly/qa-catalogue/archive/refs/heads/main.zip
unzip main
cd qa-catalogue-main

Or: by downloading only the necessary files:

wget https://raw.githubusercontent.com/pkiraly/qa-catalogue/main/docker-compose.yml
mkdir docker
cd docker
wget https://raw.githubusercontent.com/pkiraly/qa-catalogue/main/docker/qa-catalogue
chmod +x qa-catalogue
cd ..

Installation

In docker there are two main concepts: image is a packaged version of the application (think of it as an installation package), container is an instance of the executable application. QA catalogue's container contains a full Ubuntu operating system, and all components the application needs: Java, PHP, R, Apache web server, Apache Solr, SQLite, the application itself and default configuration.

The following process will download the image and create a container with default values:

docker compose up -d

If you would like to modify the configuration you have three options: using a) global environmental variables b) local environmental variables c) storing variables to a docker file.

a) global environmental variables

export WEBPORT=9000
export CONTAINER=qa-catalogue
docker compose up -d

b) local environmental variables

WEBPORT=9000 CONTAINER=qa-catalogue docker compose up -d

c) storing variables to a docker file

The file should be named .env. Here is a sample .env file

IMAGE="ghcr.io/pkiraly/qa-catalogue:main"
WEBPORT=9000
CONTAINER=qa-catalogue

once you save it, you can run

docker compose up -d

It is also possible to explicitly reference a .env file with option --env-file.

The WEBCONFIG variable contains a name of a directory, which contains a configuration.cnf file, that will be used by the web application.

A sample web-config/configuration.cnf file:

default-tab=completeness
label=My custom Catalogue
url=https://my-catalogue.org
linkTemplate=https://my-catalogue.org/catalogue/{id}
language=de

Check the documentation of configuration parameters of QA catalogue UI.

The properties of the library are: label, url, schema, languageand linkTemplate. The rest configure the behaviour of the application.

The INPUT variable stores the directory where the bibliographic files take place. It should be inside your current directory, but it might be a linked directory. The default value is ./input. In the following we suppose that you have a file ./input/rug01.backup.gz, that contains bibliographic records in a gzipped alephseq format, and it has some MARC data elements defined locally in Gent university library.

At the end of the process we will have the image and a running container. You can check these with docker images -a and docker ps -a commands.

Show results

You can reach the web interface ([qa-catalogue-web]) at http://localhost:80/ (or at another port as configured with environment variable WEBPORT).

Updates

Stop the running container and update the image, e.g.

docker pull ghcr.io/pkiraly/qa-catalogue:main

Then start a new container as described above.

Running analyses

Once we have the running container, we can run the analyses.

./docker/qa-catalogue \
  --params "--marcVersion GENT --alephseq" \
  --input-dir "" \
  --mask "rug01.backup.gz" \
  --catalogue gent \
  completeness

The script uses a single docker variable: CONTAINER. If you set it for the first docker command, please use it accordingly.

The command runs the completeness analyses of our input file. The --params contains the catalogue specific parameters, here we have two: marcVersion specifies the locally defined data elements, and alephseq specifies a specific serialization format. input-dir tells that there is no extra subdirectory within the host's (the local machine) input directory (that is mapped to /opt/qa-catalogue/marc/ within the container). mask is a file name pattern, if we have multiple files we can use Linux substitution characters such as *, .. The last part completeness is the name of the analysis to run.

Clone this wiki locally