Software to automatically fill a Virtuoso DB with Europeana datasets and update it regularly
- Check if the configuration in the file
/src/main/resources/sparql-updater.user.properties
is present and correct - Run
mvn clean install
to create the file/target/sparql-updater.jar
. This file contains the code to automatically load sets from the Europeana FTP server and write it to Virtuoso. It will also check regularly if datasets were modified and if so will update Virtuoso. - Run
docker build . -t europeana/sparql-updater-virtuoso
to create a Docker image containing both Virtuoso and the sparql-updater.jar - Start everything using the file
docker-compose-development-environment
. The Virtuoso GUI is available via http://localhost:8890/
Some things to be aware of:
- Loading all Europeana datasets will require around 500GB of disk space!
- For local testing purposes we have a hard-coded password, see the
docker-compose-development-environment.yml
file - After startup a folder named
/volumes
is created containing 3 folders/database
containing the virtuoso data files/ingest
containing the sparql-updater's log file and generated sql files/ttl-import
containing files downloaded from the ftp-server
- You can check which datasets are loaded using this SPARQL query:
SELECT DISTINCT ?g WHERE { GRAPH ?g {?s a ?o} }
If you are making changes to the sparql-updater don't forget to
- Rebuild the jar
- Rebuild the Docker image
- Recreate the container