Skip to content

Install

AtesComp edited this page May 26, 2022 · 21 revisions

Prerequisites

You need to have Java and OpenRefine installed on your machine.

  • Java 11 to 17 (see notes below)
  • OpenRefine 3.5.x to 3.6-SNAPSHOT

NOTE: The author has tested RDF Transform using OpenJDK 11 and OpenJDK 17.
NOTE: For Java Standard Editions after Java 8, you cannot install the JRE separate from the JDK unless you use a site like JustJ and their JRE Downloads. RDF Transform has not been tested using JustJ installs and is beyond the scope of this project.

Additionally, if you need to compile, you will need Maven.

  • Java JDK 11 to 17
  • Apache Maven 3.6 or better
  • OpenRefine 3.6-SNAPSHOT Source (optional)

From Compiled Release

The compiled release file is the "Easy Button" to get RDF Transform installed as an extension to OpenRefine. Follow these instructions to get it running.

  1. If it does not exist, create a folder named extensions under your user workspace directory for OpenRefine. The workspace should be located in the following places depending on your operating system (see the OpenRefine FAQ for more details):
    • Linux ~/.local/share/OpenRefine
    • Windows C:/Documents and Settings/<user>/Application Data/OpenRefine OR C:/Documents and Settings/<user>/Local Settings/Application Data/OpenRefine
    • Mac OSX ~/Library/Application Support/OpenRefine
    As an alternative (but not recommended), use the OpenRefine application's extensions directory instead.
  2. Unzip the downloaded release (ensuring it is a rdf-transform-x.x.x.zip and not a source code .zip or .tar.gz) in the extensions folder (within the directory of step 1). This will create an rdf-transform directory containg the extension.
  3. Start (or restart) OpenRefine (see the OpenRefine User Documentation)

NOTE: It is recommended that you have an active Internet connection when using the extension as it can download ontologies from specified namespaces (such as rdf, rdfs, owl and foaf). You can (re)add namespaces and specify whether to download the ontology (or not) from the namespace declaration URL. If you must run OpenRefine from an offline location, you can copy the ontologies to files in your offline space and use the "from file" feature to load the ontologies.

From Source - Build

Source code...for those of you who want more depth...to ply the inner workings of OpenRefine. You still need to install it to test and debug any modifications, so here are those complete instructions.

NOTE: If you have previously installed the extension, you will need to replace it in the extensions directory with the newly built version, e.g., delete rdf-transform directory in the extensions directory and unzip the new file.

TL;DR:

Short:

git clone https://github.com/AtesComp/rdf-transform
cd rdf-transform
mvn clean compile
mvn assembly:single
rm -rf ~/.local/share/openrefine/extensions/rdf-transform*
unzip target/rdf-transform-2.0.5.zip -d ~/.local/share/openrefine/extensions
~/path/to/openrefine/refine

Long:

git clone https://github.com/AtesComp/rdf-transform
git clone https://github.com/OpenRefine/OpenRefine
cd OpenRefine
./refine clean
./refine build
./refine dist 3.6-SNAPSHOT
cd ../rdf-transform
mvn install:install-file -Dfile=../OpenRefine/main/target/openrefine-main.jar -DpomFile=openrefine-shim-pom.xml -DcreateChecksum=true -DlocalRepositoryPath=./project-repository
mvn clean compile
mvn assembly:single
rm -rf ~/.local/share/openrefine/extensions/rdf-transform*
unzip target/rdf-transform-2.0.5.zip -d ~/.local/share/openrefine/extensions
cd ../OpenRefine
./refine

Short Steps

A local project repository (see the "project-repository" directory) contains an OpenRefine jar file ready for use by the maven compile process. If you want or need to compile OpenRefine, see the Long Steps below to create the OpenRefine jar file.

  1. From some top level development directory, create a local repository for this RDF Transform extension:
    • Clone the extension at the top level development directory where you want the /rdf-transform sub-directory:
      • git clone https://github.com/AtesComp/rdf-transform
  2. Compile the RDF Transform extension:
    • Change directories to the RDF Transform extension:
      • cd rdf-transform
    • Clean and compile the extension's dev environment:
      • mvn clean compile
    • Assemble the extension:
      • mvn assembly:single
    • Copy and unzip the target/rdf-transform-x.x.x.zip file in the extensions directory as documented in From Compiled Release above

Long Steps

Sometimes you just have to do everything yourself. If you want or need to compile OpenRefine, then you'll probably want to create the jar file for RDF Transform to match. From the Short Steps, you'll notice these instructions have two inserted steps between 1 and 2.

  1. From some top level development directory, create a local repository for this RDF Transform extension:
    • Clone the extension at the top level development directory where you want the /rdf-transform sub-directory:
      • git clone https://github.com/AtesComp/rdf-transform
    • Alternatively, to update an existing clone, in the /rdf-transform directory:
      • Change directories to the RDF Transform development directory:
        • cd rdf-transform
      • Update the code:
        • git pull (or git fetch --all; git reset --hard; git pull for a forced refresh)
      • Change directories up one level:
        • cd ..
  2. Prepare the OpenRefine jar file:
    • Clone OpenRefine from the same top level development directory to create a local repository:
      • git clone https://github.com/OpenRefine/OpenRefine
    • Create the OpenRefine jar:
      • Change directories to OpenRefine:
        • cd OpenRefine
      • Clean OpenRefine's dev environment:
        • ./refine clean
      • Build OpenRefine:
        • ./refine build
      • Build the OpenRefine jar:
        • ./refine dist 3.6-SNAPSHOT (or use the latest version id)
      • Among many other things, this builds the needed jar file: OpenRefine/main/target/openrefine-main.jar
      • Change directories up one level:
        • cd ..
  3. Process the OpenRefine jar file for the RDF Transform extension:
    • Change directories to the RDF Transform extension:
      • cd rdf-transform
    • Adjust the openrefine-shim-pom.xml file to use the proper OpenRefine version id - in this example: 3.6-SNAPSHOT
    • Adjust the pom.xml file to use the proper OpenRefine version id - in this example: 3.6-SNAPSHOT
    • Install the OpenRefine jar in the Maven library for RDF Transform:
      • mvn install:install-file -Dfile=../OpenRefine/main/target/openrefine-main.jar -DpomFile=openrefine-shim-pom.xml -DcreateChecksum=true -DlocalRepositoryPath=./project-repository
  4. Compile the RDF Transform extension:
    • Clean and compile the extension's dev environment:
      • mvn clean compile
    • Assemble the extension:
      • mvn assembly:single
    • Copy and unzip the target/rdf-transform-x.x.x.zip file in the extensions directory as documented in From Compiled Release above

Java JDKs, JREs, and JVMs! Oh, My!

Java does not supply simple JRE installs for versions after 8 (1.8), so you might want to create your own.

You can create your own JRE from a late model JDK install (9+) by performing the following command:

jlink --compress=2 --strip-debug --add-modules=java.base,java.compiler,java.datatransfer,java.logging,java.desktop,java.instrument,java.management,java.management.rmi,java.naming,java.net.http,java.prefs,java.rmi,java.scripting,java.se,java.security.jgss,java.security.sasl,java.smartcardio,java.sql,java.sql.rowset,java.transaction.xa,java.xml,java.xml.crypto --output ~/JRE

Change ~/JRE to whatever non-existing directory you like.

The --add-modules parameters get its modules from:

java --list-modules

using whatever Java version you are currently using. The jlink command is just using the listed "java" modules and ignoring the "jdk" modules.

You can run OpenRefine using this newly created JRE directory by setting the JAVA_HOME environment variable to it and running the OpenRefine script file. A one-liner for Linux is:

export JAVA_HOME=~/JRE; ./refine

while in the OpenRefine directory.

To recreate the JRE, remove the JRE directory, adjust the jlink command, and re-execute it. For Linux, to remove the JRE directory, do:

rm -rf ~/JRE
Clone this wiki locally