VELOXDFS Java library for HADOOP.
Make a file named ~/.velox-env.sh
which the content such as:
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk
export HADOOP_CLASSPATH="$HOME/.m2/repository/org/dicl/velox/velox-hadoop/1.0/velox-hadoop-1.0.jar"
export HADOOP_CLASSPATH="$HOME/sandbox/lib/java/*:$HADOOP_CLASSPATH"
export HADOOP_CONF_DIR=$HOME/hadoop-etc
export JAVA_LIBRARY_PATH=$HOME/sandbox/lib
export HADOOP_HOME=$HOME/opt/hadoop-2.7.3
export PATH+=:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
export HADOOP_SCRATCH=/scratch/vicente
$ gradle publishToMavenLocal
# Or if you do not have gradle installed
$ ./gradlew publishToMavenLocal
Refer to the project https://github.com/vicentebolea/hadoop-etc
for configuring hadoop for VeloxDFS
One example way of running it will be for Aggregate WordCount:
LIBJARS=~/.m2/repository/org/dicl/velox/velox-hadoop/1.0/velox-hadoop-1.0.jar
# Aggregation Wordcount
time hadoop jar $LIBJARS AggregateWordCount -libjars $LIBJARS /text_100GB.dat /output.`date +%N` 1 textinputformat
# Simple WordCount
time hadoop jar $LIBJARS wordcount -libjars $LIBJARS /text_100GB.dat /output.`date +%N`