Skip to content

Bitcoin Pricing of Local Currency deducted from similar Bitcoin Currency Pairs

License

Notifications You must be signed in to change notification settings

egoOyiri/egoEconometrics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

egoEconomotrics

This project uses Apache Spark run on a Single Node Hadoop/Yarn.

Warning:

This install will play with your ~/.ssh folder, more specifically the .ssh/authorized_keys file
It will allow hadoop to run the ssh localhost command using a DSA PassPhraseLess key

  • Defintion:

    • Hadoop HDFS (Hadoop Distributed File System)
    • Yarn, MapReduce 2.0
    • Spark general engine for large-scale data processing
  • Languages:

    • Scala and Python

Prerequisites:

  • wget
  • java (JVM)
  • *nix - Darwin, Cygwin (not yet)
  • Python (if running Python)
  • sbt to run the Scala examples

Running the install:

./bootstrap/install.sh

Hadoop NameNode Daemons

  • Set the Hadoop Home

HDFS_HOME=~/bin/local/bigdata/hadoop

  • Starting the services

$HDFS_HOME/sbin/start-dfs.sh

Note: On MacOS, make sure SSH is started. System Preferences/Sharing/Remote Login [ON]

  • Checking Services are running

jps

13049 NameNode (HDFS Name Node) -- Make sure this is running
13241 DataNode (HDFS Data Node)
22752 ResourceManager (Yarn Resource)
22894 NodeManager (Yarn Node)


Monitoring DFS Health

Browsing the File System's health

http://localhost:50070

Yarn Daemons

  • Start ResourceManager daemon and NodeManager daemon:

$HDFS_HOME/sbin/start-yarn.sh

Monitoring Resource Manager

If you want to look at the running jobs or already executed (Jobwatch Equivalent)

http://localhost:8088

Hadoop Distributed File System (Hadoop DFS) handling

  • Create and Mount a new Hadoop DFS

${HDFS_HOME}/bin/hdfs namenode -format

Note: You need to restart HDFS

  • Create a directory in Hadoop DFS

Create the user directory along with the owner directory

${HDFS_HOME}/bin/hdfs dfs -mkdir -p /user/${USER}

Running the example

Package a jar containing your application

sbt package

... [success] Total time: ...

Set SPARK HOME

SPARK_HOME=~/bin/local/bigdata/spark

Use spark-submit to run your application

${SPARK_HOME}/bin/spark-submit --class "SimpleApp" --master local[4] target/scala-2.10/egoeconometrics_2.10-0.1-SNAPSHOT.jar

... Lines with a: 41, Lines with b: 17

Running Interactive Shell

  • Scala

${SPARK_HOME}/bin/spark-shell

  • Python

${SPARK_HOME}/bin/bin/pyspark --master local[4]

Running Spark Streaming

  • You will first need to run Netcat (a small utility found in most Unix-like systems) as a data server by using

nc -lk 9999

  • Then, in a different terminal, you can start the example by using

${SPARK_HOME}/bin/spark-submit --class "QuickStreamingApp" --master local[4] target/scala-2.10/egoeconometrics_2.10-0.1-SNAPSHOT.jar localhost 9999

Stopping the Services

When you're done, stop the daemons with:

$HDFS_HOME/sbin/stop-yarn.sh

$HDFS_HOME/sbin/stop-dfs.sh

AWS Ubuntu

sudo /bin/dd if=/dev/zero of=/var/swap.1 bs=1M count=1024
sudo /sbin/mkswap /var/swap.1
sudo /sbin/swapon /var/swap.1
To turn off the swap do the following:
sudo /sbin/swapoff /var/swap.1

Recommended reading for AWS http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/

Kwown issues:

License

Copyleft © 2014 EgoOyiri [AfricaCoin]

Distributed under the Eclipse Public License either version 1.0 or (at your option) any later version.

About

Bitcoin Pricing of Local Currency deducted from similar Bitcoin Currency Pairs

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published