This is a simple Big Data project that writes a dataset to HDFS and applies Hadoop MapReduce jobs on the dataset. MapReduce functions coded with Java. Web GUI developed with Python Flask. We worked with Netflix Prize Dataset from Kaggle. We worked with Hadoop 3.2.1 and Java 8.
We coded 5 different descriptive statistical functions for map reduce jobs: