Big-Data

Projects based on Big Data.

This project is based on Document analysis where large number of corpus is given and the relevant document should be extracted with the help of Term Frequency- Inverse Document Frequency(TF-IDF) which is known as feature extration.

This project uses Hadoop Map Reduce, Spark RDD and Spark SQL.

All these programs are available in these file with explanation.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
MapReduce		MapReduce
SparkRDD		SparkRDD
SparkSQL		SparkSQL
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Big-Data

About

Releases

Packages

Languages

nikhilkumawat03/Extracting-Relevant-Document

Folders and files

Latest commit

History

Repository files navigation

Big-Data

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages