fundamental-hadoop

This is just introductory repository with working examples which helps to learn basic about Hadoop Ecosystem.

Hadoop

Hadoop is an open source, Java-based programming framework that supports the processing and storage of extremely large data sets in a distributed computing environment. It is part of the Apache project sponsored by the Apache Software Foundation.

Hadoop Official Website

hadoop.apache.org/

MapReduce

A MapReduce job usually splits the input data-set into independent chunks which are processed by the map tasks in a completely parallel manner. The framework sorts the outputs of the maps, which are then input to the reduce tasks. Typically both the input and the output of the job are stored in a file-system.

Apache Pig

Apache Pig is a high-level platform for creating programs that run on Apache Hadoop. The language for this platform is called Pig Latin.

Apache Pig Official Website

https://pig.apache.org/

Apache Hive

The Apache Hive ™ data warehouse software facilitates reading, writing, and managing large datasets residing in distributed storage using SQL. Structure can be projected onto data already in storage. A command line tool and JDBC driver are provided to connect users to Hive.

Apache Hive Official Website

https://hive.apache.org/

Cloudera Virtual Quickstart

Use Cloudera Virtual Quickstart to learn about ecosystem in details.

Cloudera Virtual Quickstart link

https://www.cloudera.com/downloads/quickstart_vms/5-8.html

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
Data		Data
Flume		Flume
pig_script		pig_script
word_count		word_count
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

fundamental-hadoop

Hadoop

Hadoop Official Website

MapReduce

Apache Pig

Apache Pig Official Website

Apache Hive

Apache Hive Official Website

Cloudera Virtual Quickstart

Cloudera Virtual Quickstart link

About

Releases

Packages

License

nandanosql/fundamental-hadoop

Folders and files

Latest commit

History

Repository files navigation

fundamental-hadoop

Hadoop

Hadoop Official Website

MapReduce

Apache Pig

Apache Pig Official Website

Apache Hive

Apache Hive Official Website

Cloudera Virtual Quickstart

Cloudera Virtual Quickstart link

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages