Navigator is a data service that prepares the content for travel agencies, ready for exploration in EWNS (East-West-North-South) direction and hence allows them to render content to the end-user based on their desire to travel. This service was created as a part of a hands-on course on Hadoop from IBM Headstart. All the information you need regarding the problem being solved here is present in the Navigator_SRS.pdf
file.
The travel agency's mobile application interface can use the data stream generated by this service to ensure reliable and updated content delivery at all times. For example, suppose the end-user is currently present at a point x on the map, the JSON output of this data service contains geocoded data of all the cities, within the radius of 25kms of x, in all four directions. In this way, it will help the traveller to navigate.
The geocoded input data was extracted from Wikipedia because of its reliability, owing to its crowdsourcing methodology; the link is in the SRS document, 'Navigator_SRS.pdf'. The solution is developed using the MapReduce paradigm of Hadoop on Cloudera Apache Hadoop Framework with Eclipse Version 4.4.2 and Apache HIVE. The output generated is in JSON format.
-
Set up a single-node Hadoop cluster on your system. Your system can either be a standalone system or a system existing as a node in a cluster. You can get all the installation information from
http://doctuts.readthedocs.io/en/latest/hadoop.html
. -
Install Apache HIVE on your Linux system. The information about this is present on
https://www.edureka.co/blog/apache-hive-installation-on-ubuntu
.
All the information about generating the output JSON file is present in the step_wise_queries.txt
file. Also, the compressed version of input data is present as the geo_coordinates_en.rar
file. After extracting data from this, follow the step_wise_queries.txt
file.
This project is licensed under the MIT License - see the LICENSE file for details.