Skip to content

Technologies Used: HortonWorks Hadoop,HDFS,Spark SQL,Scala

Notifications You must be signed in to change notification settings

StephyJacob/Citi-Bike-Data-Analysis

Repository files navigation

Citi-Bike-Data-Analysis

Citi Bike is the nation's largest bike share program, with 10,000 bikes and 600 stations across Manhattan, Brooklyn, Queens and Jersey City.

Analysed Citi Bike data from 07-2013 to 02-2014 (1 GB csv data) and explored below:

  1. Which route Citi Bikers ride the most?
  2. Find the biggest trip and its duration?
  3. When do they ride?
  4. Find the distance using Latitudes and Longitudes for each record?
  5. Start station name and End station name of the ride which has biggest ride in distance travelled?
  6. Which stations are most popular?
  7. Which days of the week are most rides taken on?

Technologies Used: HortonWorks Hadoop,HDFS,Spark SQL,Scala

Data: https://s3.amazonaws.com/tripdata/index.html Data Description: https://www.citibikenyc.com/system-data

About

Technologies Used: HortonWorks Hadoop,HDFS,Spark SQL,Scala

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages