- Download all the Yellow Taxi Trip Data and Green Taxi Trip Data from https://www.nyc.gov/site/tlc/about/tlc-trip-record-data.page
- Place the data in data/raw.
- run step-0-combine_all.py to generate the dataset.
- Download the POI data from https://data.cityofnewyork.us/City-Government/Points-Of-Interest/rxuy-2muj as csv type.
- Place the POI data and new york taxi zone file in data/rawPOI.
- run process_task_regression.py to generate the dataset.
python train.py --pretrain 1
python train.py --pretrain 0 --load where_you_save_the_model