SINPA

This repo is the implementation of our IJCAI 2024 paper (AI for Social Good Track) entitled Predicting Carpark Availability in Singapore with Cross-Domain Data: A New Dataset and A Data-Driven Approach. In this study, we crawl, process, and release the SINPA dataset, a large-scale parking availability dataset incorporating cross-domain data in Singapore. We then propose a novel deep-learning framework DeepPA to collectively forecast future PA readings across Singapore.

Framework

Figure (a) Distribution of 1,687 carparks throughout Singapore. (b) The framework of DeepPA.

Dataset

In this section, we will outline the procedure for downloading the SINPA dataset, followed by a detailed description of the dataset.

Dataset Download. We provide the dataset on: https://huggingface.co/datasets/Huaiwu/SINPA/tree/main. There are five files in the ./data folder:
```
  ├── data
  │   ├── train.npz
  │   ├── val.npz
  │   ├── test.npz
```
train.npz, val.npz and test.npz include training (12167 samples), validation(1217 samples), and test (1216 samples) set respectively. To download the data, you can download all data from the provided link. You can download each file by clicking on its download button.

Dataset Description. We crawled over three-year real-time PA data every 5 minutes from 1,921 parking lots throughout Singapore from Data.gov.sg. To mitigate the impact of missing values, we re-sampled the raw dataset into the 15-minute interval and chose lots with a missing rate of PA of less than 30%. In addition, due to the temporal distribution shift, we only use one-year data (2020/07/01 to 2021/06/30), and the ratio of training: validation: testing sets is set as 10:1:1. We then remove parking lots with obvious distribution shift (i.e., high KL divergence). After sample filtering, it remains 1,687 parking lots with stationary data distributions. We also crawl external attributes for these lots, including meteorological data (i.e., temperature, humidity, and wind speed), panning areas, utilization type, and road networks data from Data.gov.sg, the Urban Redevelopment Authority (URA) and the Land Transport Authority (LTA) respectively. A detailed description of the dataset can be found in the following table.

Dimension	Type	Category	Feature name	Detail
0	Predict Target	Parking Availability	Parking Availability	Real value
1	Temporal Factor	Time-related	Time of day	0 to 95 int number (24*4)
2			Weekday	0 to 6 int number (7)
3			Is_holiday	One-hot
4		Meteorology	Temperature	Normalized value
5			Humidity	Normalized value
6			Windspeed	Normalized value
7	Spatial Factor	Utilization Type	Utilization Type	0 to 9 int number (10)
8		Region-related	Planning area	0 to 35 int number (36)
9		Road-related	Road Density	Normalized value
10		Location	Latitude	Normalized value
11		Location	Longitude	Normalized value

Note: Normalized refers to Z-score normalization, which is applied for fast convergence.

Auxiliary Data. If you would like to visualize the parking lots or customize the adjacency matrix, you can access the parking lot locations in the file aux_data/lots_location.csv.

Requirements

DeepPA uses the following dependencies:

Pytorch 1.10 and its dependencies
Numpy and Scipy
CUDA 11.3 or latest version, cuDNN.

Folder Structure

We list the code of the major modules as follows:

The main function to train/test our model: click here.
The source code of our model: click here.
The trainer/tester: click here.
Data preparation and preprocessing are located at click here.
Computations: click here.

Arguments

We introduce some major arguments of our main function here.

Training settings:

mode: indicating the mode (train or test).
n_exp: experimental group number.
gpu: which gpu used to train.
seed: the random seed for experiments. (default: 0)
dataset: dataset path for the experiment.
batch_size: batch size of training or testing.
seq_len: the length of historical steps.
horizon: the length of future steps.
input_dim: the dimension of inputs.
output_dim: the dimension of inputs.
max_epochs: maximum number of training epochs.
patience: the patience of early stopping.
save_preds: whether to save prediction results.
wandb: whether to use wandb.

Model hyperparameters:

dropout: dropout rate.
n_blocks: number of layers of SLBlock and TLBlock.
n_hidden: hidden dimensions in SLBlock and TLBlock.
n_heads: number of heads in MSA.
spatial_flag: whether to use SLBlock.
temporal_flag: whether to use TLBlock.
spatial_encoding: whether to treat temporal factor as a station.
temporal_encoding: Whether to incorporate spatial factor into TLBlock.
temporal_PE: whether to use temporal position encoding.
GCO: whether to use GCO.
GCO_Thre: the proportion of low frequency signals.
base_lr: base learning rate.
lr_decay_ratio: learning rate decay ratio.

Model training

The following examples are conducted on the base dataset of SINPA:

Example 1 (DeepPA with default setting):

python ./experiments/DeepPA/main.py --dataset /base/ --mode train --gpu 0

Example 2 (DeepPA without GCO):

python ./experiments/DeepPA/main.py --dataset /base/ --mode train --gpu 0 --GCO False

Example 2 (DeepPA with the 0.7 proportion of low frequency signals):

python ./experiments/DeepPA/main.py --dataset /base/ --mode train --gpu 0 --GCO_Thre 0.7

Model Evaluation

To test the above-trained models, you can use the following command:

Example 1 (DeepPA with default setting):

python ./experiments/DeepPA/main.py --dataset /base/ --mode test --gpu 0

Example 2 (DeepPA with the 0.7 proportion of low frequency signals):

python ./experiments/DeepPA/main.py --dataset /base/ --mode test --gpu 0 --GCO_Thre 0.7

License

The SINPA dataset is released under the Singapore Open Data Licence: https://beta.data.gov.sg/open-data-license.

Citation

If you find our work useful in your research, please cite:

@inproceedings{zhang2024predicting,
  title={Predicting Parking Availability in Singapore with Cross-Domain Data: A New Dataset and A Data-Driven Approach},
  author={Zhang, Huaiwu and Xia, Yutong and Zhong, Siru and Wang, Kun and Tong, Zekun and Wen, Qingsong and Zimmermann, Roger and Liang, Yuxuan},
  booktitle={Proceedings of the Thirty-third International Joint Conference on Artificial Intelligence, IJCAI-24},
  year={2024}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

SINPA

Framework

Dataset

Requirements

Folder Structure

Arguments

Model training

Model Evaluation

License

Citation

Files

README.md

Latest commit

History

README.md

File metadata and controls

SINPA

Framework

Dataset

Requirements

Folder Structure

Arguments

Model training

Model Evaluation

License

Citation