Authors: Sangwon Lim and Omar Kawach
Purpose: Submission for the group project in University of Victoria's Artificial Intelligence course (ECE 470).
Description: Get a model and see if it can be applicable to other data.
- Sea ice concentration classification models generated using Deep Learning architectures
- Utilized Gray Level Co-occurrence Matrix (GLCM) products for feature engineering
- Devised the training and test data splitting strategy to mitigate the spatial auto-correlation in training
- Utilized 1D-CNN to generate convolved features for maximized relationships between optical bands
- Devised a deep learning architecture concatenating Multi-layer Neural Network and 1D-CNN
- Identified the optimum feature selections, architectures and classification scheme for the problem
- Model assessments based on confusion matrices, accuracy and F1 score
- 47.65% accuracy on 8-class classification comparable to existing 2D-CNN model’s 55.3% that was tested without consideration of spatial auto-correlation
To avoid conflicts, the first step is to isolate this project by creating a Python virtual environment called venv
. The virtual environment will have it's own python interpreter, dependencies, and scripts. Commands should only be entered in a terminal that has venv
active.
python -m venv venv
source venv/bin/activate
pip install .
pip install -r requirements.txt
python -m venv venv
venv/Scripts/activate
pip install .
pip install -r requirements.txt
For our research we used optical data from Kaggle. The Python programs in the package were built with the following dataset:
Sylvester, S. (2021, April). Arctic sea ice image masking, Version 3. Retrieved May 17, 2021
To use the dataset we selected, ensure that you have a Kaggle API token properly saved locally.
Once you have ensured that you have a Kaggle API token, cd
into the data
folder and run the following command:
kaggle datasets download alexandersylvester/arctic-sea-ice-image-masking
For the workflow, the commands below are in sequential order. Again, make sure you are in venv
when running these commands.
Note: If you are on Windows, be sure to write python
before the path to the script you are trying to run. Only Windows requires a relative path to the script you are trying to run. The commands below assume you are in the project's home directory.
Purpose: Preprocessing step for feature extraction.
Note: The reference image of patch locations is retrieved using the extract_patch_locations
shell script or batch script.
./extract_patch_locations.sh
./extract_patch_locations.bat
Purpose: Features for machine learning should be extracted for each sample pixel. Extracts pixel samples where the number of samples per class is nearly consistent throughout the resulting dataset.
Command to run distribution statistics on a folder:
dist-stat --input data/arctic-sea-ice-image-masking/Masks
Command to run distribution statistics on a single file:
dist-stat --input data/arctic-sea-ice-image-masking/Masks/P0-2016042417-mask.png
Command to run distribution statistics on a folder:
python scripts/dist-stat --input data/arctic-sea-ice-image-masking/Masks
Command to run distribution statistics on a single file:
python scripts/dist-stat --input data/arctic-sea-ice-image-masking/Masks/P0-2016042417-mask.png
Command to create datasets via multiprocessing:
create-datasets --images ./data/arctic-sea-ice-image-masking/Images --masks ./data/arctic-sea-ice-image-masking/Masks --dist ./data/pixel_values.csv --patch-loc ./data/AOIs_R_thresh_CL_centroids.csv --multiprocess
Command to create datasets without multiprocessing:
create-datasets --images ./data/arctic-sea-ice-image-masking/Images --masks ./data/arctic-sea-ice-image-masking/Masks --dist ./data/pixel_values.csv --dist ./data/pixel_values.csv --patch-loc ./data/AOIs_R_thresh_CL_centroids.csv
Note: WinError 5 will occur if you try creating datasets with multiprocessing on Windows.
Command to create datasets without multiprocessing:
python scripts/create-datasets --images ./data/arctic-sea-ice-image-masking/Images --masks ./data/arctic-sea-ice-image-masking/Masks --dist ./data/pixel_values.csv --dist ./data/pixel_values.csv --patch-loc ./data/AOIs_R_thresh_CL_centroids.csv
Purpose: Generate 5 GLCM products for each of the data points
GLCM --input ./data/train_dataset/raw.csv --img-dir ./data/arctic-sea-ice-image-masking/Images
python scripts/GLCM --input ./data/train_dataset/raw.csv --img-dir ./data/arctic-sea-ice-image-masking/Images
Purpose: Normalizing data can result in better performance of the model. Except for training data, the strategy of normalization should include standard min & max values instead of calculating such values within the dataset. The standard values are from the training dataset.
To normalize the training dataset:
normalize --input ./data/train_dataset/GLCM.csv --std-data ./data/train_dataset/GLCM.csv
To normalize the test dataset:
normalize --input ./data/test_dataset/GLCM.csv --std-data ./data/train_dataset/GLCM.csv
To normalize the training dataset:
python scripts/normalize --input ./data/train_dataset/GLCM.csv --std-data ./data/train_dataset/GLCM.csv
To normalize the test dataset:
python scripts/normalize --input ./data/test_dataset/GLCM.csv --std-data ./data/train_dataset/GLCM.csv
Purpose: Training, testing, and predicting of the model.
Note: The commands below only seem to work on MacOS with M1 chip.
Train neural network:
neural-network --dl-config ./DL_configs/GLCM_C6_cat.yml
Train 1D-CNN (To concatenate multi-layer neural network, add features other than spectral data and GLCM products):
# 1D-CNN
CNN --dl-config ./DL_configs/GLCM_C6.yml
# Concatenation of 1D-CNN and multi-layer NN
CNN --dl-config ./DL_configs/GLCM_C6_cat.yml
Test the model:
test-model --dl-config ./DL_configs/GLCM_C6_cat.yml --result-dir ./results/CNN_GLCM_C6_cat
For an image, run a prediction:
predict --patch-loc ./data/AOIs_R_thresh_CL_centroids.csv --std-data ./data/train_dataset/GLCM.csv --result-dir ./results/CNN_GLCM_C4_cat/ckpt_1 --dl-config ./DL_configs/GLCM_C4_cat.yml --mask-dir ./data/arctic-sea-ice-image-masking/Masks --input ./data/arctic-sea-ice-image-masking/Images/P54-2018071616.jpg --classes 4
Figure 1. Expert Image
Figure 2. Prediction Image
[1] R. Ressel, A. Frost and S. Lehner, "A Neural Network-Based Classification for Sea Ice Types on X-Band SAR Images," in IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 8, no. 7, pp. 3672-3680, July 2015, doi: 10.1109/JSTARS.2015.2436993.