Synposis
-
This is the first major release of Gym. Includes a well-tested version that implements UNets and Residual-UNets, for segmentations of 2 or more classes, on 1-, 3- and N-band imagery. Models are trained using mixed precision by default, which allows for larger batches, and shorter duration training.
-
This version uses doodleverse-utils v0.0.11 (https://github.com/Doodleverse/doodleverse_utils and https://pypi.org/manage/project/doodleverse-utils/release/0.0.11/)
-
This version is supported by the test dataset published here: https://zenodo.org/record/7232051#.Y1HDKnbMIuU
-
This version is slightly modified from (but functionally equivalent to) that documented in the following paper: Buscombe, D., & Goldstein, E. B. (2022). A reproducible and reusable pipeline for segmentation of geoscientific imagery. Earth and Space Science, 9, e2022EA002332. https://doi.org/10.1029/2022EA002332
Implementation overview
Implementation includes the following user choices
make_nd_datasets.py
- use of datasets with or without augmentations, controlled by the 'mode' parameter
- an efficient compressed file storage format, npz (see here)
- images and labels are resized to a so-called 'TARGET SIZE' for the purposes of model training, but model inference can be applied to full-size imagery
- creates a subset of augmented and non-augmented example image-label overlays for verification of npz file contents
- supports on-the-fly label filtering using morphological operations to remove class islands and holes smaller than a threshold size
- supports on-the-fly class-remapping (recoding integer label by merging classes)
train_model.py
- available loss functions: a) Dice, b) weighted Dice, c) categorical crossentropy, d) hinge, e) kullback-Leibler divergence
- single and multi-GPU training
- highly configurable learning rate scheduler
- models are evaluated by keeping track of multiple metrics including a) mean IoU (Intersection over Union), b) mean Dice (Jaccard), c) overall accuracy, d) mean frequency weighted IoU, e) Matthews Correlation Coefficient, f) per-class precision, g) per-class recall, and h) per-class F1 score
- an efficient data throughput pipeline using tfdatasets, including batching and auto pre-fetching
- available models: UNet, residual UNet, and 'satellite' UNet
- training employs early stopping and checkpoints.
- model weights are saved in h5 format, and utility script is available for conversion to a portable model
seg_images_in_folder.py
- supports single model application, and model ensembling
- supports simple softmax thresholding and per-image Otsu thresholding for 2-class problems (binary segmentation)
- creates greyscale and color label outputs
- supports per-label postprocessing using a Conditional Random Field
Contribution credits
Package maintainers:
Contributors:
Future releases
Planned future releases will incorporate planned extensions in at least 2 areas:
- switching from keras image augmentation to a third party library such as albumentations
- more control over use of train, validation, and test subsets in model training and evaluation
- more model architectures such as U++ net, U^2 net, attention Unet (etc)