Generalized Sparse Convolutional Neural Networks for Semantic Segmentation of Point Clouds Derived from Tri-Stereo Satellite Imagery
This GitHub repository contains the machine learning models described in Stefan Bachhofner, Ana-Maria Loghin, Michael Hornacek, Johannes Otepka, Andrea Siposova, Niklas Schmindinger, Norbert Pfeiffer, Kurt Hornik, Nikolaus Schiller, Olaf Kähler, Ronald Hochreiter: Generalized Sparse Convolutional Neural Networks for Semantic Segmentation of Point Clouds Derived from Tri-Stereo Satellite Imagery.
@article{remoteSensing2020gscnn,
title={Generalized Sparse Convolutional Neural Networks for Semantic Segmentation of Point Clouds Derived from Tri-Stereo Satellite Imagery},
author={Bachhofner, Stefan and Loghin, Ana-Maria and Otepka, Johannes and Pfeifer, Norbert and Hornacek, Michael and Siposova, Andrea and Schmidinger, Niklas and Hornik, Kurt and Schiller, Nikolaus and K\"ahler, Olaf and Hochreiter, Ronald},
journal={Remote Sensing},
volume={12},
number={8},
article-number={1289},
year={2020},
month={April},
day={18},
publisher={Multidisciplinary Digital Publishing Institute},
url={https://www.mdpi.com/2072-4292/12/8/1289#cite},
issn={2027-4292},
doi={10.3390/rs12081289}
}
- Add docker
- Add python training script for GSCNN
- Add R training script for the decission tree
- Release Data to public (if possible)
1. Instructions
1.1 Installation Instructions
1.2 Usage Instructions
2. Paper
2.1 Abstract
2.3 Tables and Figures
2.3.1 Segmentation Results
2.3.2 Study Area
3. General Information
3.1 Authors by Institution
3.2 Project Partners
3.3 Funding
- Ubuntu 14.04 or higher
- Python 3.6 or higher
- CUDA 10.0 or higher
- pytorch 1.3 or higher
- Ubuntu 14.04 or higher
- Python 3.6 or higher
- CUDA 10.0 or higher
- pytorch 1.2 or higher
We recommend that you use anaconda to separate the environment.
The following command creates the conda environment py3-mink
and installs the necessary python dependencies.
conda env create -f py3-mink.yml
To install the Minkowski Engine in the created environment run
conda activate py3-mink
sh install_minkowski_engine.sh
import torch
import MinkowskiEngine as ME
# For loading LiDar files
from laspy.file import File
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
def predict(model, features, coordinates):
'''
Takes the given model and returns its predictions for the given features,
and coordinates. Note that only the features are used for making the predictions.
The predictions are sent back to the cpu and returned as a numpy array.
'''
model.eval()
model.to(device)
point_cloud = ME.SparseTensor(features, coords=coordinates).to(device)
with torch.no_grad():
loss = model(point_cloud)
_, y_pred = torch.max(loss.F, dim=1)
return y_pre.cpu().numpy()
def load_point_cloud(path_to_point_cloud):
'''
Opens a point_cloud in read mode.
'''
return File(path_to_point_cloud, mode="r")
def load_coordinates_from_point_cloud(path_to_point_cloud):
'''
Returns a numpy array for the point clouds coordinates.
'''
point_cloud = load_point_cloud(path_to_point_cloud=path_to_point_cloud)
coordinates = np.vstack([point_cloud.X, point_cloud.Y, point_cloud.Z]).transpose()
return coordinates
def normalize_coordinates(coordinates, denominator=10000):
'''
Normalizes the given coordinates, i.e. all coordinates are then in the range
[0, 1].
'''
return np.divide(coordinates, denominator)
model = torch.hub.load('MacOS/ReKlaSat-3D', 'coordinates')
coordinates = load_coordinates_from_point_cloud(path_to_point_cloud="./data/my_point_cloud.laz")
features = normalize_coordinates(coordinates=coordinates)
y_pre = predict(model=model, features=features, coordinates=coordinates)
import torch
entrypoints = torch.hub.list('MacOS/ReKlaSat-3D', force_reload=True)
print(entrypoints)
import torch
model = torch.hub.load('MacOS/ReKlaSat-3D', 'coordinates')
import torch
model = torch.hub.load('MacOS/ReKlaSat-3D', 'coordinates_epoch', epoch=40)
import torch
model = torch.hub.load('MacOS/ReKlaSat-3D', 'coordinates_colors')
import torch
model = torch.hub.load('MacOS/ReKlaSat-3D', 'coordinates_colors_epoch', epoch=40)
import torch
model = torch.hub.load('MacOS/ReKlaSat-3D', 'coordinates_colors_weighted')
import torch
model = torch.hub.load('MacOS/ReKlaSat-3D', 'coordinates_colors_weighted_epoch', epoch=149)
import torch
model = torch.hub.load('MacOS/ReKlaSat-3D', 'get_minkunet34c')
We studied the applicability of point clouds derived from tri-stereo satellite imagery for semantic segmentation for generalized sparse convolutional neural networks by the example of an Austrian study area. We examined, in particular, if the distorted geometric information, in addition to color, influences the performance of segmenting clutter, roads, buildings, trees, and vehicles. In this regard, we trained a fully convolutional neural network that uses generalized sparse convolution one time solely on 3D geometric information (i.e., 3D point cloud derived by dense image matching), and twice on 3D geometric as well as color information. In the first experiment, we did not use class weights, whereas in the second we did. We compared the results with a fully convolutional neural network that was trained on a 2D orthophoto, and a decision tree that was once trained on hand-crafted 3D geometric features, and once trained on hand-crafted 3D geometric as well as color features. The decision tree using hand-crafted features has been successfully applied to aerial laser scanning data in the literature. Hence, we compared our main interest of study, a representation learning technique, with another representation learning technique, and a non-representation learning technique. Our study area is located in Waldviertel, a region in Lower Austria. The territory is a hilly region covered mainly by forests, agriculture, and grasslands. Our classes of interest are heavily unbalanced. However, we did not use any data augmentation techniques to counter overfitting. For our study area, we reported that geometric and color information only improves the performance of the Generalized Sparse Convolutional Neural Network (GSCNN) on the dominant class, which leads to a higher overall performance in our case. We also found that training the network with median class weighting partially reverts the effects of adding color. The network also started to learn the classes with lower occurrences. The fully convolutional neural network that was trained on the 2D orthophoto generally outperforms the other two with a kappa score of over 90% and an average per class accuracy of 61%. However, the decision tree trained on colors and hand-crafted geometric features has a 2% higher accuracy for roads.
Quantitative overall comparison of the GSCNN, FCN-8s, and the decision tree. We use six conventionally used metrics obtained from the segmentation results. We highlight the best values for each metric (hence in each column) in bold. And the best values among the GSCNN models in italic. Please see the paper for a class level comparison.
Models | Avg. Precision | Avg. Recall | Avg. F1 | Kappa | OA | Avg per Class Acc. |
---|---|---|---|---|---|---|
% | % | % | % | % | ||
baseline A | 12.85 | 20.00 | 15.64 | 47.33 | 64.25 | 20.00 |
U-Net based GSCNN (3D) | ||||||
Coordinates | 23.69 | 24.33 | 23.30 | 38.90 | 56.01 | 24.32 |
Coordinates, Colors | 19.31 | 19.98 | 17.38 | 45.07 | 62.14 | 19.97 |
Coordinates, Colors, W.L. | 21.92 | 22.24 | 21.36 | 34.30 | 51.07 | 22.22 |
FCN-8s (2D) | ||||||
Colors | 62.43 | 61.15 | 59.12 | 90.76 | 96.11 | 61.15 |
Decision Tree (3D) | ||||||
Coordinates | 43.89 | 38.73 | 39.54 | 82.00 | 89.10 | 38.73 |
Coordinates, Colors | 61.03 | 58.72 | 58.96 | 86.60 | 93.18 | 58.71 |
Overall accuracy progress over epochs for the GSCNN models. Here, only the first 50 epochs of the model that uses the weighted loss is shown.
This browser does not support PDFs. Please download the PDF to view it: Download PDF.
Waldviertel, Lower Austria: (a) Overview map of Austria with marked location of study area; (b) Pléiades orthophoto of Waldviertel; the selected area used for semantic segmentation is marked with yellow.
Examples of point clouds derived form tri-stereo satellite imagery for each class: (a) Clutter; (b) Roads; (c) Buildings; (c) Trees; (e) Vehicles.
Assoc. Prof. Dr. Ronald Hochreiter (Projekt Manager)
Univ.Prof. Dipl.-Ing. Dr.techn. Pfeifer Norbert
Dipl.-Ing. Dr.techn. Johannes Otepka-Schremmer
Vienna University of Technology, Department of Geodesy and Geoinformation.
Siemens AG Österreich, Corporate Technology.
Federal Ministry of Defence, Austria.
This research was funded by the Austrian Research Promotion Agency (FFG) project “3D Reconstruction and Classification from Very High Resolution Satellite Imagery (ReKlaSat 3D)” (grant agreement No. 859792).