Team project at Skoltech summer ML school
Unfortunately, we were asked not to share used dataset, but here are some internals
./datasets # 66G total
├── Russia
│ ├── test # 2.7G
│ │ ├── images # 630 files
│ │ └── masks # 630 files
│ ├── train # 43G
│ │ ├── images # 8636 files
│ │ └── masks # 8636 files
│ └── valid # 5.5G
│ ├── images # 1199 files
│ └── masks # 1199 files
└── USA
├── train # 14G
│ ├── images # 2605 files
│ └── masks # 2605 files
└── valid # 1.8G
├── images # 346 files
└── masks # 346 files
N | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
---|---|---|---|---|---|---|---|---|---|---|
Sentinel-2A | B02 | B03 | B04 | B05 | B06 | B07 | B08 | B8A | B11 | B12 |
Standard | B | G | R | RE1 | RE2 | RE3 | N | N2 | S1 | S2 |
N | 0 | 1 | 2 | 3 | 4 |
---|---|---|---|---|---|
seg. mask | Water | Urban | Bare soil | Forest | Grassland |
Used spectral indicies:
short | long | type | formula |
---|---|---|---|
BI | Bare Soil Index | soil | ((S1+R)-(N+B))/((S1+R)+(N+B)) |
BNDVI | Blue Normalized Difference Vegetation Index | vegetation | (N-B)/(N+B) |
MGRVI | Modified Green Red Vegetation Index | vegetation | (G**2-R**2)/(G**2+R**2) |
NDCI | Normalized Difference Chlorophyll Index | water | (RE1-R)/(RE1+R) |
NLI | Non-Linear Vegetation Index | vegetation | ((N**2)-R)/((N**2)+R) |
Filename: datasets/USA/train/images/large_22_09.tif
Dimensions: 3D
Shape: (512, 512, 10)
Number of bands: 10
Data type: uint16
Bit depth: 16
Unique colors per band: [1762, 1786, 1786, 1788, 1770, 1803, 1777, 1768, 1767, 1748]
Filename: datasets/USA/train/masks/large_22_09.tif
Dimensions: 2D
Shape: (512, 512)
Number of bands: 1
Data type: uint8
Bit depth: 8
Unique colors per band: [5]
- Pre train CNN on
USA
dataset, measure performance: Validation; - Measure performance on
Russia/test
: Baseline; - Randomly choose 1k shots from
Russia/train
; - Train 1 epoch, measure performance: Just;
- Undo changes, freeze layers 0-6;
- Train 1 epoch, measure performance: Frozen 6.
Experiment results, averaged for 10 trials:
Performed experiment isn't actually a domain adaptation _per se_. But adding calculated Spectral Indicies (SI) as a sort of invariant properties of landcovers is indeed much closer to domain adaptation attempts. Further experiments are needed to compare model performance with and without added SI layers.
- Pre train CNN-1 (no added spectral indexes bands) and CNN-2 (with added 5 spectral indexes bands) on
USA
(source domain), measure performance; - Measure performance drop on
Russia/test
for both models;