Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change BigEarthNet class mapping order to match original code #1127

Merged
merged 4 commits into from
Feb 20, 2023

Conversation

lucastao
Copy link
Contributor

It seems like the mapping from 43 to 19 classes in the BigEarthNet dataset code is incorrect.

Example
The below image corresponds to the RGB visualization of S2B_MSIL2A_20180502T093039_42_25. The provided textual labels are 'Broad-leaved forest', 'Natural grassland', and 'Transitional woodland/shrub'. Per the mapping provided here , this corresponds to indices 5, 24, and 39. After converting these from 43 class labels to 19 class labels using the label converter, we end up with 5 being discarded, 24 (Natural grassland) mapping to 10 (Mixed forest) and 39 (Transitional woodland/shrub) mapping to 17 (Inland waters). All three of these are incorrect. 5 (Broad-leaved forest) should be mapped to 8 (Broad-leaved forest), 24 (Natural grassland) should actually map to 11 (Natural grassland and sparsely vegetated areas) and 39 (Transitional woodland/shrub) should map to 13 (Transitional woodland, shrub).

S2B_MSIL2A_20180502T093039_42_25

Root Cause and Fix
The root cause of this issue appears to be an alphabetically sorted list here: https://github.com/microsoft/torchgeo/blob/main/torchgeo/datasets/bigearthnet.py#L145-L190 when it should instead match the order provided by the BigEarthNet code: https://git.tu-berlin.de/rsim/BigEarthNet-S2_19-classes_models/-/blob/master/label_indices.json#L2-46.

  1. Look at the BigEarthNet code for 43 to 19 class mappings provided at https://bigearth.net/. Specifically, this file: https://git.tu-berlin.de/rsim/BigEarthNet-S2_19-classes_models/-/blob/master/label_indices.json#L2-46. Here, the mapping are not sorted in alphabetical order.

  2. Look at the BigEarthNet code for 43 to 19 class mappings in torchgeo: https://github.com/microsoft/torchgeo/blob/main/torchgeo/datasets/bigearthnet.py#L145-L190. These are sorted in alphabetical order.

  3. However, the mappings from 43 to 19 classes remain index based and are the same as the indexes used in: https://github.com/microsoft/torchgeo/blob/main/torchgeo/datasets/bigearthnet.py#L193-L226 and https://git.tu-berlin.de/rsim/BigEarthNet-S2_19-classes_models/-/blob/master/label_indices.json#L47-67. To fix this, either the ordering of the 43 classes or the indexing dictionary must be changed to fix this mismatch.

This commit adjusts the order of the 43 classes to match the original. After testing, I find that the above example now has the correct mapping:

Textual label: ['Broad-leaved forest', 'Natural grassland', 'Transitional woodland/shrub']
43 class: [22, 25, 28]
19 class: [8, 11, 13]

@github-actions github-actions bot added the datasets Geospatial or benchmark datasets label Feb 20, 2023
@lucastao
Copy link
Contributor Author

@microsoft-github-policy-service agree

@adamjstewart adamjstewart added this to the 0.4.1 milestone Feb 20, 2023
Copy link
Collaborator

@adamjstewart adamjstewart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thanks for fixing this!

torchgeo/datasets/bigearthnet.py Outdated Show resolved Hide resolved
torchgeo/datasets/bigearthnet.py Show resolved Hide resolved
@adamjstewart adamjstewart enabled auto-merge (squash) February 20, 2023 19:47
@calebrob6
Copy link
Member

Thanks @lucastao!

@adamjstewart adamjstewart merged commit 562778c into microsoft:main Feb 20, 2023
calebrob6 added a commit that referenced this pull request Apr 10, 2023
* Change BigEarthNet class mapping order to match original code

* Add data source comment and fix Sphinx numbered list

* Override flake8 long line warning for url

* Update torchgeo/datasets/bigearthnet.py

Co-authored-by: Adam J. Stewart <[email protected]>

---------

Co-authored-by: Caleb Robinson <[email protected]>
Co-authored-by: Adam J. Stewart <[email protected]>
yichiac pushed a commit to yichiac/torchgeo that referenced this pull request Apr 29, 2023
…oft#1127)

* Change BigEarthNet class mapping order to match original code

* Add data source comment and fix Sphinx numbered list

* Override flake8 long line warning for url

* Update torchgeo/datasets/bigearthnet.py

Co-authored-by: Adam J. Stewart <[email protected]>

---------

Co-authored-by: Caleb Robinson <[email protected]>
Co-authored-by: Adam J. Stewart <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datasets Geospatial or benchmark datasets
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants