Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistencies between handling of bands in datasets #289

Open
nilsleh opened this issue Dec 15, 2021 · 10 comments
Open

Inconsistencies between handling of bands in datasets #289

nilsleh opened this issue Dec 15, 2021 · 10 comments
Labels
datasets Geospatial or benchmark datasets

Comments

@nilsleh
Copy link
Collaborator

nilsleh commented Dec 15, 2021

I am currently working on adding plotting methods based on #253. For datasets that contain Sentinel multispectral imagery, I have two questions.

  1. In the CV4AKenyaCropType, the band names can be chosen, when creating a dataset by passing a tuple of strings. In the SEN12MS dataset, the bands can be chose with integers in a list. In ZueriCrop dataset it is not possible to specify bands. Should there not be a standardized way across all datasets?
  2. Given that people can choose different bands, how should the plot method be handled. For example, when RGB bands are not chosen? Or another way of handling plotting for multispectral datasets?
@adamjstewart adamjstewart added the datasets Geospatial or benchmark datasets label Dec 15, 2021
@adamjstewart
Copy link
Collaborator

  1. Yes, this should be standardized some way
  2. We could either have plotting fail, or have it still work as long as there are at least 3 bands. @calebrob6 thoughts?

@calebrob6
Copy link
Member

calebrob6 commented Dec 15, 2021

Great questions!

  1. Agree with Adam
  2. I've been raising a ValueError when the RGB bands aren't present (https://github.com/microsoft/torchgeo/blob/main/torchgeo/datasets/seco.py#L260, https://github.com/microsoft/torchgeo/blob/plotting/torchgeo/datasets/benin_cashews.py#L440). I wanted the plot methods to be an easy way for someone to see what the data looks like / gently introduce people who aren't familiar with satellite imagery to the fact that it doesn't come nicely normalized (i.e. I don't want users to get stuck trying to visualize S2 patches). I don't want the plot methods to try to solve every case of visualizing the data.

@nilsleh
Copy link
Collaborator Author

nilsleh commented Dec 15, 2021

  1. If you have a preference for a standardized way of doing this, I can change all the datasets accordingly.
  2. Given a standardized way of choosing bands, I can implement the plotting methods for those datasets like the examples you gave.

@calebrob6
Copy link
Member

  1. I like list or tuple of string band names
  2. Great!

I think both of these can be done at the same time

@isaaccorley
Copy link
Collaborator

An alternative could be to plot a false color image from a user defined subset of the bands in the image (basically a user defines indices like [0, 1, 2]) as an argument to plot. This is how it's done in torchgeo.datasets.IDTReeS when plotting the hyperspectral image.

@adamjstewart adamjstewart added this to the 0.2.0 milestone Dec 19, 2021
@adamjstewart adamjstewart changed the title Dealing with Multispectral Bands in different Datasets Inconsistencies between handling of bands in datasets Dec 28, 2021
@adamjstewart
Copy link
Collaborator

Another issue I noticed is that So2Sat doesn't allow you to select bands but So2SatDataModule does. The issue with this is that the So2Sat.plot method assumes that the sample will always have all bands. We should copy the So2SatDataModule band selection to So2Sat and update So2Sat.plot to handle this.

@remtav
Copy link
Contributor

remtav commented Feb 23, 2022

Relating to band selection: as more datasets arise in torchgeo, could band selection become standardized to the STAC specification common_names? This could also become a standard way of doing band selection from STAC items when some datasets point to STAC items.

FYI, I'm currently trying this "band selection interface" out with a dataset implemented torchgeo's way. I think it has some potential.

@adamjstewart adamjstewart modified the milestones: 0.3.0, 0.4.0 Jul 9, 2022
@adamjstewart
Copy link
Collaborator

Related to, and partially addressed by #687.

As I mentioned in #394, I'm starting to think that it doesn't make much sense to allow band selection for curated benchmark datasets, only for uncurated raster datasets. What do you think?

@calebrob6
Copy link
Member

Why not? E.g. experiments that compare how multispectral information compare to only rgb information on benchmark datasets are quite interesting, I think.

@adamjstewart
Copy link
Collaborator

Hmm, that's fair, someone might want to do a more in-depth ablation study on which bands actually matter. I guess I'm not opposed if someone wants it, but I wouldn't go changing everything if we don't need it, especially if it doesn't speed up data loading times.

I think I want to rename the So2Sat bands anyway, will submit a PR to do that.

@adamjstewart adamjstewart removed this from the 0.4.0 milestone Dec 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datasets Geospatial or benchmark datasets
Projects
None yet
Development

No branches or pull requests

5 participants