Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add OpenBuildings dataset #402

Merged
merged 12 commits into from
Feb 27, 2022
Merged

Conversation

nilsleh
Copy link
Collaborator

@nilsleh nilsleh commented Feb 15, 2022

This PR closes #68 and adds the OpenBuildings dataset.

Dataset features:

  • more than 516 M building detections across the African continent
  • confidence score per building detection
  • google Plus code corresponding to polygon building center

Dataset format:

  • building data comes in 135 csv.gz files
  • meta data file tiles.json that has geometry of each of the 130 single tile
  • 5 csv.gz files are not covered in the meta data file since they are very small I suppose (one entry)

Questions/Concerns:

  • I could not find the source crs
  • Currently, loading an item can take fairly long because some csv.gz contain millions of buildings
  • It is quiet possible that a BoundingBox query returns a hit on a file but still finds no shapes for this area because the tiles are so large.

Thus, I would welcome suggestions on these points, as well as any other suggestions of course.

Plot example:
openBuildings

@github-actions github-actions bot added datasets Geospatial or benchmark datasets testing Continuous integration testing labels Feb 15, 2022
@adamjstewart adamjstewart added this to the 0.3.0 milestone Feb 15, 2022
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Feb 16, 2022
@adamjstewart
Copy link
Collaborator

I could not find the source crs

Can you elaborate on this? If there is no CRS, then there's no way to make this a GeoDataset, right? You have to have some transform to map this dataset from one CRS to another in order to use it geospatially.

docs/api/datasets.rst Outdated Show resolved Hide resolved
torchgeo/datasets/openbuildings.py Outdated Show resolved Hide resolved
torchgeo/datasets/openbuildings.py Outdated Show resolved Hide resolved
torchgeo/datasets/openbuildings.py Show resolved Hide resolved
torchgeo/datasets/openbuildings.py Show resolved Hide resolved
@nilsleh
Copy link
Collaborator Author

nilsleh commented Feb 23, 2022

Can you elaborate on this? If there is no CRS, then there's no way to make this a GeoDataset, right? You have to have some transform to map this dataset from one CRS to another in order to use it geospatially.

I could not find a crs in the metadata file or the csv files where the polygon strings are. However, I assumed that since there are geographical coordinates, there would also be a CRS. But since it is not the case, I was wondering if there is maybe another way to find out or to handle it?

@adamjstewart
Copy link
Collaborator

Hmm, if the coordinates range from [-90, 90] and [-180, 180] or [0, 360] then we can probably assume that things are simply in lat/long units?

adamjstewart
adamjstewart previously approved these changes Feb 24, 2022
tests/datasets/test_openbuildings.py Show resolved Hide resolved
torchgeo/datasets/openbuildings.py Outdated Show resolved Hide resolved
torchgeo/datasets/openbuildings.py Outdated Show resolved Hide resolved
adamjstewart
adamjstewart previously approved these changes Feb 27, 2022
@adamjstewart adamjstewart enabled auto-merge (squash) February 27, 2022 20:03
@adamjstewart adamjstewart merged commit 06ec364 into microsoft:main Feb 27, 2022
@adamjstewart adamjstewart mentioned this pull request Jul 11, 2022
yichiac pushed a commit to yichiac/torchgeo that referenced this pull request Apr 29, 2023
* populate index attempt

* added tests

* correct plot method

* fix test

* fix documentation

* fix docs

* name changes

* lazy import pandas and Any instead of Tensor

* requested changes

* mypy fixes

* Close plot filehandles

Co-authored-by: Adam J. Stewart <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datasets Geospatial or benchmark datasets documentation Improvements or additions to documentation testing Continuous integration testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Open Buildings Dataset
2 participants