-
Notifications
You must be signed in to change notification settings - Fork 371
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add custom RasterDataset notebook #283
Add custom RasterDataset notebook #283
Conversation
Can you add the tutorial name to docs/index.rst? |
Goes to show I should finally learn how to use Sphinx... |
Thanks @RitwikGupta! Just reviewed this but it is hard to leave comments inline on notebooks...
""" xview3tiny We would like to create a custom Thoughts? |
|
I think we try to run each notebook as an integration test before new releases and this one would break because there is no way to auto download xView3 tiny. This is generally OK with me (as other notebooks have special test behavior and exist to show one-off things). @adamjstewart, thoughts? see notebook render here https://github.com/RitwikGupta/torchgeo/blob/custom_rasterdataset_doc/docs/tutorials/custom_raster_dataset.ipynb |
@calebrob6 if you want to self-host xView3 tiny like you self host your subset of xView2, that should be simple enough to add in. At some point once we finish writing the paper for xView3 I can add the right citation for it too. |
We don't self-host any real data as to not run into any licensing issues. I.e., all the data at |
How are the integration tests for |
(Disclaimer, I might have my terminology mixed up, I'm new to this software engineering stuff) I don't think we run "integration" tests on Some parts of the code take too long to run on every commit to a PR, so we mark these as "slow" and only run them occasionally (see https://github.com/microsoft/torchgeo/blob/main/tests/test_train.py#L12 for an example). These are what I'm calling "integration" tests and the notebooks fall in this category. To run the unit tests you can do |
Okay, how's this. I created completely empty xView3 rasters with the same geotransforms as original source files. The custom |
haha yes, that will definitely work! Note: I just wanted Adam to weigh in on how to handle these notebooks in general. It seems reasonable to me that we don't have to force these tutorial notebooks be fully executable by the github tests (at the cost of losing some usefulness / cool factor). However, if we decide that, then we'll have to be extra vigilant about running them ourselves periodically. |
I think it makes sense to expect all of our tutorials to actually run. Without this, we have no way of testing them, so they rapidly become out-of-date. Manually running tests is a perfect recipe for not running tests. If we just want to document a feature, it doesn't need to be in a notebook. Note: there are actually two types of integration tests we run, the slow unit tests can be run with |
Okay so last changes:
|
Zipping all the files together brings the total file size down to 142 KB. What's the way to have TorchGeo automatically unzip files before using them? |
|
e3c91bf
to
71f6b5c
Compare
Try that on for size |
Fits great! |
This lgtm but while I'm here looking at the notebook, I think you can simplify the dataset further from class XView3Polarizations(RasterDataset):
'''
Load xView3 polarization data that ends in *_dB.tif
'''
filename_glob = "*_dB.tif"
def __init__(
self,
root: Path = None,
crs: Optional[CRS] = None,
res: Optional[float] = None,
transforms: Optional[Callable[[Dict[str, Tensor]], Dict[str, Tensor]]] = None,
cache: bool = True
) -> None:
self.root = root
super().__init__(root, crs, res, transforms, cache) to just: class XView3Polarizations(RasterDataset):
'''
Load xView3 polarization data that ends in *_dB.tif
'''
filename_glob = "*_dB.tif" |
This was a willful decision to be explicit. If people want to modify the root, they can! Should I add a markdown comment discussing this instead? |
I agree with @isaaccorley on this one, there's no need to reimplement methods of the parent class unless you need to change them. If a different root is needed, you can change the variable that is passed into the dataset when you instantiate it. |
In this case I think we should remove the boilerplate to show how simple it is for a new user to inherit and just change the glob. However, I do think we should come up with some more examples in a future PR which show how you can customize the dataset more. |
Is this any better? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
What is the difference in the new one? Is the idea |
Yeah. It's a pretty weak way of showing how you could change things up, to be honest. |
But I think it does open your mind up to the fact that you can add additional args and things, and use them, in your constructor. |
No problem, maybe like a final cell that says something to the effect of, "Now you can do this |
You could do that before and after, I don't see any point of setting |
Looks like the header levels are too high. Can you reduce the header levels so they match the other tutorials? Also, the plot is empty. Is it supposed to be? |
"source": [ | ||
"from torchgeo.datasets.utils import extract_archive\n", | ||
"\n", | ||
"data_root = Path('../../tests/data/xview3/')\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This directory doesn't exist on Colab, so this tutorial crashes immediately when you run it. We should really write these files ourselves using rasterio inside the notebook, or download them (from GitHub or just use real images).
* Add custom RasterDataset notebook * Update docs index.rst * Update copyright, fix URL typo, and add verbose description * Add xview3 sample data * Update notebook * Show simple example first, complicated example second * Remove the second half of the notebook, can expand later
Using xView3 as an example to address issue #280