-
Notifications
You must be signed in to change notification settings - Fork 370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GridGeoSampler: Sample beyond roi limit if needed #448
Conversation
…l and issue warning
Sorry it took me so long to review this. For the sake of completeness, here are all possible solutions to #431 that I can think of:
The more I think about it, I agree that 1 is not the best solution, especially if you are trying to make predictions for an entire region. You raised some concerns about how to pad things properly with 2 but I think it's actually easier than you think, rasterio will automatically pad with nodata values if you do an out-of-bounds windowed sample (if I remember correctly). I think 5 and 6 are both bad solutions because they make it difficult to construct a mini-batch without padding. I personally like 4 the best (calculate the total number of patches we need then adjust stride to match) but it is also very unintuitive. I can imagine someone adjusting stride only to find that the sampler doesn't obey their choice and being very frustrated. For the sake of completeness (sampling all parts of the image) and intuitiveness (not changing the meaning of size/stride) I think I'm actually leaning towards 2. Before we decide to go with this solution, can you see how other libraries deal with this? I can imagine that satellite imagery isn't the only image source large enough that you have to chop it up into patches. @isaaccorely what does Kornia do? P.S. I noticed that you added warnings to let the user know that the stride is being changed. I don't think we'll want to do this. I usually only add warnings to let users know that what they're trying to do isn't the correct way to do it (e.g. this function is deprecated, this input value is not ideal). In this case, because all satellite images vary in width/height by a few pixels, it is actually impossible to choose a value for size/stride that doesn't result in this warning. |
Thanks for the complete evaluation and feedback @adamjstewart ! |
We could of course make this a parameter that the user decides like we did with size (CRS or pixel units) but I would like to avoid this unless we have a good use case where one of these other solutions is needed. Trying to keep things simple when possible. |
FWIW, Pytorch's |
The solaris library seems to favor option 2.
You'll see that tile() calls a tile_generator(), which then uses get_tile_bounds() to split geometry into "equal sized tiles" with split_geom() This process ends up with a rasterio.window.from_bounds() on the generated bounds (which extend passed the original bounds). Rasterio is therefore responsible for the padding. |
Based on https://docs.rastervision.io/en/0.13/api.html#geodatawindowconfig it seems like Raster Vision adds a |
Let's go with option 2, where size and stride are held constant but the last patch samples outside the bounds of the image with padding. I think this will be the least surprising behavior. No need for a warning since the image size will almost never be an exact multiple of size. I would also like to undo the change in #376 so that areas smaller than the patch size will still be sampled from. We need to be careful to skip patches with area=0 though. This occurs when two tiles are side-by-side with an overlapping edge face. We'll want tests for at least 4 cases:
|
@remtav have you gotten a chance to work on this? We're planning a 0.3.0 release in about a week and I would love to see this change in that release. |
Thanks for the reminder @adamjstewart . I can move this on top of my TODO and tackle this during this week. |
…artial patch (to be padded) test_single.py: add tests for multiple limit cases (see issue microsoft#448)
I can't easily rebase my github's branch on top of main to integrate latest developments (I'd new github desktop and don't have admin rights on my laptop...). I'll close this PR and open a new one with an up-to-date branch. |
…artial patch (to be padded) test_single.py: add tests for multiple limit cases (see issue microsoft#448)
* Adjust minx/miny with a smaller stride for the last sample per row/col and issue warning * style and mypy fixes * black test fix * Adjust minx/miny with a smaller stride for the last sample per row/col and issue warning * style and mypy fixes * black test fix * single.py: adapt gridgeosampler to sample beyond limit of ROI for a partial patch (to be padded) test_single.py: add tests for multiple limit cases (see issue #448) * format for black and flake8 * format for black and flake8 * once again, format for black and flake8 * Revert "Adjust minx/miny with a smaller stride for the last sample per row/col and issue warning" This reverts commit cb554c6 * adapt unit tests, remove warnings * flake8: remove warnings import * Address some comments * Simplify computation of # rows/cols * Document this new feature * Fix size of ceiling symbol * Simplify tests Co-authored-by: Adam J. Stewart <[email protected]>
…rosoft#630) * Adjust minx/miny with a smaller stride for the last sample per row/col and issue warning * style and mypy fixes * black test fix * Adjust minx/miny with a smaller stride for the last sample per row/col and issue warning * style and mypy fixes * black test fix * single.py: adapt gridgeosampler to sample beyond limit of ROI for a partial patch (to be padded) test_single.py: add tests for multiple limit cases (see issue microsoft#448) * format for black and flake8 * format for black and flake8 * once again, format for black and flake8 * Revert "Adjust minx/miny with a smaller stride for the last sample per row/col and issue warning" This reverts commit cb554c6 * adapt unit tests, remove warnings * flake8: remove warnings import * Address some comments * Simplify computation of # rows/cols * Document this new feature * Fix size of ceiling symbol * Simplify tests Co-authored-by: Adam J. Stewart <[email protected]>
* Adjust minx/miny with a smaller stride for the last sample per row/col and issue warning * style and mypy fixes * black test fix * Adjust minx/miny with a smaller stride for the last sample per row/col and issue warning * style and mypy fixes * black test fix * single.py: adapt gridgeosampler to sample beyond limit of ROI for a partial patch (to be padded) test_single.py: add tests for multiple limit cases (see issue #448) * format for black and flake8 * format for black and flake8 * once again, format for black and flake8 * Revert "Adjust minx/miny with a smaller stride for the last sample per row/col and issue warning" This reverts commit cb554c6 * adapt unit tests, remove warnings * flake8: remove warnings import * Address some comments * Simplify computation of # rows/cols * Document this new feature * Fix size of ceiling symbol * Simplify tests Co-authored-by: Adam J. Stewart <[email protected]>
…rosoft#630) * Adjust minx/miny with a smaller stride for the last sample per row/col and issue warning * style and mypy fixes * black test fix * Adjust minx/miny with a smaller stride for the last sample per row/col and issue warning * style and mypy fixes * black test fix * single.py: adapt gridgeosampler to sample beyond limit of ROI for a partial patch (to be padded) test_single.py: add tests for multiple limit cases (see issue microsoft#448) * format for black and flake8 * format for black and flake8 * once again, format for black and flake8 * Revert "Adjust minx/miny with a smaller stride for the last sample per row/col and issue warning" This reverts commit cb554c6 * adapt unit tests, remove warnings * flake8: remove warnings import * Address some comments * Simplify computation of # rows/cols * Document this new feature * Fix size of ceiling symbol * Simplify tests Co-authored-by: Adam J. Stewart <[email protected]>
fix for #431 as per suggested by @adamjstewart