Sampling the middlemost Bounding box in images #1170
Replies: 2 comments 1 reply
-
For large global studies like you're describing, most of the time, there are enough images to cover the entire Earth without nodata pixels. If you sample near the corner of an image where there are nodata pixels, TorchGeo should detect all neighboring overlapping images and use them to fill in nodata pixels. So this is the logic behind the current design. Of course, if you don't have that many images and there are significant regions with nodata pixels, this doesn't help you. Unfortunately, there isn't an easy way to get what you want at the moment. However, there are a couple of places where you could hack this in. In RasterDatasetTo prevent the sampler from sampling near the edges of images, you can change the edges of images. Following this line, you could add: boundary = 10 # in units of CRS, might be meters or degrees, so be careful
minx += boundary
maxx -= boundary
miny += boundary
maxy -= boundary You could also calculate In GeoSamplerOn the sampler side, you can also tell the sampler to avoid sampling boxes near the boundary of an image. This is easiest to do in this function which all samplers use for random sampling (not used by GridGeoSampler). Just add something similar to the code above in this location. I believe @calebrob6 also has some code to support non-rectangular polygon ROIs for long, narrow objects like New Zealand or Japan that was never upstreamed. It sounds like that wouldn't help in this situation though. I completely agree that this is a common issue that users have, and I haven't yet found a robust way to solve it other than cutting off the corners from all images, which obviously isn't ideal. I haven't yet found an easy way in rasterio/GDAL to get the true corners of an image instead of the corners of the bounding box. Maybe a convex hull would help? Probably too slow. Even if we get those, we can no longer use an R-tree which requires all boxes to be at right angles. An alternative would be to somehow define the CRS in terms of WRS paths/rows such that everything is at an angle instead of N-S being vertical and E-W being horizontal, but I haven't yet figured out how to do that either, and it may not help for satellites that don't follow the same WRS paths/rows trajectories. Any suggestions would be welcome. |
Beta Was this translation helpful? Give feedback.
-
I will investigate this further and keep you posted. While at it, could you help me better understand the mint and maxt values? They are very big floating numbers but I wasn't sure how they are calculated. Are they in seconds? |
Beta Was this translation helpful? Give feedback.
-
I am trying to avoid sampling bounding boxes near or inside the edges where we have all or significantly many empty pixel cells (black pixels). I am trying to formulate a way to do random sampling where the entire bounding box stays within the image and/or preferably closer to the center. I think random sampling is convenient since it can create an augmentation effect but I just need to use it carefully to avoid these edge cases. I looked into defining an ROI but the dataset includes images from different path/row regioins on the globe, and a predefined ROI that stays within an image cannot be universally defined. Is there a predefined method or convenient way to compute reliable pixels from within the image?
Beta Was this translation helpful? Give feedback.
All reactions