Skip to content
This repository has been archived by the owner on Dec 11, 2022. It is now read-only.

✅ resizing one layer to the size of another, and ambiguity in rescale function meaning #131

Open
gottacatchenall opened this issue Oct 24, 2021 · 20 comments
Assignees

Comments

@gottacatchenall
Copy link
Member

gottacatchenall commented Oct 24, 2021

Description of the to-do item

As far as I can tell, there is not a function to take a raster and change its resolution to match the dimensions of a target raster. Doing this would be straightforward by iterating over the longitudes(a), latitudes(a) in the smaller raster and then access the values of the other raster at that coord, and have the getindex(b::SimpleSDMLayer, lat, long) handle that conversion.

When trying to do this I found rescale(layer1::T, layer2::T) where {T <: SimpleSDMLayer},which scales the values of a given raster to match a target, but not dimensions. To avoid this semantic ambiguity, perhaps the method rescale could rescale both values and dimensions by default with kw arguments to turn either off

@gottacatchenall
Copy link
Member Author

my hacky workaround atm is

function make_same_dims(input::IT, target::TT) where {IT,TT <: SimpleSDMLayer}
	lat, long = collect(longitudes(target)), collect(latitudes(target))
	newgrid = similar(target)
	for lt in lat, lg in long
		newgrid[lt,lg] = input[lt,lg]
	end
	newgrid
end

@tpoisot
Copy link
Member

tpoisot commented Oct 24, 2021

That's a good point, but ideally something we might want to do with some sort of spatial interpolation. There's a way to get the haversine distance, do we can use this. I wouldn't necessarily call it rescale, maybe reproject? There's a little more thinking to do on this one before moving to a PR.

@mkborregaard
Copy link
Contributor

FWIW, GeoData calls this "resample" and forwards the functionality to ArchGDAL.gdalwarp

@tpoisot
Copy link
Member

tpoisot commented Oct 25, 2021

That sounds like something we might do -- my current rough draft is

using SimpleSDMLayers
using StatsPlots
using Statistics

ref = SimpleSDMPredictor(WorldClim, BioClim, 3; left=-20.0, right=40.0, top=60.0, bottom=30.0)

target = SimpleSDMResponse(zeros(Float32, 200, 200), -12.056, 37.065, 33.1234, 55.06)

for p in keys(target)
    if isnothing(ref[p])
        target[p] = nothing
    else
        vals = SimpleSDMLayers._sliding_values(ref, p, 20.0)
        target[p] = isempty(vals) ? nothing : mean(vals)
    end
end

diff
overlap

@gottacatchenall , given the info @mkborregaard provided, do you want to see if a version based on ArchGDAL.gdalwarp is possible? It's already a dependency, and we can write/read temp geotiff files...

@tpoisot tpoisot assigned gottacatchenall and unassigned tpoisot Oct 25, 2021
@gottacatchenall
Copy link
Member Author

@tpoisot Yeah GeoData.resample wraps it with arguments for interpolation rules, it isn't currently a dependency though

i've also written a few functions to take a set of SimpleSDMLayers and return PCA layers which i could include in the same PR

@tpoisot
Copy link
Member

tpoisot commented Oct 25, 2021

Two PRs! Mostly because I'd like to do some solid work on the PCA, to make sure we can drop layers into multivariate stats pretty much directly.

@tpoisot
Copy link
Member

tpoisot commented Oct 25, 2021

Also I wouldn't bring too much stuff from Geo until they figure out whether they use the Geometry types or not...

@gottacatchenall
Copy link
Member Author

@tpoisot i'm not sure if GeoData.jl is affiliated with JuliaGeo, here's its deps, which don't seem too heavy

also in progress on opening prs for both PCA and this, from what i can tell interfacing with Mvstats is pretty easy

@tpoisot
Copy link
Member

tpoisot commented Oct 25, 2021

Re. GeoData that's still a lot of deps. I'm curious to see how far we can go using ArchGDAL only, since we depend on that no matter what.

@gottacatchenall
Copy link
Member Author

gottacatchenall commented Oct 26, 2021

As far as I can tell resample only uses ArchGDAL functions.I could port it, but it would effectively be copy/pasting which isn't ideal.

I suppose long-term the julian-esque would be to move all functions that rely on ArchGDAL to a shared lib and have any other heavy dependencies loaded from higher level packages

Thoughts? @tpoisot @mkborregaard @rafaqz

@mkborregaard
Copy link
Contributor

So I don't think it would make sense to take a GeoData dep just to get resample. GeoData is a general raster package, very similar to R's terra (and their old raster package), and as such has a lot of overlapping functionality with SimpleSDMLayers (such as masking, cropping, extracting data, downloading data, aggregating, opening raster files etc etc). GeoData is more focused on being a general raster package, though, whereas in my understanding SimpleSDMLayers is focused on creating a smooth user experience for fitting SDM models.

One could consider having SimpleSDMLayers depend on GeoData, which would give all the raster operations here for free, and I guess make it easier to maintain, but that would change the focus of the package to be explicitly on creating the smooth SDM interface. That depends completely on @tpoisot 's vision for this package.

I think most of the added deps for geodata (e.g. DiskArrays, Flatten, Mmap, ) relate to the functionality for working with on-disk rather than in-memory rasters.

When I originally posted here my suggestion was that you could be inspired (or essentially copy) GeoData's resample function, which should also be doable.

@rafaqz
Copy link

rafaqz commented Oct 26, 2021

I agree with that assessment @mkborregaard . Depending on GeoData.jl would be a larger change in philosophy and more like a decision to reduce the codebase here by half. GeoData generalizes many of the methods here, including resample.

Proper broadcasting, base array methods and the Tables.jl interface in GeoData.jl would also save work here. These things are hard to do properly.

So I wouldn't add a GeoData.jl dep just for resample. You can just copy code from GeoData.jl, and delete all of the generalization code. resample is just a wrapper to warp which is n-dimensionally generalised gdalwarp. Although @gottacatchenall I tend to agree that copy-pasting doesn't seem like a sustainable long-term strategy.

In terms of JuliaGeo, GeoData is no more affiliated than ArchGDAL is - its not a JuliaGeo package. Its my package, partly dog-fed by cesaraustralia but to do generic things better, not specifically to work with cesar packages.

It's also less tightly integrated with GeoInterface.jl than ArchGDAL is. I agree about the GeoInterface/GeometryBasics problem and will likely be involved in the solution in the coming months. But I don't think that's an insurmountable problem currently.

Many of the dependencies like DiskArrays/GeoInterface/ColorTypes/GeoFormatTypes are already ArchGDAL deps. HDF5 and NCDatasets are included because Requires.jl is not a great solution to handle dependency bounds, and we dont have proper glue packages yet. Please chip in there too if you would like the dependency problem resolved, I've used up my quota pushing that...

@gottacatchenall
Copy link
Member Author

Yeah for now I think copying warp will work

@rafaqz
Copy link

rafaqz commented Oct 26, 2021

Yep. GeoData.jl is not going anywhere if you want to cut some code and upgrade the tooling and functionality here later on. I just hope we can aim for integration rather than copy-paste as a long term goal.

@tpoisot
Copy link
Member

tpoisot commented Oct 26, 2021

One could consider having SimpleSDMLayers depend on GeoData, which would give all the raster operations here for free, and I guess make it easier to maintain, but that would change the focus of the package to be explicitly on creating the smooth SDM interface. That depends completely on @tpoisot 's vision for this package.

Medium-term, I don't think GeoData will become a dependency of this package. As you'll see in the manuscript we're preparing (on network distribution models), SimpleSDMLayers does things in a different way and is, therefore, both less and more flexible depending on the use-case. Functionalities as they are currently cover our research needs, so don't expect changes in development direction soon.

The overlap in data download is because we couldn't agree on an interface, and that ship has sailed, meaning that there are no plans to remove these functionalities from SimpleSDMLayers in order to use RasterDataSources either.

@mkborregaard
Copy link
Contributor

Great, yeah then it definitely makes more sense to just copy the implementation over :-)

@rafaqz
Copy link

rafaqz commented Oct 26, 2021

I suppose long-term the julian-esque would be to move all functions that rely on ArchGDAL to a shared lib and have any other heavy dependencies loaded from higher level packages

@gottacatchenall I have thought about this with warp... it could start by wrapping the arguments to gdalwarp actually in ArchGDAL.jl instead of in GeoData.jl. ArchGDAL.jl kind of already is that shared lib you're talking about. But it needs a Dict based front-end to GDAL arguments rather than the current Vector of String approach. What I've written in warp could really be moved there. It's just probably going to be a bigger job doing that consistently for the whole package than it is in GeoData.jl.

@gottacatchenall
Copy link
Member Author

Looks like most of the calls to GDAL.jl happen in this file. Seems that if we changed the ArchGDAL wrappers (unsafe_gdalwarp, etc.) to pass the options keyword to a function that converts from whatever the options input type is to the vector of strings required by GDAL.jl, then this could work

@rafaqz
Copy link

rafaqz commented Nov 1, 2021

I'm pretty sure this has been discussed as a thing ArchGDAL needs already, so a PR would probably be appreciated.

@tpoisot
Copy link
Member

tpoisot commented Apr 8, 2022

Here's the code I'm using for a project where we need to do this:

function coerce(template::TT, source::TS, f) where {TS<:SimpleSDMLayer,TT<:SimpleSDMLayer}
    coerced = similar(template)
    for k in keys(template)
        coerced[k] = f(clip(source, k .- stride(template), k .+ stride(template)))
    end
    return coerced
end

There are a few possible adjustments to make, but I wouldn't be opposed to adding this to a new release.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants