-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
distributed rtree #22
base: main
Are you sure you want to change the base?
Conversation
the distributed tree works already, but I'm sure the code – especially the part using dask – could be further optimized. I still have to check whether the results actually make sense (see also #7). Additionally, right now the chunk boundaries are computed as the union of the grid geometries, which takes quite a while for large chunks. It might thus make sense to allow passing pre-computed chunk boundary polygons, which would make the index truly lazy. cc @rsignell, @maxrjones, @norlandrhagen, @TomNicholas, in case you're interested |
single-chunk as well as in-memory indexing appear to return the right results, but multi-chunk does not quite work yet. I suspect the issue is with the chunking; I'm flattening rectangular chunks and assemble the result as one 1D array per target chunk, so that could explain the wrong results. |
This contains a draft of the distributed rtree. Main remaining issues is constructing the indexes for the chunks on a worker, and assembling the final grids as a sparse matrix (though I guess the latter may be borrowed from VirtualiZarr?)