Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finalsplit #220

Merged
merged 8 commits into from
Dec 7, 2020
Merged

Finalsplit #220

merged 8 commits into from
Dec 7, 2020

Conversation

meggart
Copy link
Member

@meggart meggart commented Aug 7, 2020

These are the latest changes to split out the whole machinery to YAXArrays and leave the Geo-specific parts inside ESDL.jl

@meggart
Copy link
Member Author

meggart commented Aug 7, 2020

@felixcremer

@felixcremer
Copy link
Member

I tested it, and the computations are working, but it feels slower as before, but I am not sure, whether this is also because I am currently not using the distributed computations. And I have to set up proper tests to compare the timings of the different versions.

I still can't open NetCDF files while the julia process is running but I can copy them and then they can be opened.
But I now can't open my RQA results, but I am not sure, whether this is due to ESDL changes or my QGIS version, because the netcdf file looks ok, in ncdump.
Could you test, whether you can open the file at
https://upload.uni-jena.de/data/5f3400804a46b4.93034284/lakecuberqa2.nc

I would also suggest, to not set the version number of YAXArrays directly to 1.0 because we still have some deprecations in there and I would also like to talk again about reducing the number of submodules. We could also think about renaming functions with cube in the name, because at the moment, this is a mix of Cube naming and YAXArray naming.

I would suggest to first make the split and then we can tidy up YAXArrays further.

@meggart
Copy link
Member Author

meggart commented Aug 14, 2020

I just tested your file and could read it using NetCDF.jl without problems. I could not try ArchGDAL though, because I don't have the NetCDF driver installed. Otherwise I agree with your comments about version 1.0 and the removing of the remaining "cube" functions.

Copy link
Member

@felixcremer felixcremer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am still testing things and it seems that the type inference for RangeAxis is not working, because when I get the eltype of a RangeAxis it is any.

julia> testax = RangeAxis(:Test, 1:10)
Test                Axis with 10 Elements from 1 to 10

julia> eltype(testax)
Any

julia> eltype(testax.values)
Int64

On ESDL master testax has an eltype of Int64.
This is a problem when I try to use the TimeAxis in a computation, because I can't set the CacheSizes and the computation fails with the following error:

ERROR: Abstract type Any does not have a definite size.
Stacktrace:
 [1] sizeof at ./essentials.jl:449 [inlined]
 [2] mysizeof(::Type{T} where T) at /home/crem_fe/.julia/dev/YAXArrays/src/DAT/DAT.jl:564
 [3] (::YAXArrays.DAT.var"#463#467")(::Array{Int64,1}, ::YAXArrays.DAT.InputCube{1}) at /home/crem_fe/.julia/dev/YAXArrays/src/DAT/DAT.jl:580
 [4] #3 at ./generator.jl:36 [inlined]
 [5] iterate at ./generator.jl:47 [inlined]
 [6] collect_to!(::Array{Int64,1}, ::Base.Generator{Base.Iterators.Zip{Tuple{Array{Array{Int64,1},1},Tuple{YAXArrays.DAT.InputCube{3},YAXArrays.DAT.InputCube{1}}}},Base.var"#3#4"{YAXArrays.DAT.var"#463#467"}}, ::Int64, ::Tuple{Int64,Int64}) at ./array.jl:732
 [7] collect_to_with_first!(::Array{Int64,1}, ::Int64, ::Base.Generator{Base.Iterators.Zip{Tuple{Array{Array{Int64,1},1},Tuple{YAXArrays.DAT.InputCube{3},YAXArrays.DAT.InputCube{1}}}},Base.var"#3#4"{YAXArrays.DAT.var"#463#467"}}, ::Tuple{Int64,Int64}) at ./array.jl:710
 [8] collect(::Base.Generator{Base.Iterators.Zip{Tuple{Array{Array{Int64,1},1},Tuple{YAXArrays.DAT.InputCube{3},YAXArrays.DAT.InputCube{1}}}},Base.var"#3#4"{YAXArrays.DAT.var"#463#467"}}) at ./array.jl:691
 [9] map(::Function, ::Array{Array{Int64,1},1}, ::Tuple{YAXArrays.DAT.InputCube{3},YAXArrays.DAT.InputCube{1}}) at ./abstractarray.jl:2248
 [10] getCacheSizes(::YAXArrays.DAT.DATConfig{2,1}, ::Dict{Any,Any}) at /home/crem_fe/.julia/dev/YAXArrays/src/DAT/DAT.jl:586
 [11] mapCube(::typeof(ESDLHelpers.clombscargle), ::Tuple{YAXArray{Float32,3,ArchGDAL.RasterDataset{Float32,ArchGDAL.IDataset},Array{CubeAxis,1}},RangeAxis{Date,:Time,Array{Date,1}}}; max_cache::Float64, indims::Tuple{InDims,InDims}, outdims::OutDims, inplace::Bool, ispar::Bool, debug::Bool, include_loopvars::Bool, showprog::Bool, nthreads::Array{Int64,1}, loopchunksize::Dict{Any,Any}, kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at /home/crem_fe/.julia/dev/YAXArrays/src/DAT/DAT.jl:273
 [12] lombscargle(::YAXArray{Float32,3,ArchGDAL.RasterDataset{Float32,ArchGDAL.IDataset},Array{CubeAxis,1}}) at /home/crem_fe/.julia/dev/ESDLHelpers/src/tsanalysis.jl:78
 [13] top-level scope at REPL[38]:1
 [14] include_string(::Function, ::Module, ::String, ::String) at ./loading.jl:1088

@@ -62,7 +62,7 @@ function fill_msc_poly!(tmsc)
tmsc .= view(a,(n+1):(n+n))
end

gapfillpoly(x::AbstractCubeData;max_gap=30,nbefore_after=10, polyorder = 2) =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing this typehint gives a clash with the inner gapfillpoly function a few lines below and Julia starts to complain about compilation problems. I circumvented the problem for now, by renaming the inner gapfillpoly function to gapfillpoly! but I am not sure, whether this is the way to go.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is the way to go, alternatively, _gapfillpoly would be ok as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for removing the AbstractCubeData type was that in the medium term I want to make the operations in YAXArrays work for any array that implements the interface in YAXArrayBase, which means you can directly plug in a DImensionalArray or AxisArray.

src/ESDL.jl Outdated
@@ -9,12 +9,13 @@ module ESDL
include("Proc.jl")
include("countrydict.jl")
include("esdc.jl")
include("cubeinfo.jl")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to put this include below the using YAXArrays lines, because otherwise some YAXArray type wasn't defined yet which is used in the cubeinfo function.

@felixcremer
Copy link
Member

The type inference fails also for categorical axes.

@meggart
Copy link
Member Author

meggart commented Aug 21, 2020

I just fixed the eltype bug which was simply a missing method

@felixcremer
Copy link
Member

What is the state of this?
Is there anything I could help with to push this forward?

@meggart
Copy link
Member Author

meggart commented Dec 2, 2020

I think we can merge now. I have registered YAXArrays, so that the unit tests pass again now, so I think I will just do it....

@meggart meggart merged commit ac4620e into SplitPackages Dec 7, 2020
meggart added a commit that referenced this pull request Dec 7, 2020
* Removed a lot of files

* move some files

* break more things

* Clean up Proc module

* Many fixes to get dataset creation running again

* savecube to netcdf works

* more improvements

* Commit many more changes

* Separate out API

* some test fixes

* switch to YAXArrayBase

* read variables in a typed way

* add ESDLArray export

* Fixes to remap

* remove NetCDF and Zarr dependency

* delete some unused stuff

* Do not permute cubes by default

* Add concatenations

* Fixes to findaxis

* Tests pass again

* Try to fix chunking

* dont depend on reexport

* remove reexport dep

* dont forgegt using it

* update test

* Allow NamedTuples

* minor changes

* Finalsplit (#220)

* Delete almost everything

* More updates

* Old tests for ESDL pass

* More changes

* Fix bugs in ESDL

* Some fixes

* Fix some warnings

* tests should pass
meggart added a commit that referenced this pull request Dec 7, 2020
* Removed a lot of files

* move some files

* break more things

* Clean up Proc module

* Many fixes to get dataset creation running again

* savecube to netcdf works

* more improvements

* Commit many more changes

* Separate out API

* some test fixes

* switch to YAXArrayBase

* read variables in a typed way

* add ESDLArray export

* Fixes to remap

* remove NetCDF and Zarr dependency

* delete some unused stuff

* Do not permute cubes by default

* Add concatenations

* Fixes to findaxis

* Tests pass again

* Try to fix chunking

* dont depend on reexport

* remove reexport dep

* dont forgegt using it

* update test

* Allow NamedTuples

* minor changes

* Finalsplit (#220)

* Delete almost everything

* More updates

* Old tests for ESDL pass

* More changes

* Fix bugs in ESDL

* Some fixes

* Fix some warnings

* tests should pass
meggart added a commit that referenced this pull request Dec 7, 2020
* Start reworking everything

* Loading cubes works again

* mapslices running

* Simple mapslices works

* Access tests pass

* Tests pass with new data type

* Code improvements and TODOs based on @felixcremer s comments

* add CubeAxis import to Shapes.jl

* some fixes

* small fix for readblock

* Show CodeCov badge in README

* fix bug when subsetting concatcubes

* Implement changes suggested by @felixcremer

* Add sync after exporting cube

* Require NetCDF 0.10

* make it load again

* Fix ambiguity

* Fix cartind bug

* Add DiskArrays as direct dependency

This doesn't solve the test failure because some readblock! method is
still missing.

* Fix bug in readblock!

* Import readblock! from Zarr

I also changed the import Zarr to using Zarr: ... and sorted the imports
into types and functions which are sorted alphabetically.

* Make wrap keyword available in cubefromshape (#199)

Wrap is a keyword which is projection specific and which gives the
projection value at which the projection wraps into itself.
The default values fail for non Lat Lon projections.
Therefore, this makes the keyword available in the cubefromshape
interface.
This is an intermediate solution, because it would be better to have the
projection as a metadata information of a cube, so that we could use
this to set the wrapping values.

* Revert "Make wrap keyword available in cubefromshape (#199)"

This reverts commit 6e778a2.
Unfortunately, this pull request was not yet ready.
I should have indicated that more clearly.
I am reverting this for now and then I am going to push a tidied up
version of the PR.

* Change clone to add in readme install instructions

Clone is the keyword from the old package manger.

* Update README.md

* typos (#224)

* Split packages (#198)

* Removed a lot of files

* move some files

* break more things

* Clean up Proc module

* Many fixes to get dataset creation running again

* savecube to netcdf works

* more improvements

* Commit many more changes

* Separate out API

* some test fixes

* switch to YAXArrayBase

* read variables in a typed way

* add ESDLArray export

* Fixes to remap

* remove NetCDF and Zarr dependency

* delete some unused stuff

* Do not permute cubes by default

* Add concatenations

* Fixes to findaxis

* Tests pass again

* Try to fix chunking

* dont depend on reexport

* remove reexport dep

* dont forgegt using it

* update test

* Allow NamedTuples

* minor changes

* Finalsplit (#220)

* Delete almost everything

* More updates

* Old tests for ESDL pass

* More changes

* Fix bugs in ESDL

* Some fixes

* Fix some warnings

* tests should pass

* Start reworking everything

* Loading cubes works again

* mapslices running

* Simple mapslices works

* Access tests pass

* Tests pass with new data type

* Code improvements and TODOs based on @felixcremer s comments

* add CubeAxis import to Shapes.jl

* some fixes

* fix bug when subsetting concatcubes

* Implement changes suggested by @felixcremer

* Add sync after exporting cube

* Require NetCDF 0.10

* make it load again

* Fix ambiguity

* Split packages (#198)

* Removed a lot of files

* move some files

* break more things

* Clean up Proc module

* Many fixes to get dataset creation running again

* savecube to netcdf works

* more improvements

* Commit many more changes

* Separate out API

* some test fixes

* switch to YAXArrayBase

* read variables in a typed way

* add ESDLArray export

* Fixes to remap

* remove NetCDF and Zarr dependency

* delete some unused stuff

* Do not permute cubes by default

* Add concatenations

* Fixes to findaxis

* Tests pass again

* Try to fix chunking

* dont depend on reexport

* remove reexport dep

* dont forgegt using it

* update test

* Allow NamedTuples

* minor changes

* Finalsplit (#220)

* Delete almost everything

* More updates

* Old tests for ESDL pass

* More changes

* Fix bugs in ESDL

* Some fixes

* Fix some warnings

* tests should pass

Co-authored-by: Felix Cremer <[email protected]>
Co-authored-by: Guido Kraemer <[email protected]>
@meggart meggart deleted the finalsplit branch March 17, 2021 14:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants