Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

to_netcdf problem with echopype installed using pip install #801

Closed
emiliom opened this issue Sep 5, 2022 · 10 comments · Fixed by #843
Closed

to_netcdf problem with echopype installed using pip install #801

emiliom opened this issue Sep 5, 2022 · 10 comments · Fixed by #843
Labels
bug Something isn't working

Comments

@emiliom
Copy link
Collaborator

emiliom commented Sep 5, 2022

When using echopype installed using pip install echopype, there is a problem when saving an xarray dataset to netcdf. The problem involves the compression of unicode strings, and it's extraneous to echopype proper. It was reported in #789.

I'm describing the problem in more detail here, copying directly from that other issue. Both environments were created with echopype 0.6.2.

This simple freestanding code illustrates the problem:

import numpy as np
import xarray as xr

x = np.array([1,2],dtype='U') #This results in RuntimeError.
#x = np.array([1,2],dtype='f') #This is working well.
da = xr.DataArray(x,name="x")

encoding={'x': {'zlib': True, 'complevel': 4}}
da.to_netcdf(path="output.nc" encoding=encoding)

This environment generates the error using the sample code:

conda create -n echopype_pip -c conda-forge python=3.9 ipykernel
conda activate echopype_pip
pip install echopype

In contrast, with this similar environment where echopype is installed from the conda package, the same code runs without errors:

conda create -n echopype_conda -c conda-forge python=3.9 ipykernel echopype
conda activate echopype_conda
@emiliom emiliom added the bug Something isn't working label Sep 5, 2022
@emiliom emiliom changed the title netcdf problem with echopype installed using pip install to_netcdf problem with echopype installed using pip install Sep 8, 2022
@erikatbeof
Copy link

@emiliom Is anyone looking into this?

@emiliom
Copy link
Collaborator Author

emiliom commented Sep 23, 2022

Not yet. Have you tried using conda?

@leewujung leewujung moved this to Todo in Echopype Oct 6, 2022
@erikatbeof
Copy link

I went through the process installing conda. Fail to run the demo, echopype_tour.ipynb, though:


ModuleNotFoundError Traceback (most recent call last)
Input In [1], in <cell line: 5>()
2 from pathlib import Path
4 import fsspec
----> 5 import geopandas as gpd
6 import xarray as xr
8 import matplotlib.pyplot as plt

ModuleNotFoundError: No module named 'geopandas'

@leewujung
Copy link
Member

@erikatbeof : not sure what you're running into there. Try conda list to see if you actually have geopandas in your environment since the error message is not finding the module.

@erikatbeof
Copy link

I tried that, -of course, and it's not installed. But I got the impression that using conda would be a easy, a 'one' line solution. It appears not to be. In the contrast codas is clunky. I advise you to make your code working well with pip -an abandon conda.

@leewujung
Copy link
Member

leewujung commented Oct 11, 2022

@erikatbeof : Heh, I don't think abandoning conda is the right solution, and if you are reluctant to fix your environment it is your choice though unfortunate.

The problem you're running into is external to the echopype codebase and is due to netcdf4. To see that, create the following environment that has nothing to do with echopype:

$ conda create -n min_test -c conda-forge python=3.9 ipykernel -y
$ conda activate min_test
$ pip install netcdf4 xarray

Once you have this, do the following like in the first comment:

import numpy as np
import xarray as xr

x = np.array([1,2],dtype='U')  #This results in RuntimeError.
#x = np.array([1,2],dtype='f')  #This is working well.
da = xr.DataArray(x,name="x")

encoding={'x': {'zlib': True, 'complevel': 4}}
da.to_netcdf(path="output.nc", encoding=encoding)  # this will error out

If you create this below environment instead:

$ conda create -n min_test -c conda-forge python=3.9 ipykernel netcdf4 -y  # use netcdf4 from conda-forge
$ conda activate min_test
$ pip install netcdf4 xarray

The above da.to_netcdf(path="output.nc", encoding=encoding) line will run through no problem.

@lsetiawan
Copy link
Member

Looks like the core of this issue comes from the underlying netcdf-c. The wheel for netcdf4 > 1.6.0 coming from pypi is built with netcdf-c 4.9.0 which is the problem, see discussion. Looking at the conda install, we're actually getting netcdf-c version 4.8.1, which does not cause an error. I think for now echopype should pin netcdf4 to <1.6 to avoid any errors.

The true cause of the issue is that Compression does not work with variable-length types. . So having that 'zlib: True' is the problem. If you set the above example with 'zlib: False', everything works just fine, but then there's no compression happening. Hope this helps sort this out.

@emiliom
Copy link
Collaborator Author

emiliom commented Oct 11, 2022

Thank you @lsetiawan ! Your PR #843 (pinning netcdf4 to <1.6) fixes the issue. This fix will make it into the next echopype release, so that pip install echopype should work out of the box because it'll install a version of netcdf4 that doesn't introduce this problem.

@lsetiawan
Copy link
Member

Currently now, when I install echopype dev branch, the problem doesn't show up:

Screenshot from 2022-10-11 10-13-22

@leewujung
Copy link
Member

This is resolved now. Closing.

Repository owner moved this from Todo to Done in Echopype Oct 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

4 participants