Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add rasterio to docker image? #249

Closed
rabernat opened this issue May 11, 2018 · 7 comments
Closed

add rasterio to docker image? #249

rabernat opened this issue May 11, 2018 · 7 comments

Comments

@rabernat
Copy link
Member

I wanted to try an example using rasterio to read from some of the cloud-optimized GeoTiff archives on S3 (from pangeo.esipfed.org). Providing an example working with imagery could open up a huge new community to Pangeo.

However, we don't currently have rasterio in the pangeo docker images? Is this a dependency we should add? It is very heavy and requires changes to lots of the package versions. This would make the docker images much bigger and could potentially screw up other dependencies (e.g. netCDF) in an unforeseen way.

Thoughts?

$ conda install -c conda-forge rasterio
Solving environment: done


==> WARNING: A newer version of conda exists. <==
  current version: 4.4.10
  latest version: 4.5.3

Please update conda by running

    $ conda update -n base conda



## Package Plan ##

  environment location: /opt/conda

  added / updated specs:
    - rasterio


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    rasterio-0.36.0            |           py36_3         3.4 MB  conda-forge
    proj4-4.9.3                |                5         4.0 MB  conda-forge
    geos-3.6.2                 |                1        19.9 MB  conda-forge
    scikit-learn-0.19.1        |py36_nomklh27f7947_0         5.2 MB  defaults
    openjpeg-2.3.0             |                2         372 KB  conda-forge
    freexl-1.0.5               |                0         135 KB  conda-forge
    libkml-1.3.0               |                6         604 KB  conda-forge
    scipy-1.1.0                |py36_nomklh9d22d0a_0        18.1 MB  defaults
    libgfortran-ng-7.2.0       |       hdf63c60_3         1.2 MB  defaults
    libpq-9.6.6                |       h4e02ad2_0         101 KB  defaults
    boost-1.66.0               |           py36_1         317 KB  conda-forge
    libnetcdf-4.4.1.1          |               10         2.0 MB  conda-forge
    giflib-5.1.4               |                0         200 KB  conda-forge
    snuggs-1.4.1               |           py36_0           9 KB  conda-forge
    affine-2.2.0               |             py_0          15 KB  conda-forge
    gdal-2.2.2                 |   py36hc209d97_1         767 KB  defaults
    click-plugins-1.0.3        |           py36_0           7 KB  conda-forge
    blas-1.0                   |           noblas           1 KB  conda-forge
    netcdf4-1.3.1              |           py36_1         2.9 MB  conda-forge
    json-c-0.12.1              |                0          47 KB  conda-forge
    libspatialite-4.3.0a       |      h72746d6_18         3.1 MB  defaults
    libgdal-2.2.2              |       h804cdde_1        16.1 MB  defaults
    util-linux-2.21            |                0          35 KB  defaults
    poppler-data-0.4.9         |                0         3.5 MB  conda-forge
    libopenblas-0.2.20         |       h9ac9557_4         8.7 MB  defaults
    bzip2-1.0.6                |                1         310 KB  conda-forge
    boost-cpp-1.66.0           |                1        18.8 MB  conda-forge
    xerces-c-3.2.1             |                0         3.9 MB  conda-forge
    cairo-1.14.12              |       h77bcde2_0         1.3 MB  defaults
    numpy-1.14.2               |py36_nomklh2b20989_1         4.0 MB  defaults
    cligj-0.4.0                |           py36_0          12 KB  conda-forge
    poppler-0.60.1             |       hc909a00_0         6.6 MB  defaults
    pixman-0.34.0              |                2         1.2 MB  conda-forge
    kealib-1.4.7               |                4         173 KB  conda-forge
    libdap4-3.19.2             |                1        15.6 MB  conda-forge
    ------------------------------------------------------------
                                           Total:       142.7 MB

The following NEW packages will be INSTALLED:

    affine:         2.2.0-py_0                    conda-forge
    boost:          1.66.0-py36_1                 conda-forge
    boost-cpp:      1.66.0-1                      conda-forge
    bzip2:          1.0.6-1                       conda-forge
    cairo:          1.14.12-h77bcde2_0            defaults
    click-plugins:  1.0.3-py36_0                  conda-forge
    cligj:          0.4.0-py36_0                  conda-forge
    freexl:         1.0.5-0                       conda-forge
    gdal:           2.2.2-py36hc209d97_1          defaults
    geos:           3.6.2-1                       conda-forge
    giflib:         5.1.4-0                       conda-forge
    json-c:         0.12.1-0                      conda-forge
    kealib:         1.4.7-4                       conda-forge
    libdap4:        3.19.2-1                      conda-forge
    libgdal:        2.2.2-h804cdde_1              defaults
    libgfortran-ng: 7.2.0-hdf63c60_3              defaults
    libkml:         1.3.0-6                       conda-forge
    libopenblas:    0.2.20-h9ac9557_4             defaults
    libpq:          9.6.6-h4e02ad2_0              defaults
    libspatialite:  4.3.0a-h72746d6_18            defaults
    openjpeg:       2.3.0-2                       conda-forge
    pixman:         0.34.0-2                      conda-forge
    poppler:        0.60.1-hc909a00_0             defaults
    poppler-data:   0.4.9-0                       conda-forge
    proj4:          4.9.3-5                       conda-forge
    rasterio:       0.36.0-py36_3                 conda-forge
    snuggs:         1.4.1-py36_0                  conda-forge
    util-linux:     2.21-0                        defaults
    xerces-c:       3.2.1-0                       conda-forge

The following packages will be DOWNGRADED:

    blas:           1.1-openblas                  conda-forge --> 1.0-noblas                  conda-forge
    libnetcdf:      4.6.1-2                       conda-forge --> 4.4.1.1-10                  conda-forge
    netcdf4:        1.4.0-py36_0                  conda-forge --> 1.3.1-py36_1                conda-forge
    numpy:          1.14.3-py36_blas_openblas_200 conda-forge [blas_openblas] --> 1.14.2-py36_nomklh2b20989_1 defaults    [nomkl]
    scikit-learn:   0.19.1-py36_blas_openblas_201 conda-forge [blas_openblas] --> 0.19.1-py36_nomklh27f7947_0 defaults    [nomkl]
    scipy:          1.1.0-py36_blas_openblas_200  conda-forge [blas_openblas] --> 1.1.0-py36_nomklh9d22d0a_0  defaults    [nomkl]

Proceed ([y]/n)? n
@jhamman
Copy link
Member

jhamman commented May 11, 2018

I suggest we give it a try and if the image size is truly causing problems, we rethink the decision.

@rsignell-usgs
Copy link
Member

rsignell-usgs commented May 11, 2018

I was experimenting with rasterio this morning on our http://pangeo.esipfed.org instance. I added it from the -c conda-forge/label/dev channel, and the notebook image is 3.8GB and worker image 3.1GB.

It works great single-threaded, but distributed I'm currently getting errors like:

distributed.protocol.core - CRITICAL - Failed to Serialize
Traceback (most recent call last):
  File "/opt/conda/lib/python3.6/site-packages/distributed/protocol/pickle.py", line 38, in dumps
    result = pickle.dumps(x, protocol=pickle.HIGHEST_PROTOCOL)
TypeError: can't pickle rasterio._io.RasterReader objects

which I'm trying to figure out.

@jhamman
Copy link
Member

jhamman commented May 11, 2018

In [1]: import rasterio

In [2]: ds = rasterio.open('RGB.byte.tif')

In [3]: ds
Out[3]: <open RasterReader name='RGB.byte.tif' mode='r'>

In [4]: import pickle

In [5]: pickle.dumps(ds)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-5-a165c2473431> in <module>()
----> 1 pickle.dumps(ds)

TypeError: can't pickle rasterio._io.RasterReader objects

I think we should probably open an issue in xarray to discuss possibly creating a picklable wrapper for rasterio objects, much like we do for netcdf objects.

@jhamman
Copy link
Member

jhamman commented May 15, 2018

I've started working on a fix for the rasterio+distributed issue. pydata/xarray#2131

@jhamman
Copy link
Member

jhamman commented Jun 7, 2018

@rsignell-usgs - this should be working in xarray now. Any chance we can convince you to donate a working notebook as an example for Pangeo's website (http://pangeo-data.org/use_cases/index.html)?

@stale
Copy link

stale bot commented Aug 6, 2018

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Aug 6, 2018
@stale
Copy link

stale bot commented Aug 13, 2018

This issue has been automatically closed because it had not seen recent activity. The issue can always be reopened at a later date.

@stale stale bot closed this as completed Aug 13, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants