Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jupyter docker: new full build with latest of almost everything except xclim and ravenpy to smooth transition #121

Merged
merged 110 commits into from
May 9, 2024

Conversation

tlvu
Copy link
Contributor

@tlvu tlvu commented Jun 30, 2023

Overview

This new full build has latest of almost everything except xclim and ravenpy as intermediate step to smooth transition to pandas 2.2 freq strings changes.

Changes

  • New: save conda env export, DockerHub build logs and Jenkins test result in the repo to track changes much more easily between releases

  • Jenkins: add SAVE_RESULTING_NOTEBOOK_TIMEOUT for slow notebooks or slow machine

  • Jupyter env changes:

    • add conda-pack so we can export the conda env outside of the docker image if need to run locally without docker
    • upgrade from Python 3.9 to 3.11
    • Relevant changes (alphabetical order):
-  - birdy=0.8.4=pyh1a96a4e_0
+      - birdhouse-birdy==0.8.7

# major upgrade from v2 to v3
-  - bokeh=2.4.3=pyhd8ed1ab_3
+  - bokeh=3.4.1=pyhd8ed1ab_0

-  - cartopy=0.21.1=py39h6e7ad6e_0
+  - cartopy=0.23.0=py311h320fe9a_0

-  - cf_xarray=0.8.0=pyhd8ed1ab_0
+  - cf_xarray=0.9.0=pyhd8ed1ab_0

-  - cfgrib=0.9.10.4=pyhd8ed1ab_0
+  - cfgrib=0.9.11.0=pyhd8ed1ab_0

-  - cftime=1.6.2=py39h2ae25f5_1
+  - cftime=1.6.3=py311h1f0f07a_0

-  - climpred=2.3.0=pyhd8ed1ab_0
+  - climpred=2.4.0=pyhd8ed1ab_0

-  - clisops=0.9.6=pyh1a96a4e_0
+  - clisops=0.13.0=pyhca7485f_0

-  - dask=2023.5.1=pyhd8ed1ab_0
+  - dask=2024.5.0=pyhd8ed1ab_0

-  - geopandas=0.13.0=pyhd8ed1ab_0
+  - geopandas=0.14.4=pyhd8ed1ab_0

-  - hvplot=0.8.3=pyhd8ed1ab_0
+  - hvplot=0.9.2=pyhd8ed1ab_0

-  - numpy=1.23.5=py39h3d75532_0
+  - numpy=1.24.4=py311h64a7726_0

-  - numba=0.57.0=py39hb75a051_1
+  - numba=0.59.1=py311h96b013e_0

# major upgrade from v1 to v2
-  - pandas=1.3.5=py39hde0f152_0
+  - pandas=2.1.4=py311h320fe9a_0

# major upgrade to v1
-  - panel=0.14.4=pyhd8ed1ab_0
+  - panel=1.4.2=pyhd8ed1ab_0

# major upgrade from v1 to v2
-  - pydantic=1.10.8=py39hd1e30aa_0
+  - pydantic=2.7.1=pyhd8ed1ab_0

# Python 3.9 to 3.11
-  - python=3.9.16=h2782a2a_0_cpython
+  - python=3.11.6=hab00c5b_0_cpython

-  - raven-hydro=0.2.1=py39h8e2dbb5_1
+  - raven-hydro=0.2.4=py311h64a4d7b_0

-  - ravenpy=0.12.1=py39hf3d152e_0
+      - ravenpy==0.13.1

-  - rioxarray=0.14.1=pyhd8ed1ab_0
+  - rioxarray=0.15.5=pyhd8ed1ab_0

-  - roocs-utils=0.6.4=pyh1a96a4e_0
+  - roocs-utils=0.6.8=pyhd8ed1ab_0

-  - scipy=1.9.1=py39h8ba3f38_0
+  - scipy=1.13.0=py311h517d4fd_1

-  - xarray=2023.1.0=pyhd8ed1ab_0
+  - xarray=2023.8.0=pyhd8ed1ab_0

-  - xclim=0.43.0=py39hf3d152e_1
+  - xclim=0.47.0=py311h38be061_0

-  - xesmf=0.7.1=pyhd8ed1ab_0
+  - xesmf=0.8.5=pyhd8ed1ab_0

-  - xskillscore=0.0.24=pyhd8ed1ab_0
+  - xskillscore=0.0.26=pyhd8ed1ab_0

+  - xscen=0.8.2=pyhd8ed1ab_0

+      - figanos==0.3.0

-      - xncml==0.2
+      - xncml==0.4.0

Test

Related Issue / Discussion

Additional Information

Full diff conda env export:
81deb99...931cfc9#diff-e8f2a6a53085ae29bb7cedc701c1d345a330651ae971555e85a5c005e94f4cd9

Full new conda env export:
https://github.com/Ouranosinc/PAVICS-e2e-workflow-tests/blob/931cfc924a147d07b59e88badff9f170e852a03b/docker/saved_buildout/conda-env-export.yml

DockerHub build log
https://github.com/Ouranosinc/PAVICS-e2e-workflow-tests/blob/931cfc924a147d07b59e88badff9f170e852a03b/docker/saved_buildout/docker-buildlogs.txt

tlvu added 24 commits July 4, 2023 12:33
Anything outside of /notebook_dir/writable-workspace/ is not persisted
on disk.

So we make /notebook_dir/ read-only to avoid user accidentally saving
valuable data there and lose it.

Need to pre-create /notebook_dir/writable-workspace/ and
/notebook_dir/pavics-homepage/ because Jenkins will write there so these
folders are writable.  See
https://github.com/Ouranosinc/PAVICS-landing/blob/f733612c9805bbd46da7d8204497553d891ffd1b/content/notebooks/climate_indicators/setup_dirlayout.sh#L14-L27
tlvu added 2 commits May 7, 2024 16:25
Using the latest 0.4x series, hopefully it is compatible.

```
Could not solve for environment specs
The following packages are incompatible
├─ jupyterlab-git 0.50.0  is installable with the potential options
│  ├─ jupyterlab-git 0.50.0 would require
│  │  └─ jupyterlab >=4,<5 , which can be installed;
│  └─ jupyterlab-git 0.50.0 conflicts with any installable versions previously reported;
└─ jupyterlab-topbar is not installable because there are no viable options
   ├─ jupyterlab-topbar 0.6.1 would require
   │  └─ jupyterlab >=3.0.0,<4 , which conflicts with any installable versions previously reported;
   └─ jupyterlab-topbar 0.6.1 would require
      └─ jupyterlab >=3.0.0rc10,==3.* , which does not exist (perhaps a missing channel).
The command '/bin/sh -c umask 0000     && pip uninstall -y ravenpy     && pip install --no-cache-dir git+https://github.com/CSHS-CWRA/RavenPy.git@1977732c8dbf39ea7d563e7e30052707ba8fb2ec     && mamba install -c conda-forge -c cdat -c bokeh -c plotly -c pyviz/label/dev -c defaults -n birdy jupyterlab-git==0.50.0     && mamba clean --all --yes' returned a non-zero code: 1
```
@tlvu tlvu changed the title New docker build Jupyter docker: new full build with latest of almost everything except xclim and ravenpy to smooth transition May 8, 2024
To fix notebook 02_Extract_geographical_watershed_properties.ipynb.
@tlvu tlvu merged commit c7af8b8 into master May 9, 2024
@tlvu tlvu deleted the new-docker-build branch May 9, 2024 17:21
tlvu added a commit to Ouranosinc/pavics-sdi that referenced this pull request May 9, 2024
tlvu added a commit to CSHS-CWRA/RavenPy that referenced this pull request May 9, 2024
In notebook 02 using the PAVICS beta image `stats_resp.get(asobj=True)`
returns an `xr.Dataset` ... plotting requires accessing the `band_data`
variable

See Ouranosinc/pavics-jupyter-env-issues#7

Note this will break the notebook when using `current` image. Do not
merge until all fixes are ready

Fix for
Ouranosinc/PAVICS-e2e-workflow-tests#121.
@tlvu
Copy link
Contributor Author

tlvu commented May 9, 2024

@tlogan2000 The new Jupyter env is live. You can send announcement to our users to restart their server to get the new version.

The previous "stable" version is now the new "previous" version. I've also added a "last-py38-stable-version" to preserve our last Python 3.8 env in case our user need an even older version to reproduce their work.

20240509_141805

@Zeitsperre
Copy link
Collaborator

We need to cull a few of those entries. beta, alpha, gamma, and last are a lot of development options for images.

@tlvu
Copy link
Contributor Author

tlvu commented May 9, 2024

We need to cull a few of those entries. beta, alpha, gamma, and last are a lot of development options for images.

Well beta, alpha, gamma are placeholders when we have many test candidates with slight variations to test for bug fixes. As we have just experienced, it's pretty labor intensive to release a fully tested/certified Jupyter env. So all those placeholders does help during the testing phase.

Previously we deploy those test versions on our staging host but since the staging host do not have all the data, we could not fully test those environments. And the only 100% sure way to not have any surprise is to test directly on the production host ! I've seen case where the JupyterHub is unable to launch the JupyterLab server ! Having the test env on the production host eliminate this issue.

All the app specific envs are for our apps to reproduce the exact env of their runtime. Not all of them, as you can see, are on the "Stable" image. Since their have their own version, they are not force to update at the same time we release a new Stable version. Just ensuring all the notebooks passed is way too labor intensive, I do not want to add all the apps to the certification list !

The last one are an ancien remark from David. We will eventually have last-py39, last-py311 ...

So that's the reason for all those multiple entries.

@tlvu
Copy link
Contributor Author

tlvu commented May 9, 2024

We need to cull a few of those entries. beta, alpha, gamma, and last are a lot of development options for images.

Well beta, alpha, gamma are placeholders when we have many test candidates with slight variations to test for bug fixes. As we have just experienced, it's pretty labor intensive to release a fully tested/certified Jupyter env. So all those placeholders does help during the testing phase.

Previously we deploy those test versions on our staging host but since the staging host do not have all the data, we could not fully test those environments. And the only 100% sure way to not have any surprise is to test directly on the production host ! I've seen case where the JupyterHub is unable to launch the JupyterLab server ! Having the test env on the production host eliminate this issue.

All the app specific envs are for our apps to reproduce the exact env of their runtime. Not all of them, as you can see, are on the "Stable" image. Since their have their own version, they are not force to update at the same time we release a new Stable version. Just ensuring all the notebooks passed is way too labor intensive, I do not want to add all the apps to the certification list !

The last one are an ancien remark from David. We will eventually have last-py39, last-py311 ...

So that's the reason for all those multiple entries.

Oh forgot to add that we are trying to release more often so we are continuous trying to test newer versions. Those beta, alpha, gamma will continuously be useful.

tlvu added a commit to bird-house/birdhouse-deploy that referenced this pull request May 11, 2024
tlvu added a commit that referenced this pull request Nov 14, 2024
# Overview

New full build with latest of everything.


## Changes

- New: save output of `conda env export` and environment size directly
in the docker image for ease of tracking change between releases.
- Updated pull request template for new docker release.
- Jupyter env changes:
- Unpin `libnetcdf`
(Ouranosinc/PAVICS-landing#66 fixed).
  - Avoid `dask != 2024.11.0` due to bugs with Raven nb.
  - Relevant changes (alphabetical order):
```diff
-  - bokeh=3.4.1=pyhd8ed1ab_0    
+  - bokeh=3.5.2=pyhd8ed1ab_0          

-  - cartopy=0.23.0=py311h320fe9a_0
+  - cartopy=0.24.0=py311h7db5c69_0

-  - cf_xarray=0.9.0=pyhd8ed1ab_0
+  - cf_xarray=0.10.0=pyhd8ed1ab_0

-  - cfgrib=0.9.11.0=pyhd8ed1ab_0      
+  - cfgrib=0.9.14.1=pyhd8ed1ab_0

-  - cftime=1.6.3=py311h1f0f07a_0
+  - cftime=1.6.4=py311h9f3472d_1   
   
-  - climpred=2.4.0=pyhd8ed1ab_0 
+  - climpred=2.5.0=pyhd8ed1ab_0   

-  - clisops=0.13.0=pyhca7485f_0 
+  - clisops=0.14.1=pyhd8ed1ab_0  

-  - dask=2024.5.0=pyhd8ed1ab_0   
+  - dask=2024.10.0=pyhd8ed1ab_0     

-  - esmf=8.4.0=nompi_hdb2cfa9_4      
+  - esmf=8.6.1=nompi_h4441c20_3   

-  - fiona=1.9.1=py311h3f14cef_0      
+  - fiona=1.9.5=py311hf8e0aa6_2    

-  - gdal=3.6.2=py311hadb6153_6          
+  - gdal=3.8.5=py311hf92cf48_11  

-  - geopandas=0.14.4=pyhd8ed1ab_0
+  - geopandas=1.0.1=pyhd8ed1ab_1      

-  - hvplot=0.9.2=pyhd8ed1ab_0        
+  - hvplot=0.11.1=pyhd8ed1ab_0    

-  - libnetcdf=4.8.1=nompi_h261ec11_106
+  - libnetcdf=4.9.2=nompi_h135f659_114

-  - numba=0.59.1=py311h96b013e_0     
+  - numba=0.60.0=py311h4bc866e_0        

-  - numpy=1.24.4=py311h64a7726_0 
+  - numpy=1.26.4=py311h64a7726_0  

-  - owslib=0.28.1=pyhd8ed1ab_0      
+  - owslib=0.32.0=pyhd8ed1ab_0       

-  - pandas=2.1.4=py311h320fe9a_0  
+  - pandas=2.2.3=py311h7db5c69_1     

-  - panel=1.4.2=pyhd8ed1ab_0       
+  - panel=1.5.3=pyhd8ed1ab_0         

-  - pydantic=2.7.1=pyhd8ed1ab_0
+      - pydantic==2.7.4                                                       

-  - pyogrio=0.5.1=py311h3f14cef_0                                                    
+  - pyogrio=0.7.2=py311hf8e0aa6_1 

-  - python=3.11.6=hab00c5b_0_cpython               
+  - python=3.11.10=hc5c86c4_3_cpython              

-  - rasterio=1.3.6=py311h567e639_0                 
+  - rasterio=1.3.10=py311h239598e_2                

-  - raven-hydro=0.2.4=py311h64a4d7b_0              
+  - raven-hydro=0.3.2=py311h81cb690_1              

-      - ravenpy==0.13.1     
+  - ravenpy=0.16.0=pyhd8ed1ab_0                                                                         

-  - rioxarray=0.15.5=pyhd8ed1ab_0                  
+  - rioxarray=0.17.0=pyhd8ed1ab_0                         

-  - roocs-utils=0.6.8=pyhd8ed1ab_0                        
+  - roocs-utils=0.6.9=pyhd8ed1ab_0                        

-  - scipy=1.13.0=py311h517d4fd_1                          
+  - scipy=1.14.1=py311he9a78e4_1                          

-  - shapely=2.0.1=py311h0f577a2_0                         
+  - shapely=2.0.4=py311h0bed3d6_1                         

-  - xarray=2023.8.0=pyhd8ed1ab_0                          
+  - xarray=2024.9.0=pyhd8ed1ab_1                          

-  - xclim=0.47.0=py311h38be061_0                          
+  - xclim=0.53.2=pyhd8ed1ab_0                                                 

-  - xesmf=0.8.5=pyhd8ed1ab_0                                                                                          
+  - xesmf=0.8.8=pyhd8ed1ab_0                                                  

-  - xscen=0.8.2=pyhd8ed1ab_0                                                  
+  - xscen=0.10.1=pyhd8ed1ab_0                                                 

```


## Test

- Deployed as "beta" image in production for bokeh visualization
performance regression testing.
- Manual test notebook
https://github.com/Ouranosinc/PAVICS-landing/blob/master/content/notebooks/climate_indicators/PAVICStutorial_ClimateDataAnalysis-5Visualization.ipynb
for bokeh visualization performance and it looks fine.
- Jenkins build:
- Default notebooks, all passed:
https://github.com/Ouranosinc/PAVICS-e2e-workflow-tests/blob/f8c1c585a4f1b17c0a3dc67453695deb2c30fb11/docker/saved_buildout/jenkins-buildlogs-default.txt
- Raven notebooks, only known `HydroShare_integration.ipynb`:
https://github.com/Ouranosinc/PAVICS-e2e-workflow-tests/blob/0b3419bba4aa42ee5a1fa9a8de0c8a6b91bf8547/docker/saved_buildout/jenkins-buildlogs-raven.txt


## Related Issue / Discussion

- Matching notebook fixes:
  - Pavics-sdi: Ouranosinc/pavics-sdi#336
  - Finch: PR url
  - PAVICS-landing: Ouranosinc/PAVICS-landing#98
  - RavenPy: CSHS-CWRA/RavenPy#395
  - (...)

- Deployment to PAVICS:
bird-house/birdhouse-deploy#475

- Jenkins-config changes for new notebooks: PR url

- Other issues found while working on this one
  - Issue 1 URL
  - Issue 2 URL
  - (...)

- Previous release:
#121


## Additional Information

Full diff conda env export:

release-py311-240506-update240508...f8c1c58

Full new conda env export:

https://github.com/Ouranosinc/PAVICS-e2e-workflow-tests/blob/0b3419bba4aa42ee5a1fa9a8de0c8a6b91bf8547/docker/saved_buildout/conda-env-export.yml

DockerHub build log

https://github.com/Ouranosinc/PAVICS-e2e-workflow-tests/blob/0b3419bba4aa42ee5a1fa9a8de0c8a6b91bf8547/docker/saved_buildout/docker-buildlogs.txt
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants