Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory issue when plotting a 2D contour of surface relative vorticity in a large ROMS dataset #17

Open
jl2158 opened this issue Nov 6, 2020 · 3 comments

Comments

@jl2158
Copy link

jl2158 commented Nov 6, 2020

Hi everyone,

I have a memory issue when plotting the Hovmoller diagram of the vertical component of surface/bottom relative vorticity (dvdx-dudz) in a large ROMS dataset, which has the dimension of {"ocean_time": 96, "s_rho": 30, "eta_v": 601, "xi_u": 677} for velocity v (@hetland ).

Here is what I did.

`
import matplotlib.pyplot as plt
import xroms
import cmocean
from glob import glob
%matplotlib inline

files = glob('./roms_*.nc')
ds = xroms.open_mfnetcdf(files)
ds, grid = xroms.roms_dataset(ds)

zeta = xroms.relative_vorticity(ds.u,ds.v,grid)
plt.pcolormesh(ds.lon_rho,ds.ocean_time,zeta.isel(s_w=-1, eta_v=0),cmap=cmocean.cm.balance)
`
In the last line, I tried to get a pcolor of zeta which should have a size of 96 by 677 (ocean_time by xi_u).
And I did this in a jupyter notebook on a local desktop which has 16GB memory. However, it got stuck at the very last command line and after some time the notebook ran out of memory. Does anyone have any idea how to solve this? Thanks.

Jinliang

@hetland
Copy link
Collaborator

hetland commented Nov 6, 2020

I looked into this problem. You can see more information about what is going on with dask if you explicitly create a dask-distributed cluster before loading your data.

from dask.distributed import Client
client = Client()
client

You can then look at the linked dashboard. For even more information, get the dask-labextension widget, and start your cluster from within jupyterlab. When I did this, I could get results for a 24 timesteps (with the other dimensions being what you say above), in about 30 seconds. If I try to do something for 5*24 timesteps, I end up using about 150Gb of memory, and it is incredibly slow.

I'm not sure why it is so slow, since the operation you are using does not do any operations across time, only space, so parallelizing in time should be trivial. It's possible rechunking may help. E.g.,

zeta = zeta.chunk({'ocean_time':4, 's_w': 31, 'eta_v': 601, 'xi_u':676})

to use more time steps at once per chunk, but I find this does not speed up the calculation for 24 points, and it uses slightly more memory. @kthyng has also had some troubles with this, and perhaps she can chime in and describe her solutions.

@kthyng
Copy link
Contributor

kthyng commented Nov 6, 2020

I would try a few things like making sure zeta.isel(s_w=-1, eta_v=0) has the expected shape and dimensions, seeing if it works to plot directly zeta.isel(s_w=-1, eta_v=0).plot(), seeing if I can just get the values back zeta.isel(s_w=-1, eta_v=0).values. I would also play around a bit with different chunks than the default in xroms.open_mfnetcdf but I don't have so much experience with that to help generally.

@kthyng
Copy link
Contributor

kthyng commented May 24, 2023

Hi @jl2158! I've recently put out a new version to PyPI (v0.2.4, still coming through on conda-forge). Please check to see if this issue is still present in the new version so that over time we can work to address these. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants