Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Download multiple fields with FastHerbie? #242

Closed
williamhobbs opened this issue Nov 2, 2023 · 3 comments
Closed

Download multiple fields with FastHerbie? #242

williamhobbs opened this issue Nov 2, 2023 · 3 comments

Comments

@williamhobbs
Copy link
Contributor

Is there a simple way to download multiple variables/fields at once with FastHerbie()?

For example, this:

DATES = pd.date_range(start='2023-06-21 06:00',
                      end='2023-06-22 06:00',
                      freq='24H')
fxx = range(30,32)
variables = 'DSWRF:surface|TMP:2 m'  # <--- two variables
FH = FastHerbie(DATES, model="HRRR", product="sfc", fxx=fxx)
ds = FH.xarray(variables)

returns:

Note: Returning a list of [2] xarray.Datasets because cfgrib opened with multiple hypercubes.
Note: Returning a list of [2] xarray.Datasets because cfgrib opened with multiple hypercubes.
Note: Returning a list of [2] xarray.Datasets because cfgrib opened with multiple hypercubes.
Note: Returning a list of [2] xarray.Datasets because cfgrib opened with multiple hypercubes.

and

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
[c:\Users\willh\Documents\Python](file:///C:/Users/willh/Documents/Python) Scripts\herbie_examples\herbie_pvlib_hrrr.ipynb Cell 22 line 1
----> [1](vscode-notebook-cell:/c%3A/Users/willh/Documents/Python%20Scripts/herbie_examples/herbie_pvlib_hrrr.ipynb#X53sZmlsZQ%3D%3D?line=0) ds = FH.xarray(variables)

File [c:\Users\willh\anaconda3\envs\herbie_conda_venv\lib\site-packages\herbie\fast.py:294](file:///C:/Users/willh/anaconda3/envs/herbie_conda_venv/lib/site-packages/herbie/fast.py:294), in FastHerbie.xarray(self, searchString, max_threads, **xarray_kwargs)
    291     ds_list = [H.xarray(**xarray_kwargs) for H in self.file_exists]
    293 # Sort the DataSets, first by lead time (step), then by run time (time)
--> 294 ds_list.sort(key=lambda x: x.step.data.max())
    295 ds_list.sort(key=lambda x: x.time.data.max())
    297 # Reshape list with dimensions (len(DATES), len(fxx))

File [c:\Users\willh\anaconda3\envs\herbie_conda_venv\lib\site-packages\herbie\fast.py:294](file:///C:/Users/willh/anaconda3/envs/herbie_conda_venv/lib/site-packages/herbie/fast.py:294), in FastHerbie.xarray.<locals>.<lambda>(x)
    291     ds_list = [H.xarray(**xarray_kwargs) for H in self.file_exists]
    293 # Sort the DataSets, first by lead time (step), then by run time (time)
--> 294 ds_list.sort(key=lambda x: x.step.data.max())
    295 ds_list.sort(key=lambda x: x.time.data.max())
    297 # Reshape list with dimensions (len(DATES), len(fxx))

AttributeError: 'list' object has no attribute 'step'

I could run a loop, downloading a dataset one variable at a time, but I wanted to make sure there wasn't a better way to do it.

Thanks!

@williamhobbs
Copy link
Contributor Author

It looks like I can pull multiple variables, as long as they are at the same level/height. E.g.,

DATES = pd.date_range(start='2023-06-21 06:00',
                      end='2023-06-22 06:00',
                      freq='24H')
fxx = range(30,32)
variables = 'UGRD:10 m|VGRD:10 m'  # <--- two variables, both at 10 m
FH = FastHerbie(DATES, model="HRRR", product="sfc", fxx=fxx)
ds = FH.xarray(variables)

works ok.

I'd still be interested in knowing if there is a way to pull multiple variables at multiple heights. I'm my case, I don't need the height information, so that could be dropped/overridden.

@blaylockbk
Copy link
Owner

Hi @williamhobbs, that's pretty much what I would have done.

The main limitation is in cfgrib which sets the variable level as a coordinate. Since the xarray Datasets with different levels can't be joined together, cfgrib instead returns a list of Datasets. It could be possible for Herbie to drop the level coordinate so you can join things together, but I hesitate trimming metadata.

I actually think the right solution is for cfgrib to return an xarray Datatree instead of a list of Datasets. This hasn't really gone anywhere yet, but I'd love to see it happen. xarray-contrib/datatree#195.

See also ecmwf/cfgrib#344 and pydata/xarray#7437

@williamhobbs
Copy link
Contributor Author

Thanks. That's helpful background.

I'll close this issue - let me know if you'd like me to leave it open.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants