-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SanDiego.nc file throws error in load_arbitrary_ugrid.py example script #56
Comments
Hi, I just tried it locally, the file opens fine for me. I'm guessing something went wrong during your download. We could find out with a checksum: import hashlib
def md5_hash(path: str) -> str:
with open(path, "rb") as f:
content = f.read()
return hashlib.md5(content).hexdigest()
print(md5_hash("SanDiego.nc"))
# prints 1ed883e7318883ef654c123106ed09c0 Did you already try re-downloading the netCDF file? (Lame suggestion I know, but sometimes the solution is mundane!) (By the way, ordinary URLs will not work with the netCDF4 library, only an OPeNDAP URL will work: https://www.opendap.org/) |
Works for me, too. I think your file got corrupted, or your netcdf lib is somehow broken. You might try running ncdump on the file to test it out: $ ncdump -h SanDiego.nc -CHB |
Huite and Chris, Yes, I did try redownloading the file but still getting the same error. The checksum changes every time I download the file. Here is how I am downloading it within my jupyter notebook on university cluster (although same result on local machine):
which gives the following output:
Here is the checksum code output:
And here is the same error using the example script:
@ChrisBarker-NOAA : ncdump does not work work Am I downloading from the right weblink? @Huite : Good to know the OPEnDAP trick. |
Ah, you were using See: This'll do the trick:
(I didn't know this either before now.) You're probably on a *nix machine (if you're using wget), so you can check immediately:
I got a little curious, you can get the data into Python without loading into a file, provided you once again give the right URL: import requests
import netCDF4
my_url = "https://github.com/erdc/AdhModel/blob/master/tests/test_files/SanDiego/SanDiego.nc?raw=true"
response = requests.get(my_url, stream=True)
ds = netCDF4.Dataset('name', mode='r', memory=response.content) OPenDAP is probably lot fancier, but it's pretty cool this works. Xarray dispatches based on the type you're passing in, in which case you need import io
import requests
import xarray as xr
response = requests.get("https://github.com/erdc/AdhModel/blob/master/tests/test_files/SanDiego/SanDiego.nc?raw=true")
ds = xr.open_dataset(io.BytesIO(response.content)) |
@Huite: thanks for beeting me to it! and the trick about xarray -- another good reason to build the next version of gridded on it :-) -CHB |
@Huite : Thank you so much, it works!!! I had no idea about the wget issue with github files. Your suggestion for loading through url is really helpful and works just great! I am currently trying out gridded for post-processing the output from an atmospheric NWP model. Will let you guys know if I need any help. Any words of wisdom from your end are highly appreciated. I found out about gridded from @ChrisBarker-NOAA's AMS 2017 talk. |
@ChrisBarker-NOAA @Huite : What's the best way to reach out to you to discuss gridded's application to a meteorological numerical model output? |
Actually gitHub is a pretty good way to do it. Why not start a new issue(s) with a question or proposal? |
Hi I am new to gridded. I was trying to replicate the load_arbitrary_ugrid.py example script but could not load the SanDiego.nc file. I tried downloading the file to the working directory as well as accessing directly through the provided url using both netCDF4 and xarray libraries.
Approach 1:
with netCDF4.Dataset("SanDiego.nc") as nc:
OSError Traceback (most recent call last)
in
----> 1 with netCDF4.Dataset("SanDiego.nc") as nc:
2
3 # need to convert to zero-indexing
4 nodes = nc.variables['nodes'][:] - 1
5 faces = nc.variables['E3T'][:, :3] - 1
netCDF4/_netCDF4.pyx in netCDF4._netCDF4.Dataset.init()
netCDF4/_netCDF4.pyx in netCDF4._netCDF4._ensure_nc_success()
OSError: [Errno -51] NetCDF: Unknown file format: b'SanDiego.nc'`
Approach 2:
url = ('https://github.com/erdc/AdhModel/blob/master/tests/test_files/SanDiego/SanDiego.nc#mode=bytes')
dataset = netCDF4.Dataset(url)
FileNotFoundError Traceback (most recent call last)
in
1 url = ('https://github.com/erdc/AdhModel/blob/master/tests/test_files/SanDiego/SanDiego.nc#mode=bytes' class="ansi-blue-fg">)
----> 2 dataset = netCDF4.Dataset(url)
netCDF4/_netCDF4.pyx in netCDF4._netCDF4.Dataset.init()
netCDF4/_netCDF4.pyx in netCDF4._netCDF4._ensure_nc_success()
FileNotFoundError: [Errno 2] No such file or directory: b'https://github.com/erdc/AdhModel/blob/master/tests/test_files/SanDiego/SanDiego.nc#mode=bytes'
Approach 3:
ds = xr.open_dataset(url)
KeyError Traceback (most recent call last)
~/.conda/envs/cent7/5.3.1-py37/arps/lib/python3.8/site-packages/xarray/backends/file_manager.py in _acquire_with_cache_info(self, needs_lock)
197 try:
--> 198 file = self._cache[self._key]
199 except KeyError:
~/.conda/envs/cent7/5.3.1-py37/arps/lib/python3.8/site-packages/xarray/backends/lru_cache.py in getitem(self, key)
52 with self._lock:
---> 53 value = self._cache[key]
54 self._cache.move_to_end(key)
KeyError: [<class 'netCDF4._netCDF4.Dataset'>, ('https://github.com/erdc/AdhModel/blob/master/tests/test_files/SanDiego/SanDiego.nc#mode=bytes',), 'r', (('clobber', True), ('diskless', False), ('format', 'NETCDF4'), ('persist', False))]
The text was updated successfully, but these errors were encountered: