Skip to content
This repository has been archived by the owner on Aug 29, 2023. It is now read-only.

Unable to open soil moisture dataset #326

Closed
kjpearson opened this issue Aug 15, 2017 · 18 comments
Closed

Unable to open soil moisture dataset #326

kjpearson opened this issue Aug 15, 2017 · 18 comments
Assignees
Milestone

Comments

@kjpearson
Copy link

Expected behavior

Open soil moisture dataset ESA Soil Moisture Climate Change Initiative (Soil_Moisture_cci): 'Combined' Product, version 03.2

Actual behavior

I have downloaded it to be a local source but when I try to open it, Cate fails with short error message "set_workspace_resource() call raised exception: "unable to decode time units 'days since 1970-01-01 00:00:00 UTC' with the default calendar. Try opening your dataset with decode_times=False.""

How can I set this option in the GUI? If I do so will I still be able to look at the time behaviour alongside the SST dataset I have? UC6 is about looking for time delays between the two for example.

Detailed error messase

Cate Desktop, version 0.9.0-dev.4

set_workspace_resource() call raised exception: "unable to decode time units 'days since 1970-01-01 00:00:00 UTC' with the default calendar. Try opening your dataset with decode_times=False."

An error (code 20) occurred while executing a backend process:

Traceback (most recent call last):
File "C:\Users\gs900529\AppData\Local\Continuum\cate\lib\site-packages\xarray\conventions.py", line 155, in decode_cf_datetime
pd.to_timedelta(flat_num_dates.min(), delta) + ref_date
File "C:\Users\gs900529\AppData\Local\Continuum\cate\lib\site-packages\pandas\core\tools\timedeltas.py", line 89, in to_timedelta
box=box, errors=errors)
File "C:\Users\gs900529\AppData\Local\Continuum\cate\lib\site-packages\pandas\core\tools\timedeltas.py", line 134, in _coerce_scalar_to_timedelta_type
result = tslib.convert_to_timedelta64(r, unit)
File "pandas/_libs/tslib.pyx", line 3526, in pandas._libs.tslib.convert_to_timedelta64 (pandas_libs\tslib.c:62190)
File "pandas/_libs/tslib.pyx", line 3570, in pandas._libs.tslib.convert_to_timedelta64 (pandas_libs\tslib.c:61660)
File "pandas/_libs/tslib.pyx", line 4028, in pandas._libs.tslib.cast_from_unit (pandas_libs\tslib.c:68471)
OverflowError: int too big to convert

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:\Users\gs900529\AppData\Local\Continuum\cate\lib\site-packages\xarray\conventions.py", line 393, in init
result = decode_cf_datetime(example_value, units, calendar)
File "C:\Users\gs900529\AppData\Local\Continuum\cate\lib\site-packages\xarray\conventions.py", line 161, in decode_cf_datetime
dates = _decode_datetime_with_netcdf4(flat_num_dates, units, calendar)
File "C:\Users\gs900529\AppData\Local\Continuum\cate\lib\site-packages\xarray\conventions.py", line 103, in _decode_datetime_with_netcdf4
dates = np.asarray(nc4.num2date(num_dates, units, calendar))
File "netCDF4_netCDF4.pyx", line 5555, in netCDF4._netCDF4.num2date (netCDF4_netCDF4.c:67494)
File "netcdftime_netcdftime.pyx", line 880, in netcdftime._netcdftime.utime.num2date (netcdftime_netcdftime.c:15941)
File "netcdftime_netcdftime.pyx", line 267, in netcdftime._netcdftime.DateFromJulianDay (netcdftime_netcdftime.c:4920)
ValueError: Julian Day must be positive

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:\Users\gs900529\AppData\Local\Continuum\cate\lib\site-packages\cate\util\web\jsonrpchandler.py", line 190, in send_service_method_result
result = future.result()
File "C:\Users\gs900529\AppData\Local\Continuum\cate\lib\concurrent\futures_base.py", line 398, in result
return self.__get_result()
File "C:\Users\gs900529\AppData\Local\Continuum\cate\lib\concurrent\futures_base.py", line 357, in __get_result
raise self._exception
File "C:\Users\gs900529\AppData\Local\Continuum\cate\lib\concurrent\futures\thread.py", line 55, in run
result = self.fn(*self.args, **self.kwargs)
File "C:\Users\gs900529\AppData\Local\Continuum\cate\lib\site-packages\cate\util\web\jsonrpchandler.py", line 267, in call_service_method
result = method(*method_params, monitor=monitor)
File "C:\Users\gs900529\AppData\Local\Continuum\cate\lib\site-packages\cate\webapi\websocket.py", line 283, in set_workspace_resource
monitor=monitor)
File "C:\Users\gs900529\AppData\Local\Continuum\cate\lib\site-packages\cate\core\wsmanag.py", line 313, in set_workspace_resource
workspace.execute_workflow(res_name=res_name, monitor=monitor)
File "C:\Users\gs900529\AppData\Local\Continuum\cate\lib\site-packages\cate\core\workspace.py", line 630, in execute_workflow
self.workflow.invoke_steps(steps, context=self._new_context(), monitor=monitor)
File "C:\Users\gs900529\AppData\Local\Continuum\cate\lib\site-packages\cate\core\workflow.py", line 624, in invoke_steps
steps[0].invoke(context=context, monitor=monitor)
File "C:\Users\gs900529\AppData\Local\Continuum\cate\lib\site-packages\cate\core\workflow.py", line 315, in invoke
self._invoke_impl(_new_context(context, step=self), monitor=monitor)
File "C:\Users\gs900529\AppData\Local\Continuum\cate\lib\site-packages\cate\core\workflow.py", line 977, in _invoke_impl
return_value = self._op(monitor=monitor, **input_values)
File "C:\Users\gs900529\AppData\Local\Continuum\cate\lib\site-packages\cate\core\op.py", line 211, in call
return_value = self._wrapped_op(**input_values)
File "C:\Users\gs900529\AppData\Local\Continuum\cate\lib\site-packages\cate\ops\io.py", line 63, in open_dataset
var_names=var_names, region=region)
File "C:\Users\gs900529\AppData\Local\Continuum\cate\lib\site-packages\cate\core\ds.py", line 546, in open_dataset
return data_source.open_dataset(time_range, region, var_names)
File "C:\Users\gs900529\AppData\Local\Continuum\cate\lib\site-packages\cate\ds\local.py", line 162, in open_dataset
ds = open_xarray_dataset(paths)
File "C:\Users\gs900529\AppData\Local\Continuum\cate\lib\site-packages\cate\core\ds.py", line 593, in open_xarray_dataset
temp_ds = xr.open_dataset(paths[0])
File "C:\Users\gs900529\AppData\Local\Continuum\cate\lib\site-packages\xarray\backends\api.py", line 301, in open_dataset
return maybe_decode_store(store, lock)
File "C:\Users\gs900529\AppData\Local\Continuum\cate\lib\site-packages\xarray\backends\api.py", line 225, in maybe_decode_store
drop_variables=drop_variables)
File "C:\Users\gs900529\AppData\Local\Continuum\cate\lib\site-packages\xarray\conventions.py", line 955, in decode_cf
decode_coords, drop_variables=drop_variables)
File "C:\Users\gs900529\AppData\Local\Continuum\cate\lib\site-packages\xarray\conventions.py", line 888, in decode_cf_variables
decode_times=decode_times)
File "C:\Users\gs900529\AppData\Local\Continuum\cate\lib\site-packages\xarray\conventions.py", line 825, in decode_cf_variable
data = DecodedCFDatetimeArray(data, units, calendar)
File "C:\Users\gs900529\AppData\Local\Continuum\cate\lib\site-packages\xarray\conventions.py", line 402, in init
raise ValueError(msg)
ValueError: unable to decode time units 'days since 1970-01-01 00:00:00 UTC' with the default calendar. Try opening your dataset with decode_times=False.

@JanisGailis
Copy link
Member

@kjpearson Currently it seems that the time unit description has changed from how it was before, as it used to work with version 02-2.

From CF Conventions:

The specification:

    seconds since 1992-10-8 15:15:42.5 -6:00

indicates seconds since October 8th, 1992  at  3  hours,  15
minutes  and  42.5 seconds in the afternoon in the time zone
which is six hours to the west of Coordinated Universal Time
(i.e.  Mountain Daylight Time).  The time zone specification
can also be written without a colon using one or  two-digits
(indicating hours) or three or four digits (indicating hours
and minutes).

The problem is likely that the time zone designation in 'days since 1970-01-01 00:00:00 UTC' breaks machine readability for it.

@JanisGailis
Copy link
Member

Nope, this is probably not it. Just checked some 02-2 files on my computer and these seem to have the exact same time unit.

@mzuehlke
Copy link
Collaborator

I have taken a look at this. The problem is not with time variable, but with the t0 variable.

        double t0(time, lat, lon) ;
            t0:_FillValue = -9999. ;
            t0:long_name = "Observation Timestamp" ;
            t0:units = "days since 1970-01-01 00:00:00 UTC" ;
            t0:valid_range = 7885.5, 7886.5 ;
            t0:_CoordinateAxes = "time lat lon" ;

This is still valid, bu the values that the variable contains are in one example file all -2450586.5. Which is roughly year 6700 BC ?? Sounds not valid at all.

@JanisGailis
Copy link
Member

JanisGailis commented Aug 30, 2017

Right, I was also investigating this, and I can attest that 'most' datasets are fine and can be opened and decoded correctly, while around 20% don't open. This is probably it.

EDIT: I also ran the debugger on it, and can confirm that the variable with which there was a problem has a 'long_name' 'Observation Timestamp'.

@mzuehlke
Copy link
Collaborator

The files from 02-2 (still on FTP) contain NaN instead of this big negative number.

@JanisGailis
Copy link
Member

Until a new version of Soil Moisture dataset with a fixed 't0' variable comes out, the data can be accessed, opened, and used in cate if the problematic variable is explicitly excluded

cate ds copy esacci.SOILMOISTURE.day.L3S.SSMV.multi-sensor.multi-platform.COMBINED.03-2.r1 --name SOIL_2007 --time '2007-01-01,2007-12-31' --region '72,8,85,17' --vars 'sm,sm_uncertainty'

@kjpearson
Copy link
Author

Can this workaround be used to make the dataset available within the desktop GUI?

@JanisGailis
Copy link
Member

It should be. In the download dialog there must be a field for variable selection.
In general, everything one can do on the CLI should be also doable on the GUI. If that's not the case, it's either a missing feature or a bug that needs fixing!

@mzuehlke
Copy link
Collaborator

mzuehlke commented Sep 6, 2017

Fixed in commit: 9b64f2f

@forman
Copy link
Member

forman commented Sep 15, 2017

Removed cu_mandatory because it is implied by cu_review

@forman forman changed the title Unable to open soil mositure dataset Unable to open soil moisture dataset Sep 20, 2017
@HerzogStephan
Copy link
Contributor

I propose to close this issue: When selecting the right variables the DS can be opened in GUI.
But (!) I would not put this dataset on the "whitelist".

Remark:
By the way this is not the only DS that returns this kind of traceback quoted by @kjpearson in the opening comment. Shall we compile an extra list for those DS, which have this traceback in common?

@JanisGailis
Copy link
Member

Marco implemented a fix for this, where the troublesome variable is excluded automatically by default. Hence, the user should not see the stacktrace 'ever'. I'm not sure if we're displaying a warning regarding the excluded variable, and if we can try to coerce Cate into opening said variable by choice.

Either way, with this fix, I think it should be in the whitelist. Maybe with a caveat.

@HerzogStephan
Copy link
Contributor

Ok. This means I will delete what I have and load it again to check it thoroughly. If it works well I think we can close this issue and proceed according to your suggestion.

@JanisGailis
Copy link
Member

I'm not 100% sure if the fix is in the version you have. E.g., if it's released.

@forman forman added this to the 1.0 milestone Sep 21, 2017
@HerzogStephan
Copy link
Contributor

Ok. I wait for dev7 though

@mzuehlke
Copy link
Collaborator

Hi ,
the fix will be in the 0.9.0.dev7 release and requires a new make local step for these broken data sets.
As the brken varibale is removed in the make_local step.

@HerzogStephan
Copy link
Contributor

Can be closed from my point of view and can be put on the list.

@forman forman added this to the IRM7 milestone Oct 17, 2017
@kbernat kbernat modified the milestones: IRM7, IRM8 Mar 8, 2018
@kbernat kbernat self-assigned this Mar 8, 2018
kbernat pushed a commit that referenced this issue Mar 9, 2018
kbernat pushed a commit that referenced this issue Mar 9, 2018
kbernat pushed a commit that referenced this issue Mar 14, 2018
@papesci
Copy link
Contributor

papesci commented Mar 29, 2018

on Chris behalf i close this Issue

@papesci papesci closed this as completed Mar 29, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants