Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignore dtype_vert for data that isn't vertically defined #84

Open
spencerahill opened this issue Jul 5, 2016 · 3 comments
Open

Ignore dtype_vert for data that isn't vertically defined #84

spencerahill opened this issue Jul 5, 2016 · 3 comments

Comments

@spencerahill
Copy link
Owner

Sometimes, I'll forget to set dtype_vert=False when I'm doing a calculation on a 2D variable, e.g. t_surf. Rather than throwing an error or being ignored, this causes the calculation to be performed as normal but then with the erroneous dtype_vert incorporated into the output filename. So then it can't be found subsequently.

I think the most reasonable default behavior (which we should ultimately enable toggling via config) would be to raise a warning/log that there's a meaningless dtype_vert option specified, and then override it. Should be simple enough to implement via checking Var.def_vert. and Calc.dtype_out_vert.

@spencerahill
Copy link
Owner Author

Working through this offline with @micahkim23, here's an initial stab at how to proceed. The key is that this introduces nested looping, where we need to loop first over the vertical reductions and then over the temporal and regional ones.

Steps to take within Calc:

  1. Remove vertical reductions from Calc._compute_full_ts
  2. Move them into a new method Calc._apply_vert_reduction
  3. Create a new method Calc._apply_all_reductions, which will look something like:
def _apply_all_reductions(self, data):
    for dov in self.dtype_out_vert:
        vert_reduced = self._apply_vert_reduction(data, dov)
        for dot in self.dtype_out_time:
            self._apply_all_time_reductions(vert_reduced)

(the call signatures of the above method calls may not be correct; double check)

@micahkim23
Copy link
Collaborator

micahkim23 commented Dec 8, 2017

Hey, @spencerahill, I started working on this and things are starting to get a little thorny. I'm going to explain my understanding of the problem. If we change dtype_out_vert to a tuple:

Currently

def _apply_all_time_reductions(self, full_ts, monthly_ts, eddy_ts):
returns an OrderedDict that maps the name of the time reduction to its corresponding Dataset or DataArray. Ultimately that's what gets looped over here

aospy/aospy/calc.py

Lines 597 to 602 in 841a756

for dtype_time, data in reduced.items():
data = _add_metadata_as_attrs(data, self.var.units,
self.var.description,
self.dtype_out_vert)
self.save(data, dtype_time, dtype_out_vert=self.dtype_out_vert,
save_files=True, write_to_tar=write_to_tar)

My understanding is that at this part we want to have an OrderedDict that contains all possible combinations of vertical reductions and time reductions. Does that mean that the keys of the OrderedDict are string concatenations of each possible combination of dtype_out_time and dtype_out_vert? But if one is None and the other is not, then the keys will be really weird. Things that rely on this key will get weird like _add_metadata_as_attrs since you would have to split the key on ' ' and get one of the elements.

I think there is a similar issue when determining self.file_name and self.path_out here

aospy/aospy/calc.py

Lines 267 to 269 in 841a756

self.file_name = {d: self._file_name(d) for d in self.dtype_out_time}
self.path_out = {d: self._path_out(d)
for d in self.dtype_out_time}
When writing files out, we have 4 different cases:

  • dtype_out_vert is None and dtype_out_time is None
  • dtype_out_vert is not empty and dtype_out_time is None
  • dtype_out_vert is None and dtype_out_time is not empty
  • dtype_out_vert is not empty and dtype_out_time is not empty

@spencerahill
Copy link
Owner Author

I won't have time in the next week to dig into this, and @micahkim23 and I agreed offline that he is now officially done for the fall term. But wanted to note briefly:

Does that mean that the keys of the OrderedDict are string concatenations of each possible combination of dtype_out_time and dtype_out_vert?

Dictionaries can take tuples as keys: {(None, 'av'): 1, ('vert_av', 'reg.ts'): 2} is totally valid. Although we may want to take a step back and think about if we should continue trying to coerce this into a dict or come up with a new data structure that's better suited.

Also wondering if perhaps we should tackle this alongside/after implementing the Reductions class, c.f. #208

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants