Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix llcreader klevels bug #225

Merged
merged 6 commits into from
Nov 13, 2020
Merged

Conversation

rabernat
Copy link
Member

Possible fix for #224.

Will travis even run on this repo any more? It seems broken in all our other repos.

@rabernat rabernat changed the title empty commit to trigger CI Fix llcreader klevels bug Sep 28, 2020
@timothyas
Copy link
Member

I think travis is not working here just because the fsspec seems to require aiohttp, which needs to be in the conda environment. I had to add this to the environment files in #223 e.g. here

@rabernat rabernat force-pushed the fix-llcreader-klevels-bug branch from 96d1738 to 12f7419 Compare September 30, 2020 17:31
@rabernat
Copy link
Member Author

This did not immediately reproduce the error reported in #224 (although it did catch a failure for python 2.7). We will have to dig deeper.

iter_stop=iter_stop,
read_grid=False
)
ds_vel.sum().compute()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fraserwg and @urielz: I have added the above test to check for the errors you reported in #224 and #233. The test is passing even with the latest master version of xarray.

I was wondering if you can help me figure out what is going on here. Could you try running the test suite in your local environment? You will have to clone my branch. For example:

git clone -b fix-llcreader-klevels-bug https://github.com/rabernat/xmitgcm.git
cd xmitgcm
pip install -e .
py.test -v

@fraserwg
Copy link
Contributor

fraserwg commented Nov 9, 2020

I've tried running your test suite in my environment and everything passes. I've found that the error seems to be a result of a squeeze operation which I omitted from my initial bug report.

The following (modified) test function, however, fails:

@pytest.mark.slow
def test_llc4320_klevels_bug(ecco_portal_model):
    # just get three timesteps
    iter_stop = ecco_portal_model.iter_start + ecco_portal_model.iter_step + 1
    ds_vel = ecco_portal_model.get_dataset(
        varnames=['U', 'V'],
        type='latlon',
        k_levels=[1],
        iter_stop=iter_stop,
        read_grid=True
    )
    ds_vel.squeeze().sum().compute()

It does not fail if read_grid=False, suggesting the behaviour may still be inconsistent?

@rabernat
Copy link
Member Author

rabernat commented Nov 9, 2020

It does not fail if read_grid=False, suggesting the behaviour may still be inconsistent?

Ok this is a useful clue. Thanks for checking.

@rabernat rabernat requested a review from timothyas November 12, 2020 16:51
@@ -533,12 +533,12 @@ def _get_kp1_levels(self,k_levels):
# determine kp1 levels
# get borders to all k (center) levels
# ki used to get Zu, Zl later
ku = np.concatenate([k_levels[1:],[k_levels[-1]+1]])
ku = k_levels[1:] + [k_levels[-1] + 1 ]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note, the original code was producing a float array (the cause of #224). Avoiding numpy ensures everything is int.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, that is pretty straightforward

@rabernat
Copy link
Member Author

By adding a more comprehensive test, I discovered a couple of new bugs. So I ended up refactoring some of the internal logic.

This was referenced Nov 12, 2020
@rabernat rabernat force-pushed the fix-llcreader-klevels-bug branch from c884677 to 213a25f Compare November 12, 2020 17:59
Comment on lines +593 to +602
def _key_and_task(n_k, these_klevels, n_iter=None, iternum=None):
if n_iter is None:
key = name, n_k, 0, 0, 0
else:
key = name, n_iter, n_k, 0, 0, 0
task = (_get_facet_chunk, self.store, varname, iternum,
nfacet, these_klevels, self.nx, self.nz, self.dtype,
self.mask_override)
return key, task

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so clean... 👍

Copy link
Member

@timothyas timothyas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me @rabernat, thanks!

@@ -436,7 +436,8 @@ def _get_facet_chunk(store, varname, iternum, nfacet, klevels, nx, nz, dtype,
data.shape = facet_shape
level_data.append(data)

return np.concatenate(level_data, axis=1)
out = np.concatenate(level_data, axis=-4)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this the line causing "extra bugs?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. We were getting errors about grid variables having the wrong shape on computation. It's because the concatenation axis is supposed to be the vertical axis. This line assumed the vertical axis was axis 1, but that is not the case if the array has no time dimensions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me, nice catch!

@rabernat rabernat merged commit 8341ec9 into MITgcm:master Nov 13, 2020
fraserwg pushed a commit to fraserwg/xmitgcm that referenced this pull request Nov 23, 2021
* empty commit to trigger CI

* add test for k_levels=[1]

* add explicit test for MITgcm#233

* test for bug in llc90

* fixes MITgcm#224

* resolve final bugs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants