Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dl tutorial files to tmp directory, then move them once successful #1393

Merged
merged 8 commits into from
May 21, 2017

Conversation

gidden
Copy link
Contributor

@gidden gidden commented May 2, 2017

closes #1392

  • tests added / passed
  • passes git diff upstream/master | flake8 --diff
  • whatsnew entry

I'm not sure how to best add tests for this. Also not sure how best to import shutil. I followed the current pattern for os, but am not sure why that pattern exists in this file. I can clean up further if this is deemed useful.

if not _os.path.exists(tmpfile):
raise ValueError('File could not be downloaded, please try again')

_shutil.move(tmpfile, localfile)
Copy link
Member

@fmaussion fmaussion May 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that the issue wasn't that the file isn't downloaded, but rather that it is incompletely downloaded? Will this new temporary step solve this?

Is there a way to make urlretrieve more robust, with checksums or something?

@gidden
Copy link
Contributor Author

gidden commented May 2, 2017 via email

@gidden
Copy link
Contributor Author

gidden commented May 2, 2017

Successful locally with pydata/xarray-data#9:

In [3]: xr.tutorial.load_dataset('air_temperature', github_url='https://github.com/gidden/xarray-data', branch='md5')
Out[3]: 
<xarray.Dataset>
Dimensions:  (lat: 25, lon: 53, time: 2920)
Coordinates:
  * lat      (lat) float32 75.0 72.5 70.0 67.5 65.0 62.5 60.0 57.5 55.0 52.5 ...
  * lon      (lon) float32 200.0 202.5 205.0 207.5 210.0 212.5 215.0 217.5 ...
  * time     (time) datetime64[ns] 2013-01-01 2013-01-01T06:00:00 ...
Data variables:
    air      (time, lat, lon) float64 241.2 242.5 243.5 244.0 244.1 243.9 ...
Attributes:
    Conventions:  COARDS
    title:        4x daily NMC reanalysis (1948)
    description:  Data is from NMC initialized reanalysis\n(4x/day).  These a...
    platform:     Model
    references:   http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly...

@shoyer
Copy link
Member

shoyer commented May 2, 2017

Thanks for taking this on @gidden.

Also not sure how best to import shutil. I followed the current pattern for os, but am not sure why that pattern exists in this file.

I think this exists to work around the fact that IPython shows everything that isn't private in auto-complete (ipython/ipykernel#129). But we should probably switch back to normal imports in this module -- this is really more of an IPython issue.

@gidden
Copy link
Contributor Author

gidden commented May 2, 2017 via email

Copy link
Member

@shoyer shoyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's kind of ridiculous that we have to implement this logic ourselves, but this looks good to me.

@@ -18,9 +20,17 @@
_default_cache_dir = _os.sep.join(('~', '.xarray_tutorial_data'))


def _md5(fname):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Call this something more descriptive like check_file_md5_hash?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should it remain private (for ipython completion) or no?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't stress about that. Given that we use it in the script in xarray-data, it's fine either way for me.

@gidden
Copy link
Contributor Author

gidden commented May 3, 2017

ok, this should be good to go

@shoyer
Copy link
Member

shoyer commented May 3, 2017

Thanks @gidden, this look great to me. Could you kindly add a brief note to the "Bug fixes" section of whats-new.rst?

@shoyer
Copy link
Member

shoyer commented May 3, 2017

I'd be slightly more comfortable if we had this being unit tested on Travis-CI, but that's somewhat tricky to setup because we don't wan to require network access for the test suite by default. You could do this by adding a flag like --run-network-tests in conftest.py, and then hooking it through in xarray/tests/__init__.py like --run-flaky. I would be awesome if you could take a look at that and set the Travis/Appveyor tests to test the tutorial, but isn't strictly required to merge this.

@gidden gidden force-pushed the tutorial-tmp-files branch from 80b6737 to 440b6da Compare May 3, 2017 16:49
@gidden
Copy link
Contributor Author

gidden commented May 3, 2017

Hey @shoyer, if I have time later I will try to address the unit test issue. If I'm too late, feel free to pull this in the meantime.

@gidden gidden force-pushed the tutorial-tmp-files branch from 440b6da to 349099a Compare May 21, 2017 14:55
@gidden
Copy link
Contributor Author

gidden commented May 21, 2017

hey @shoyer, @fmaussion. test decorator added with tutorial dataset unit test. i think this is good to go.

Copy link
Member

@shoyer shoyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one small fix, otherwise looks good to me -- thanks!


def setUp(self):
self.testfile = 'tiny'
self.testfilepath = os.path.expanduser(os.sep.join(
('~', '.xarray_tutorial_data', self.testfile)))
with suppress(OSError):
os.remove(self.testfilepath)
os.remove('{}.nc'.format(self.testfilepath))
os.remove('{}.md5'.format(self.testfilepath))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should go in its own suppress block

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done!

@shoyer shoyer merged commit 3737d26 into pydata:master May 21, 2017
@shoyer
Copy link
Member

shoyer commented May 21, 2017

thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

tutorial datasets can be corrupted on dl with bad connection
3 participants