-
Notifications
You must be signed in to change notification settings - Fork 226
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrap surface #245
Wrap surface #245
Conversation
Initial commit for GenericMappingTools#243. Implement GMT surface function under gridding.py. Test cases checking file/matrix/vectors type inputs stored in test_surface.py. Sample dataset for tests uses newly created load_tut_ship function in datasets/tutorial.py. Note that original GMT surface module https://gmt.soest.hawaii.edu/doc/latest/surface.html requires a -Goutputfile.nc parameter. Here, the implementation of surface does not respect this -G parameter properly. Instead, it stores the output file to a GMTTempFile and returns an xarray.Dataset in Python.
Checks that the loaded pandas.DataFrame's shape and minmax summary on each column is correct.
Getting code coverage up to 100% for gridding.py.
Allow `surface` to store the output grid to a netcdf file. Will still return xarray.Dataset in python bindings.
💖 Thanks for opening this pull request! 💖 Please make sure you read our contributing guidelines and abide by our code of conduct. A few things to keep in mind:
|
@weiji14 thanks for submitting this PR! I have a few comments that need to addressed first. I wasn't sure if we should merge this yet because I'm making some big changes in the C lib wrapper and how we build the wrapper modules. But I'll merge this in as a test case for some of the changes I'm planning. Let me know if you need any help.
Yep, you can put it under the "tabular data" section. |
Renaming the function to load the @tut_ship.xyz tutorial dataset.
Using long aliases 'spacing' and 'region' instead of -I and -R respectively. Change from x, y, z to longitude, latitude, bathymetry for load_sample_bathymetry tutorial dataset. Note that the sample dataset's longitude is in 0-360 format instead of -180 to 180.
The xarray.Dataset may sometimes be lazily loaded, so we use `.load()` to force loading of the data. See e.g. https://github.com/pydata/xarray/blame/9f4474d657193f1c7c9aac25bb2edf94755a8593/doc/io.rst#L169-L170. This is important because we want to persist the data even after the tempfile is deleted.
Cosmetic changes mainly. Try to reduce cognitive complexity of the surface docstring. Also fix pylint C0411: third party import "import xarray as xr" should be placed before "import pytest" (wrong-import-order) in test_surface.py
@leouieda thanks for taking the time to review this! I've made some commits to address the issues you've highlighted, and have now added You've definitely spotted the problematic part in this comment #245 (review). I do think there needs to be a bit more discussion there on how we handle the output. I.e. do we want the gridding function to create a file only by default, or have it output an xarray.Dataset, or have some gmt-python specific flag (e.g. return_grid = True/False) that allows the user to decide. Another question is, what's a suitable alias for -G? 'outputfile' seems a bit long, 'grdfile' or 'out_grdfile' seems quite technical/specific. My preference is for 'outfile', but I'm open to suggestions. |
Using 'outfile' as long alias for the 'G' parameter in gmt surface. Created some if-then logic to return no output (None) if this outfile parameter is set, otherwise return an xarray.Dataset if outfile parameter is unset (the default result from previous code).
Ok, it's been almost a month already so I'm just going ahead with the changes suggested by @leouieda and me:
Also, I do have a jupyter notebook called P.S. Let me know if I should merge in the latest commits from master. Also, not sure what I can do about the codeclimate cognitive complexity problem. |
Hi @weiji14, sorry for the absence. I'm glad you went ahead with the changes. The code is fine and the tests seem to work well. The issue with the complexity is because of the many nested levels of the code. I don't see a good way around this for this function so we can leave it as is. The output handling is something I'm still undecided. I find it very unnatural to have all these Python functions that operate with file names for input and output. In any other library in the ecosystem, the computations happen in memory and there are dedicated functions for reading and writing files. This also minimizes the complexity of the input arguments making the code much easier to test (which is a major issue with GMT). Regarding the example, please don't commit the notebook yet. I'm trying to get sphinx-gallery working which will greatly facilitate making examples and tutorials. After that is done, then I'd welcome some help with writing gallery plots or tutorials. I'll merge this in now and we can see think of a better way to handle IO later on. Thanks for your work on this! I hope you'll consider making more contributions in the future 🥇 🎉 |
🎉🎉🎉 Congrats on merging your first pull request and welcome to the team! 🎉🎉🎉 Please open a new pull request to add yourself to the |
In terms of working on filenames, I think a good example would be the pandas library, and in fact, I did look at their I/O implementation when wrapping |
Yep. And they have no function that computes things and saves them directly to a file. You always do But yes, we'll discuss this more in another issue :)
🚀 🎉 |
Couple of breaking changes. Since GenericMappingTools/pygmt#261, the conda-forge binary package is broken (hence all the recent CI build failures...) and we have to build the GMT binary from source, sigh. The new Dockerfile code for handling this isn't nice but it'll do for now. Also, the Python `gmt-python` package has been renamed to `pygmt` in GenericMappingTools/pygmt#265, so the Pipfile is updated accordingly here. Also pinning to a newer git version on the upstream branch instead of my personal fork, since GenericMappingTools/pygmt#245 has been merged in!
Description of proposed changes
Wrapping the surface gridding function. A new load_tut_ship() dataset function was also added as an example for the tests.
Question: Do I put the API docs for surface under Data Processing (along with gmt.info and gmt.grdinfo)? Or have it under another section?
Fixes #243
Reminders
make format
andmake check
to make sure the code follows the style guide.doc/api/index.rst
.