Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrap the GMT API function GMT_Read_Data to read data into GMT data containers #3324

Merged
merged 24 commits into from
Jul 19, 2024

Conversation

seisman
Copy link
Member

@seisman seisman commented Jul 10, 2024

Description of proposed changes

See #3318 for the motivations.

This PR wraps the GMT API functions GMT_Read_Data for reading data file into GMT data containers. Currently, only GMT_DATASET and GMT_GRID are supported, other data containers will be supported later.

The first two commits 5a56ba8 and 7ccf641 are cherry-picked from PR #3318.

Preview at https://pygmt-dev--3324.org.readthedocs.build/en/3324/api/generated/pygmt.clib.Session.read_data.html

Reminders

  • Run make format and make check to make sure the code follows the style guide.
  • Add tests for new features or tests that would have caught the bug that you're fixing.
  • Add new public functions/methods/classes to doc/api/index.rst.
  • Write detailed docstrings for all functions/methods.
  • If wrapping a new module, open a 'Wrap new GMT module' issue and submit reasonably-sized PRs.
  • If adding new functionality, add an example to docstrings or tutorials.
  • Use underscores (not hyphens) in names of Python files and directories.

Slash Commands

You can write slash commands (/command) in the first line of a comment to perform
specific operations. Supported slash command is:

  • /format: automatically format and lint the code

@seisman seisman added the feature Brand new feature label Jul 10, 2024
@seisman seisman added this to the 0.13.0 milestone Jul 10, 2024
Comment on lines 87 to 96
data_ptr = lib.read_data(infile, kind="grid", mode="GMT_CONTAINER_ONLY")
grid = data_ptr.contents
header = grid.header.contents
assert header.n_rows == 14
assert header.n_columns == 8
assert header.wesn[:] == [-55.0, -47.0, -24.0, -10.0]
assert header.z_min == 190.0
assert header.z_max == 981.0
assert header.n_bands == 1 # Explicitely check n_bands
assert not grid.data # The data is not read yet
Copy link
Member Author

@seisman seisman Jul 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't do xrgrid = data_ptr.contents.to_dataarray() here because data is not read yet, so grid.data is NULL. Maybe we should modify the GMT_GRID.to_dataarray() method to fill the DataArray with np.empty() if grid.data is not read yet?

Edit: I think it's not necessary after thinking twice.

@seisman seisman marked this pull request as ready for review July 11, 2024 03:56
@seisman seisman added the needs review This PR has higher priority and needs review. label Jul 11, 2024
pygmt/clib/session.py Outdated Show resolved Hide resolved
@seisman seisman requested a review from a team July 17, 2024 06:15
pygmt/clib/session.py Outdated Show resolved Hide resolved
pygmt/tests/test_clib_read_data.py Outdated Show resolved Hide resolved
pygmt/tests/test_clib_read_data.py Outdated Show resolved Hide resolved
Comment on lines 1101 to 1103
mode
How the data is to be read from the file. This option varies depending on
the given family. See the GMT API documentation for details.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you link to where in the GMT API documentation what options there are for mode?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pygmt/tests/test_clib_read_data.py Outdated Show resolved Hide resolved
pygmt/tests/test_clib_read_data.py Outdated Show resolved Hide resolved
kind: Literal["dataset", "grid"],
family: str | None = None,
geometry: str | None = None,
mode: str = "GMT_READ_NORMAL",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking on whether to make this an enum or use Literal instead of a str. Looking at https://docs.generic-mapping-tools.org/6.5/devdocs/api.html#import-from-a-file-stream-or-handle, it seems like it's possible to pass in mode as one of the variants of GMT_enum_container or GMT_enum_gridio? It also looks like mode can be None (or unset) if reading from text or table.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the documentation is incomplete, and mode can accept specific values plus some modifiers, making it impossible to list all of them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that what I thought too, there might be other options we's unaware of. That said, should we allow for mode=None (or whatever works when we don't want to set a mode)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the GMT C API, mode has an unsigned int type, so the mode argument must be an integer number and can't be None. The C enum GMT_READ_NORMAL has a value of 0 (https://github.com/GenericMappingTools/gmt/blob/3e9fcb1f54e4e93fc1e309e0be6e2288eb79f3d2/src/gmt_resources.h#L244), which is also the default read mode.

We can define mode=None, and add one line code like:

if mode == None:
   mode = "GMT_READ_NORMAL"

but I feel it's not as good as setting GMT_READ_NORMAL as the default.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok then, then the GMT_READ_NORMAL default here makes sense. Maybe good to mention it in the docstring.

Copy link
Member

@weiji14 weiji14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @seisman, just one additional suggestion on documenting the default mode="GMT_READ_NORMAL". Also need to fix the W505 doc-line-too-long lint error before merging.

kind: Literal["dataset", "grid"],
family: str | None = None,
geometry: str | None = None,
mode: str = "GMT_READ_NORMAL",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok then, then the GMT_READ_NORMAL default here makes sense. Maybe good to mention it in the docstring.

pygmt/clib/session.py Outdated Show resolved Hide resolved
@seisman seisman removed the needs review This PR has higher priority and needs review. label Jul 19, 2024
@seisman seisman merged commit 917b3aa into main Jul 19, 2024
18 of 19 checks passed
@seisman seisman deleted the gmt-read-data branch July 19, 2024 06:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Brand new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants