-
Notifications
You must be signed in to change notification settings - Fork 224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor plot and plot3d to use virtualfile_from_data #990
Conversation
Added an additional `extra_arrays` parameter to the `virtualfile_from_data` function to accept optional numpy arrays from the plot and plot3d functions.
More efficient to pass in whole 2D numpy array matrix as a virtualfile to GMT, and this fixes the segmentation fault crash on test_plot3d_matrix_color when the data was passed in via virtualfile_from_vectors instead.
pygmt/clib/session.py
Outdated
elif kind == "matrix": # turn 2D arrays into list of vectors | ||
try: | ||
# pandas.DataFrame and xarray.Dataset types | ||
_data = [array for _, array in data.items()] | ||
except AttributeError: | ||
# Python lists, tuples, and numpy ndarray types | ||
_data = np.atleast_2d(np.asanyarray(data).T) | ||
try: | ||
# Just use virtualfile_from_matrix for 2D | ||
# numpy.ndarray which are not datetime (M) types | ||
assert data.ndim == 2 and not data.dtype.kind == "M" | ||
_virtualfile_from = self.virtualfile_from_matrix | ||
_data = (data,) | ||
except (AssertionError, AttributeError): | ||
# Python lists, tuples, and numpy ndarray types | ||
_data = np.atleast_2d(np.asanyarray(data).T) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would appreciate advice on improving/refactoring this chunk of code. I'm half thinking whether to move some of the logic to pygmt/helpers/utils.py
, i.e. have a new 'kind' besides file/grid/matrix/vectors:
Lines 65 to 72 in 65b099a
if isinstance(data, str): | |
kind = "file" | |
elif isinstance(data, xr.DataArray): | |
kind = "grid" | |
elif data is not None: | |
kind = "matrix" | |
else: | |
kind = "vectors" |
Or we could just keep things like this as it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't read the codes carefully, but I think the "matrix" kind is quite confusing.
The GMT API function GMT_Put_Matrix()
, the PyGMT wrapper put_matrix()
and the virtualfile function virtualfile_from_matrix()
all require a simple 2d matrix with a single dtype (e.g., np.float
or np.double
).
However, currently, data types like pandas.DataFrame are also "matrix". So we have to check the data types to choose either virtualfile_from_vector
or virtualfile_from_matrix
.
I agree with you that we can/shoudl add a new kind to distinguish between a GMT-compatible "matrix" and a more complicated data structure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good.
pygmt/clib/session.py
Outdated
@@ -1378,6 +1380,9 @@ def virtualfile_from_data(self, check_kind=None, data=None, x=None, y=None, z=No | |||
raster grid, a vector matrix/arrays, or other supported data input. | |||
x/y/z : 1d arrays or None | |||
x, y and z columns as numpy arrays. | |||
extra_arrays : list |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extra_arrays : list | |
extra_arrays : list of 1d arrays |
pygmt/clib/session.py
Outdated
try: | ||
# Just use virtualfile_from_matrix for 2D | ||
# numpy.ndarray which are not datetime (M) types | ||
assert data.ndim == 2 and not data.dtype.kind == "M" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert data.ndim == 2 and not data.dtype.kind == "M"
The GMT_Put_Matrix()
function only supports a few numeric data types, but numpy.dtype.kind can have values other than M
, for example, data.dtype.kind == "O"
will pass the assert statement but is not supported by virtualfile_from_matrix
. Perhaps check data.dtype.kind in 'iuf'
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I'll try data.dtype.kind in 'iuf'
and see if it works on the test suite.
One more comment: perhaps we should add a new test to make sure #1021 is fixed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK to merge after all tests pass and the styling issue is fixed.
Co-Authored-By: Dongdong Tian <[email protected]>
dbc8ded
to
b0ab97d
Compare
…Tools#990) Added an additional `extra_arrays` parameter to the `virtualfile_from_data` function to accept optional numpy arrays from the plot and plot3d functions. * Just use virtualfile_from_matrix for non-datetime 2D numpy arrays More efficient to pass in whole 2D numpy array matrix as a virtualfile to GMT, and this fixes the segmentation fault crash on test_plot3d_matrix_color when the data was passed in via virtualfile_from_vectors instead. * Add docstring on extra_arrays parameter in virtualfile_from_data * Use virtualfile_from_matrix on int/uint/float types and add a test Co-authored-by: Dongdong Tian <[email protected]>
Description of proposed changes
Added an additional
extra_arrays
parameter to thevirtualfile_from_data
function to accept optional numpy arrays from the plot and plot3d functions.TODO: May need to tweak theEdit: see comment at #990 (comment)data_kind
function a little, to outputmatrix
for true numpy array matrices andvectors
for numpy arrays with mixed dtypes. This is to fix a segmentation fault ontest_plot3d_matrix_color
.Part of #949. Once this is done, it will simplify the
geopandas
integration (#608) (which will be done in a follow-up Pull Request at #1000).Also fixes #1021
Reminders
make format
andmake check
to make sure the code follows the style guide.doc/api/index.rst
.Slash Commands
You can write slash commands (
/command
) in the first line of a comment to performspecific operations. Supported slash commands are:
/format
: automatically format and lint the code/test-gmt-dev
: run full tests on the latest GMT development version