Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow passing arguments containing spaces into pygmt functions #1487

Merged
merged 23 commits into from
Mar 13, 2022

Conversation

weiji14
Copy link
Member

@weiji14 weiji14 commented Sep 7, 2021

Description of proposed changes

To allow for spaces in arguments to PyGMT functions, this pull request modifies the build_arg_string function in pygmt/helpers/utils.py to replace space characters ( ) with the equivalent octal code (\040). This enables users to do this:

import pygmt

fig = pygmt.Figure()
fig.basemap(region=[0, 2, 4, 8], projection="X2c", frame=r"WSne+tPlot Zero")
fig.show()

Note that no nested double quotes ('"some word"') are needed anymore!

I have also removed some hacky workarounds that allowed for text inputs with spaces into:

Also updated some unit tests for fig.legend and fig.rose that had used arguments like '"Some text"'.

Special handling of functions where spaces can't just be converted to \040:

Fixes #247 🎉 🎉 🎉

Reminders

  • Run make format and make check to make sure the code follows the style guide.
  • Add tests for new features or tests that would have caught the bug that you're fixing.
  • Add new public functions/methods/classes to doc/api/index.rst.
  • Write detailed docstrings for all functions/methods.
  • If adding new functionality, add an example to docstrings or tutorials.

Slash Commands

You can write slash commands (/command) in the first line of a comment to perform
specific operations. Supported slash commands are:

  • /format: automatically format and lint the code
  • /test-gmt-dev: run full tests on the latest GMT development version

Modifying build_arg_string function to replace blank space
characters with octal code 040, and added a doctest to
check various combinations with single and double quotes
included.
Supersedes workaround for subplot's autolabel (-A)
and title (-T) parameters in
a9d167d,
4126c16, and
eadb847.
Doesn't work yet, as the filename will contain the 040 octal
code, but committing to have the diff available for review.
@weiji14 weiji14 added the bug Something isn't working label Sep 7, 2021
@weiji14 weiji14 added this to the 0.5.0 milestone Sep 7, 2021
@weiji14 weiji14 self-assigned this Sep 7, 2021
@github-actions
Copy link
Contributor

github-actions bot commented Sep 7, 2021

Summary of changed images

This is an auto-generated report of images that have changed on the DVC remote

Status Path
added pygmt/tests/baseline/test_basemap_utm_projection.png
added pygmt/tests/baseline/test_config_format_date_map.png
modified pygmt/tests/baseline/test_rose_no_sectors.png

Image diff(s)

Added images

  • pygmt/tests/baseline/test_basemap_utm_projection.png

  • pygmt/tests/baseline/test_config_format_date_map.png

Modified images

Path Old New
test_rose_no_sectors.png

Report last updated at commit a526d45

@weiji14 weiji14 changed the title Arg with space Fix problem with arguments with spaces into pygmt functions Sep 7, 2021
@weiji14 weiji14 changed the title Fix problem with arguments with spaces into pygmt functions Allow passing arguments containing spaces into pygmt functions Sep 7, 2021
Comment on lines 117 to 120
with GMTTempFile(prefix="pygmt-filename with spaces", suffix=".png") as imgfile:
fig.savefig(imgfile.name)
assert r"\040" not in os.path.abspath(imgfile.name)
assert os.path.exists(imgfile.name)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A problem with this PR is that running fig.savefig with filenames that have spaces (introduced in #1116) will produce an output like "pygmt-filename\040with\040spaces.png". This test currently passes still, but it is a false positive, because Python's os.path.exists reads "pygmt-filename\040with\040spaces.png" as ""pygmt-filename with spaces.png"".

Any ideas how to improve the test and/or implementation of psconvert?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I've manually worked around the problem in 83c8c3b so filenames with spaces will be saved as usual (without \040 in the name).

@willschlitzer
Copy link
Contributor

@weiji14 I'm assuming this won't be done by Friday? Should I bump it to a later version?

@weiji14 weiji14 modified the milestones: 0.5.0, 0.6.0 Oct 27, 2021
@weiji14
Copy link
Member Author

weiji14 commented Oct 27, 2021

@weiji14 I'm assuming this won't be done by Friday? Should I bump it to a later version?

Yes, the bugfix here needs a thorough review and more tests I think, plus I haven't figured out how to resolve #1487 (comment). So let's move it to v0.6.0.

@weiji14 weiji14 linked an issue Nov 26, 2021 that may be closed by this pull request
@seisman
Copy link
Member

seisman commented Jan 10, 2022

xref: https://forum.generic-mapping-tools.org/t/errors-in-pygmt-when-there-are-spaces-in-string-arguments/2494/3

@@ -55,7 +55,7 @@ def __init__(self, **kwargs):
self.old_defaults[key] = lib.get_default(key)

# call gmt set to change GMT defaults
arg_str = " ".join([f"{key}={value}" for key, value in kwargs.items()])
arg_str = " ".join([f'{key}="{value}"' for key, value in kwargs.items()])
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

xref: https://forum.generic-mapping-tools.org/t/errors-in-pygmt-when-there-are-spaces-in-string-arguments/2494/3

Ok, I've fixed the FORMAT_DATE_MAP="o dd" issue reported in the forum by wrapping the arguments in double quotes here in commit c29e632. Didn't use \040 because o\040dd doesn't work. The problem with this solution is that the workaround suggested in the forum (use FORMAT_DATE_MAP='"o dd"') will break in PyGMT v0.6.0, but I think that's acceptable since that workaround isn't intuitive anyway.

So that fig.savefig won't insert `\040` characters when saving filenames with spaces. Resolves problem mentioned in https://github.com/GenericMappingTools/pygmt/pull/1487/files#r703116544
pygmt/figure.py Outdated
# Manually handle prefix -F argument so spaces aren't converted to \040
# by build_arg_string function. For more information, see
# https://github.com/GenericMappingTools/pygmt/pull/1487
prefix = kwargs.pop("F")
Copy link
Member

@maxrjones maxrjones Mar 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be a check before this that prefix is provided so that the user doesn't get a KeyError here? Or you could raise GMTInvalidInput in an except statement.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Looking at the docs at https://docs.generic-mapping-tools.org/6.3/psconvert.html, it reads like prefix/-F is optional, but when I tried running psconvert without the -F option, it throws an error: psconvert [ERROR]: Modern GMT mode requires the -F option. See https://github.com/GenericMappingTools/gmt/blob/adb244afa51ca7246cc051080c9d47193087d6c2/src/psconvert.c#L1041-L1042

So since PyGMT is modern mode only, it should be fine to keep this line intact.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs say "If no input files are given, will convert the current active figure (see pygmt.figure). In this case, an output name must be given using parameter prefix." How does one provide other input files than the current figure using pygmt.Figure.psconvert?

If we chose not to support operations analogous to psconvert -Tg test.ps (which uses the same prefix as the original file name), I think we should make prefix required in the function signature or raise an invalid input exception to avoid the traceback below. I don't think it's obvious to that the error is caused by not using prefix.

image

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does one provide other input files than the current figure using pygmt.Figure.psconvert?

We can't, because it has not been implemented yet. The current fig.psconvert is a method of the pygmt.Figure() class and is tied to the fig, and can only 'psconvert' the active figure.

If we chose not to support operations analogous to psconvert -Tg test.ps (which uses the same prefix as the original file name), I think we should make prefix required in the function signature or raise an invalid input exception to avoid the traceback below. I don't think it's obvious to that the error is caused by not using prefix.

Ok, I've added a try-except block in 801ba01 to raise a GMTInvalidInput if prefix/-F is not given.

@maxrjones
Copy link
Member

I tried for a while to break it. Nice work on this complicated issue!

The two issues that I came across relate to the outgrid and outfile parameters.

Outgrid parameter

The following will produce a filename with spaces using the main branch but the filename contains "\040" with this branch:

pygmt.grdcut("@earth_relief_01d", region=[0, 1, 0, 1], outgrid="test grid output.nc")

Outfile parameter

File outputs with spaces still require using the trick of wrapping double quotes in single quotes (this could be addressed separately from this PR).

pygmt.blockmean(data="@ship_15.txt", region=[245, 255, 20, 30], spacing="5m", outfile='"Test file output.txt"') # Works
pygmt.blockmean(data="@ship_15.txt", region=[245, 255, 20, 30], spacing="5m", outfile="Test file output.txt") # Doesn't work

@weiji14
Copy link
Member Author

weiji14 commented Mar 12, 2022

I tried for a while to break it. Nice work on this complicated issue!

Thanks for testing this! I still have a nagging feeling that there's a random edge case that will break this, but we'll find out once this gets released into the wild 😈

The two issues that I came across relate to the outgrid and outfile parameters.

Outgrid parameter

The following will produce a filename with spaces using the main branch but the filename contains "\040" with this branch:

pygmt.grdcut("@earth_relief_01d", region=[0, 1, 0, 1], outgrid="test grid output.nc")

Outfile parameter

File outputs with spaces still require using the trick of wrapping double quotes in single quotes (this could be addressed separately from this PR).

pygmt.blockmean(data="@ship_15.txt", region=[245, 255, 20, 30], spacing="5m", outfile='"Test file output.txt"') # Works
pygmt.blockmean(data="@ship_15.txt", region=[245, 255, 20, 30], spacing="5m", outfile="Test file output.txt") # Doesn't work

Right, fixing this would require a workaround like the one applied to fig.savefig mentioned in #1487 (comment), but it'll be a lot of work to apply it individually to all the functions having an outgrid/outfile parameter. Maybe leave it for another time, and honestly, we really need to discourage people from having spaces in filenames in the first place.

Copy link
Member

@maxrjones maxrjones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are still a couple issues to work out as mentioned in https://github.com/GenericMappingTools/pygmt/pull/1487/files#r825371307 and #1487 (comment), but this is a big improvement that would benefit users in v0.6.0.

@maxrjones maxrjones added the final review call This PR requires final review and approval from a second reviewer label Mar 13, 2022
@seisman
Copy link
Member

seisman commented Mar 13, 2022

The one thing that makes little sense to me is shown in the following example:

import pygmt
fig = pygmt.Figure()
fig.basemap(region=[0, 2, 4, 8], projection="X10c", frame=["WSne+tPlot Zero", "xaf+lX label+uK"])
fig.show()

As you can see, I have to use "xaf+lX label+uK", not "xaf+l'X label'+uK" or 'xaf+l"X label"+uK'. For me, the last two are much easier to read.

@weiji14
Copy link
Member Author

weiji14 commented Mar 13, 2022

As you can see, I have to use "xaf+lX label+uK", not "xaf+l'X label'+uK" or 'xaf+l"X label"+uK'. For me, the last two are much easier to read.

I just tested all three examples and only "xaf+l'X label'+uK" doesn't work properly (it prints 'X label instead of X label), but that's a known issue with using single quotes. This branch should be backwards compatible with the 'xaf+l"X label"+uK' method, and it now allows for "xaf+lX label+uK" which isn't nice but works if people like that format.

@seisman
Copy link
Member

seisman commented Mar 13, 2022

As you can see, I have to use "xaf+lX label+uK", not "xaf+l'X label'+uK" or 'xaf+l"X label"+uK'. For me, the last two are much easier to read.

I just tested all three examples and only "xaf+l'X label'+uK" doesn't work properly (it prints 'X label instead of X label), but that's a known issue with using single quotes. This branch should be backwards compatible with the 'xaf+l"X label"+uK' method, and it now allows for "xaf+lX label+uK" which isn't nice but works if people like that format.

Ah, you're right. Good to know that I can still use 'xaf+l"X label"+uK'.

I'm reading how GMT parses a long text string https://github.com/GenericMappingTools/gmt/blob/434ed1848a1e8886931694b2de4c352e4c6cddae/src/gmt_parse.c#L600, it seems it ignores double quotes (that's why 'xaf+l"X label"+uK'), but doesn't ignore single quotes (so "xaf+l'X label'+uK" doesn't work).

pygmt/helpers/utils.py Outdated Show resolved Hide resolved
Comment on lines +209 to +215
if key != "J": # non-projection parameters
_value = str(kwargs[key]).replace(" ", r"\040")
else:
# special handling if key == "J" (projection)
# remove any spaces in PROJ4 string
_value = str(kwargs[key]).replace(" ", "")
gmt_args.append(rf"-{key}{_value}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know almost nothing about PROJ4, but does a PROJ4 string always contains +proj=xxx or EPSG? If yes, then we can simplify the if-test to something like:

            if key != "J" or (key == "J" and "+proj" not in value and "EPSG" not in value):
                _value = str(kwargs[key]).replace(" ", r"\040")
            else:
                # special handling if -J + PROJ4 string
                # remove any spaces in PROJ4 string
                _value = str(kwargs[key]).replace(" ", "")
            gmt_args.append(rf"-{key}{_value}")

The motivation is, since PROJ4 is not commonly used, it seems a waste of time to do call the replace method for GMT-style -J.

Copy link
Member Author

@weiji14 weiji14 Mar 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if all PROJ-strings have either +proj=xxx or EPSG (see https://proj.org/operations/projections/index.html), but they probably do? I know that there's +init=EPSG:xxx which kinda works when I tried here, but that's actually deprecated syntax (see https://pyproj4.github.io/pyproj/stable/gotchas.html#init-auth-auth-code-should-be-replaced-with-auth-auth-code) and probably not something we want to allow.

Also unsure if 2 extra if-checks saves time, considering that either way, a str.replace still needs to happen for the space character - either to \040 or to nothing ``.

@weiji14
Copy link
Member Author

weiji14 commented Mar 13, 2022

Thanks team! I'll merge this in and close that almost 4 year old #247 bug!!

@weiji14 weiji14 merged commit b173c69 into main Mar 13, 2022
@weiji14 weiji14 deleted the arg_with_space branch March 13, 2022 15:46
@weiji14 weiji14 removed the final review call This PR requires final review and approval from a second reviewer label Mar 13, 2022
sixy6e pushed a commit to sixy6e/pygmt that referenced this pull request Dec 21, 2022
…icMappingTools#1487)

* Replace spaces in arguments with octal code 040

Modifying build_arg_string function to replace blank space
characters with octal code 040, and added a doctest to
check various combinations with single and double quotes
included.

* Remove workarounds for spaces in fig.subplot's autolabel and title args

Supersedes workaround for subplot's autolabel (-A)
and title (-T) parameters in
a9d167d,
4126c16, and
eadb847.

* Remove workaround for spaces in fig.text's -F argument
* Remove double quotes around legend label test examples
* Edit test_rose_no_sectors to remove single quotes from title
* Remove workaround for spaces in fig.psconvert prefix

Doesn't work yet, as the filename will contain the 040 octal
code, but committing to have the diff available for review.

* Ensure spaces in pygmt.config arguments can work

Also added a regression test for
FORMAT_DATE_MAP="o dd".

* Manually handle prefix -F in psconvert

So that fig.savefig won't insert `\040` characters when saving filenames
with spaces. Resolves problem mentioned in
https://github.com/GenericMappingTools/pygmt/pull/1487/files#r703116544

* Handle PROJ4 strings with spaces

Instead of converting spaces to \040 in proj4 strings, just remove them directly.
Added parametrized unit tests to basemap and grdproject to check that it works.

* Use Modifier Letter Colon instead of regular colon to fix WIndows tests

Adapted from https://stackoverflow.com/questions/10386344/how-to-get-a-file-in-windows-with-a-colon-in-the-filename/25477235#25477235.

* Try using underscore instead of Modifier Letter Colon
* Raise GMTInvalidInput if no prefix argument is passed to psconvert

Co-authored-by: Dongdong Tian <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Dynamic legend labels with space in it Arguments with spaces in strings
4 participants