Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Two tests fail on macOS suddenly due to pandas + pyarrow #3732

Closed
seisman opened this issue Dec 30, 2024 · 1 comment
Closed

Two tests fail on macOS suddenly due to pandas + pyarrow #3732

seisman opened this issue Dec 30, 2024 · 1 comment
Labels
upstream Bug or missing feature of upstream core GMT
Milestone

Comments

@seisman
Copy link
Member

seisman commented Dec 30, 2024

We have two sudden new failures on macOS in the latest scheduled CI runs (see https://github.com/GenericMappingTools/pygmt/actions/runs/12540348940), while the CI runs yesterday worked https://github.com/GenericMappingTools/pygmt/actions/runs/12532978920.

=================================== FAILURES ===================================
___________________ test_vectors_to_arrays_pyarrow_datetime ____________________

    @pytest.mark.skipif(not _HAS_PYARROW, reason="pyarrow is not installed.")
    def test_vectors_to_arrays_pyarrow_datetime():
        """
        Test the vectors_to_arrays function with pyarrow arrays containing date32/date64
        types.
        """
        vectors = [
>           pd.Series(
                data=[datetime.date(2020, 1, 1), datetime.date(2021, 12, 31)],
                dtype="date32[day][pyarrow]",
            ),
            pd.Series(
                data=[datetime.date(2022, 1, 1), datetime.date(2023, 12, 31)],
                dtype="date64[ms][pyarrow]",
            ),
        ]

../pygmt/tests/test_clib_vectors_to_arrays.py:79: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../../../../micromamba/envs/pygmt/lib/python3.11/site-packages/pandas/core/series.py:428: in __init__
    dtype = self._validate_dtype(dtype)
../../../../micromamba/envs/pygmt/lib/python3.11/site-packages/pandas/core/generic.py:458: in _validate_dtype
    dtype = pandas_dtype(dtype)
../../../../micromamba/envs/pygmt/lib/python3.11/site-packages/pandas/core/dtypes/common.py:1679: in pandas_dtype
    result = registry.find(dtype)
../../../../micromamba/envs/pygmt/lib/python3.11/site-packages/pandas/core/dtypes/base.py:521: in find
    return dtype_type.construct_from_string(dtype)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

cls = <class 'pandas.core.arrays.arrow.dtype.ArrowDtype'>
string = 'date32[day][pyarrow]'

    @classmethod
    def construct_from_string(cls, string: str) -> ArrowDtype:
        """
        Construct this type from a string.
    
        Parameters
        ----------
        string : str
            string should follow the format f"{pyarrow_type}[pyarrow]"
            e.g. int64[pyarrow]
        """
        if not isinstance(string, str):
            raise TypeError(
                f"'construct_from_string' expects a string, got {type(string)}"
            )
        if not string.endswith("[pyarrow]"):
            raise TypeError(f"'{string}' must end with '[pyarrow]'")
        if string == "string[pyarrow]":
            # Ensure Registry.find skips ArrowDtype to use StringDtype instead
            raise TypeError("string[pyarrow] should be constructed by StringDtype")
    
        base_type = string[:-9]  # get rid of "[pyarrow]"
        try:
>           pa_dtype = pa.type_for_alias(base_type)
E           NameError: name 'pa' is not defined

../../../../micromamba/envs/pygmt/lib/python3.11/site-packages/pandas/core/arrays/arrow/dtype.py:218: NameError
_____________________ test_virtualfile_from_vectors_pandas _____________________

dtypes_pandas = (<class 'numpy.int8'>, <class 'numpy.int16'>, <class 'numpy.int32'>, <class 'numpy.int64'>, <class 'numpy.longlong'>, <class 'numpy.uint8'>, ...)

    def test_virtualfile_from_vectors_pandas(dtypes_pandas):
        """
        Pass vectors to a dataset using pandas.Series, checking both numpy and pyarrow
        dtypes.
        """
        size = 13
    
        for dtype in dtypes_pandas:
>           data = pd.DataFrame(
                data={
                    "x": np.arange(size),
                    "y": np.arange(size, size * 2, 1),
                    "z": np.arange(size * 2, size * 3, 1),
                },
                dtype=dtype,
            )

../pygmt/tests/test_clib_virtualfile_from_vectors.py:157: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../../../../micromamba/envs/pygmt/lib/python3.11/site-packages/pandas/core/frame.py:650: in __init__
    dtype = self._validate_dtype(dtype)
../../../../micromamba/envs/pygmt/lib/python3.11/site-packages/pandas/core/generic.py:458: in _validate_dtype
    dtype = pandas_dtype(dtype)
../../../../micromamba/envs/pygmt/lib/python3.11/site-packages/pandas/core/dtypes/common.py:1679: in pandas_dtype
    result = registry.find(dtype)
../../../../micromamba/envs/pygmt/lib/python3.11/site-packages/pandas/core/dtypes/base.py:521: in find
    return dtype_type.construct_from_string(dtype)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

cls = <class 'pandas.core.arrays.arrow.dtype.ArrowDtype'>
string = 'int8[pyarrow]'

    @classmethod
    def construct_from_string(cls, string: str) -> ArrowDtype:
        """
        Construct this type from a string.
    
        Parameters
        ----------
        string : str
            string should follow the format f"{pyarrow_type}[pyarrow]"
            e.g. int64[pyarrow]
        """
        if not isinstance(string, str):
            raise TypeError(
                f"'construct_from_string' expects a string, got {type(string)}"
            )
        if not string.endswith("[pyarrow]"):
            raise TypeError(f"'{string}' must end with '[pyarrow]'")
        if string == "string[pyarrow]":
            # Ensure Registry.find skips ArrowDtype to use StringDtype instead
            raise TypeError("string[pyarrow] should be constructed by StringDtype")
    
        base_type = string[:-9]  # get rid of "[pyarrow]"
        try:
>           pa_dtype = pa.type_for_alias(base_type)
E           NameError: name 'pa' is not defined

There are no changes in the PyGMT source codes, and the environment difference seems irrelevant:

diff old.txt new.txt
47c47
<     coverage                          7.6.9         py311h4921393_0          conda-forge
---
>     coverage                          7.6.10        py311h4921393_0          conda-forge
59c59
<     fonttools                         4.55.3        py311h4921393_0          conda-forge
---
>     fonttools                         4.55.3        py311h4921393_1          conda-forge
62c62
<     gdal                              3.10.0        py311h6d86783_10         conda-forge
---
>     gdal                              3.10.0        py311h32e851c_13         conda-forge
95c95
<     libabseil                         20240722.0    cxx17_hf9b8971_1         conda-forge
---
>     libabseil                         20240722.0    cxx17_h07bc746_2         conda-forge
114,115c114,115
<     libgdal-core                      3.10.0        hcf82b6a_10              conda-forge
<     libgdal-jp2openjpeg               3.10.0        h4ea06f0_10              conda-forge
---
>     libgdal-core                      3.10.0        h9ef0d2d_13              conda-forge
>     libgdal-jp2openjpeg               3.10.0        h5de94d9_13              conda-forge
122c122
<     libheif                           1.18.2        gpl_he913df3_100         conda-forge
---
>     libheif                           1.19.5        gpl_h297b2c4_100         conda-forge
136c136
<     libspatialindex                   2.0.0         h00cdb27_0               conda-forge
---
>     libspatialindex                   2.1.0         h57eeb1c_0               conda-forge
205c205
<     rtree                             1.3.0         py311hc46b6d3_2          conda-forge
---
>     rtree                             1.3.0         py311heb40887_3          conda-forge
215c215
<     sphinx-gallery                    0.18.0        pyhd8ed1ab_0             conda-forge
---
>     sphinx-gallery                    0.18.0        pyhd8ed1ab_1             conda-forge
246c246
<     zstd                              1.5.6         hb46c0d2_0               conda-forge
---
>     zstd                              1.5.6         hb46c0d2_0               conda-forge

A similar issue was reported to the pandas repository pandas-dev/pandas#60573 and that issue also happens on macOS.

@seisman
Copy link
Member Author

seisman commented Jan 1, 2025

@seisman seisman closed this as completed Jan 1, 2025
@seisman seisman added this to the 0.15.0 milestone Jan 1, 2025
@seisman seisman added the upstream Bug or missing feature of upstream core GMT label Jan 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
upstream Bug or missing feature of upstream core GMT
Projects
None yet
Development

No branches or pull requests

1 participant