Skip to content

Commit

Permalink
DEPR: fix stacklevel for DataFrame(mgr) deprecation (#55591)
Browse files Browse the repository at this point in the history
  • Loading branch information
jorisvandenbossche authored Nov 9, 2023
1 parent a167f13 commit 775f716
Show file tree
Hide file tree
Showing 22 changed files with 134 additions and 215 deletions.
2 changes: 0 additions & 2 deletions doc/source/user_guide/10min.rst
Original file line number Diff line number Diff line change
Expand Up @@ -763,14 +763,12 @@ Parquet
Writing to a Parquet file:

.. ipython:: python
:okwarning:
df.to_parquet("foo.parquet")
Reading from a Parquet file Store using :func:`read_parquet`:

.. ipython:: python
:okwarning:
pd.read_parquet("foo.parquet")
Expand Down
11 changes: 0 additions & 11 deletions doc/source/user_guide/io.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2247,7 +2247,6 @@ For line-delimited json files, pandas can also return an iterator which reads in
Line-limited json can also be read using the pyarrow reader by specifying ``engine="pyarrow"``.

.. ipython:: python
:okwarning:
from io import BytesIO
df = pd.read_json(BytesIO(jsonl.encode()), lines=True, engine="pyarrow")
Expand Down Expand Up @@ -5372,15 +5371,13 @@ See the documentation for `pyarrow <https://arrow.apache.org/docs/python/>`__ an
Write to a parquet file.

.. ipython:: python
:okwarning:
df.to_parquet("example_pa.parquet", engine="pyarrow")
df.to_parquet("example_fp.parquet", engine="fastparquet")
Read from a parquet file.

.. ipython:: python
:okwarning:
result = pd.read_parquet("example_fp.parquet", engine="fastparquet")
result = pd.read_parquet("example_pa.parquet", engine="pyarrow")
Expand All @@ -5390,7 +5387,6 @@ Read from a parquet file.
By setting the ``dtype_backend`` argument you can control the default dtypes used for the resulting DataFrame.

.. ipython:: python
:okwarning:
result = pd.read_parquet("example_pa.parquet", engine="pyarrow", dtype_backend="pyarrow")
Expand All @@ -5404,7 +5400,6 @@ By setting the ``dtype_backend`` argument you can control the default dtypes use
Read only certain columns of a parquet file.

.. ipython:: python
:okwarning:
result = pd.read_parquet(
"example_fp.parquet",
Expand Down Expand Up @@ -5433,7 +5428,6 @@ Serializing a ``DataFrame`` to parquet may include the implicit index as one or
more columns in the output file. Thus, this code:

.. ipython:: python
:okwarning:
df = pd.DataFrame({"a": [1, 2], "b": [3, 4]})
df.to_parquet("test.parquet", engine="pyarrow")
Expand All @@ -5450,7 +5444,6 @@ If you want to omit a dataframe's indexes when writing, pass ``index=False`` to
:func:`~pandas.DataFrame.to_parquet`:

.. ipython:: python
:okwarning:
df.to_parquet("test.parquet", index=False)
Expand All @@ -5473,7 +5466,6 @@ Partitioning Parquet files
Parquet supports partitioning of data based on the values of one or more columns.

.. ipython:: python
:okwarning:
df = pd.DataFrame({"a": [0, 0, 1, 1], "b": [0, 1, 0, 1]})
df.to_parquet(path="test", engine="pyarrow", partition_cols=["a"], compression=None)
Expand Down Expand Up @@ -5539,14 +5531,12 @@ ORC format, :func:`~pandas.read_orc` and :func:`~pandas.DataFrame.to_orc`. This
Write to an orc file.

.. ipython:: python
:okwarning:
df.to_orc("example_pa.orc", engine="pyarrow")
Read from an orc file.

.. ipython:: python
:okwarning:
result = pd.read_orc("example_pa.orc")
Expand All @@ -5555,7 +5545,6 @@ Read from an orc file.
Read only certain columns of an orc file.

.. ipython:: python
:okwarning:
result = pd.read_orc(
"example_pa.orc",
Expand Down
3 changes: 0 additions & 3 deletions doc/source/user_guide/pyarrow.rst
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,6 @@ To convert a :external+pyarrow:py:class:`pyarrow.Table` to a :class:`DataFrame`,
:external+pyarrow:py:meth:`pyarrow.Table.to_pandas` method with ``types_mapper=pd.ArrowDtype``.

.. ipython:: python
:okwarning:
table = pa.table([pa.array([1, 2, 3], type=pa.int64())], names=["a"])
Expand Down Expand Up @@ -165,7 +164,6 @@ functions provide an ``engine`` keyword that can dispatch to PyArrow to accelera
* :func:`read_feather`

.. ipython:: python
:okwarning:
import io
data = io.StringIO("""a,b,c
Expand All @@ -180,7 +178,6 @@ PyArrow-backed data by specifying the parameter ``dtype_backend="pyarrow"``. A r
``engine="pyarrow"`` to necessarily return PyArrow-backed data.

.. ipython:: python
:okwarning:
import io
data = io.StringIO("""a,b,c,d,e,f,g,h,i
Expand Down
3 changes: 0 additions & 3 deletions doc/source/user_guide/scale.rst
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,6 @@ To load the columns we want, we have two options.
Option 1 loads in all the data and then filters to what we need.

.. ipython:: python
:okwarning:
columns = ["id_0", "name_0", "x_0", "y_0"]
Expand All @@ -60,7 +59,6 @@ Option 1 loads in all the data and then filters to what we need.
Option 2 only loads the columns we request.

.. ipython:: python
:okwarning:
pd.read_parquet("timeseries_wide.parquet", columns=columns)
Expand Down Expand Up @@ -202,7 +200,6 @@ counts up to this point. As long as each individual file fits in memory, this wi
work for arbitrary-sized datasets.

.. ipython:: python
:okwarning:
%%time
files = pathlib.Path("data/timeseries/").glob("ts*.parquet")
Expand Down
1 change: 0 additions & 1 deletion doc/source/whatsnew/v2.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -152,7 +152,6 @@ When this keyword is set to ``"pyarrow"``, then these functions will return pyar
* :meth:`Series.convert_dtypes`

.. ipython:: python
:okwarning:
import io
data = io.StringIO("""a,b,c,d,e,f,g,h,i
Expand Down
2 changes: 1 addition & 1 deletion pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -697,7 +697,7 @@ def __init__(
"is deprecated and will raise in a future version. "
"Use public APIs instead.",
DeprecationWarning,
stacklevel=find_stack_level(),
stacklevel=1, # bump to 2 once pyarrow 15.0 is released with fix
)

if using_copy_on_write():
Expand Down
8 changes: 4 additions & 4 deletions pandas/core/series.py
Original file line number Diff line number Diff line change
Expand Up @@ -407,7 +407,7 @@ def __init__(
"is deprecated and will raise in a future version. "
"Use public APIs instead.",
DeprecationWarning,
stacklevel=find_stack_level(),
stacklevel=2,
)
if using_copy_on_write():
data = data.copy(deep=False)
Expand Down Expand Up @@ -446,7 +446,7 @@ def __init__(
"is deprecated and will raise in a future version. "
"Use public APIs instead.",
DeprecationWarning,
stacklevel=find_stack_level(),
stacklevel=2,
)

if copy:
Expand All @@ -465,7 +465,7 @@ def __init__(
"is deprecated and will raise in a future version. "
"Use public APIs instead.",
DeprecationWarning,
stacklevel=find_stack_level(),
stacklevel=2,
)

name = ibase.maybe_extract_name(name, data, type(self))
Expand Down Expand Up @@ -539,7 +539,7 @@ def __init__(
"is deprecated and will raise in a future version. "
"Use public APIs instead.",
DeprecationWarning,
stacklevel=find_stack_level(),
stacklevel=2,
)
allow_mgr = True

Expand Down
22 changes: 10 additions & 12 deletions pandas/tests/arrays/interval/test_interval.py
Original file line number Diff line number Diff line change
Expand Up @@ -309,6 +309,9 @@ def test_arrow_array_missing():
assert result.storage.equals(expected)


@pytest.mark.filterwarnings(
"ignore:Passing a BlockManager to DataFrame:DeprecationWarning"
)
@pytest.mark.parametrize(
"breaks",
[[0.0, 1.0, 2.0, 3.0], date_range("2017", periods=4, freq="D")],
Expand All @@ -325,29 +328,26 @@ def test_arrow_table_roundtrip(breaks):

table = pa.table(df)
assert isinstance(table.field("a").type, ArrowIntervalType)
msg = "Passing a BlockManager to DataFrame is deprecated"
with tm.assert_produces_warning(DeprecationWarning, match=msg):
result = table.to_pandas()
result = table.to_pandas()
assert isinstance(result["a"].dtype, pd.IntervalDtype)
tm.assert_frame_equal(result, df)

table2 = pa.concat_tables([table, table])
msg = "Passing a BlockManager to DataFrame is deprecated"
with tm.assert_produces_warning(DeprecationWarning, match=msg):
result = table2.to_pandas()
result = table2.to_pandas()
expected = pd.concat([df, df], ignore_index=True)
tm.assert_frame_equal(result, expected)

# GH-41040
table = pa.table(
[pa.chunked_array([], type=table.column(0).type)], schema=table.schema
)
msg = "Passing a BlockManager to DataFrame is deprecated"
with tm.assert_produces_warning(DeprecationWarning, match=msg):
result = table.to_pandas()
result = table.to_pandas()
tm.assert_frame_equal(result, expected[0:0])


@pytest.mark.filterwarnings(
"ignore:Passing a BlockManager to DataFrame:DeprecationWarning"
)
@pytest.mark.parametrize(
"breaks",
[[0.0, 1.0, 2.0, 3.0], date_range("2017", periods=4, freq="D")],
Expand All @@ -365,9 +365,7 @@ def test_arrow_table_roundtrip_without_metadata(breaks):
table = table.replace_schema_metadata()
assert table.schema.metadata is None

msg = "Passing a BlockManager to DataFrame is deprecated"
with tm.assert_produces_warning(DeprecationWarning, match=msg):
result = table.to_pandas()
result = table.to_pandas()
assert isinstance(result["a"].dtype, pd.IntervalDtype)
tm.assert_frame_equal(result, df)

Expand Down
24 changes: 9 additions & 15 deletions pandas/tests/arrays/masked/test_arrow_compat.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,10 @@
import pandas as pd
import pandas._testing as tm

pytestmark = pytest.mark.filterwarnings(
"ignore:Passing a BlockManager to DataFrame:DeprecationWarning"
)

pa = pytest.importorskip("pyarrow")

from pandas.core.arrays.arrow._arrow_utils import pyarrow_array_to_numpy_and_mask
Expand Down Expand Up @@ -36,9 +40,7 @@ def test_arrow_roundtrip(data):
table = pa.table(df)
assert table.field("a").type == str(data.dtype.numpy_dtype)

msg = "Passing a BlockManager to DataFrame is deprecated"
with tm.assert_produces_warning(DeprecationWarning, match=msg):
result = table.to_pandas()
result = table.to_pandas()
assert result["a"].dtype == data.dtype
tm.assert_frame_equal(result, df)

Expand All @@ -56,9 +58,7 @@ def types_mapper(arrow_type):
record_batch = pa.RecordBatch.from_arrays(
[bools_array, ints_array, small_ints_array], ["bools", "ints", "small_ints"]
)
msg = "Passing a BlockManager to DataFrame is deprecated"
with tm.assert_produces_warning(DeprecationWarning, match=msg):
result = record_batch.to_pandas(types_mapper=types_mapper)
result = record_batch.to_pandas(types_mapper=types_mapper)
bools = pd.Series([True, None, False], dtype="boolean")
ints = pd.Series([1, None, 2], dtype="Int64")
small_ints = pd.Series([-1, 0, 7], dtype="Int64")
Expand All @@ -75,9 +75,7 @@ def test_arrow_load_from_zero_chunks(data):
table = pa.table(
[pa.chunked_array([], type=table.field("a").type)], schema=table.schema
)
msg = "Passing a BlockManager to DataFrame is deprecated"
with tm.assert_produces_warning(DeprecationWarning, match=msg):
result = table.to_pandas()
result = table.to_pandas()
assert result["a"].dtype == data.dtype
tm.assert_frame_equal(result, df)

Expand All @@ -98,18 +96,14 @@ def test_arrow_sliced(data):

df = pd.DataFrame({"a": data})
table = pa.table(df)
msg = "Passing a BlockManager to DataFrame is deprecated"
with tm.assert_produces_warning(DeprecationWarning, match=msg):
result = table.slice(2, None).to_pandas()
result = table.slice(2, None).to_pandas()
expected = df.iloc[2:].reset_index(drop=True)
tm.assert_frame_equal(result, expected)

# no missing values
df2 = df.fillna(data[0])
table = pa.table(df2)
msg = "Passing a BlockManager to DataFrame is deprecated"
with tm.assert_produces_warning(DeprecationWarning, match=msg):
result = table.slice(2, None).to_pandas()
result = table.slice(2, None).to_pandas()
expected = df2.iloc[2:].reset_index(drop=True)
tm.assert_frame_equal(result, expected)

Expand Down
21 changes: 9 additions & 12 deletions pandas/tests/arrays/period/test_arrow_compat.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,11 @@
period_array,
)

pytestmark = pytest.mark.filterwarnings(
"ignore:Passing a BlockManager to DataFrame:DeprecationWarning"
)


pa = pytest.importorskip("pyarrow")


Expand Down Expand Up @@ -81,16 +86,12 @@ def test_arrow_table_roundtrip():

table = pa.table(df)
assert isinstance(table.field("a").type, ArrowPeriodType)
msg = "Passing a BlockManager to DataFrame is deprecated"
with tm.assert_produces_warning(DeprecationWarning, match=msg):
result = table.to_pandas()
result = table.to_pandas()
assert isinstance(result["a"].dtype, PeriodDtype)
tm.assert_frame_equal(result, df)

table2 = pa.concat_tables([table, table])
msg = "Passing a BlockManager to DataFrame is deprecated"
with tm.assert_produces_warning(DeprecationWarning, match=msg):
result = table2.to_pandas()
result = table2.to_pandas()
expected = pd.concat([df, df], ignore_index=True)
tm.assert_frame_equal(result, expected)

Expand All @@ -109,9 +110,7 @@ def test_arrow_load_from_zero_chunks():
[pa.chunked_array([], type=table.column(0).type)], schema=table.schema
)

msg = "Passing a BlockManager to DataFrame is deprecated"
with tm.assert_produces_warning(DeprecationWarning, match=msg):
result = table.to_pandas()
result = table.to_pandas()
assert isinstance(result["a"].dtype, PeriodDtype)
tm.assert_frame_equal(result, df)

Expand All @@ -126,8 +125,6 @@ def test_arrow_table_roundtrip_without_metadata():
table = table.replace_schema_metadata()
assert table.schema.metadata is None

msg = "Passing a BlockManager to DataFrame is deprecated"
with tm.assert_produces_warning(DeprecationWarning, match=msg):
result = table.to_pandas()
result = table.to_pandas()
assert isinstance(result["a"].dtype, PeriodDtype)
tm.assert_frame_equal(result, df)
Loading

0 comments on commit 775f716

Please sign in to comment.