Skip to content

Commit

Permalink
BUG: Fix astype str issue 54654 (#54687)
Browse files Browse the repository at this point in the history
* Fix issue #54654
on pickle roundtrip astype(str) might change original array even when copy is True

* changelog

* Update v2.2.0.rst

rephrase

* rephrase

* Update lib.pyx

add gh comment

* Update v2.2.0.rst

* Update lib.pyx

fix CR

---------

Co-authored-by: Itay Azolay <[email protected]>
  • Loading branch information
Itayazolay and itay-jether authored Oct 29, 2023
1 parent bd21f6b commit a39f783
Show file tree
Hide file tree
Showing 3 changed files with 14 additions and 1 deletion.
1 change: 1 addition & 0 deletions doc/source/whatsnew/v2.2.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -349,6 +349,7 @@ Numeric

Conversion
^^^^^^^^^^
- Bug in :func:`astype` when called with ``str`` on unpickled array - the array might change in-place (:issue:`54654`)
- Bug in :meth:`Series.convert_dtypes` not converting all NA column to ``null[pyarrow]`` (:issue:`55346`)
-

Expand Down
3 changes: 2 additions & 1 deletion pandas/_libs/lib.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -792,7 +792,8 @@ cpdef ndarray[object] ensure_string_array(

result = np.asarray(arr, dtype="object")

if copy and result is arr:
if copy and (result is arr or np.shares_memory(arr, result)):
# GH#54654
result = result.copy()
elif not copy and result is arr:
already_copied = False
Expand Down
11 changes: 11 additions & 0 deletions pandas/tests/copy_view/test_astype.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
import pickle

import numpy as np
import pytest

Expand Down Expand Up @@ -130,6 +132,15 @@ def test_astype_string_and_object_update_original(
tm.assert_frame_equal(df2, df_orig)


def test_astype_string_copy_on_pickle_roundrip():
# https://github.com/pandas-dev/pandas/issues/54654
# ensure_string_array may alter array inplace
base = Series(np.array([(1, 2), None, 1], dtype="object"))
base_copy = pickle.loads(pickle.dumps(base))
base_copy.astype(str)
tm.assert_series_equal(base, base_copy)


def test_astype_dict_dtypes(using_copy_on_write):
df = DataFrame(
{"a": [1, 2, 3], "b": [4, 5, 6], "c": Series([1.5, 1.5, 1.5], dtype="float64")}
Expand Down

0 comments on commit a39f783

Please sign in to comment.