Skip to content

Commit

Permalink
API: Harmonize dtype for index levels for Series.sparse.from_coo (#50926
Browse files Browse the repository at this point in the history
)

* API: Harmonize dtype for index levels for Series.sparse.from_coo

* add gh number
  • Loading branch information
topper-123 authored Jan 23, 2023
1 parent 79b2610 commit 1128f5e
Show file tree
Hide file tree
Showing 3 changed files with 5 additions and 10 deletions.
1 change: 1 addition & 0 deletions doc/source/whatsnew/v2.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -602,6 +602,7 @@ Other API changes
methods to get a full slice (for example ``df.loc[:]`` or ``df[:]``) (:issue:`49469`)
- Disallow computing ``cumprod`` for :class:`Timedelta` object; previously this returned incorrect values (:issue:`50246`)
- Loading a JSON file with duplicate columns using ``read_json(orient='split')`` renames columns to avoid duplicates, as :func:`read_csv` and the other readers do (:issue:`50370`)
- The levels of the index of the :class:`Series` returned from ``Series.sparse.from_coo`` now always have dtype ``int32``. Previously they had dtype ``int64`` (:issue:`50926`)
- :func:`to_datetime` with ``unit`` of either "Y" or "M" will now raise if a sequence contains a non-round ``float`` value, matching the ``Timestamp`` behavior (:issue:`50301`)
-

Expand Down
5 changes: 1 addition & 4 deletions pandas/core/arrays/sparse/scipy_sparse.py
Original file line number Diff line number Diff line change
Expand Up @@ -203,9 +203,6 @@ def coo_to_sparse_series(
ser = ser.sort_index()
ser = ser.astype(SparseDtype(ser.dtype))
if dense_index:
# is there a better constructor method to use here?
i = range(A.shape[0])
j = range(A.shape[1])
ind = MultiIndex.from_product([i, j])
ind = MultiIndex.from_product([A.row, A.col])
ser = ser.reindex(ind)
return ser
9 changes: 3 additions & 6 deletions pandas/tests/arrays/sparse/test_accessor.py
Original file line number Diff line number Diff line change
Expand Up @@ -218,14 +218,11 @@ def test_series_from_coo(self, dtype, dense_index):
A = scipy.sparse.eye(3, format="coo", dtype=dtype)
result = pd.Series.sparse.from_coo(A, dense_index=dense_index)

# TODO: GH49560: scipy.sparse.eye always has A.row and A.col dtype as int32.
# fix index_dtype to follow scipy.sparse convention (always int32)?
index_dtype = np.int64 if dense_index else np.int32
index = pd.MultiIndex.from_tuples(
[
np.array([0, 0], dtype=index_dtype),
np.array([1, 1], dtype=index_dtype),
np.array([2, 2], dtype=index_dtype),
np.array([0, 0], dtype=np.int32),
np.array([1, 1], dtype=np.int32),
np.array([2, 2], dtype=np.int32),
],
)
expected = pd.Series(SparseArray(np.array([1, 1, 1], dtype=dtype)), index=index)
Expand Down

0 comments on commit 1128f5e

Please sign in to comment.