Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove SparseSeries and SparseDataFrame #28425

Merged
merged 36 commits into from
Sep 18, 2019
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
acf5f2f
CLN: Remove sparse
TomAugspurger Sep 12, 2019
5418dd5
round 2
TomAugspurger Sep 12, 2019
f61b5e3
Round 3
TomAugspurger Sep 13, 2019
238db69
round 4
TomAugspurger Sep 13, 2019
7448795
remove hdf
TomAugspurger Sep 13, 2019
f285272
some more
TomAugspurger Sep 13, 2019
b6fb1aa
cleanup
TomAugspurger Sep 13, 2019
c476b21
note
TomAugspurger Sep 13, 2019
fc34fe8
fixups
TomAugspurger Sep 13, 2019
3cc4765
pickle changes
TomAugspurger Sep 13, 2019
766a2f2
pickle compat
TomAugspurger Sep 13, 2019
dd51140
skip feather
TomAugspurger Sep 13, 2019
129e89e
fixups
TomAugspurger Sep 13, 2019
5b711c6
Merge remote-tracking branch 'upstream/master' into remove-sparse
TomAugspurger Sep 13, 2019
075bfd2
cleanups
TomAugspurger Sep 13, 2019
2b58e53
black
TomAugspurger Sep 13, 2019
413347f
to_sparse docs
TomAugspurger Sep 13, 2019
d5828e3
doc note
TomAugspurger Sep 13, 2019
047773e
rm sparse frame
TomAugspurger Sep 13, 2019
2d8d195
rm sparse series
TomAugspurger Sep 13, 2019
fa508c1
docs
TomAugspurger Sep 13, 2019
9b61370
doc
TomAugspurger Sep 13, 2019
0c530ae
remove new pickle
TomAugspurger Sep 13, 2019
5d55a49
Update v0.25.0.rst
TomAugspurger Sep 13, 2019
58b848a
Update v0.25.0.rst
TomAugspurger Sep 13, 2019
7a7e2d3
Merge remote-tracking branch 'upstream/master' into remove-sparse
TomAugspurger Sep 16, 2019
f1afc8f
added new legacy pickle files
TomAugspurger Sep 16, 2019
7742e36
Merge remote-tracking branch 'upstream/master' into remove-sparse
TomAugspurger Sep 17, 2019
008931a
shim
TomAugspurger Sep 17, 2019
04bf466
shim
TomAugspurger Sep 17, 2019
a4a21ae
revert io changes
TomAugspurger Sep 17, 2019
a8b0d65
warning for sparse
TomAugspurger Sep 17, 2019
77b7da3
Fixup typing
TomAugspurger Sep 17, 2019
d265ba9
format
TomAugspurger Sep 17, 2019
c2a9514
fixup typing
TomAugspurger Sep 17, 2019
0c02b2a
0.24.0 todo
TomAugspurger Sep 17, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 0 additions & 7 deletions doc/source/reference/frame.rst
Original file line number Diff line number Diff line change
Expand Up @@ -361,10 +361,3 @@ Serialization / IO / conversion
DataFrame.to_string
DataFrame.to_clipboard
DataFrame.style

Sparse
~~~~~~
.. autosummary::
:toctree: api/

SparseDataFrame.to_coo
10 changes: 0 additions & 10 deletions doc/source/reference/series.rst
Original file line number Diff line number Diff line change
Expand Up @@ -581,13 +581,3 @@ Serialization / IO / conversion
Series.to_string
Series.to_clipboard
Series.to_latex


Sparse
------

.. autosummary::
:toctree: api/

SparseSeries.to_coo
SparseSeries.from_coo
20 changes: 5 additions & 15 deletions doc/source/user_guide/sparse.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,6 @@
Sparse data structures
**********************

.. note::

``SparseSeries`` and ``SparseDataFrame`` have been deprecated. Their purpose
is served equally well by a :class:`Series` or :class:`DataFrame` with
sparse values. See :ref:`sparse.migration` for tips on migrating.

Pandas provides data structures for efficiently storing sparse data.
These are not necessarily sparse in the typical "mostly 0". Rather, you can view these
objects as being "compressed" where any data matching a specific value (``NaN`` / missing value, though any value
Expand Down Expand Up @@ -168,6 +162,11 @@ the correct dense result.
Migrating
---------

.. note::

``SparseSeries`` and ``SparseDataFrame`` were removed in pandas 1.0.0. This migration
guide is present to aid in migrating from previous versions.

In older versions of pandas, the ``SparseSeries`` and ``SparseDataFrame`` classes (documented below)
were the preferred way to work with sparse data. With the advent of extension arrays, these subclasses
are no longer needed. Their purpose is better served by using a regular Series or DataFrame with
Expand Down Expand Up @@ -366,12 +365,3 @@ row and columns coordinates of the matrix. Note that this will consume a signifi

ss_dense = pd.Series.sparse.from_coo(A, dense_index=True)
ss_dense


.. _sparse.subclasses:

Sparse subclasses
-----------------

The :class:`SparseSeries` and :class:`SparseDataFrame` classes are deprecated. Visit their
API pages for usage.
2 changes: 2 additions & 0 deletions doc/source/whatsnew/v1.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,8 @@ Deprecations

Removal of prior version deprecations/changes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

- Removed ``SparseSeries`` and ``SparseDataFrame`` (:issue:``)
TomAugspurger marked this conversation as resolved.
Show resolved Hide resolved
- Removed the previously deprecated :meth:`Series.get_value`, :meth:`Series.set_value`, :meth:`DataFrame.get_value`, :meth:`DataFrame.set_value` (:issue:`17739`)
- Changed the the default value of `inplace` in :meth:`DataFrame.set_index` and :meth:`Series.set_axis`. It now defaults to False (:issue:`27600`)
- :meth:`pandas.Series.str.cat` now defaults to aligning ``others``, using ``join='left'`` (:issue:`27611`)
Expand Down
2 changes: 0 additions & 2 deletions pandas/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -116,8 +116,6 @@

from pandas.core.sparse.api import (
SparseArray,
SparseDataFrame,
SparseSeries,
SparseDtype,
)

Expand Down
5 changes: 1 addition & 4 deletions pandas/_typing.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,13 +12,10 @@
from pandas.core.dtypes.dtypes import ExtensionDtype # noqa: F401
from pandas.core.indexes.base import Index # noqa: F401
from pandas.core.series import Series # noqa: F401
from pandas.core.sparse.series import SparseSeries # noqa: F401
from pandas.core.generic import NDFrame # noqa: F401


AnyArrayLike = TypeVar(
"AnyArrayLike", "ExtensionArray", "Index", "Series", "SparseSeries", np.ndarray
)
AnyArrayLike = TypeVar("AnyArrayLike", "ExtensionArray", "Index", "Series", np.ndarray)
ArrayLike = TypeVar("ArrayLike", "ExtensionArray", np.ndarray)
DatetimeLikeScalar = TypeVar("DatetimeLikeScalar", "Period", "Timestamp", "Timedelta")
Dtype = Union[str, np.dtype, "ExtensionDtype"]
Expand Down
22 changes: 11 additions & 11 deletions pandas/core/arrays/sparse.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,6 @@
ABCIndexClass,
ABCSeries,
ABCSparseArray,
ABCSparseSeries,
)
from pandas.core.dtypes.missing import isna, na_value_for_dtype, notna

Expand Down Expand Up @@ -607,7 +606,7 @@ def __init__(
if fill_value is None and isinstance(dtype, SparseDtype):
fill_value = dtype.fill_value

if isinstance(data, (type(self), ABCSparseSeries)):
if isinstance(data, type(self)):
# disable normal inference on dtype, sparse_index, & fill_value
if sparse_index is None:
sparse_index = data.sp_index
Expand Down Expand Up @@ -1969,7 +1968,7 @@ def _delegate_method(self, name, *args, **kwargs):
@classmethod
def from_coo(cls, A, dense_index=False):
"""
Create a SparseSeries from a scipy.sparse.coo_matrix.
Create a Series with sparse values from a scipy.sparse.coo_matrix.

Parameters
----------
Expand All @@ -1982,7 +1981,8 @@ def from_coo(cls, A, dense_index=False):

Returns
-------
s : SparseSeries
s : Series
A Series with sparse values.

Examples
--------
Expand All @@ -1996,7 +1996,7 @@ def from_coo(cls, A, dense_index=False):
matrix([[ 0., 0., 1., 2.],
[ 3., 0., 0., 0.],
[ 0., 0., 0., 0.]])
>>> ss = pd.SparseSeries.from_coo(A)
>>> ss = pd.Series.sparse.from_coo(A)
>>> ss
0 2 1
3 2
Expand All @@ -2009,14 +2009,14 @@ def from_coo(cls, A, dense_index=False):
from pandas.core.sparse.scipy_sparse import _coo_to_sparse_series
from pandas import Series

result = _coo_to_sparse_series(A, dense_index=dense_index, sparse_series=False)
result = _coo_to_sparse_series(A, dense_index=dense_index)
result = Series(result.array, index=result.index, copy=False)

return result

def to_coo(self, row_levels=(0,), column_levels=(1,), sort_labels=False):
"""
Create a scipy.sparse.coo_matrix from a SparseSeries with MultiIndex.
Create a scipy.sparse.coo_matrix from a Series with MultiIndex.

Use row_levels and column_levels to determine the row and column
coordinates respectively. row_levels and column_levels are the names
Expand Down Expand Up @@ -2046,10 +2046,10 @@ def to_coo(self, row_levels=(0,), column_levels=(1,), sort_labels=False):
(2, 1, 'b', 0),
(2, 1, 'b', 1)],
names=['A', 'B', 'C', 'D'])
>>> ss = s.to_sparse()
>>> A, rows, columns = ss.to_coo(row_levels=['A', 'B'],
column_levels=['C', 'D'],
sort_labels=True)
>>> ss = s.astype("Sparse")
>>> A, rows, columns = ss.sparse.to_coo(row_levels=['A', 'B'],
... column_levels=['C', 'D'],
... sort_labels=True)
>>> A
<3x4 sparse matrix of type '<class 'numpy.float64'>'
with 3 stored elements in COOrdinate format>
Expand Down
17 changes: 1 addition & 16 deletions pandas/core/dtypes/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -273,8 +273,6 @@ def is_sparse(arr):

See Also
--------
DataFrame.to_sparse : Convert DataFrame to a SparseDataFrame.
Series.to_sparse : Convert Series to SparseSeries.
Series.to_dense : Return dense representation of a Series.

Examples
Expand All @@ -283,7 +281,7 @@ def is_sparse(arr):

>>> is_sparse(pd.SparseArray([0, 0, 1, 0]))
True
>>> is_sparse(pd.SparseSeries([0, 0, 1, 0]))
>>> is_sparse(pd.Series(pd.SparseArray([0, 0, 1, 0])))
True

Returns `False` if the parameter is not sparse.
Expand All @@ -300,14 +298,6 @@ def is_sparse(arr):
False

Returns `False` if the parameter has more than one dimension.

>>> df = pd.SparseDataFrame([389., 24., 80.5, np.nan],
columns=['max_speed'],
index=['falcon', 'parrot', 'lion', 'monkey'])
>>> is_sparse(df)
False
>>> is_sparse(df.max_speed)
True
"""
from pandas.core.arrays.sparse import SparseDtype

Expand Down Expand Up @@ -340,8 +330,6 @@ def is_scipy_sparse(arr):
True
>>> is_scipy_sparse(pd.SparseArray([1, 2, 3]))
False
>>> is_scipy_sparse(pd.SparseSeries([1, 2, 3]))
False
"""

global _is_scipy_sparse
Expand Down Expand Up @@ -1715,9 +1703,6 @@ def is_extension_type(arr):
True
>>> is_extension_type(pd.SparseArray([1, 2, 3]))
True
>>> is_extension_type(pd.SparseSeries([1, 2, 3]))
True
>>>
>>> from scipy.sparse import bsr_matrix
>>> is_extension_type(bsr_matrix([1, 2, 3]))
False
Expand Down
7 changes: 1 addition & 6 deletions pandas/core/dtypes/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,12 +52,7 @@ def _check(cls, inst):

ABCSeries = create_pandas_abc_type("ABCSeries", "_typ", ("series",))
ABCDataFrame = create_pandas_abc_type("ABCDataFrame", "_typ", ("dataframe",))
ABCSparseDataFrame = create_pandas_abc_type(
"ABCSparseDataFrame", "_subtyp", ("sparse_frame",)
)
ABCSparseSeries = create_pandas_abc_type(
"ABCSparseSeries", "_subtyp", ("sparse_series", "sparse_time_series")
)

ABCSparseArray = create_pandas_abc_type(
"ABCSparseArray", "_subtyp", ("sparse_array", "sparse_series")
)
Expand Down
76 changes: 0 additions & 76 deletions pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -1925,81 +1925,6 @@ def _from_arrays(cls, arrays, columns, index, dtype=None):
mgr = arrays_to_mgr(arrays, columns, index, columns, dtype=dtype)
return cls(mgr)

def to_sparse(self, fill_value=None, kind="block"):
"""
Convert to SparseDataFrame.

.. deprecated:: 0.25.0

Implement the sparse version of the DataFrame meaning that any data
matching a specific value it's omitted in the representation.
The sparse DataFrame allows for a more efficient storage.

Parameters
----------
fill_value : float, default None
The specific value that should be omitted in the representation.
kind : {'block', 'integer'}, default 'block'
The kind of the SparseIndex tracking where data is not equal to
the fill value:

- 'block' tracks only the locations and sizes of blocks of data.
- 'integer' keeps an array with all the locations of the data.

In most cases 'block' is recommended, since it's more memory
efficient.

Returns
-------
SparseDataFrame
The sparse representation of the DataFrame.

See Also
--------
DataFrame.to_dense :
Converts the DataFrame back to the its dense form.

Examples
--------
>>> df = pd.DataFrame([(np.nan, np.nan),
... (1., np.nan),
... (np.nan, 1.)])
>>> df
0 1
0 NaN NaN
1 1.0 NaN
2 NaN 1.0
>>> type(df)
<class 'pandas.core.frame.DataFrame'>

>>> sdf = df.to_sparse() # doctest: +SKIP
>>> sdf # doctest: +SKIP
0 1
0 NaN NaN
1 1.0 NaN
2 NaN 1.0
>>> type(sdf) # doctest: +SKIP
<class 'pandas.core.sparse.frame.SparseDataFrame'>
"""
warnings.warn(
"DataFrame.to_sparse is deprecated and will be removed "
"in a future version",
FutureWarning,
stacklevel=2,
)

from pandas.core.sparse.api import SparseDataFrame

with warnings.catch_warnings():
warnings.filterwarnings("ignore", message="SparseDataFrame")
return SparseDataFrame(
self._series,
index=self.index,
columns=self.columns,
default_kind=kind,
default_fill_value=fill_value,
)

@deprecate_kwarg(old_arg_name="encoding", new_arg_name=None)
def to_stata(
self,
Expand Down Expand Up @@ -7192,7 +7117,6 @@ def join(self, other, on=None, how="left", lsuffix="", rsuffix="", sort=False):
4 K4 A4 NaN
5 K5 A5 NaN
"""
# For SparseDataFrame's benefit
return self._join_compat(
other, on=on, how=how, lsuffix=lsuffix, rsuffix=rsuffix, sort=sort
)
Expand Down
11 changes: 0 additions & 11 deletions pandas/core/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -5575,9 +5575,6 @@ def get_ftype_counts(self):

.. deprecated:: 0.23.0

This is useful for SparseDataFrame or for DataFrames containing
sparse arrays.

Returns
-------
dtype : Series
Expand Down Expand Up @@ -5672,7 +5669,6 @@ def ftypes(self):
See Also
--------
DataFrame.dtypes: Series with just dtype information.
SparseDataFrame : Container for sparse tabular data.

Notes
-----
Expand All @@ -5688,13 +5684,6 @@ def ftypes(self):
2 float64:dense
3 float64:dense
dtype: object

>>> pd.SparseDataFrame(arr).ftypes # doctest: +SKIP
0 float64:sparse
1 float64:sparse
2 float64:sparse
3 float64:sparse
dtype: object
"""
warnings.warn(
"DataFrame.ftypes is deprecated and will "
Expand Down
7 changes: 0 additions & 7 deletions pandas/core/groupby/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,6 @@
import pandas.core.indexes.base as ibase
from pandas.core.internals import BlockManager, make_block
from pandas.core.series import Series
from pandas.core.sparse.frame import SparseDataFrame

from pandas.plotting import boxplot_frame_groupby

Expand Down Expand Up @@ -258,12 +257,6 @@ def aggregate(self, func, *args, **kwargs):
result.columns.levels[0], name=self._selected_obj.columns.name
)

if isinstance(self.obj, SparseDataFrame):
# Backwards compat for groupby.agg() with sparse
# values. concat no longer converts DataFrame[Sparse]
# to SparseDataFrame, so we do it here.
result = SparseDataFrame(result._data)

if not self.as_index:
self._insert_inaxis_grouper_inplace(result)
result.index = np.arange(len(result))
Expand Down
3 changes: 0 additions & 3 deletions pandas/core/ops/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,6 @@
ABCExtensionArray,
ABCIndexClass,
ABCSeries,
ABCSparseSeries,
TomAugspurger marked this conversation as resolved.
Show resolved Hide resolved
ABCTimedeltaArray,
ABCTimedeltaIndex,
)
Expand Down Expand Up @@ -1151,8 +1150,6 @@ def wrapper(self, other):
if isinstance(other, ABCDataFrame):
return NotImplemented
elif isinstance(other, ABCSeries):
if not isinstance(other, ABCSparseSeries):
other = other.to_sparse(fill_value=self.fill_value)
return _sparse_series_op(self, other, op, op_name)
elif is_scalar(other):
with np.errstate(all="ignore"):
Expand Down
Loading