Skip to content

Commit

Permalink
ENH/DOC: reimplement Series delegates/accessors using descriptors
Browse files Browse the repository at this point in the history
This PR implements `Series.str`, `Series.dt` and `Series.cat` as descriptors
instead of properties. This means that the API docs can refer to methods like
`Series.str.lower` instead of `StringMethods.lower` and tab-completion like
`Series.str.<tab>` also works, even on the base class.

CC jorisvandenbossche jreback
  • Loading branch information
shoyer committed Jan 22, 2015
1 parent 5fd1fbd commit b7f5775
Show file tree
Hide file tree
Showing 12 changed files with 189 additions and 120 deletions.
142 changes: 68 additions & 74 deletions doc/source/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -449,114 +449,106 @@ Datetimelike Properties

``Series.dt`` can be used to access the values of the series as
datetimelike and return several properties.
Due to implementation details the methods show up here as methods of the
``DatetimeProperties/PeriodProperties/TimedeltaProperties`` classes. These can be accessed like ``Series.dt.<property>``.

.. currentmodule:: pandas.tseries.common
These can be accessed like ``Series.dt.<property>``.

**Datetime Properties**

.. autosummary::
:toctree: generated/

DatetimeProperties.date
DatetimeProperties.time
DatetimeProperties.year
DatetimeProperties.month
DatetimeProperties.day
DatetimeProperties.hour
DatetimeProperties.minute
DatetimeProperties.second
DatetimeProperties.microsecond
DatetimeProperties.nanosecond
DatetimeProperties.second
DatetimeProperties.weekofyear
DatetimeProperties.dayofweek
DatetimeProperties.weekday
DatetimeProperties.dayofyear
DatetimeProperties.quarter
DatetimeProperties.is_month_start
DatetimeProperties.is_month_end
DatetimeProperties.is_quarter_start
DatetimeProperties.is_quarter_end
DatetimeProperties.is_year_start
DatetimeProperties.is_year_end
Series.dt.date
Series.dt.time
Series.dt.year
Series.dt.month
Series.dt.day
Series.dt.hour
Series.dt.minute
Series.dt.second
Series.dt.microsecond
Series.dt.nanosecond
Series.dt.second
Series.dt.weekofyear
Series.dt.dayofweek
Series.dt.weekday
Series.dt.dayofyear
Series.dt.quarter
Series.dt.is_month_start
Series.dt.is_month_end
Series.dt.is_quarter_start
Series.dt.is_quarter_end
Series.dt.is_year_start
Series.dt.is_year_end

**Datetime Methods**

.. autosummary::
:toctree: generated/

DatetimeProperties.to_period
DatetimeProperties.to_pydatetime
DatetimeProperties.tz_localize
DatetimeProperties.tz_convert
Series.dt.to_period
Series.dt.to_pydatetime
Series.dt.tz_localize
Series.dt.tz_convert

**Timedelta Properties**

.. autosummary::
:toctree: generated/

TimedeltaProperties.days
TimedeltaProperties.seconds
TimedeltaProperties.microseconds
TimedeltaProperties.nanoseconds
TimedeltaProperties.components
Series.dt.days
Series.dt.seconds
Series.dt.microseconds
Series.dt.nanoseconds
Series.dt.components

**Timedelta Methods**

.. autosummary::
:toctree: generated/

TimedeltaProperties.to_pytimedelta
Series.dt.to_pytimedelta

String handling
~~~~~~~~~~~~~~~
``Series.str`` can be used to access the values of the series as
strings and apply several methods to it. Due to implementation
details the methods show up here as methods of the
``StringMethods`` class. These can be acccessed like ``Series.str.<function/property>``.

.. currentmodule:: pandas.core.strings

.. autosummary::
:toctree: generated/

StringMethods.cat
StringMethods.center
StringMethods.contains
StringMethods.count
StringMethods.decode
StringMethods.encode
StringMethods.endswith
StringMethods.extract
StringMethods.findall
StringMethods.get
StringMethods.join
StringMethods.len
StringMethods.lower
StringMethods.lstrip
StringMethods.match
StringMethods.pad
StringMethods.repeat
StringMethods.replace
StringMethods.rstrip
StringMethods.slice
StringMethods.slice_replace
StringMethods.split
StringMethods.startswith
StringMethods.strip
StringMethods.title
StringMethods.upper
StringMethods.get_dummies
strings and apply several methods to it. These can be acccessed like
``Series.str.<function/property>``.

.. autosummary::
:toctree: generated/

Series.str.cat
Series.str.center
Series.str.contains
Series.str.count
Series.str.decode
Series.str.encode
Series.str.endswith
Series.str.extract
Series.str.findall
Series.str.get
Series.str.join
Series.str.len
Series.str.lower
Series.str.lstrip
Series.str.match
Series.str.pad
Series.str.repeat
Series.str.replace
Series.str.rstrip
Series.str.slice
Series.str.slice_replace
Series.str.split
Series.str.startswith
Series.str.strip
Series.str.title
Series.str.upper
Series.str.get_dummies

.. _api.categorical:

Categorical
~~~~~~~~~~~

.. currentmodule:: pandas.core.categorical

If the Series is of dtype ``category``, ``Series.cat`` can be used to change the the categorical
data. This accessor is similar to the ``Series.dt`` or ``Series.str`` and has the
following usable methods and properties (all available as ``Series.cat.<method_or_property>``).
Expand All @@ -579,6 +571,8 @@ To create a Series of dtype ``category``, use ``cat = s.astype("category")``.
The following two ``Categorical`` constructors are considered API but should only be used when
adding ordering information or special categories is need at creation time of the categorical data:

.. currentmodule:: pandas.core.categorical

.. autosummary::
:toctree: generated/

Expand Down
2 changes: 1 addition & 1 deletion doc/source/reshaping.rst
Original file line number Diff line number Diff line change
Expand Up @@ -478,7 +478,7 @@ This function is often used along with discretization functions like ``cut``:
get_dummies(cut(values, bins))
See also :func:`Series.str.get_dummies <pandas.core.strings.StringMethods.get_dummies>`.
See also :func:`Series.str.get_dummies <pandas.Series.str.get_dummies>`.

.. versionadded:: 0.15.0

Expand Down
48 changes: 24 additions & 24 deletions doc/source/text.rst
Original file line number Diff line number Diff line change
Expand Up @@ -204,27 +204,27 @@ Method Summary
:header: "Method", "Description"
:widths: 20, 80

:meth:`~core.strings.StringMethods.cat`,Concatenate strings
:meth:`~core.strings.StringMethods.split`,Split strings on delimiter
:meth:`~core.strings.StringMethods.get`,Index into each element (retrieve i-th element)
:meth:`~core.strings.StringMethods.join`,Join strings in each element of the Series with passed separator
:meth:`~core.strings.StringMethods.contains`,Return boolean array if each string contains pattern/regex
:meth:`~core.strings.StringMethods.replace`,Replace occurrences of pattern/regex with some other string
:meth:`~core.strings.StringMethods.repeat`,Duplicate values (``s.str.repeat(3)`` equivalent to ``x * 3``)
:meth:`~core.strings.StringMethods.pad`,"Add whitespace to left, right, or both sides of strings"
:meth:`~core.strings.StringMethods.center`,Equivalent to ``pad(side='both')``
:meth:`~core.strings.StringMethods.wrap`,Split long strings into lines with length less than a given width
:meth:`~core.strings.StringMethods.slice`,Slice each string in the Series
:meth:`~core.strings.StringMethods.slice_replace`,Replace slice in each string with passed value
:meth:`~core.strings.StringMethods.count`,Count occurrences of pattern
:meth:`~core.strings.StringMethods.startswith`,Equivalent to ``str.startswith(pat)`` for each element
:meth:`~core.strings.StringMethods.endswith`,Equivalent to ``str.endswith(pat)`` for each element
:meth:`~core.strings.StringMethods.findall`,Compute list of all occurrences of pattern/regex for each string
:meth:`~core.strings.StringMethods.match`,"Call ``re.match`` on each element, returning matched groups as list"
:meth:`~core.strings.StringMethods.extract`,"Call ``re.match`` on each element, as ``match`` does, but return matched groups as strings for convenience."
:meth:`~core.strings.StringMethods.len`,Compute string lengths
:meth:`~core.strings.StringMethods.strip`,Equivalent to ``str.strip``
:meth:`~core.strings.StringMethods.rstrip`,Equivalent to ``str.rstrip``
:meth:`~core.strings.StringMethods.lstrip`,Equivalent to ``str.lstrip``
:meth:`~core.strings.StringMethods.lower`,Equivalent to ``str.lower``
:meth:`~core.strings.StringMethods.upper`,Equivalent to ``str.upper``
:meth:`~Series.str.cat`,Concatenate strings
:meth:`~Series.str.split`,Split strings on delimiter
:meth:`~Series.str.get`,Index into each element (retrieve i-th element)
:meth:`~Series.str.join`,Join strings in each element of the Series with passed separator
:meth:`~Series.str.contains`,Return boolean array if each string contains pattern/regex
:meth:`~Series.str.replace`,Replace occurrences of pattern/regex with some other string
:meth:`~Series.str.repeat`,Duplicate values (``s.str.repeat(3)`` equivalent to ``x * 3``)
:meth:`~Series.str.pad`,"Add whitespace to left, right, or both sides of strings"
:meth:`~Series.str.center`,Equivalent to ``pad(side='both')``
:meth:`~Series.str.wrap`,Split long strings into lines with length less than a given width
:meth:`~Series.str.slice`,Slice each string in the Series
:meth:`~Series.str.slice_replace`,Replace slice in each string with passed value
:meth:`~Series.str.count`,Count occurrences of pattern
:meth:`~Series.str.startswith`,Equivalent to ``str.startswith(pat)`` for each element
:meth:`~Series.str.endswith`,Equivalent to ``str.endswith(pat)`` for each element
:meth:`~Series.str.findall`,Compute list of all occurrences of pattern/regex for each string
:meth:`~Series.str.match`,"Call ``re.match`` on each element, returning matched groups as list"
:meth:`~Series.str.extract`,"Call ``re.match`` on each element, as ``match`` does, but return matched groups as strings for convenience."
:meth:`~Series.str.len`,Compute string lengths
:meth:`~Series.str.strip`,Equivalent to ``str.strip``
:meth:`~Series.str.rstrip`,Equivalent to ``str.rstrip``
:meth:`~Series.str.lstrip`,Equivalent to ``str.lstrip``
:meth:`~Series.str.lower`,Equivalent to ``str.lower``
:meth:`~Series.str.upper`,Equivalent to ``str.upper``
2 changes: 2 additions & 0 deletions doc/source/whatsnew/v0.16.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -107,6 +107,8 @@ Enhancements

- ``Timedelta`` will now accept nanoseconds keyword in constructor (:issue:`9273`)

- Added auto-complete for ``Series.str.<tab>``, ``Series.dt.<tab>`` and ``Series.cat.<tab>`` (:issue:`9322`)

Performance
~~~~~~~~~~~

Expand Down
22 changes: 22 additions & 0 deletions pandas/core/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -166,6 +166,28 @@ def f(self, *args, **kwargs):
if not hasattr(cls, name):
setattr(cls,name,f)


class AccessorProperty(object):
"""Descriptor for implementing accessor properties like Series.str
"""
def __init__(self, accessor_cls, construct_accessor):
self.accessor_cls = accessor_cls
self.construct_accessor = construct_accessor
self.__doc__ = accessor_cls.__doc__

def __get__(self, instance, owner=None):
if instance is None:
# this ensures that Series.str.<method> is well defined
return self.accessor_cls
return self.construct_accessor(instance)

def __set__(self, instance, value):
raise AttributeError("can't set attribute")

def __delete__(self, instance):
raise AttributeError("can't delete attribute")


class FrozenList(PandasObject, list):

"""
Expand Down
2 changes: 1 addition & 1 deletion pandas/core/categorical.py
Original file line number Diff line number Diff line change
Expand Up @@ -829,7 +829,7 @@ def searchsorted(self, v, side='left', sorter=None):
array([3, 4]) # eggs before milk
>>> x = pd.Categorical(['apple', 'bread', 'bread', 'cheese', 'milk', 'donuts' ])
>>> x.searchsorted(['bread', 'eggs'], side='right', sorter=[0, 1, 2, 3, 5, 4])
array([3, 5]) # eggs after donuts, after switching milk and donuts
array([3, 5]) # eggs after donuts, after switching milk and donuts
"""
if not self.ordered:
raise ValueError("searchsorted requires an ordered Categorical.")
Expand Down
30 changes: 18 additions & 12 deletions pandas/core/series.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,10 @@
from pandas.core.indexing import _check_bool_indexer, _maybe_convert_indices
from pandas.core import generic, base
from pandas.core.internals import SingleBlockManager
from pandas.core.categorical import Categorical
from pandas.core.categorical import Categorical, CategoricalAccessor
from pandas.core.strings import StringMethods
from pandas.tseries.common import (maybe_to_datetimelike,
CombinedDatetimelikeProperties)
from pandas.tseries.index import DatetimeIndex
from pandas.tseries.tdi import TimedeltaIndex
from pandas.tseries.period import PeriodIndex, Period
Expand Down Expand Up @@ -2452,11 +2455,6 @@ def asof(self, where):
new_values = com.take_1d(values, locs)
return self._constructor(new_values, index=where).__finalize__(self)

@cache_readonly
def str(self):
from pandas.core.strings import StringMethods
return StringMethods(self)

def to_timestamp(self, freq=None, how='start', copy=True):
"""
Cast to datetimeindex of timestamps, at *beginning* of period
Expand Down Expand Up @@ -2502,27 +2500,35 @@ def to_period(self, freq=None, copy=True):
return self._constructor(new_values,
index=new_index).__finalize__(self)

#------------------------------------------------------------------------------
# string methods

def _make_str_accessor(self):
return StringMethods(self)

str = base.AccessorProperty(StringMethods, _make_str_accessor)

#------------------------------------------------------------------------------
# Datetimelike delegation methods

@cache_readonly
def dt(self):
from pandas.tseries.common import maybe_to_datetimelike
def _make_dt_accessor(self):
try:
return maybe_to_datetimelike(self)
except (Exception):
raise TypeError("Can only use .dt accessor with datetimelike values")

dt = base.AccessorProperty(CombinedDatetimelikeProperties, _make_dt_accessor)

#------------------------------------------------------------------------------
# Categorical methods

@cache_readonly
def cat(self):
from pandas.core.categorical import CategoricalAccessor
def _make_cat_accessor(self):
if not com.is_categorical_dtype(self.dtype):
raise TypeError("Can only use .cat accessor with a 'category' dtype")
return CategoricalAccessor(self.values, self.index)

cat = base.AccessorProperty(CategoricalAccessor, _make_cat_accessor)

Series._setup_axes(['index'], info_axis=0, stat_axis=0,
aliases={'rows': 0})
Series._add_numeric_operations()
Expand Down
Loading

0 comments on commit b7f5775

Please sign in to comment.