Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: linearly spaced date_range (GH 20808) #20846

Merged
merged 12 commits into from
May 3, 2018
Merged
1 change: 1 addition & 0 deletions doc/source/whatsnew/v0.23.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -450,6 +450,7 @@ Other Enhancements
- Updated ``to_gbq`` and ``read_gbq`` signature and documentation to reflect changes from
the Pandas-GBQ library version 0.4.0. Adds intersphinx mapping to Pandas-GBQ
library. (:issue:`20564`)
- :func:`pandas.core.indexes.datetimes.date_range` now returns a linearly spaced DatetimeIndex if ``start``, ``stop``, and ``periods`` are specified, but ``freq`` is not. (:issue:`20808`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • func:`date_range` should suffice here
  • DatetimeIndex --> ``DatetimeIndex``


.. _whatsnew_0230.api_breaking:

Expand Down
48 changes: 44 additions & 4 deletions pandas/core/indexes/datetimes.py
Original file line number Diff line number Diff line change
Expand Up @@ -2583,13 +2583,15 @@ def _generate_regular_range(start, end, periods, freq):
return data


def date_range(start=None, end=None, periods=None, freq='D', tz=None,
def date_range(start=None, end=None, periods=None, freq=None, tz=None,
normalize=False, name=None, closed=None, **kwargs):
"""
Return a fixed frequency DatetimeIndex.

Exactly two of the three parameters `start`, `end` and `periods`
must be specified.
Two or three of the three parameters `start`, `end` and `periods`
must be specified. If all three parameters are specified, and `freq` is
omitted, the resulting DatetimeIndex will have `periods` linearly spaced
elements between `start` and `end` (closed on both sides).

Parameters
----------
Expand All @@ -2616,6 +2618,8 @@ def date_range(start=None, end=None, periods=None, freq='D', tz=None,
the 'left', 'right', or both sides (None, the default).
**kwargs
For compatibility. Has no effect on the result.
Can be used to pass arguments to `pd.to_datetime` when specifying
`start`, `end`, and `periods`, but not `freq`.

Returns
-------
Expand All @@ -2631,7 +2635,7 @@ def date_range(start=None, end=None, periods=None, freq='D', tz=None,
--------
**Specifying the values**

The next three examples generate the same `DatetimeIndex`, but vary
The next four examples generate the same `DatetimeIndex`, but vary
the combination of `start`, `end` and `periods`.

Specify `start` and `end`, with the default daily frequency.
Expand All @@ -2655,6 +2659,13 @@ def date_range(start=None, end=None, periods=None, freq='D', tz=None,
'2017-12-29', '2017-12-30', '2017-12-31', '2018-01-01'],
dtype='datetime64[ns]', freq='D')

Specify `start`, `end`, and `periods`; the frequency is generated
automatically (linearly spaced).

>>> pd.date_range(start='2018-04-24', end='2018-04-27', periods=3)
DatetimeIndex(['2018-04-24 00:00:00', '2018-04-25 12:00:00',
'2018-04-27 00:00:00'], freq=None)

**Other Parameters**

Changed the `freq` (frequency) to ``'M'`` (month end frequency).
Expand Down Expand Up @@ -2704,7 +2715,36 @@ def date_range(start=None, end=None, periods=None, freq='D', tz=None,
>>> pd.date_range(start='2017-01-01', end='2017-01-04', closed='right')
DatetimeIndex(['2017-01-02', '2017-01-03', '2017-01-04'],
dtype='datetime64[ns]', freq='D')

Declare extra parameters (kwargs) to be used with the pd.to_datetime
function that is used when all three parameters `start`, `end`, and
`periods` are declared.

>>> date_range('2018-04-24', '2018-04-27', periods=3, box=False)
array(['2018-04-24T00:00:00.000000000', '2018-04-25T12:00:00.000000000',
'2018-04-27T00:00:00.000000000'], dtype='datetime64[ns]')
"""

# See https://github.com/pandas-dev/pandas/issues/20808
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs to go in the DTI constructor itself. we already do some of this validation, it needs to fit in there. Further you don't need to worry about lots of other things that you are repeating, e.g. tz, which are already handled.

if freq is None and periods is not None and start is not None \
and end is not None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A little bit cleaner to use if freq is None and com._all_not_none(periods, start, end):

if is_float(periods):
periods = int(periods)
elif not is_integer(periods):
msg = 'periods must be a number, got {periods}'
raise TypeError(msg.format(periods=periods))

start = Timestamp(start)
end = Timestamp(end)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

start and end should be normalized if normalize=True is passed to date_range

di = tools.to_datetime(np.linspace(start.value, end.value, periods),
**kwargs)
if name:
di.name = name
return di

if freq is None:
freq = 'D'

return DatetimeIndex(start=start, end=end, periods=periods,
freq=freq, tz=tz, normalize=normalize, name=name,
closed=closed, **kwargs)
Expand Down
15 changes: 15 additions & 0 deletions pandas/tests/indexes/datetimes/test_date_range.py
Original file line number Diff line number Diff line change
Expand Up @@ -162,6 +162,21 @@ def test_date_range_ambiguous_arguments(self):
with tm.assert_raises_regex(ValueError, msg):
date_range(start, end, periods=10, freq='s')

def test_date_range_convenience_periods(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also test this with the tz arg specified? Would also be good to test a tz where there is a day light savings transition between start and end.

# GH 20808
rng = date_range('2018-04-24', '2018-04-27', periods=3)
exp = pd.DatetimeIndex(['2018-04-24 00:00:00', '2018-04-25 12:00:00',
'2018-04-27 00:00:00'], freq=None)

tm.assert_index_equal(rng, exp)

# Test if kwargs work for the to_datetime function used
rng = date_range('2018-04-24', '2018-04-27', periods=3, box=False)
exp = np.array(['2018-04-24T00:00:00', '2018-04-25T12:00:00',
'2018-04-27T00:00:00'], dtype='datetime64[ns]')

assert (rng == exp).all()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use tm.assert_numpy_array_equal here


def test_date_range_businesshour(self):
idx = DatetimeIndex(['2014-07-04 09:00', '2014-07-04 10:00',
'2014-07-04 11:00',
Expand Down