-
-
Notifications
You must be signed in to change notification settings - Fork 18.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: to_timedelta() won't accept all freq units returned by infer_freq() #36769
Comments
Confirmed here with pandas 1.5.2.
The last line works, though, when replacing |
Why would you expect this to work? |
I expect things to be consistent. At least within the same library. A time frequency is basically a repeated application of a time delta from shifted start. Therefore it's quite logical to expect to_timedelta to accept the frequency from infer_freq as an time offset. inter_freq() returns string such as 3H, 2H, H. All of these are accepted by to_timedelta except the last case where '1H' is required. For unitary values either the 1 should be there always (mandatory) for all strings or it's assumed to be 1 when multiplier is applied. Or both can be used interchangeably and consistently To put it another way, with this issue it's easy to write code that functions perfectly with input data with a consistent multi unit frequency such as 2H, 3H, 2T, 3T but then suddenly fails when unitary versions are used. That seems like the definition of a 'bug' to me (if not the most critical one) |
Btw see that for to_timedelta in 2.0 it states: "Changed in version 2.0: Strings with units ‘M’, ‘Y’ and ‘y’ do not represent unambiguous timedelta values and will raise an exception." However the same argument (that a year, month are variable duration) applies regardless if they are specified with 1 prefix or not. |
False. infer_freq returns a string corresponding to a DateOffset (FWIW id rather just return the DateOffset itself, but that ship has sailed). Many DateOffsets do behave like timedeltas, in particular those that are Tick subclasses. But others, e.g. BDay, are distinct from timedeltas. |
Ok I accept the point that infer_freq() can return some 'frequencies' such as BDay that can't be treated as a consistent fixed timedelta. So not all values can be passed. However the main point remains: - infer_freq returns consistent unitary offsets such as hourly as 'H', whereas to_timedelta() requires it to be specified as '1H'.
#2 seems much preferable. |
[ x ] I have checked that this issue has not already been reported.
[ x ] I have confirmed this bug exists on the latest version of pandas.
(optional) I have confirmed this bug exists on the master branch of pandas.
Code Sample, a copy-pastable example
Problem description
to_timedelta() rejects unitary time units like 'H', 'D' etc which infer_freq() returns. Strings like '2H' etc are accepted. Even if you create the date range with '1H' freq it is dropped back to 'H' by date_range(). So you are forced to handle this special case and prepend a '1' which feels very clumsy.
Expected Output
to_timedelta() should treat strings like 'H' as for '1H' - so output of infer_freq() can be safely passed through.
Output of
pd.show_versions()
INSTALLED VERSIONS
commit : 2a7d332
python : 3.6.4.final.0
python-bits : 64
OS : Darwin
OS-release : 19.6.0
Version : Darwin Kernel Version 19.6.0: Mon Aug 31 22:12:52 PDT 2020; root:xnu-6153.141.2~1/RELEASE_X86_64
machine : x86_64
processor : i386
byteorder : little
LC_ALL : None
LANG : None
LOCALE : None.None
pandas : 1.1.2
numpy : 1.19.2
pytz : 2020.1
dateutil : 2.8.1
pip : 20.2.3
setuptools : 28.8.0
Cython : None
pytest : 6.0.1
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 2.11.2
IPython : None
pandas_datareader: None
bs4 : None
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : 3.3.2
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pytables : None
pyxlsb : None
s3fs : None
scipy : 1.5.2
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
numba : None
The text was updated successfully, but these errors were encountered: