Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Unlike its pandas counterpart, cudf.date_range doesn't include the end if it's specified #15116

Closed
amanlai opened this issue Feb 22, 2024 · 4 comments
Assignees
Labels
bug Something isn't working Python Affects Python cuDF API.

Comments

@amanlai
Copy link

amanlai commented Feb 22, 2024

If I create a DatetimeIndex object using cudf.date_range() by passing start, end and freq parameters:

cudf.date_range('2020-01-01', '2020-01-05', freq='D')

DatetimeIndex(['2020-01-01', '2020-01-02', '2020-01-03', '2020-01-04'], 
              dtype='datetime64[ns]', freq='D')

As you can see, the result is an object with length 4. The same code in pandas results in an object with length 5:

pd.date_range('2020-01-01', '2020-01-05', freq='D')

DatetimeIndex(['2020-01-01', '2020-01-02', '2020-01-03', '2020-01-04', '2020-01-05'],
              dtype='datetime64[ns]', freq='D')

Is it possible to make it consistent with pandas' date_range() in that the result includes both ends? Because, closed parameter is nonfunctional at the moment, to make cudf.date_range() produce the same output as its pandas counterpart, we need to add 1 freq to end, which creates a whole lot of work: convert the string to datetime, convert freq to timedelta and add them to define a new end.

Anyway, is it possible to make it clear in the documentation that this is different from its pandas counterpart?

I didn't know how to tag this since it's not really a bug. It's just different from pandas API that results in a subtle bug in my code.

@amanlai amanlai added the bug Something isn't working label Feb 22, 2024
@shwina shwina self-assigned this Feb 22, 2024
@shwina
Copy link
Contributor

shwina commented Feb 22, 2024

I think this is a legitimate bug. Thanks for reporting!

@shwina shwina added the Python Affects Python cuDF API. label Feb 22, 2024
@wence-
Copy link
Contributor

wence- commented Feb 27, 2024

Also #12133

rapids-bot bot pushed a commit that referenced this issue Apr 3, 2024
Precursor to #15116

* Aligns `date_range` signature with pandas, _technically_ an API breakage with `closed` changing defaults even though it still isn't supported
* Copies pandas behavior of allowing `date_range` with just two of `start/end/periods`
* Supports `tz` arg now

Authors:
  - Matthew Roeschke (https://github.com/mroeschke)

Approvers:
  - GALI PREM SAGAR (https://github.com/galipremsagar)

URL: #15139
@Matt711
Copy link
Contributor

Matt711 commented Sep 16, 2024

Was this closed by #16277?

cc. @mroeschke

@mroeschke
Copy link
Contributor

Was this closed by #16277?

cc. @mroeschke

Appears so! Closing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Python Affects Python cuDF API.
Projects
None yet
Development

No branches or pull requests

5 participants