Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Series dayofyear / day_of_year #8625

Closed
beckernick opened this issue Jun 29, 2021 · 0 comments · Fixed by #8626
Closed

[FEA] Series dayofyear / day_of_year #8625

beckernick opened this issue Jun 29, 2021 · 0 comments · Fixed by #8626
Assignees
Labels
feature request New feature or request Python Affects Python cuDF API.

Comments

@beckernick
Copy link
Member

While exploring our current datetime functionality in libcudf, I noticed that we have a libcudf implementation for day_of_year that isn't accessible in cuDF Python (and last_day_of_month, which I'll file separately for dt.is_month_end). We should plumb this up to Python and make it accessible through the Series.dt accessor.

std::unique_ptr<cudf::column> day_of_year(
cudf::column_view const& column,
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource());

import pandas as pd
s = pd.Series(["2021-01-08", "2021-06-28", "2020-03-09", "2021-06-30"], dtype="datetime64[ms]")
print(s.dt.dayofyear)
print(s.dt.day_of_year)
0      8
1    179
2     69
3    181
dtype: int64
0      8
1    179
2     69
3    181
dtype: int64
@beckernick beckernick added feature request New feature or request Python Affects Python cuDF API. Cython labels Jun 29, 2021
@beckernick beckernick self-assigned this Jun 29, 2021
rapids-bot bot pushed a commit that referenced this issue Jun 30, 2021
…Index (#8626)

This PR:
- [x] Adds `[Series/DatetimeColumn/DatetimeIndex].dt.dayofyear` and `day_of_year`
- [x] Updates the existing pytests to include dayofyear/day_of_year
- [x] Includes docstrings in new methods

```python
import cudf
import pandas as pd
​
s = pd.Series(["2021-01-08", "2021-06-28", "2020-03-09", "2021-06-30"], dtype="datetime64[ms]")
s = s.repeat(25000) # 100K elements
gs = cudf.from_pandas(s)
​
%timeit gs.dt.dayofyear
%timeit s.dt.dayofyear
39 µs ± 169 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
6.49 ms ± 39.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
```

This closes #8625

Authors:
  - Nick Becker (https://github.com/beckernick)

Approvers:
  - GALI PREM SAGAR (https://github.com/galipremsagar)

URL: #8626
@beckernick beckernick added this to the Time Series Analysis milestone Jul 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant