-
Notifications
You must be signed in to change notification settings - Fork 919
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] dt.date to extract date from a timestamp #7880
Comments
@bschifferer the @brandon-b-miller any ideas on a workaround to this for the time being? |
I think that is fine as long as the returned date (in |
There's a couple of approaches to making this happen, but none that I think are implemented today.
My suspicion is that we need to implement this natively or leverage |
This issue has been labeled |
Bumping this issue as the need for |
@charlesbluca apologies for missing this request. What's the expected return type here? Pandas returns an |
For dask-sql's purposes, ideally any column that can later be cast to a That being said, the usage in dask-sql is for a workaround to address the fact that cuDF and pandas (for timezoned datetime columns) don't support |
Your workaround may be the best approach to this for now. Perhaps pandas behavior here is something we simply don't want to enable, like iteration? cc @shwina |
Correct, although if all you care about is removing all the resolutions smaller than In [42]: s
Out[42]:
0 2001-01-01 12:12:12
1 2001-01-02 01:02:03
dtype: datetime64[ns]
In [43]: s.dt.floor("D")
Out[43]:
0 2001-01-01
1 2001-01-02
dtype: datetime64[ns] |
I'm going to close this as a |
Now that pandas returns a proper Series in the pyarrow-dtypes case:
would you be open to reconsidering? We're xfailing some tests in Narwhals for cuDF because of this, though of course our preference would be to not have to do so |
I'd be open to reconsidering, but before this goes anywhere I think we need to figure out our story w.r.t. pandas and pyarrow dtypes going forward, especially in light of discussions like pandas-dev/pandas#57073. Aside from the strings-focused questions, cudf.pandas has raised a number of questions for me around the continued viability of supporting two different type systems in pandas (pyarrow-backed and not-pyarrow-backed) simultaneously. Given the differential level of support for these two in pandas and how tightly we couple ourselves to trying to reproduce pandas behaviors, I'm pretty bearish on trying to support both and would probably want to sort something out in that direction before trying to resolve specific issues like this one. |
Is your feature request related to a problem? Please describe.
As a user, I want to be able to extract the date from a timestamp.
Pandas uses
.dt.date
- for exampledf_train['timestamp'].to_pandas()).dt.date
The reason is, that my timestamp contains Year-Month-Day and Hour-Minute.
I want to use the DATE as the index and apply shift functions based on differences in days.
Therefore, I want to extract only the date and not include the hour or minute.
Describe the solution you'd like
Similar to pandas, support .dt.date
The text was updated successfully, but these errors were encountered: