-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pandas.DataFrame.where not replacing NaTs properly #15613
Comments
here's a copy-pastable example
note that [15] we don't allow; [16] is not in-place but the same operation. The issue is that when you reconstruct A we alway infer to datetimes, IOW, we don't allow np.nan, None or any null value to exist in a datetime dtype; instead these are coerced to @grechut why exactly are you doing this and what is the utility? The entire issue is that setting things to |
Thanks for the response! :) Sorry for not copy-pastable example. ( Our use case: We have a very brutal method that sanitizes all So what is unclear/confusing is that
I thought that maybe for our case, we should serialize before sending values to the database:
But that's an extra step to perform. With large datasets, it can be significant step. Also though about using
..and I felt that it would be more intuitive to return here |
@grechut the way IIRC this is handled in
|
see also this comment: #15533 (comment) which is a similar issue. we have to come up with a good API for this. |
So maybe According to the docs Another note, after reading docs, I thought that
But it didn't work this way. All those remarks are API-wise. Implementation-wise they might be hard and having little trade-off. So maybe just raise warning/error (partially pseudocode): if column.dtype == 'datetime' and column_has_NaTs
and other is not pd.NaT and pd.isnull(other):
raise ValueError(
"Trying to replace NaT with {other} would require changing of {column.name} type."
) |
note #14968 . So this is coerce here: This is correct, though I understand you want a different result. You can disambiguating This would work in this case, but likely will break other things. You can see what breaks and we can go from there. Note this same thinking would also change in a |
The
|
pandas==1.0.3 but pandas==1.3.1 why? |
@Pysion-lin please open a new issue with a reproducible example if the above
doesn't answer your question |
Problem description
pandas.DataFrame.where
seems to be not replacing NaTs properly.As in the example below, NaT values stay in data frame after applying
.where((pd.notnull(df)), None)
Code sample
INSTALLED VERSIONS
commit: None
python: 3.6.0.final.0
python-bits: 64
OS: Darwin
OS-release: 16.0.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
pandas: 0.19.2
nose: None
pip: 9.0.1
setuptools: 34.3.1
Cython: None
numpy: 1.12.0
scipy: 0.18.1
statsmodels: None
xarray: None
IPython: 5.3.0
sphinx: None
patsy: None
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: 2.0.0
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.9999999
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.9.5
boto: None
pandas_datareader: None
The text was updated successfully, but these errors were encountered: