reindex(fill_value=None) fills with np.NaN instead of None #14188

nekobon · 2016-09-08T22:47:30Z

Code Sample, a copy-pastable example if possible

In [12]: import pandas as pd

In [13]: s = pd.Series(['a', 'b'])

In [14]: s.reindex([0,1,2], fill_value=None)
Out[14]: 
0      a
1      b
2    NaN
dtype: object

Expected Output

0      a
1      b
2    None
dtype: object

output of `pd.show_versions()`

commit: None
python: 3.5.2.final.0
python-bits: 64
OS: Linux
OS-release: 4.4.0-34-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.18.0
nose: None
pip: 8.1.2
setuptools: 25.2.0
Cython: 0.20.1
numpy: 1.8.1
scipy: 0.17.0
statsmodels: 0.6.1
xarray: None
IPython: 1.1.0
sphinx: 1.2b1
patsy: 0.4.1
dateutil: 2.4.2
pytz: 2013b
blosc: None
bottleneck: None
tables: 3.1.1
numexpr: 2.4
matplotlib: None
openpyxl: 2.3.0
xlrd: 0.9.3
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 0.9.8
pymysql: None
psycopg2: None
jinja2: 2.7
boto: None

in source

This is happening in BlockManager.reindex_indexer()
https://github.com/pydata/pandas/blob/8af626474f6f314527a9ad3f15403aa2dd8c402d/pandas/core/internals.py#L3820-L3822

The text was updated successfully, but these errors were encountered:

jreback · 2016-09-08T22:54:41Z

this is by definition
np.nan is the missing indicator

jreback · 2016-09-08T22:56:25Z

http://pandas.pydata.org/pandas-docs/stable/text.html

nekobon · 2016-09-08T23:34:21Z

I think it's confusing to get NaN when we give fill_value=None explicitly, without warnings or exceptions.

According to reindex's document http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.reindex.html:

fill_value : scalar, default np.NaN
    Value to use for missing values. Defaults to NaN, but can be any “compatible” value

Perhaps we can improve the document to mention None is not a compatible value there.

jreback · 2016-09-08T23:36:48Z

the default argument is None, meaning it's not passed

this is fairly standard convention

what I would take for documentation is a small section in. missing.rst to add that strings use np.nan as the missing value

near the top

nekobon · 2016-09-09T00:06:46Z

I see that not supporting None here is consistent with this issue on fillna.

It's true that None by default is standard in python, but it's also common to use a sentinel object (object()) when None could be meaningful. This would let us use None on both fillna and fill_value, and I think it's an improvement. What do you think?

jreback · 2016-09-09T00:10:52Z

you could use s sentinel but we don't allow None filling for a variety of reasons

nekobon · 2016-09-09T00:22:33Z

Could you share with me some of those reasons? I've been using None in Series and DataFrames, so I'm curious about why we shouldn't.

jreback · 2016-09-09T00:34:35Z

http://pandas.pydata.org/pandas-docs/stable/missing_data.html

nekobon · 2016-09-09T00:55:57Z

Thank you

nekobon closed this as completed Sep 9, 2016

jreback added Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Usage Question labels Sep 9, 2016

jreback added this to the No action milestone Sep 9, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reindex(fill_value=None) fills with np.NaN instead of None #14188

reindex(fill_value=None) fills with np.NaN instead of None #14188

nekobon commented Sep 8, 2016

jreback commented Sep 8, 2016

jreback commented Sep 8, 2016

nekobon commented Sep 8, 2016

jreback commented Sep 8, 2016

nekobon commented Sep 9, 2016

jreback commented Sep 9, 2016

nekobon commented Sep 9, 2016

jreback commented Sep 9, 2016

nekobon commented Sep 9, 2016

reindex(fill_value=None) fills with np.NaN instead of None #14188

reindex(fill_value=None) fills with np.NaN instead of None #14188

Comments

nekobon commented Sep 8, 2016

Code Sample, a copy-pastable example if possible

Expected Output

output of pd.show_versions()

in source

jreback commented Sep 8, 2016

jreback commented Sep 8, 2016

nekobon commented Sep 8, 2016

jreback commented Sep 8, 2016

nekobon commented Sep 9, 2016

jreback commented Sep 9, 2016

nekobon commented Sep 9, 2016

jreback commented Sep 9, 2016

nekobon commented Sep 9, 2016

output of `pd.show_versions()`