Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NaN value in dataframe caused _write_mysql error #4199

Closed
simomo opened this issue Jul 11, 2013 · 4 comments
Closed

NaN value in dataframe caused _write_mysql error #4199

simomo opened this issue Jul 11, 2013 · 4 comments

Comments

@simomo
Copy link

simomo commented Jul 11, 2013

  • I have a dataframe with nan value inside and wanna use io.sql.write_frame function to store it in a table.
  • But, nan value can not be treated well by MySQLdb, it will raise an exception:
OperationalError: (1054, "Unknown column 'nan' in 'field list'")
  • I've tried to convert all nan value to None, but it turns out to be impossible, fillna function treat param value=None as "the caller don't pass a value" not "the nan value should be None"
  • I think add some hacks to convert nan to None in io.sql._write_mysql can fix this.
@hayd
Copy link
Contributor

hayd commented Jul 11, 2013

closing as duplicate of #2754.

Think current workaround is to first use df[pd.isnull(df)] = None (rather than fillna).

@hayd hayd closed this as completed Jul 11, 2013
@simomo
Copy link
Author

simomo commented Jul 11, 2013

I tried this, and got an error:

df = DataFrame({'a': [1, 2, 3], 'b': ['1', '2', '3'], 'c': [nan, 2.0, 3.0]})
df[pd.isnull(df)] = None


---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/home/duozhang/python-workspace/notebook/l-demo/<ipython-input-182-db74023de5e1> in <module>()
----> 1 df[pd.isnull(df)] = None

/usr/local/lib/python2.7/dist-packages/pandas/core/frame.pyc in __setitem__(self, key, value)
   2033             self._setitem_array(key, value)
   2034         elif isinstance(key, DataFrame):
-> 2035             self._setitem_frame(key, value)
   2036         else:
   2037             # set column


/usr/local/lib/python2.7/dist-packages/pandas/core/frame.pyc in _setitem_frame(self, key, value)
   2068         if self._is_mixed_type:
   2069             if not self._is_numeric_mixed_type:
-> 2070                 raise ValueError('Cannot do boolean setting on mixed-type frame')
   2071 
   2072         self.where(-key, value, inplace=True)

ValueError: Cannot do boolean setting on mixed-type frame

And I found the solution in #1972 .

In [10]: df.where(pd.notnull(df), None)
Out[10]: 
   a     m     p     x
0  1  None     0  None
1  2    10  None  None
2  3    11    20  None
3  4    12    21  None

This code works good.
But I think this transaction should be built in the write_frame function. Pandas shouldn't let user search on google for hours to solve such a common issue.

@hayd
Copy link
Contributor

hayd commented Jul 11, 2013

You're right, this should "just work" from read_sql, and I think it'll be fixed in 0.13 (probably in dev quite soon).

Team are currently working on better sql support, see #4163.

Thanks for reporting this and if you find any other bugs please let us know!

@Lwa3000
Copy link

Lwa3000 commented Jul 3, 2018

Hi, Although this tissue is closed, I'm still getting the error:OperationalError: (1054, "Unknown column 'nan' in 'field list'").
The following works for me:
df.where(pd.notnull(df), None)

The following does not work:
df[pd.isnull(df)] = None

my pandas version is higher than 0.13:

'0.22.0'

my other versions:
pymysql: '0.8.0'
sqlalchemy: '1.2.5'
Python 3.6.4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants