-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
'datetime64[ns]' columns cause fillna function fail #3047
Comments
u do realize that df.fillna(np.nan) is a no- op? |
In [169]: df_t1
Out[169]:
date_date date_obj int
0 2012-01-01 00:00:00 2012-01-01 00:00:00 1
1 2012-01-01 00:00:00 2012-01-01 00:00:00 2
2 2012-01-01 00:00:00 None NaN |
Construct your series like this
The issue when you are constructing a Series with out specifying a dtype is that it is ambiguous what you want, we cannot automatically convert the missing sentinels (NaN/None) properly. If you ONLY use
We allow np.nan setting after a series of dtype The following 'work', but leave you in a funny state, and as you have seen
|
If this solves your problem, simomo, you can close this issue. |
def grab_data(table_name, size_of_page=20000):
'''
Grab data from a db table
size_of_page: the second argument of sql's limit subclass
'''
cur.execute('select count(*) from %s' % table_name)
records_number = cur.fetchone()[0]
loop_number = records_number / size_of_page + 1
print '****\nStart Grab %s\n****\nrecords_number: %s\nloop_number: %s' % (table_name, records_number, loop_number)
start_position = 0
df = DataFrame() # WARNING: this dataframe object will contain all records of a table, so BE CAREFUL of the MEMORY USAGE!
for i in range(0, loop_number):
sql_export = 'select * from %s limit %s, %s' % (table_name, start_position, size_of_page)
df = df.append(psql.read_frame(sql_export, conn), verify_integrity=False, ignore_index=True)
start_position += size_of_page
print 'start_position: %s' % start_position
return df
|
so after u read from sql, just force the date columns, something like df['date'] = Series(df['date'].values,dtype='datetime64[ns]') |
I'd recommend using |
df['date'] = Series(df['date'].values,dtype='datetime64[ns]') |
Practicality beats purety. that said, I have pointed the sql reader authors to this, so that it can be added. Dealing with types is non-trivial and does take some time. thanks for the report. If this solves your issue, pls code this. |
I've pasted the final solution here: |
The text was updated successfully, but these errors were encountered: