-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: various bug fixes for DataFrame/Series construction #2752
Conversation
0 and 1 len ndarrays datetimes that are single objects mixed datetimes and objects (GH pandas-dev#2751) astype now converts correctly with a datetime64 type to object, NaT are converted to np.nan _get_numeric_data with empty mixed-type returning empty, but index was missing DOC: release notes updated, added missing_data section to docs, whatsnew 0.10.2
Thanks jeff |
fyi: this combined with #2708 is causing |
just saw that |
actually, here's what i have: intname = np.dtype(np.int_).name
floatname = np.dtype(np.float_).name
datetime64name = np.dtype('M8[ns]').name
objectname = np.dtype(np.object_).name
# single item
df = DataFrame({'A' : 1, 'B' : 'foo', 'C' : 'bar', 'D' : Timestamp("20010101"), 'E' : datetime(2001,1,2,0,0) },
index=np.arange(10))
result = df.get_dtype_counts()
expected = Series({intname: 1, datetime64name: 2, objectname : 2})
assert_series_equal(result, expected) that's working fine but this is not: # GH #2751 (construction with no index specified)
df = DataFrame({'a':[1,2,4,7], 'b':[1.2, 2.3, 5.1, 6.3], 'c':list('abcd'), 'd':[datetime(2000,1,1) for i in range(4)] })
result = df.get_dtype_counts()
expected = Series({intname: 1, floatname : 1, datetime64name: 1, objectname : 1})
assert_series_equal(result, expected) because the integer list is being promoted to int64, even on 32-bit. is that intended? see commit 9a047679c2e6f2064fb4b656a2461cceba7df679 (b05b97986f779cdd9007f281c4255c0d31ab263c below, if it's still showing, is out of date) |
In [29]: p.DataFrame([1,2]).dtypes
Out[29]:
0 int32
Dtype: object
In [30]: p.DataFrame({'a': [1,2]}).dtypes
Out[30]:
a int64
Dtype: object |
this does look inconsistent is your example on 32-bit? (default for ints should be np.int_)
On Feb 10, 2013, at 6:13 PM, stephenwlin [email protected] wrote:
|
yeah, 32-bit |
ok, just fyi to save you time, I've narrowed it down to here's the relevant lines, with some print statements added (and what they print when calling print data, np.asarray(data).dtype
# "[1, 2] int32"
subarr = lib.list_to_object_array(data)
print subarr, subarr.dtype
# "[1 2] object"
print [np.asarray(x).dtype for x in subarr]
# "[dtype('int32'), dtype('int32')]"
subarr = lib.maybe_convert_objects(subarr)
print subarr, subarr.dtype
# "[1 2] int64"
subarr = com._possibly_cast_to_datetime(subarr, dtype)
print subarr, subarr.dtype
# "[1 2] int64" i can look at it further if you want, but since you've been working on this I'm guessing you're more familiar with it and less likely to break something else with a fix (if this needs to be fixed) |
hmm On Feb 10, 2013, at 6:40 PM, stephenwlin [email protected] wrote:
|
ok made a PR to just fix the build for now (leaving that particular test referencing int64 instead of platform int) |
0 and 1 len ndarrays not inferring dtype correctly
datetimes that are single objects not inferring dtype
mixed datetimes and objects (GH #2751), casting datetimes to object
timedelta64 creation on series subtraction (of datetime64[ns])
astype on datetimes to object are now handled (as well as NaT conversions to np.nan)
astype conversion
construction with datetimes
1 len ndarrays